When running:
gcloud firebase test android run --type=instrumentation --app=app.apk --test=test_app.apk
The firebase command line is stuck many minutes in "Creating individual test executions".
When debugging further it seems that the command line polls a backend "https://testing.googleapis.com:443" periodically till it get's an ok.
Is there a way to speed this up? This step can take 5 minutes and it takes unnessecary CI time
Update:
The command line was missing the part: --device model=NexusLowRes,version=29 --verbosity=debug
I analyzed the issue further. It takes about 100 sec to upload both app and test app and another 150 s to create the test execution. so i think that it is a limitation in the system and nothing can be done here. Maybe the size of the apk is limiting. It is about 200 mb and it takes a lot time to scan this.
Please see my comment on your question asking for additional details that could affect the answer.
One option is to add --async to your command. This will only poll the matrix status until it verifies that the matrix is created successfully, then exit without waiting for the test to actually run.
Related
I'm running a DAG in Google Cloud Composer (hosted Airflow) which runs fine in Airflow locally. All it does is print "Hello World". However, when I run it through Cloud Composer I receive the error:
*** Log file does not exist: /home/airflow/gcs/logs/matts_custom_dag/main_test/2020-04-20T23:46:53.652833+00:00/2.log
*** Fetching from: http://airflow-worker-d775d7cdd-tmzj9:8793/log/matts_custom_dag/main_test/2020-04-20T23:46:53.652833+00:00/2.log
*** Failed to fetch log file from worker. HTTPConnectionPool(host='airflow-worker-d775d7cdd-tmzj9', port=8793): Max retries exceeded with url: /log/matts_custom_dag/main_test/2020-04-20T23:46:53.652833+00:00/2.log (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f8825920160>: Failed to establish a new connection: [Errno -2] Name or service not known',))
I've also tried making the DAG add data into a database and it actually succeeds 50% of the time. However, it always returns this error message (and no other print statements or logs). Any help much appreciated on why this might be happening.
We also faced the same issue then raised a support ticket to GCP and got the following reply.
The message is related to the latency of syncing logs from Airflow workers to WebServer, it takes at least some minutes (depending on the number of objects and their size)
The total log size seems not large but it’s enough to noticeably slow down synchronization, hence, we recommend cleanup/archive the logs
Basically we recommend relying on Stackdriver logs instead, because of latency due to the design of this sync
I hope this will help you solve the problem.
I have the same problem after upgrading from 1.10.3 to 1.10.6 of Google Composer.
I can see in my logs that airflow is trying to get the logs from a bucket with a name ended with -tenant while the bucket in my account ends with -bucket
In the configuration, I can see something weird too.
## airflow.cfg
[core]
remote_base_log_folder = gs://us-east1-dada-airflow-xxxxx-bucket/logs
## also in the running configuration says
core remote_base_log_folder gs://us-east1-dada-airflow-xxxxx-tenant/logs env var
I wrote to google support and they said the team is working on a fix.
EDIT:
I've been accessing my logs with gsutil and replacing the bucket name suffix to -bucket
gsutil cat gs://us-east1-dada-airflow-xxxxx-bucket/logs/...../5.logs
I faced the same situation in multiple occasions.
As soon as when the job finished when I take a look at the log on Airflow Web UI, it used to give me the same error. Although when I check back the same logs on UI after a min or 2, I could see the logs properly.
As per the above answers, its a sync issue between the webserver and the Worker node.
In general, the issue describe here should be more like a sporadic issue.
In certain situations, what could help is setting default-task-retries to a value that allows for retrying a task at least 1.
This issue is resolved at least since Airflow version: 1.10.10+composer.
I use OOZIE to run a workflow. But a simple official example shell-wf (echo hello oozie) stuck in RUNNING state and never end. The workflow can be submitted but stuck at RUNNING state. There is not any error in job log in OOZIE UI.
When submitting a shell with spark-submit inside, the job will be never submitted and can not be seen in Spark UI. I suspect the shell didn't run at all.
What's the possible problem?
A Quick Checklist
For those who have the same problem, there is a checklist to check your system. Hope it helps!
Check jobTracker in your Oozie configuration. Note: If a job has been successfully run, it probably not the problem of jobTracker. Related discussion can be found here
Check your disk usage. If ## Heading ##disk usage is greater than 90%, remove some files to make sure disk usage is less than 90%. (That's my case!)
Check Console URL of the stuck action. It can be found in Job - Job Info tab - Actions - Action - Action Info tab. Job state here may help you to find the problem.
Check Oozie log. It's typically in /usr/local/oozie/logs. Check oozie.log* to find if there are exceptions.
Details
Disk usage
If your action state is
YarnApplicationState: ACCEPTED: waiting for AM container to be allocated, launched and register with RM.
That may be the disk problem. Relative discussion can be found in MapReduce job hangs, waiting for AM container to be allocated. Solutions can be found in Why does Hadoop report "Unhealthy Node local-dirs and log-dirs are bad"?.
Is there anyway that a Jenkins job can be paused until a notification is received. Ideally with a payload as well?
I have a "test" job which does a whole bunch of remote tests and I'd like it to wait until the test are done where I send a HTTP notification via curl with a payload including a test success code.
Is this possible with any default Jenkins plugins?
If Jenkins 2.x is an option for you, I'd consider taking a look at writing a pipeline job.
See https://jenkins.io/doc/book/pipeline/
Perhaps you could create a pipeline with multiple stages, where:
The first batch of work (your test job) is launched by the first pipeline stage.
That stage is configured (via Groovy code) to wait until your tests are complete before continuing. This is of course easy if the command to run your tests blocks, but if your tests launch and then detach without providing an easy way to determine when they exit, you can probably add extra Groovy code to your stage to make it poll the machine where the tests are running, to discover whether the work is complete.
Subsequent stages can be run once the first stage exits.
As for passing a payload from one stage to another, that's possible too - for exit codes and strings, you can use Groovy variables, and for files, I believe you can have a stage archive a file as an artifact; subsequent stages can then access the artifact.
Or, as Hani mentioned in a comment, you could create two Jenkins jobs, and have your tests (launched by the first job) use the Jenkins API to launch the second job when they complete successfully.
As you suggested, curl can be used to trigger jobs via the API, or you can use a Jenkins API wrapper package for to your preferred language (I've had success using the Python jenkinsapi package for this sort of work: http://pythonhosted.org/jenkinsapi/)
If you need to pass parameters from your API client code to the second Jenkins job, that's possible by adding parameters to the second job using the the Parameterized Build features built into Jenkins: https://wiki.jenkins-ci.org/display/JENKINS/Parameterized+Build
Having a question on how the build queue is configured in CC.net.
I believe we have an issue , when trying to “force” build a scheduled project, the server tries to run several builds at the same time and fails
Most of them except the one that started first.
We need to get to a state when regardless how many builds are scheduled or how many we “force” start in about the same time, all build requests are placed in to a build queue and
executed one after finishing another in the order they were placed, and no extra request are generated.
Build Failed email is sent but the build was actually successful.
In short,The erroneous email is likely due to an error in the build server’s build scheduler/queue, trying to run 2 builds instead of one when asked for a “forced” build, as a result the first one is successful and the second one fails.
How to correct/resolve this issue....?
Thanks
Nilesh
To specify your projects' queue you need to set the queue property like this :
<project name="MyFirstProject" queue="Q1" queuePriority="1">
The default value is a queue per project. If you manually set the same queue (for example Q1) for all you project then, you will have a unique queue.
As for the queuePriority, the project (not yet started) in the queue are ordonned by queuePriority, low queuePriority projects start first.
It's all described in the cc net documentation which is now offline due to a problem at sourceforge.
we have scheduled one script to run for every 5 mins.
how we can check that whether the script is running for every 5 minutes in linux?
If anyone knows, please Reply..
Thanks in Advance.
We have created the script. But we dont run that script. Our Support team will run this for every minutes. If they found any error, then they will update to us.
How we can cofirm that they r running this script properly or not?
If you don't want to modify your script and you've scheduled it in cron, you could change your cron line to:
*/5 * * * * /home/me/myscript.sh; date >> /tmp/mylog
And check /tmp/mylog - a new line should be added with the date and time every run.
Maybe by making the script log in a logfile with the timestamp, so you can verify if the timestamp is OK, something like:
date >>/tmp/myprocess.log
at the top of the script (or in the loop if that's how you're "running" it) then you can examine the log file to check.
Maybe you could just add a log with a timestamp in your script ?
Then you could see if the script was effectively run every 5 min.
Without changing the script?
atop is a process monitor package that can record history of programs being run, and as root will catch terminations as well. See http://www.atcomputing.nl/Tools/atop/whyatop.html
Also consider the process accounting tools http://www.faqs.org/docs/Linux-mini/Process-Accounting.html