again, im stucking in Gearman. I was implementing the ulabox gearman Bundle which works nicely. But there are two things which I dont unterstand yet.
How do I start a Worker??
Im the documentation, I should first execute a worker and the start the code.
https://github.com/ulabox/GearmanBundle/blob/master/README.md
Open the first console and run:
$ php app/console gearman:worker:execute --worker=AcmeDemoBundle:AcmeWorker
Now open another console and run:
$ php app/console gearman:client:execute --client=UlaboxGearmanBundle:GearmanClient:hello_world --worker=AcmeDemoBundle:AcmeWorker --params="{\"foo\": \"bar\" }"
So, if I dont start the worker manually, the job would be done by itsself. If I start the worker, everysthin is fine. But at least, it is a bit strange to start in manually, even if there is set an iteration of x so that the worker will kill itsself after that amount of job.
So please, can anyone help me out of this :((((
Heeeeeelp :) lol
thanks in advance an kind regards
Phil
Yes to run some task in background not only Gearman need to be run but also workers.
So you have run "gearman" that wait for some command (e.x. email send).
Additionally you have waiting workers.
When gearman view new command he look for first free worker and pass this command to it.
Next worker process execution for command and after finish return to Gearman server that it finished and ready to process new command.
More worker you have faster commands in queue processed.
You can use "supervisor" for automatic maintenance workers running.
Bellow you can find few links with more information:
http://www.daredevel.com/php-jobs-with-gearman-and-supervisor/
http://www.masnun.com/2011/11/02/gearman-php-and-supervisor-processing-background-jobs-with-sanity.html
Running Gearman Workers in the Background
Related
I have a workflow that has many sessions that run in parallel to each other. When one of the session fails, the workflow waits for the other session to complete and then the entire workflow gets failed. We have selected the option "fail parent if this task fails". But we want the workflow to fail and stop immediately if any of the session fails without waiting for other sessions to finish.
ps: We have a unix shell script that calls all the workflows one by one. So if we can solve it using unix shell scripting that would be fine aswell.
Does anyone have any solution for it?
Best thing you can do in Informatica is use a Control Task to Abort the worklfow, and have it connected from all sessions with an OR condition. Something like:
start--S1--S2--S3
\ \ \
\---\---\-(OR)-CTL
I need to restart Airflow. I want to make sure I do it when it's idle, so I that I don't interrupt a job by restarting the worker component of Airflow.
How do I see what DAGs are running?
I don't see anything in the UI that would list currently running DAGs.
I don't see any command in the airflow CLI to list currently running DAGs.
I found airflow shell that lets me connect to the DB, but I don't know enough about Airflow internals to know where to look to see what's running.
You can also query the database to get all running tasks at once:
select * from task_instance where state='running'
You can use the command line command airflow jobs check which would return "No alive jobs found." in the event no jobs are running.
I found it... it's in the UI, on the DAGs page, it's the second circle under "Recent Tasks":
I use OOZIE to run a workflow. But a simple official example shell-wf (echo hello oozie) stuck in RUNNING state and never end. The workflow can be submitted but stuck at RUNNING state. There is not any error in job log in OOZIE UI.
When submitting a shell with spark-submit inside, the job will be never submitted and can not be seen in Spark UI. I suspect the shell didn't run at all.
What's the possible problem?
A Quick Checklist
For those who have the same problem, there is a checklist to check your system. Hope it helps!
Check jobTracker in your Oozie configuration. Note: If a job has been successfully run, it probably not the problem of jobTracker. Related discussion can be found here
Check your disk usage. If ## Heading ##disk usage is greater than 90%, remove some files to make sure disk usage is less than 90%. (That's my case!)
Check Console URL of the stuck action. It can be found in Job - Job Info tab - Actions - Action - Action Info tab. Job state here may help you to find the problem.
Check Oozie log. It's typically in /usr/local/oozie/logs. Check oozie.log* to find if there are exceptions.
Details
Disk usage
If your action state is
YarnApplicationState: ACCEPTED: waiting for AM container to be allocated, launched and register with RM.
That may be the disk problem. Relative discussion can be found in MapReduce job hangs, waiting for AM container to be allocated. Solutions can be found in Why does Hadoop report "Unhealthy Node local-dirs and log-dirs are bad"?.
We have configured Hangfire to run as part of our web app using OWIN as provided in the tutorial.
We enqueue long running background process via an API we provide. The job we run initializes a R process in the background using the.Net Process class. The R code internally spawns a number of processes internally to finish the job faster. We get a number of Rscript process running in Task manager when the job runs.
On manually Recycling the app pool of our web app(to see how process restart works) the Rscript Processes are not killed. We have a custom kill strategy we to get rid of all the Rscript process within our code.
while (IsNotTimedOut())
{
try
{
_token.ThrowIfCancellationRequested();
Thread.Sleep(2000);
}
catch (OperationCanceledException)
{
Kill();
throw;
}
}
Within the kill method, we block using Process.WaitForExit() method.
When we do the manual recycle all the process are not killed. The current thread instead of blocking for the process to kill just dies after killing a couple of Rscript processes.
The hangfire code seems to just cancel the token and it doesn't seem to wait for the process to get killed that are listening on the cancellation token. Please, can someone suggest how do we get this working, Please let me know if more details are required?
All,
I'm looking for a good way to do some job backgrounding through either of these two services.
I see PHPFog supports IronWorks, but i need something more realtime. Through these cloud based PaaS services, I'm not able to use popen(background.php --token=1234). So I'm thinking the best solution, might be to try to kick off a gearman worker to handle the job. (Actually my preferred method would be to use websockets to keep a connection open and receive feedback from the job, rather than long polling a db table through AJAX, but none of these guys support websockets)
Question 1 is, is there a better solution than using gearman to offload the job?
Question 2 is, http://help.pagodabox.com/customer/portal/articles/430779 I see pagodabox supports 'worker listeners' ... has anybody set this up with gearman? Would it work?
Thanks
I am using PagodaBox with a background worker in an application I am building right now. Basically, PagodaBox daemonizes a PHP process for you (meaning it will continually run in the background), so all you really have to do is create a script that checks a database table for tasks to run, runs them, and then sleeps a bit so it's not running too many queries against your database.
This is a simplified version of what I have running:
// Remove time limit
set_time_limit(0);
// Show ALL errors
error_reporting(-1);
// Run daemon
echo "--- Starting Daemon ---\n";
while(true) {
// Query 'work_queue' table for new tasks
// Loop over items and do whatever tasks are associated with them
// Update row to mark task as completed
// Wait a bit
sleep(30);
}
A benefit to this approach is that it's easy to test via CLI:
php tasks.php
You will see all the echo statements come through in console as it's running, and of course it's much easier to do than a more complicated setup with other dependencies like Gearman.
So whenever you add a new task to the table, the maximum amount of time you'll wait for that task to be started in a batch is 30 seconds (or whatever your sleep time is). This is better and preferable to cron jobs, because if you setup a cron job to run every minute (the lowest possible interval) and the work you have to do takes longer than a minute, another cron process will start working on the same queue and you could end up with quite a lot of duplicated task work that could cause a lot of issues that are hard to debug and troubleshoot. So if you instead have either only one background worker that runs all tasks, or multiple background workers that work on different task types, you will never run into this issue.