JobRunr - Trying to run multiple recurring jobs with Spring Boot - background-process

I am using JobRunr to run my background jobs and in this I am providing the users to setup recurring jobs using an endpoint like below:
#PostMapping("/schedule-recurring")
public String scheduleRecurring(#RequestBody ExecutionJob executionJob) {
return BackgroundJob.scheduleRecurrently(executionJob.getId(),executionJob.getCronExpression(), ()
-> jobService.executeSomeJob(executionJob, JobContext.Null));
}
These jobs could run in 5 mins, 10 mins or it might sometimes take upto 4 hours. This all depends on how many records to process. Right now, I am in a phase where I have only one background job server since I am building a POC for this. However, in future we plan to scale it to support 1 instance per customer having one or multiple background job server based on the client license.
My issue here is that, I have 2 recurring jobs which run 1 hour apart from each other. The first recurring job executes for more than 1 hour and deems the second job unexecuted, because this job is not triggered as there is no available Background Job Worker to address this request. I am thinking of adding a check to the job to trigger itself only if a Background Job Worker is available. But is there a better idea where-in the schedule-recurring method itself adds a condition to queue the job if a Background Job worker is not available?
Thanks in advance.

Related

Correct Approach For Airflow DAG Project

I am trying to see if Airflow is the right tool for some functionality I need for my project. We are trying to use it as a scheduler for running a sequence of jobs
that start at a particular time (or possibly on demand).
The first "task" is to query the database for the list of job id's to sequence through.
For each job in the sequence send a REST request to start the job
Wait until job completes or fails (via REST call or DB query)
Go to next job in sequence.
I am looking for recommendations on how to break down the functionality discussed above into an airflow DAG. So far my approach would :
create a Hook for the database and another for the REST server.
create a custom operator that handles the start and monitoring of the "job" (steps 2 and 3)
use a sensor to poll handle waiting for job to complete
Thanks

For Hangfire, is there any sample code for non-simple tasks; and how should recurring tasks be handled when re-publishing?

I am considering using Hangfire https://www.hangfire.io to replace an older home-grown scheduling ASP.NET web site/app.
I have created a simple test project using Hangfire. I am able to start the project with Hangfire, submit (in code) a couple of very simple single and recurring tasks, view the dashboard, etc.
I'm looking for more suggestions for creating a little more complex code (and classes) for tasks to be scheduled, and I have a question about what happens with permanently scheduled tasks when re-publishing a Hangfire site to production.
I have read some of the documentation on the Hangfire site, reviewed the 2 tutorials, scanned the Hangfire forums, and searched StackOverflow and the web a bit. A lot of what I have seen shows you how to schedule something very simple (like Console.WriteLine), but nothing more complex. The "Highlighter" tutorial was useful, but that essentially shows how to schedule a single instance of a (slightly longer-running) task in response to an interactive user input. I understand how useful that can be, but I'm more interested in recurring tasks that are submitted and then run every day (or every hour, etc.) and don't need to be submitted again. These tasks could be for something like sending a batch of emails to users each night, batch processing some data, importing a nightly feed of external data, periodically calling a web service to perform some processing, etc.
Is there any sample code available that shows some examples like this, or any guidance on the most appropriate approach for structuring such code in an interface and class(es)?
Secondly, in my case, most of the tasks would be "permanent" (always existing as a recurring task). If I set up code to add these as recurring tasks shortly after starting the Hangfire application in production, how should I handle it when publishing updates to production (when this same initialization would run again)? Should I just call "AddOrUpdate" with the same ID and Hangfire will take care of it? Should I first call "RemoveIfExists" and then add the recurring task again? Is there some other approach that should be used?
One example would be a log janitor, which would run every weekday # 5:00PM to remove logs that are older than 5 days.
public void Schedule()
{
RecurringJob.AddOrUpdate<LogJanitor>(
"Janitor - Old Logs",
j => j.OnSchedule(null),
"0 17 * * 1,2,3,4,5",
TimeZoneInfo.FindSystemTimeZoneById("CST"));
}
Then we would handle it this way
public void OnSchedule(
PerformContext context)
{
DateTime timeStamp = DateTime.Today.AddDays(-5);
_logRepo.FindAndDelete(from: DateTime.MinValue, to: timeStamp);
}
These two methods are declared inside LogJanitor class. When our application starts, we get an instance of this class then call Schedule().

How to prevent a Hangfire recurring job from restarting after 30 minutes of continuous execution

I am working on an asp.net mvc-5 web application, and I am facing a problem in using Hangfire tool to run long running background jobs. the problem is that if the job execution exceed 30 minutes, then hangfire will automatically initiate another job, so I will end up having two similar jobs running at the same time.
Now I have the following:-
Asp.net mvc-5
IIS-8
Hangfire 1.4.6
Windows server 2012
Now I have defined a hangfire recurring job to run at 17:00 each day. The background job mainly scan our network for servers and vms and update the DB, and the recurring job will send an email after completing the execution.
The recurring job used to work well when its execution was less than 30 minutes. But today as our system grows, the recurring job completed after 40 minutes instead of 22-25 minutes as it used to be. and I received 2 emails instead of one email (and the time between the emails was around 30 minutes). Now I re-run the job manually and I have noted that that the problem is as follow:-
"when the recurring job reaches 30 minutes of continuous execution, a
new instance of the recurring job will start, so I will have two
instances instead of one running at the same time, so that why I received 2 emails."
Now if the recurring job takes less than 30 minutes (for example 29 minute) I will not face any problem, but if the recurring job execution exceeds 30 minutes then for a reason or another hangfire will initiate a new job.
although when I access the hangfire dashboard during the execution of the job, I can find that there is only one active job, when I monitor our DB I can see from the sql profiler that there are two jobs accessing the DB. this happens after 30 minutes from the beginning of the recurring job (at 17:30 in our case), and that why I received 2 emails which mean 2 recurring jobs were running in the background instead of one.
So can anyone advice on this please, how I can avoid hangfire from automatically initiating a new recurring job if the current recurring job execution exceeds 30 minutes?
Thanks
Did you look at InvisibilityTimeout setting from the Hangfire docs?
Default SQL Server job storage implementation uses a regular table as
a job queue. To be sure that a job will not be lost in case of
unexpected process termination, it is deleted only from a queue only
upon a successful completion.
To make it invisible from other workers, the UPDATE statement with
OUTPUT clause is used to fetch a queued job and update the FetchedAt
value (that signals for other workers that it was fetched) in an
atomic way. Other workers see the fetched timestamp and ignore a job.
But to handle the process termination, they will ignore a job only
during a specified amount of time (defaults to 30 minutes).
Although this mechanism ensures that every job will be processed,
sometimes it may cause either long retry latency or lead to multiple
job execution. Consider the following scenario:
Worker A fetched a job (runs for a hour) and started it at 12:00.
Worker B fetched the same job at 12:30, because the default invisibility timeout was expired.
Worker C (did not fetch) the same job at 13:00, because (it
will be deleted after successful performance.)
If you are using cancellation tokens, it will be set for Worker A at
12:30, and at 13:00 for Worker B. This may lead to the fact that your
long-running job will never be executed. If you aren’t using
cancellation tokens, it will be concurrently executed by WorkerA and
Worker B (since 12:30), but Worker C will not fetch it, because it
will be deleted after successful performance.
So, if you have long-running jobs, it is better to configure the
invisibility timeout interval:
var options = new SqlServerStorageOptions
{
InvisibilityTimeout = TimeSpan.FromMinutes(30) // default value
};
GlobalConfiguration.Configuration.UseSqlServerStorage("<name or connection string>", options);
As of Hangfire 1.5 this option is now Obsolete. Jobs that are being worked on are invisible to other workers.
Say goodbye to confusing invisibility timeout with unexpected
background job retries after 30 minutes (by default) when using SQL
Server. New Hangfire.SqlServer implementation uses plain old
transactions to fetch background jobs and hide them from other
workers.
Even after ungraceful shutdown, the job will be available for other
workers instantly, without any delays.
I was having trouble finding documentation on how to do this properly for a Postgresql database, every example I was see is using sqlserver, I found how the invisibility timeout was a property inside the PostgreSqlStorageOptions object, I found this here : https://github.com/frankhommers/Hangfire.PostgreSql/blob/master/src/Hangfire.PostgreSql/PostgreSqlStorageOptions.cs#L36. Luckily through trial and error I was able to figure out that the UsePostgreSqlStorage has an overload to accept this object. For .Net Core 2.0 when you are setting up the hangfire postgresql DB in the ConfigureServices method in the startup class add this(the default timeout is set to 30 mins):
services.AddHangfire(config =>
config.UsePostgreSqlStorage(Configuration.GetConnectionString("Hangfire1ConnectionString"), new PostgreSqlStorageOptions {
InvisibilityTimeout = TimeSpan.FromMinutes(720)
}));
I had this problem when using Hangfire.MemoryStorage as the storage provider. With memory storage you need to set the FetchNextJobTimeout in the MemoryStorageOptions, otherwise by default jobs will timeout after 30 minutes and a new job will be executed.
var options = new MemoryStorageOptions
{
FetchNextJobTimeout = TimeSpan.FromDays(1)
};
GlobalConfiguration.Configuration.UseMemoryStorage(options);
Just would like to point out that even though, it is stated the thing below:
As of Hangfire 1.5 this option is now Obsolete. Jobs that are being worked on are invisible to other workers.
Say goodbye to confusing invisibility timeout with unexpected background job retries after 30 minutes (by default) when using SQL Server. New Hangfire.SqlServer implementation uses plain old transactions to fetch background jobs and hide them from other workers.
Even after ungraceful shutdown, the job will be available for other workers instantly, without any delays.
It seems that for many people using MySQL, PostgreSQL, MongoDB, InvisibilityTimeout is still the way to go: https://github.com/HangfireIO/Hangfire/issues/1197

Autosys Email Generation if the job runs more than the time specified

Currenty I have requirement in my enviroment for the autosys email notifition.
Requirement: If the job runs more than the specified time it should trigger an email.
What I am trying is using max_run_alam, but I am not successful.
Lets say i have a job that runs for 10mins(lets say the time as 10.00). i set max_run_alarm as 3. i should get an email at 10.03 where i can goahead and see why the job is running more than the max_run_alarm. if i use max_run_alarm i am able to see in the logs triggering that alarm, but I cannot spend all day monitoring the logs to see which job is taking long as i have many jobs. my question is am i using max_run_alarm in the correct way or is there something else i am missing or is there entirely different way for the emails to generate.
pls advise.
We are using autosys R11 at work. I believe the triggering of emails is already automated in higher versions of autosys, but, in our version, to send automatic emails after a certain time, we create two extra autosys jobs. One autosys job starts at the same time as the job you want to "monitor". This job contains a 'sleep' command. (in your example, the command would be "sleep 180" for the job to run for 3 minutes until completion). The second extra job is the sending of the email and only starts after successful completion of the sleep-job.
To prevent the mail from being send every time the autosys box starts, you have to add your first job as BOX_SUCCESS condition. The sleep-job will run to completion, but the mail-job went from the "ACTIVATED" state to the "INACTIVE" state because the autosys box isn't RUNNING anymore.

Pagodabox or PHPfog + Gearman

All,
I'm looking for a good way to do some job backgrounding through either of these two services.
I see PHPFog supports IronWorks, but i need something more realtime. Through these cloud based PaaS services, I'm not able to use popen(background.php --token=1234). So I'm thinking the best solution, might be to try to kick off a gearman worker to handle the job. (Actually my preferred method would be to use websockets to keep a connection open and receive feedback from the job, rather than long polling a db table through AJAX, but none of these guys support websockets)
Question 1 is, is there a better solution than using gearman to offload the job?
Question 2 is, http://help.pagodabox.com/customer/portal/articles/430779 I see pagodabox supports 'worker listeners' ... has anybody set this up with gearman? Would it work?
Thanks
I am using PagodaBox with a background worker in an application I am building right now. Basically, PagodaBox daemonizes a PHP process for you (meaning it will continually run in the background), so all you really have to do is create a script that checks a database table for tasks to run, runs them, and then sleeps a bit so it's not running too many queries against your database.
This is a simplified version of what I have running:
// Remove time limit
set_time_limit(0);
// Show ALL errors
error_reporting(-1);
// Run daemon
echo "--- Starting Daemon ---\n";
while(true) {
// Query 'work_queue' table for new tasks
// Loop over items and do whatever tasks are associated with them
// Update row to mark task as completed
// Wait a bit
sleep(30);
}
A benefit to this approach is that it's easy to test via CLI:
php tasks.php
You will see all the echo statements come through in console as it's running, and of course it's much easier to do than a more complicated setup with other dependencies like Gearman.
So whenever you add a new task to the table, the maximum amount of time you'll wait for that task to be started in a batch is 30 seconds (or whatever your sleep time is). This is better and preferable to cron jobs, because if you setup a cron job to run every minute (the lowest possible interval) and the work you have to do takes longer than a minute, another cron process will start working on the same queue and you could end up with quite a lot of duplicated task work that could cause a lot of issues that are hard to debug and troubleshoot. So if you instead have either only one background worker that runs all tasks, or multiple background workers that work on different task types, you will never run into this issue.

Resources