Make scheduler run in only one instance of multiple micro-service

Make scheduler run in only one instance of multiple micro-service - amazon-dynamodb

I have built a micro-service where there is an API called deleteToken. This API(when invoked) is supposed to change the status in a tuple in db corresponding to token (identified with token id) to "MARK-DELETE". Once that tuple has status "MARK_DELETE" then after 30 days there should be a rest call made to downstream service API called deleteTokenFromPartner. There is no such mandate like call to deleteTokenFromPartner has to be made right after 30 days, it can be done few hours later 30 days also. So what I thought was I will write a scheduler (using Quartz, Java Executor service) with scheduled period in such a way that it will run once everyday. what it will do is it will query db and find out all rows which has status="MARK_DELETE" and status update is older than 30 days. After then it will iteratively call deleteTokenFromPartner for each and every row. There is one db which is highly available and we may not have any issue with consistency as we delete after 30 days. But the problem I am seeing is, as this is a micro-service which has N instances so every instance will query db, get the same set of rows and make call to same rows. Can I make any tweak so that this duplicated calls can be avoided. FYI we don't make any config changes using hostnames and if only one instance will be capable of running the scheduler that too will be fine.

Related

Correct Approach For Airflow DAG Project

I am trying to see if Airflow is the right tool for some functionality I need for my project. We are trying to use it as a scheduler for running a sequence of jobs
that start at a particular time (or possibly on demand).
The first "task" is to query the database for the list of job id's to sequence through.
For each job in the sequence send a REST request to start the job
Wait until job completes or fails (via REST call or DB query)
Go to next job in sequence.
I am looking for recommendations on how to break down the functionality discussed above into an airflow DAG. So far my approach would :
create a Hook for the database and another for the REST server.
create a custom operator that handles the start and monitoring of the "job" (steps 2 and 3)
use a sensor to poll handle waiting for job to complete
Thanks

JobRunr - Trying to run multiple recurring jobs with Spring Boot

I am using JobRunr to run my background jobs and in this I am providing the users to setup recurring jobs using an endpoint like below:
#PostMapping("/schedule-recurring")
public String scheduleRecurring(#RequestBody ExecutionJob executionJob) {
return BackgroundJob.scheduleRecurrently(executionJob.getId(),executionJob.getCronExpression(), ()
-> jobService.executeSomeJob(executionJob, JobContext.Null));
}
These jobs could run in 5 mins, 10 mins or it might sometimes take upto 4 hours. This all depends on how many records to process. Right now, I am in a phase where I have only one background job server since I am building a POC for this. However, in future we plan to scale it to support 1 instance per customer having one or multiple background job server based on the client license.
My issue here is that, I have 2 recurring jobs which run 1 hour apart from each other. The first recurring job executes for more than 1 hour and deems the second job unexecuted, because this job is not triggered as there is no available Background Job Worker to address this request. I am thinking of adding a check to the job to trigger itself only if a Background Job Worker is available. But is there a better idea where-in the schedule-recurring method itself adds a condition to queue the job if a Background Job worker is not available?
Thanks in advance.

Google Cloud Scheduler Run at Set Times Every Minute

I am trying to call an api every minute for ski lift status and check for changes. I am going to store the value of if the lift is open or closed in firebase (Real Time Database) and read to see if value from api is different and only update/ write to that node when it's a different value. Then I can set up a cloud function that will listen for database changes and send push notifications to the list of FCM tokens from that channel. I am not sure if this is the most efficient way, but I was going to set up scheduled functions to call the third party api.
I have been using these docs:
https://firebase.google.com/docs/functions/schedule-functions
I was planning to do something like this:
exports.scheduledFunction = functions.pubsub.schedule('every 5 minutes').onRun((context) => {
CALL MY API IN HERE AND UPDATE DATABASE IF SNAPSHOT BACK IS DIFFERENT
});
I was wondering how would I run only between set times- say 8am-6pm EST. I am struggling to find anything about times to run. Should I just run the function every minute and then pause and resume by checking the time? In which case how does it know to keep checking the time when it is paused?

Firebase scheduled functions use Cloud Scheduler to implement the schedule. It accepts cron style time specifiers to indicate when a job should be run. The full spec for that can be found here. You will have to use ranges of numbers to indicate the valid times and frequency of the schedule. For example, you might use "8-18" in the hour field to limit the hours of execution.

Spring Kafka batch within time window

Spring Boot environment listening to kafka topics(#KafkaListener / #StreamListener)
Configured the listener factory to operate in batch mode:
ConcurrentKafkaListenerContainerFactory # setBatchListener
or via application.properties:
spring.kafka.listener.type=batch
How to configure the framework so that given two numbers: N and T, it will try to fetch N records for the listener but won't wait more than T seconds, like described here: https://doc.akka.io/docs/akka/2.5/stream/operators/Source-or-Flow/groupedWithin.html
Some properties I've looked at:
max-poll-records ensures you won't get more than N numbers in a batch
fetch-min-size get at least this amount of data in a fetch request
fetch-max-wait but don't wait more than necessary
idleBetweenPolls just sleep a bit between polls :)
It seems like fetch-min-size combined with fetch-max-wait should do it but they compare bytes, not messages/records.
It is obviously possible to implement that by hand, I'm looking whether it's possible to configure Spring to to that for me.

It seems like fetch-min-size combined with fetch-max-wait should do it but they compare bytes, not messages/records.
That is correct, unfortunately, Kafka provides no mechanism such as fetch.min.records.
I don't anticipate that Spring would layer this functionality on top of the kafka-clients; it would be better to ask for a new feature in Kafka itself.
Spring does not manipulate the records returned from the poll at all, except you can now specify subBatchPerPartition to get batches containing just one partition in order to properly support zombie fencing when using exactly once read/prcess/write.

How to prevent a Hangfire recurring job from restarting after 30 minutes of continuous execution

I am working on an asp.net mvc-5 web application, and I am facing a problem in using Hangfire tool to run long running background jobs. the problem is that if the job execution exceed 30 minutes, then hangfire will automatically initiate another job, so I will end up having two similar jobs running at the same time.
Now I have the following:-
Asp.net mvc-5
IIS-8
Hangfire 1.4.6
Windows server 2012
Now I have defined a hangfire recurring job to run at 17:00 each day. The background job mainly scan our network for servers and vms and update the DB, and the recurring job will send an email after completing the execution.
The recurring job used to work well when its execution was less than 30 minutes. But today as our system grows, the recurring job completed after 40 minutes instead of 22-25 minutes as it used to be. and I received 2 emails instead of one email (and the time between the emails was around 30 minutes). Now I re-run the job manually and I have noted that that the problem is as follow:-
"when the recurring job reaches 30 minutes of continuous execution, a
new instance of the recurring job will start, so I will have two
instances instead of one running at the same time, so that why I received 2 emails."
Now if the recurring job takes less than 30 minutes (for example 29 minute) I will not face any problem, but if the recurring job execution exceeds 30 minutes then for a reason or another hangfire will initiate a new job.
although when I access the hangfire dashboard during the execution of the job, I can find that there is only one active job, when I monitor our DB I can see from the sql profiler that there are two jobs accessing the DB. this happens after 30 minutes from the beginning of the recurring job (at 17:30 in our case), and that why I received 2 emails which mean 2 recurring jobs were running in the background instead of one.
So can anyone advice on this please, how I can avoid hangfire from automatically initiating a new recurring job if the current recurring job execution exceeds 30 minutes?
Thanks

Did you look at InvisibilityTimeout setting from the Hangfire docs?
Default SQL Server job storage implementation uses a regular table as
a job queue. To be sure that a job will not be lost in case of
unexpected process termination, it is deleted only from a queue only
upon a successful completion.
To make it invisible from other workers, the UPDATE statement with
OUTPUT clause is used to fetch a queued job and update the FetchedAt
value (that signals for other workers that it was fetched) in an
atomic way. Other workers see the fetched timestamp and ignore a job.
But to handle the process termination, they will ignore a job only
during a specified amount of time (defaults to 30 minutes).
Although this mechanism ensures that every job will be processed,
sometimes it may cause either long retry latency or lead to multiple
job execution. Consider the following scenario:
Worker A fetched a job (runs for a hour) and started it at 12:00.
Worker B fetched the same job at 12:30, because the default invisibility timeout was expired.
Worker C (did not fetch) the same job at 13:00, because (it
will be deleted after successful performance.)
If you are using cancellation tokens, it will be set for Worker A at
12:30, and at 13:00 for Worker B. This may lead to the fact that your
long-running job will never be executed. If you aren’t using
cancellation tokens, it will be concurrently executed by WorkerA and
Worker B (since 12:30), but Worker C will not fetch it, because it
will be deleted after successful performance.
So, if you have long-running jobs, it is better to configure the
invisibility timeout interval:
var options = new SqlServerStorageOptions
{
InvisibilityTimeout = TimeSpan.FromMinutes(30) // default value
};
GlobalConfiguration.Configuration.UseSqlServerStorage("<name or connection string>", options);
As of Hangfire 1.5 this option is now Obsolete. Jobs that are being worked on are invisible to other workers.
Say goodbye to confusing invisibility timeout with unexpected
background job retries after 30 minutes (by default) when using SQL
Server. New Hangfire.SqlServer implementation uses plain old
transactions to fetch background jobs and hide them from other
workers.
Even after ungraceful shutdown, the job will be available for other
workers instantly, without any delays.

I was having trouble finding documentation on how to do this properly for a Postgresql database, every example I was see is using sqlserver, I found how the invisibility timeout was a property inside the PostgreSqlStorageOptions object, I found this here : https://github.com/frankhommers/Hangfire.PostgreSql/blob/master/src/Hangfire.PostgreSql/PostgreSqlStorageOptions.cs#L36. Luckily through trial and error I was able to figure out that the UsePostgreSqlStorage has an overload to accept this object. For .Net Core 2.0 when you are setting up the hangfire postgresql DB in the ConfigureServices method in the startup class add this(the default timeout is set to 30 mins):
services.AddHangfire(config =>
config.UsePostgreSqlStorage(Configuration.GetConnectionString("Hangfire1ConnectionString"), new PostgreSqlStorageOptions {
InvisibilityTimeout = TimeSpan.FromMinutes(720)
}));

I had this problem when using Hangfire.MemoryStorage as the storage provider. With memory storage you need to set the FetchNextJobTimeout in the MemoryStorageOptions, otherwise by default jobs will timeout after 30 minutes and a new job will be executed.
var options = new MemoryStorageOptions
{
FetchNextJobTimeout = TimeSpan.FromDays(1)
};
GlobalConfiguration.Configuration.UseMemoryStorage(options);

Just would like to point out that even though, it is stated the thing below:
As of Hangfire 1.5 this option is now Obsolete. Jobs that are being worked on are invisible to other workers.
Say goodbye to confusing invisibility timeout with unexpected background job retries after 30 minutes (by default) when using SQL Server. New Hangfire.SqlServer implementation uses plain old transactions to fetch background jobs and hide them from other workers.
Even after ungraceful shutdown, the job will be available for other workers instantly, without any delays.
It seems that for many people using MySQL, PostgreSQL, MongoDB, InvisibilityTimeout is still the way to go: https://github.com/HangfireIO/Hangfire/issues/1197

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex