Quartz.NET Manually invoked recurring job - asp.net

I have ASP.NET Core application and I'm using Quartz.NET for reccuring job which ran my TestMethod() every 60 seconds and it works fine.
I need to run this job sometime manually. How to force Quartz.NET to make next reccuring calling 60 seconds after the manual run?
This is what I need to accomplish:
00:01:00 -> Automatic run of TestMethod();
00:02:00 -> Automatic run of TestMethod();
00:02:10 -> Manual run of TestMethod();
00:03:10 -> Automatic run of TestMethod(); (Note: 60 seconds after last run)
...
Maybe is it possible by the Hangfire library?

I guess you can't do that using Quartz or Hangfire. Schedule-based jobs are not affected by manual runs usually. You may also notice that schedule-based jobs do not take into account the execution time, so it may lead to the situation then you have one job still in progress (from the previous run) and another one started by schedule.
Instead of using simple recurring jobs you can use the following pattern:
Schedule a single execution of your job
Re-schedule completed job at the end of the execution.
It will help you to avoid the situation described above and also it will allow you to schedule next run after manual trigger.

Related

Airflow runs manual and scheduled Dag even though max_active_runs_per_dag=1

Good Day,
we use Airflow to orchestrate runs of our jobs.
The Job in question is usually scheduled for 2:30 and takes quite some time.
Due to a new data source it was expected to run a full day.
Since our jobs don't work parallel we set max_active_runs_per_dag to 1 to ensure that there are no multiple instances of the same job even when it takes more than 24h. In general this seem to work, but not in this case.
What happened:
We triggered a manual run at 13:00
at 2:30 (next day) the scheduled run is triggered and runs simultaneously
Expectation:
The scheduled run should wait for the manual run to finish
More Information:
The Airflow instance did not restart.
Airflow version 1.10.2
I thank you for any advice.
Seems to be an open Issue which will be fixed in 2.1 and 1.15.
A workaround was not provided yet.
https://github.com/apache/airflow/issues/9975

How to launch a Dataflow job with Apache Airflow and not block other tasks?

Problem
Airflow tasks of the type DataflowTemplateOperator take a long time to complete. This means other tasks can be blocked by it (correct?).
When we run more of these tasks, that means we would need a bigger Cloud Composer cluster (in our case) to execute tasks that are essentially blocking while they shouldn't be (they should be async operations).
Options
Option 1: just launch the job and airflow job is successful
Option 2: write a wrapper as explained here and use a reschedule mode as explained here
Option 1 does not seem feasible as the DataflowTemplateOperator only has an option to specify the wait time between completion checks called poll_sleep (source).
For the DataflowCreateJavaJobOperator there is an option check_if_running to wait for completion of a previous job with the same name (see this code)
It seems that after launching a job, the wait_for_finish is executed (see this line), which boils down to an "incomplete" job (see this line).
For Option 2, I need Option 1.
Questions
Am I correct to assume that Dataflow tasks will block others in Cloud Composer/Airflow?
Is there a way to schedule a job without a "wait to finish" using the built-in operators? (I might have overlooked something)
Is there an easy way to write this myself? I'm thinking of just executing a bash launch script, followed by a task that looks if the job finished correctly, but in a reschedule mode.
Is there another way to avoid blocking other tasks while running dataflow jobs? Basically this is an async operation and should not take resources.
Answers
Am I correct to assume that Dataflow tasks will block others in Cloud Composer/Airflow?
A: Partly yes. Airflow has parallelism option in the configuration which define the number of tasks that should execute at a time across the system. Having a task block this slot might slow down the execution in the system but this issue is bound to happen as you increase the number of tasks and DAGs. You can increase this in the configuration depending on your needs
Is there a way to schedule a job without a "wait to finish" using the built-in operators? (I might have overlooked something)
A: Yes. You can use PythonOperator and in the python_callable you can use the dataflow hook to launch the job in async mode (launch and don't wait).
Is there an easy way to write this myself? I'm thinking of just executing a bash launch script, followed by a task that looks if the job finished correctly, but in a reschedule mode.
A: When you say reschedule, I'm assuming that you are going to retry the task that looks for job that checks if the job finished correctly. If I'm right, you can set the task on retry mode and the delay at which you want the retry to happen.
Is there another way to avoid blocking other tasks while running dataflow jobs? Basically this is an async operation and should not take resources.
A: I think I answered this in the second question.

Autosys Can a job run multiple instance at the same time

I am trying to understand autosys job. Suppose I have Job A that runs every 15 minutes. Suppose for some reason if Job A takes more than 15 minutes, will another instance of it run or it will wait for the job to finish before running another instance?
In my experience, if the previous job run is still running, another instance will not run if the next scheduled time comes. The next time the job runs is when the previous run is finished and the next scheduled time comes.
Another user also experienced this according to this answer.
I did not find any AutoSys documentation that officially confirms what happens in this situation, but I guess the best way to find out is to test it on your AutoSys instance.
I have experienced this first hand and can confirm that there won't be two instances in the mentioned scenario. The job will wait on the previous run to complete and will immediately kick off the next instance if the time condition is met before the previous completes.
But this will be the case only when the job is in running state, if the job is in any other state it will kick off based on the given start_time condition.

Oozie job taking longer than scheduled interval

I am scheduling an Oozie MapReduce job to run every 15 minutes. I wonder what would happen if each job will take longer than that set time? Will it result in a job backlog? Or Oozie will create a new task / thread / fork for the new job while the previous one is still running?
Oozie won't run the next job before the previous one is over. If the first job takes more than 15 minutes to execute then the next one will be run after scheduled time. So scheduled time and running time may be different in Oozie.
EDIT:
Anyway, the described behaviour is default only and can be changed. You can set concurrency property from controls block to more than 1, and the next job will be run even the first one is still running. Check my answer on similar question

how to create a wait job in informatica

My requirement is to create a job in informatica which will run for every 15 min and look for a status column in abc table.If it is “Approved” THEN It will exit and kick off the rest of the jobs.
If the status is not approved it will not do anything and run after 15 min.This process wil continue until we have a approval status.
So, No matter what happens in the above two scenarios,This process will run in every 15 minutes.
I have worked on the same requirement in unix using loops and conditional statments but I am not sure how this can be achieved using informatica.Could you please help me on this.
Regards,
Karthik
I would try adding a scheduler that runs every 15 minutes. The best way that I've found to "loop" sessions in Informatica is:
run the session once, check if it failed using conditional links
if it did fail, run a timer task for an amount of time (a minute, an hour, whatever)
then try to run the same session again by copying and pasting the session up ahead of the timer task, and repeat a few times as necessary.
So if you added a scheduler into the mix, you could set the scheduler to have the workflow run every 15 minutes, and have the timer tasks halt the workflow for 4 or 5 minutes each. Then you could use SESSSTARTTIME function in some pre/post-session task to determine when the scheduler will fire off again and simply abort the workflow before that time.

Resources