CronJob - scheduale a job to run every 12 hours starting immediately - unix

Is there a way to define a cron jon that will run every 12 hours but will start immediately?
For example if now its 14:30 , and I start cron, I'll want the job to run on: 15:00 , 3:00, 15:00 and etc.
But if I start cron at 16:22, I want it to be:
17:00, 7:00, 17:00 and etc..

Why not just have your entry as:
0 00,12 * * * /script/
This will run every day at midnight and 12pm, and then edit the schedule whenever you start and restart cron (although not sure why you'd want a varying schedule dependant upon when the cron server was started or stopped).
edit:
How about
0 */12 * * * /script
or if you want it to run on boot
* */12 * * * /script
you may need to add a sleep to your script of say 5 minutes (unless it's ready instantenously)

Related

Airflow, number of run instances does not match start_date, end_date and interval

I am new to airflow, I have set the date as below:
start_date=datetime.datetime(2022,7,30,7,33),
end_date=datetime.datetime(2022,7,30,7,35),
schedule_interval='*/1 * * * *'
If I run the dag before the start_date, I expect to see 3 runs:
first run execute after 2022,7,30,7,34
second run execute after 2022,7,30,7,35
third run execute after 2022,7,30,7,36
However, my dag return 4 runs. The first run happened immediately once I clicked start, even though I clicked start well before the start_date.
Shouldn't airflow wait until start_date+interval to execute the first run?

How to schedule execution without delay?

I have one problem, I need to launch one DAG every first day of month, but I have one problem, the DAG started on 1 October but executed that day on 1 November, I need that 1 October execute 1 October and 1 November execute 1 November, and not delay the execution one month.
My scheduler was: '0 10 1 * *'
Thanks
This is how Airflow works.
Airflow schedule DAGs at the end of the interval.
So if you have:
DAG(
dag_id='tutorial',
schedule_interval='0 10 1 * *',
start_date=datetime(2021, 10, 1),
)
The first run will start on 2021-11-01 - this run will have execution date of 2021-10-01. This behavior is consistent with how data pipelines work. In November you will want to process October data. Or in the terminology that I mentioned before - Your monthly interval starts on beginning of October it ends in November so at the beginning of November you can run the job that process October data.
That said - In the job itself you can process any interval you wish. For that you can use Airflow macros.
In simple words if you want your first DAG run to start on 2021-10-01 you should set start_date=datetime(2021, 9, 1)
Starting from Airflow 2.2.0 there was enhancement in that area.
Airflow decoupled the "When to run" from the "What interval to process" with the completion of AIP- 39 Richer Scheduler. You can read about the concept of Timetables in this doc.

Cron job run at the wrong time in AIX 7.1

I had configured my cronjob to run in every first Monday of the month 8:40am as below
40 08 1-7 * 1 /fs/test/testtime.sh
But it not only run on Monday, it also run on today which is Tuesday.
Is there anything i miss out?
From the man page for crontab (my emphasis):
Note: The day of a command's execution can be specified by two fields - day of month, and day of week. If both fields are restricted (i.e., aren't *), the command will be run when either field matches the current time.
For example, 30 4 1,15 * 5 would cause a command to be run at 4:30 am on the 1st and 15th of each month, plus every Friday.
So, in your case, the job runs on every one of the first seven days in each month, plus every Monday.
You can do what you wish by adding an AND condition in the command rather than relying on an OR condition in the time specification, something like:
40 08 1-7 * * test $(date +\%u) -eq 1 && /fs/test/testtime.sh
This will run the actual cron job on all those days (first seven days in each month) but the payload (the script) will only run if the day is Monday.

How to consider daylight savings time when using cron schedule in Airflow

In Airflow, I'd like a job to run at specific time each day in a non-UTC timezone. How can I go about scheduling this?
The problem is that once daylight savings time is triggered, my job will either be running an hour too soon or an hour too late. In the Airflow docs, it seems like this is a known issue:
In case you set a cron schedule, Airflow assumes you will always want
to run at the exact same time. It will then ignore day light savings
time. Thus, if you have a schedule that says run at end of interval
every day at 08:00 GMT+1 it will always run end of interval 08:00
GMT+1, regardless if day light savings time is in place.
Has anyone else run into this issue? Is there a work around? Surely the best practice cannot be to alter all the scheduled times after Daylight Savings Time occurs?
Thanks.
Starting with Airflow 1.10, time-zone aware DAGs can be defined using time-zone aware datetime objects to specify start_date. For Airflow to schedule DAG runs always at the same time (regardless of a possible daylight-saving-time switch), use cron expressions to specify schedule_interval. To make Airflow schedule DAG runs with fixed intervals (regardless of a possible daylight-saving-time switch), use datetime.timedelta() to specify schedule_interval.
For example, consider the following code that, first, uses a cron expression to schedule two consecutive DAG runs, and then uses a fixed interval to do the same.
import pendulum
from airflow import DAG
from datetime import datetime, timedelta
START_DATE = datetime(
year=2019,
month=10,
day=25,
hour=8,
minute=0,
tzinfo=pendulum.timezone('Europe/Kiev'),
)
def gen_execution_dates(start_date, schedule_interval):
dag = DAG(
dag_id='id', start_date=start_date, schedule_interval=schedule_interval
)
execution_date = dag.start_date
for i in range(1, 3):
execution_date = dag.following_schedule(execution_date)
print(
f'[Run {i}: Execution Date for "{schedule_interval}"]:',
dag.timezone.convert(execution_date),
)
gen_execution_dates(START_DATE, '0 8 * * *')
gen_execution_dates(START_DATE, timedelta(days=1))
Running the code produces the following output:
[Run 1: Execution Date for "0 8 * * *"]: 2019-10-26 08:00:00+03:00
[Run 2: Execution Date for "0 8 * * *"]: 2019-10-27 08:00:00+02:00
[Run 1: Execution Date for "1 day, 0:00:00"]: 2019-10-26 08:00:00+03:00
[Run 2: Execution Date for "1 day, 0:00:00"]: 2019-10-27 07:00:00+02:00
For the zone [Europe/Kiev], the daylight saving time of 2019 ends on 2019-10-27 at 03:00:00+03:00. That is, between Run 1 and Run 2 in our example.
The first two output lines show that for the DAG runs scheduled with a cron expression the first run and second run are both scheduled for 08:00 (although, in different timezones: Eastern European Summer Time (EEST) and Eastern European Time (EET) respectively).
The last two output lines show that for the DAG runs scheduled with a fixed interval the first run is scheduled for 08:00 (EEST), and the second run is scheduled exactly 1 day (24 hours) later, which is at 07:00 (EET) due to the daylight-saving-time switch.
The following figure illustrates the example:

Cron job setting,explanation

For cron job setting,I defined a job as below:
* */12 * *
What I had assumed that it means the cron job will run after every 12 hours,unfortunately found that its running more frequently,not sure how frequently but the output was large than expected.
Can any explain it simply,I went through different docs but it seems there are several way to set a single cron.
Can anyone explain it easily?
Btw I updated my cron for running every 12 hours like below:
* * 12 * * ?
Thanks in advance.
This line should run at 12AM and 12PM:
* 0,12 * * * /path/to/command
* */12 * * means run every minutes on 0 and 12. If you want to make it run once in those hours, put a number instead of the first star(*), such as
20 */12 * *
By the way, there should be 5 time fields in the cron job instead of 4 fields, so it should actually be
20 */12 * * *
I was searching for the same and I found this site very useful.
It will explain your cron job.
look at the below example:
*/15 * * * * cd ~/ecg;
Explanation: The command cd ~/ecg; will execute every 15 minutes of every hour on every day of every month.
15 09 07 03 * php abc.php
Explanation: The command php abc.php will execute at 9:15am on the 7th of March.

Resources