SubDagOperator success is based on success state of the tasks within. Lately after Migration from Airflow 1.10 to Airflow 2, we have seen 2 instances which are not normal.
a) Scenario 1:
Total run duration shown for the subdagoperator is not the sum of the time taken by all the underlying tasks.
-- Underlying tasks took 1 hour to complete as per the GANTT chart but on hovering over SubdagOperator it shows 10 mins
b) Scenario 2:
Dag : Schedule to run at 12:50 am UTC on Sunday
Task1(BigqueryOperator) --> Task2(SubDagOperator) --> Task3(BigqueryOperator)
Task1>> Task2 >> Task3
Task2 has 3 tasks within it running it parallel, all of them using BigqueryOperator
Subtask1
Subtask2
Subtask3
Also Task2.Subtask1 uses one of the table as input from Task1
In ideal scenario first Task1 will complete before triggering the Task2 or any of its underlying subtasks.
Opposite to above we see from logs of Task2.Subtask1 that the task got triggered 2 minutes before the table in Task1 got created. Eventually failing the Task2.Subtask1 with input table not found.
Are the tasks within SubdagOperator treated differently in Airflow 2.
Note : We have 2 schedulers in Airflow 2. Is it something to do with commnunication gap between schedulers
Related
I am trying to call procedure at 11pm everyday using mysql event but it is not occuring even the event schedular is ON. Anyone have a solution? i am using the following syntax.
CREATE EVENT deleteandprocess
ON SCHEDULE
EVERY 1 DAY
STARTS (TIMESTAMP(CURRENT_DATE) + INTERVAL 23 HOUR)
DO
CALL deleteUnprocessCartItems();
I want to schedule a task which runs every 15 min during weekdays only. I've got the weekly option but confused what should I put in Recure every?
As I want to run this indefinitely till I manually stop it.
Thanks in advance.
Even though the Create New Scheduled Task Wizard only lets you specify a single simple schedule, a single Task can still have multiple schedules, which you can configure after the task is created.
Open Task Scheduler > (your task) > (right-click) > Properties > Triggers
Then add remove the default trigger you set in the wizard.
Then add 5 separate schedule triggers for Mon, Tue, Wed, Thu, Fri as follows:
Begin the task: On a schedule
Weekly
Start: Midnight of tonight.
Recur every: 1 weeks
On (a single day corresponding to the trigger, so choose Monday for the first one).
Advanced settings:
Repeat task every 15 minutes
...for a duration of 1 day
[X] Stop all running tasks at end of repetition duration (just for safety)
[X] Stop task if it runs longer than (X minutes) (also for safety)
(Ensure "Expire" is unchecked)
[X] Enabled
Repeat Step 3 for Tuesday, Wednesday, Thursday and Friday.
So at the end, you should have something like this:
I am facing an issue with the child dag dependent on multiple parent dags.
Dags A, B and C are my parent dags.
Dag D is my child dag.
Dag A has tasks --> A1,A2
Dag B has tasks --> B1,B2
Dag C has tasks --> C1,C2
The child dag D should run after successful execution of the following parent tasks:
A1,
B2,
C2
How to achieve that? Any suggestion are appreciated. Thanks in advance.
Is it possible to dynamically create control-M jobs.
Here's what I want to do:
I want to create two jobs. First one I call a discovery job, the second one I call a template job.
The discovery job runs against some database and comes back with an array of parameters. I then want to start the template job for each element in the returned array passing in that element as a parameter. So if the discovery job returned [a1,a2,a3] I want to start the template job 3 times, first one with parameter a1, second with parameter a2 and third one with parameter a3.
Only when each of the template jobs finish successfully should the discovery job show as completed successfully. If one of the template job instances fails I should be able to manually retry that one instance and when it succeeds the Discovery job should become successful.
Is this possible ? And if so, how should this be done ?
Between the various components of Control-M this is possible.
The originating job will have an On/Do tab - this can perform subsequent actions based on the output of the first job. This can be set to work in various ways but it basically works on the principle of "do x if y happens". The 'y' can be job status (ok or not) exit code (0 or not) or text string in standard output (e.g. "system wants you to run 3 more jobs"). The 'x' can be a whole list of things too - demand in a job, add a specific condition, set variables.
You should check out the Auto Edit variables (I think they've changed the name of these in the latest versions) but these are your user defined variables (use the ctmvar utility to define/alter these). The variables can be defined for a specific job only or across your whole system.
If you don't get the degree of control you want then the next step would be to use the ctmcreate utility - this allows full on-the-fly job definition.
You can do it and the way I found that worked was to loop through a create script which then plugs in your variable name from your look-up. You can then do the same for the job number by using a counter to generate a job name such as adhoc0001, adhoc0002, etc. What I have done is to create n number of adhoc jobs as required by the query, order them into a new group and then once the group is complete send the downstream conditions on. If one fails then you can re-run it as normal. I use ctmcreate -input_file . Which works a treat.
Job A runs Mon-Fri and triggers job B. However, I want Job B to run every day. How can I ensure Job B runs on Saturday and Sunday, even though it is set up to only run after Job A completes?
What we did for the similar issue is we created a job that rolls in only on weekends. This job might not have any IN-condition or can be timer dependent.
Let's call this job 'JOB_WEEKEND'.
JOB_WEEKEND out condition:
JOB_WEEKEND-OK
Now the Job B should have these conditions
JOB A-OK or JOB_WEEKEND_OK
This you can select from 'IN conditon relationships' below the conditions tab. This weekend job can be used for any other similar jobs too.
If you are using Control-M version 8, then you can use adjust condition option by using smart folders instead of creating a new dummy job.
You just need to give '#' before the IN condition of job B for Job A as below:-
IN #-JOB A--JOB B
while giving this condition, it means that job B will wait if there is a job A (from Mon- Fri). However, on weekend (SAT & SUN) the job B will not wait for Job A.