My job A is dependent on the parent job B and A is triggered when B succeed.
The problem is that B may finish several times during the day by mistake (bug in upstream).
How could I make A depend on B and trigger only if it was not already triggered the same day?
Didn't found any other solution but introducing new 'defensive job' in the middle with:
command: (( "X$(date +%F)" != "X$(cat defensive_trigger_date)" )) && ( echo $(date +%F) > defensive_trigger_date; echo $(date +%F) )
High level logic:
If current system date is not equal to last job trigger date (from the file) flush current system date to the file and exit with zero exit code (succeed the job)
Else exit with non-zero exit code (fail the job)
Related
By transactional-like I mean - etither all tasks in a set are successful or none of them are and should be retried from the first one.
Consider two operators, A and B, A downloads a file, B reads it and performs some actions.
A successfully executes, but before B comes into play a blackout/corruption occurs, and B cannot process the file, so it fails.
To pass, it needs A to re-download the file, but since A is in success status, there is no direct way to do that automatically. A human must go and clear A`s status
Now, if wonder, if there is a known way to clear the statuses of task instances up to a certain task, if some task fails?
I know I can use a hook, as in clear an upstream task in airflow within the dag, but that looks a bit ugly
This feature is not supported by airflow, but you can use on_failure_callback to achieve that:
def clear_task_A(context):
execution_date = context["execution_date"]
clear_task_A = BashOperator(
task_id='clear_task_A',
bash_command=f'airflow tasks clear -s {execution_date} -t A -d -y <dag_id>'
) # -s: start date, -t: task_id regex -d: include downstream?, -y: yes?
return clear_task_A.execute(context=context)
with DAG(...) as dag:
A = DummyOperator(
task_id='A'
)
B = BashOperator(
task_id='B',
bash_command='exit 1',
on_failure_callback=clear_task_A
)
A >> B
I want to make a VDP scheduler job in Denodo 8 wait for a certain amount of time. The wait function in the job creation process is not working as expected so I figured I'd write it into the VQL. However when i try the suggested function from the documentation (https://community.denodo.com/docs/html/browse/8.0/en/vdp/vql/stored_procedures/predefined_stored_procedures/wait) the Denodo 8 VQL shell doesn't recognize the function.
--Not working
SELECT WAIT('10000');
Returns the following error:
Function 'wait' with arity 1 not found
--Not working
WAIT('10000');
Returns the following error:
Error parsing command 'WAIT('10000')'
Any suggestions would be much appreciated.
There are two ways of invoking WAIT:
Option #1
-- Wait for one minute
CALL WAIT(60000);
Option #2:
-- Wait for ten seconds
SELECT timeinmillis
FROM WAIT()
WHERE timeinmillis = 10000;
I am executing the getURL command to check the status of a task on the server using this function below.
getURL(content(t2)$statusURL)
The status could be either "Processing" when the task is in process or "Completed_Successfully" when the task is completed.
Only if the task status is "Completed_Successfully" the following code below should be executed
getURL(content(t2)$ouptputURL)
If the task status is still in "Processing" the code should wait until it changes to "Completed_Successfully"
Need help writing this logic in R ?
I had a similar problem a few years ago. I decided to ask the Windows Task Manager (yes, I have sinned) to run a specific script on a regular basis. Your script could be along the lines of
isok <- FALSE
i <- 1
while (isok == FALSE) {
record.start <- Sys.time()
message("Checking if job done")
Sys.sleep(i)
record.end <- Sys.time()
see.difference <- record.end - record.start
message(paste("Waiting time:", round(see.difference)))
if (see.difference >= 5) {
isok <- TRUE
message("Job completed")
}
i <- i + 1
}
Checking if job done
Waiting time: 1
Checking if job done
Waiting time: 2
Checking if job done
Waiting time: 3
Checking if job done
Waiting time: 4
Checking if job done
Waiting time: 5
Job completed
I have to write a wrapper script to run 3 jobs in control M if a variable is Y i.e
$EOM_1=’Y’ [incase of End of Month true]
$EOM_1=’N’ [incase of End of Month false]
i.e if $EOM_1=’Y’ run jobs like ${DirTOOLS}/WaitUnitFileExists.sh $APIS/files/tr/chqload/local/INWUSD.dat 60 05 30
Why is the wrapper script needed? You can do this with the Job Defintions within Control-M. In the DAYS field in the job scheduling defintion for the job to run the last day of the month you would put L1.
For the Job that will run all but the last day of the month you specify -L1 in the DAYS field. For the job that has to wait on a file, run the Control-M File watcher utility that will then add the condition for the job or force the job in to run.
I have been working on this for a little while and have been unable to come to a solution. Any help would be appreciated. I am working on a UNIX workstation and have a 30-40 meg text file that I am working with. In my real file there is hundreds of Jobs. Example of input file;
# misc logging data
Job 1 start
Task start
Task stop
Task start
Task stop
Job 1 stop
# Other misc logging data
Job 2 start
Task start
Task stop
Job 2 stop
# Other misc logging data
Job 3 start
Task start
Task stop
Task start
Task stop
Task start
Task stop
Job 3 stop
My desired output is:
Job 1, 2 Tasks
Job 2, 1 Tasks
Job 3, 3 Tasks
Thanks again...
awk '/^Job .* start$/ { jobname = $2; taskcount = 0; }
/^Task start/ { taskcount++; }
/^Job .* stop$/ { printf "Job %s, %d Tasks\n", jobname, taskcount; }'
This doesn't do a lot of checking (making sure the job ending is the job that was started; checking that each started task stopped, etc.), but it does process the data you provided and give the output you required.
If the 'Other misc logging data' lines could contain stuff confused with a given job and its tasks (could match the Task start line, etc), then you have to be a bit cleverer.