Is there a way to check the value of a variable in the script while it is running? One way could be to setup print commands in the script to print the value of the variable periodically. But I forgot to do that and it is a long program. Is there any another way?
Related
I have a list of http endpoints each performing a task on its own. We are trying to write an application which will orchestrate by invoking these endpoints in a certain order. In this solution we also have to process the output of one http endpoint and generate the input for the next http enpoint. Also, the same workflow can get invoked simultaneously depending on the trigger.
What I have done until now,
1. Have defined a new operator deriving from the HttpOperator and introduced capabilities to write the output of the http endpoint to a file.
2. Have written a python operator which can transfer the output depending on the necessary logic.
Since I can have multiple instances of the same workflow in execution, I could not hardcode the output file names. Is there a way to make the http operator which I wrote to write to some unique file names and the same file name should be available for the next task so that it can read and process the output.
Airflow does have a feature for operator cross-communication called XCom
XComs can be “pushed” (sent) or “pulled” (received). When a task pushes an XCom, it makes it generally available to other tasks. Tasks can push XComs at any time by calling the xcom_push() method.
Tasks call xcom_pull() to retrieve XComs, optionally applying filters based on criteria like key, source task_ids, and source dag_id.
To push to XCOM use
ti.xcom_push(key=<variable name>, value=<variable value>)
To pull a XCOM object use
myxcom_val = ti.xcom_pull(key=<variable name>, task_ids='<task to pull from>')
With bash operator , you just set xcom_push = True and the last line in stdout is set as xcom object.
You can view the xcom object , hwile your task is running by simply opening the tast execution from airflow UI and clicking on the xcom tab.
I have a workflow that starts with a shell script node that accepts a numeric parameter and it directs into different hive scripts using this parameter. How do I loop this workflow so it would execute based on a range of number as the parameter?
What I do right now is I change parameter in the GUI, execute, wait for it to finish, then change the parameter for the next number and rerun again.
You can achieve this using sub-flow, read the following blog to understand how to implement http://www.helmutzechmann.com/2015/04/23/oozie-loops/
The shell action output can be captured and accessed by the other action
${wf:actionData('shellAction')['variablename']}
Hope this helps.
-Ravi
I have a coordinator-A running which has workflow that generates output to a directory
/var/test/output/20161213-randomnumber/
now i need to pass the dir name "20161213-randomnumber" to another coordinator-B which needs to start as soon as the workflow of the coordinator-A is completed.
I am not able to find any pointers on how to pass the file name or how can the coordinator-B be triggered with the directory generated by co-ordinator A.
How ever i have seen numerous examples on triggering the coordinators for a specific date, daily, monthly, weekly dataset. In my case the dataset is not time dependent. It can arrive arbitrarily .
In your case, you can add one more action to put one empty trigger file (trig.txt) after your data generation script(/var/test/output/20161213-randomnumber/) action in your coordinator A. Then in your coordinator B add the data dependency to point to the trigger file, if it is there corrdinator B will start. Once B is getting started you can clear the trigger file for the next run.
You can use this data dependency to solve the problem. you can not pass parameter from one coordinator to another coordinator.
I have a mapreduce job which is scheduled by an oozie coordinator and runs every 4 hours. This mapreduce job takes a parameter, let's say k, whose value is set in the job.config file. I'd like to know if I change the value of this parameter between two runs, does it pick the updated (new) value or it sticks to the original (old) value?
if the job is in runing mode, it will stick to Old parameter it self, and if the job is in waiting to schedule run, then it will take the latest value :).
Actually, there is a devious way to "dynamically" fetch a parameter value at run time:
insert a dummy Shell Action at the beginning of the Workflow, with
<capture-output/> option set
in the shell script, just download a properties file from HDFS and
dump it to STDOUT
the "capture-output" option tells Oozie to parse STDOUT into a Map (i.e. a key/value list)
then use an E.L. function to retrieve the appropriate value(s) in
the next Actions
${wf:actionData("DummyShellAction")["some.key"]}
http://oozie.apache.org/docs/4.0.0/WorkflowFunctionalSpec.html#a4.2.6_Hadoop_Jobs_EL_Function
I have a function in my unix script which has a call to a pl/sql function. The unix function is called with values taken from user at runtime. These values are then needed to be passed to the pl/sql within the unix function along with some variables storing sql results. How do i call both the functions? How do i pass values?
Thanks in advance
Pranay
You need an Oracle command interpreter to run PL/SQL code within a UNIX (bash?) program. You will need to call probably sqlplus, provide the corresponding credentials, and prepare a string with the SQL command corresponding to the PL/SQL procedure call.
You probably need something like this:
#!/bin/sh
echo "your sql sentence" > yourscript.sql
sqlplus user/pass#db #yourscript.sql
I know you said you use ksh but I don't know its syntax, and I am sure you can do something similar to what I posted.