I'm running an ANT task in background and checking in 60 second intervals whether that task is complete or not. If it is not, every 60 seconds, a message should be displayed on screen - "Deploy process is still running. $slept seconds since deploy started", where $slept is 60, 120, 180 n so on.
There's a limit of 1200 seconds, after which the script will show the log via 'ant log' command and ask the user whether to continue. If the user chooses to continue, 300 seconds are added to the time limit and the process repeats.
The code that I am using for this task is -
ant deploy &
limit=1200
deploy_check()
{
while [ ${slept:-0} -le $limit ]; do
sleep 60 && slept=`expr ${slept:-0} + 60`
if [ $$ = "`ps -o ppid= -p $!`" ]; then
echo "Deploy process is still running. $slept seconds since deploy started."
else
wait $! && echo "Application ${New_App_Name} deployed successfully" || echo "Deployment of ${New_App_Name} failed"
break
fi
done
}
deploy_check
if [ $$ = "`ps -o ppid= -p $!`" ]; then
echo "Deploy process did not finish in $slept seconds. Here's the log."
ant log
echo "Do you want to kill the process? Press Ctrl+C to kill. Press Enter to continue."
read log
limit=`expr ${limit} + 300`
deploy_check
fi
Now, the problem is - this code is not working. This looks like a perfectly good code and yet, this is not working. Can anyone point out what is wrong with this code, please.
If the user chooses to continue, deploy_check gets run again, but there's never another opportunity for the user to continue or cancel (although Ctrl-C could be pressed at any time). So you may want to wrap that in a while loop.
Also, pressing Ctrl-C is probably not going to kill the child process. You need to prompt for yes or no and if yes do a kill $!.
Edit:
Here is how your code flows:
Call deploy_check
Loop until slept exceeds limit (it will be 1260 at that point)
Break if deployed successfully
If it's still running, prompt the user
If the user presses Ctrl-C, the script exits but ant deploy is left running (unless you have a trap set elsewhere in your script that processes the Ctrl-C and does a kill on the ant deploy process)
If the user presses enter, the limit is raised to 1500
Call deploy_check testing slept==1260 against limit==1500
Loop until slept exceeds limit (it will be 1560 at that point)
Break if deployed successfully
The script (or this section of it) ends without prompting the user again
You would need a loop to cause the prompt to occur again.
Related
I'm running a long executing command (currently on day 7) under screen. I would like to temporarily suspend this command to perform some other computationally expensive operation and then resume the screened command afterwards.
Is this even possible?
OK. I figured it out.
To Pause:
screen -r attaches to the screened command:
CTRL+z suspends the process.
CTRL+A d detaches from the screened command. (Do not exit.)
To Resume:
screen -r attaches to the paused screen.
fg continues the previously suspended command.
CTRL+A d detaches from the screened command, leaving it running in the background.
I need to run multiple scripts in parallel. The scripts needs to ensure that x scripts are always running. I trigger 6 jobs in parallel, as soon as one finishes it triggers the next one, but id something fails it should stop processing any further jobs. I was able to get hold of a script which helped me in first part, but not sure how to handle the error tracking. I do not have GNU parallel
PROCS=6
SLEEP=20
while read LINE
do
while true
do
NUM=$(jobs | wc -l)
echo NUM=$NUM
if [ $NUM -lt $PROCS ]
then
stuff "$LINE" &
break
else
sleep $SLEEP
fi
done
done < textfile
wait
So above ensures 6 processes are always running. I want a way to find out if one of the process is failed, then do not trigger anymore jobs.
I often have to relaunch a server to see if my changes are fine. I keep this server opened in a shell, so I have a quick access to current logs. So here is what I type in my shell: ^C!!⏎. That is send SIGINT, and then relaunch last event in history.
So what I would like is to type, say ^R, and have the same result.
(Note: I use zsh)
I tried the following:
relaunch-function() {
kill -INT %% && !!
}
zle -N relaunch-widget relaunch-function
bindkey "^R" relaunch-widget
But it seems that while running my server, ^R won't be passed tho the shell but to the server which doesn't notice the shell. So I can't see a generic solution, while testing return value and process name should be feasible.
As long as the job is running in the foreground, keys will not be passed to the shell. So setting a key binding for killing a foreground process and starting it again won't work.
But as you could start your server in an endless loop, so that it restarts automatically. Assuming the name of the command is run_server you can start it like this on the shell:
(TRAPINT(){};while sleep .5; do run_server; done)
The surrounding parentheses start a sub-shell, TRAPINT(){} disables SIGINT for this shell. The while loop will keep restarting run_server until sleep exits with an exit status that is not zero. That can be achieved by interrupting sleep with ^C. (Without setting TRAPINT, interrupting run_server could also interrupt the loop)
So if you want to restart your server, just press ^C and wait for 0.5 seconds. If you want to stop your server without restarting, press ^C twice in 0.5 seconds.
To save some typing you can create a function for that:
doloop() {(
TRAPINT(){}
while sleep .5
do
echo running \"$#\"
eval $#
done
)}
Then call it with doloop run_server. Note: You still need the additional surrounding () as functions do not open a sub-shell by themselves.
eval allows for shell constructs to be used. For example doloop LANG=C locale. In some cases you may need to use (single):
$ doloop echo $RANDOM
running "echo 242"
242
running "echo 242"
242
running "echo 242"
242
^C
$ doloop 'echo $RANDOM'
running "echo $RANDOM"
10988
running "echo $RANDOM"
27551
running "echo $RANDOM"
8910
^C
We have two Informatica jobs that run in parallel.
One starts at 11.40 CET and it has around 300 Informatica workflows in it out of which one is fact_sales.
The other job runs at 3.40 CET and it has around 115 workflows in it many of which are dependent on fact_sales in term of data consistency.
The problem is fact_sales should finish before certain workflows in process 2 starts for data to be accurate, but this doesnt happen generally.
What we are trying to do is to split the process 2 in such a way that fact_sales dependent workflows run only after the fact_sales has finished.
Can you provide me a way to go about writing a unix shell script that check the status of this fact_sales and if it successfull then kicks off other dependent workflows and if not then it should send a failure mail.
thanks
I don't see the need to write a custom shell script for this. Most of this is pretty standard/common functionality that can be implemented using Command Task and event waits.
**Process1 - runs at 11:50**
....workflow
...
fact_sales workflow. **Add a command task at the end
**that drops a flag, say, fact_sales_0430.done
...
....workflow..500
And all the dependent processes will have an event wait that waits on this .done file. Since there are multiple dependant workflows, make sure none of them deletes the file right away. You can drop this .done file at the end of the day or when the load starts for the next day.
workflow1
.....
dependantworkflow1 -- Event wait, waiting on fact_sales_0430.done (do not delete file).
dependantworkflow2 -- Event wait, waiting on fact_sales_0430.done (do not delete file).
someOtherWorkflow
dependantworkflow3 -- Event wait, waiting on fact_sales_0430.done (do not delete file).
....
......
A second approach can be as follows -
You must be running some kind of scheduler for launching these workflows.. since Informatica cant schedule multiple workflows in a set, it can only handle worklet/sessions at that level of dependency mgmt.
From the scheduler, create a dependency across the sales fact load wf and the other dependent workflows..
I think below mentioned script will work for you. Please udpate the parameters.
WAIT_LOOP=1
while [ ${WAIT_LOOP} -eq 1 ]
do
WF_STATUS=`pmcmd getworkflowdetails -sv $INFA_INTEGRATION_SERVICE -d $INFA_DOMAIN -uv INFA_USER_NAME -pv INFA_PASSWORD -usd Client -f $FOLDER_NAME $WORKFLOW_NAME(fact_sales) | grep "Workflow run status:" | cut -d'[' -f2 | cut -d']' -f1`
echo ${WF_STATUS} | tee -a $LOG_FILE_NAME
case "${WF_STATUS}" in
Aborted)
WAIT_LOOP=0
;;
Disabled)
WAIT_LOOP=0
;;
Failed)
WAIT_LOOP=0
;;
Scheduled)
WAIT_LOOP=0
;;
Stopped)
WAIT_LOOP=0
;;
Succeeded)
WAIT_LOOP=0
;;
Suspended)
WAIT_LOOP=0
;;
Terminated)
WAIT_LOOP=0
;;
Unscheduled)
WAIT_LOOP=0
;;
esac
if [ ${WAIT_LOOP} -eq 1 ]
then
sleep $WAIT_SECONDS
fi
done
if [ ${WF_STATUS} == "Succeeded" ]
then
pmcmd startworkflow -sv $INFA_INTEGRATION_SERVICE -d $INFA_DOMAIN -uv INFA_USER_NAME -pv INFA_PASSWORD -usd Client -f $FOLDER_NAME -paramfile $PARAMETER_FILE $WORKFLOW_NAME(dependent_one) | tee $LOG_FILE_NAME
else
(echo "Please find attached Logs for Run" ; uuencode $LOG_FILE_NAME $LOG_FILE_NAME )| mailx -s "Execution logs" $EMAIL_LIST
exit 1
fi
I can see you have main challenge - keep dependency between large number of infa workflows.
You have two options-
You can use some automated scheduling tool to set the dependency and run them one by one properly. There are many free tool but depending on your comfort/time/cost etc. you should choose. link here.
Secondly you can create your custom job-scheduler. I did a similar scheduler using UNIX script, oracle table. So here are steps for that -
Categorize all your workflows into groups. independent flow should go to group 1 and dependent flows on group 1 goes to group2 and so on.
Set your process to pick up one by one from above groups and kick them off. If kick off queue is empty then it should wait. call it loop2.
Keep a polling loop that will check status of kicked off flows. If failed, aborted etc. fail the process, mail to user and mark all 'in-queue/dependent' flows to failed. If running keep on polling. If succeeded give control to loop 2.
-if kick off queue is empty then go to next group only if all workflow in that group succeeded.
This is a bit tricky process but it paid off once you set it up. You can add as many workflows as you want and your maintenance will be much more smoother compared to infa scheduler or infa worklet etc.
You can fire a query from repository database using tables such REP_SESS_LOG and check if the status of the fact sales has succeeded or not. Then only you can proceed with the second job.
I have a shell script that transfers a build.xml file to a remote unix machine (devrsp02) and executes the ANT task wldeploy on that machine (devrsp02). Now, this wldeploy task takes around 15 minutes to complete and while this is running, the last line at the unix console is -
"task {some digit} initialized".
Once this task is complete, we get a "task Completed" msg and the next task in the script is executed only after that.
But sometimes, there might be a problem with the weblogic domain and the deployment might be failing internally, with no effect on the status of the wldeploy task. The unix console will still be stuck at "task {some digit} initialized". The error of the deployment will be getting logged in a file called output.a
So, what I want now is -
Start a time counter before running wldeploy. If the wldeploy runs for more than 15 minutes, the following command should be run -
tail -f output.a ## without terminating the wldeploy
or
cat output.a ## after terminating the wldeploy forcefully
Point to be noted here is - I can't run the wldeploy task in background, as in that case the user won't get to know when the task is complete, which is crucial for this script.
Could you please suggest anything to achieve this?
Create this script (deploy.sh for example):
#!/bin/sh
sleep 900 && pkill -n wldeploy && cat output.a &
wldeploy
Then from the console
chmod +x deploy.sh
Then run
./deploy.sh
This script will start a counter (15 minutes) that will forcibly kill the wldeploy process if it's running, and if the process was running you'll see the contents of output.a.
If the script has terminated then pkill will not return true and output.a will not be shown.
I would call this task monitoring rather than "parallel processing" :)
This will only kill the wldeploy process it started, tell you whether wldeploy returned success or failure, and run no more than 30 seconds after wldeploy finishes.
It should be sh-compatible, but the /bin/sh I've got access to now seems to have a broken wait command.
#!/bin/ksh
wldeploy &
while [ ${slept:-0} -le 900 ]; do
sleep 30 && slept=`expr ${slept:-0} + 30`
if [ $$ = "`ps -o ppid= -p $!`" ]; then
echo wldeploy still running
else
wait $! && echo "wldeploy succeeded" || echo "wldeploy failed"
break
fi
done
if [ $$ = "`ps -o ppid= -p $!`" ]; then
echo "wldeploy did not finish in $slept seconds, killing it"
kill $!
cat output.a
fi
For the part without terminating the wldeploy it is easy, just execute before
{ sleep 900; tail -f output.a; } &
For the part with kill it, it is more complex, as you have determine the PID of the wldeploy process. The answer of pra is exactly doing that, so I would just refer to that.