How to reset a job status to inactive? - autosys

I have a job J1 which is dependent from the success of a job J0.
First time J0 failed so J1 is currently under status "failure".
This is fine.
Anyway I'd like now to reset its status so I can perform another test.
How to set J1's status from "FAILURE" to "INACTIVE", in other words its initial status?

SENDEVENT -J J1 -E CHANGE_STATUS -s INACTIVE

You can use below command. Question been asked long back, just answering so that other people can refer it !!
sendevent -E CHANGE_STATUS -s INACTIVE -J {job name}

Related

Autosys job to change the status to success

Hi all i tried this code
D:\Apps\tools\autorep\ixpautorep.exe -M sendevent -J jobname -E CHANGE_STATUS -s SUCCESS
But it shows status of event is missing and ignores success
Please help

Kill all R processes that hang for longer than a minute

I use crontask to regularly run Rscript. Unfortunately, I need to do this on a small instance of aws and the process may hang, building more and more processes on top of each other until the whole system is lagging.
I would like to write a crontask to kill all R processes lasting longer than one minute. I found another answer on Stack Overflow that I've adapted that I think would solve the problem. I came up with;
if [[ "$(uname)" = "Linux" ]];then killall --older-than 1m "/usr/lib/R/bin/exec/R --slave --no-restore --file=/home/ubuntu/script.R";fi
I copied the task directly from htop, but it does not work as I expect. I get the No such file or directory error but I've checked it a few times.
I need to kill all R processes that have lasted longer than a minute. How can I do this?
You may want to avoid killing processes from another user and try SIGKILL (kill -9) after SIGTERM (kill -15). Here is a script you could execute every minute with a CRON job:
#!/bin/bash
PROCESS="R"
MAXTIME=`date -d '00:01:00' +'%s'`
function killpids()
{
PIDS=`pgrep -u "${USER}" -x "${PROCESS}"`
# Loop over all matching PIDs
for pid in ${PIDS}; do
# Retrieve duration of the process
TIME=`ps -o time:1= -p "${pid}" |
egrep -o "[0-9]{0,2}:?[0-9]{0,2}:[0-9]{2}$"`
# Convert TIME to timestamp
TTIME=`date -d "${TIME}" +'%s'`
# Check if the process should be killed
if [ "${TTIME}" -gt "${MAXTIME}" ]; then
kill ${1} "${pid}"
fi
done
}
# Leave a chance to kill processes properly (SIGTERM)
killpids "-15"
sleep 5
# Now kill remaining processes (SIGKILL)
killpids "-9"
Why imply an additional process every minute with cron?
Would it not be easier to start R with timeout from coreutils, the processes will then be killed automatically after the time you chose.
timeout [option] duration command [arg]…
I think the best option is to do this with R itself. I am no expert, but it seems the future package will allow executing a function in a separate thread. You could run the actual task in a separate thread, and in the main thread sleep for 60 seconds and then stop().
Previous Update
user1747036's answer which recommends timeout is a better alternative.
My original answer
This question is more appropriate for superuser, but here are a few things wrong with
if [[ "$(uname)" = "Linux" ]];then
killall --older-than 1m \
"/usr/lib/R/bin/exec/R --slave --no-restore --file=/home/ubuntu/script.R";
fi
The name argument is either the name of image or path to it. You have included parameters to it as well
If -s signal is not specified killall sends SIGTERM which your process may ignore. Are you able to kill a long running script with this on the command line? You may need SIGKILL / -9
More at http://linux.die.net/man/1/killall

How to filter the Running jobs in a particular server in Autosys..?

I am new to Autosys. Is it possible to find all the running jobs on a particular server in Autosys. (Like what we are doing in TWS or other monitoring tools).
The command to show running jobs on a specific server is this:
autorep -M servername -d
- note that the servername value has to match exactly what's in the JIL.
Also the output isn't detailed like you'd get from "autorep -d -j jobname" - but then it does what you asked!
If you want the job detail you might want to use the previous suggestion. Or you could use autorep -M as I suggested to get the running jobs and pipe that list through autorep -j to get detail just for those specific jobs.
Yes you can find out the list of jobs running in autosys. Put the command in GUI as autorep -j % | find " RU ". in Unix machine use grep in place of find
autorep -M agent_name -d
autorep -M -d shows jobs that are in running on the specified node.
autorep -M machine1 -d
Machine Name Max Load Current Load Factor O/S Status
machine1 --- --- 1.00 Sys Agent Online
Current Jobs:
Job Name Machine Status Load Priority
CMD1 machine1 RUNNING 0 0
CMD2 machine1 RUNNING 0 0
autorep -I MachineName | Find "RU"

How to get the complete list of autosys jobs with there dependency

Can you Please help mw with getting the complete list of jobs present on Autosys server or Box with there details ?
IS there any utility available ?
autorep -J ALL | cat > jobList.txt
autorep -J %BOX_NAME% -q will spit out all jobs in a box.
Note that the % sign denotes a wild card

LSF bsub: job always in PENDING state, not going to RUN state

I stuck on a small problem.
I'm launching many bsub commands at the same time each one on a specified host:
bsub -sp 20 -W 0:5 -m $myhostname -q "myQueue" -J "mkdir_script" -o $log_file "script_to_launch param1 param2 param3"
all this inside a for, for each hostName.
The problem is that everything is OK for all hosts except one (always the same one). The job is always in PENDING state, and is not moving to RUN state.
The script to execute is a script that will check for a folder and creating it if is not there (so a very small task to do).
Is there a way to see what happens on that host and why my job is not going to RUN state ?
PS: I just found the bjobs -p command and I have the following message:
Not specified in job submission: 81 hosts;
Closed by LSF administrator: 3 hosts;
What does this message mean?
The -m option limits you to a particular host, which excludes 81 hosts. The other three have been closed by your system administrator. You would have to contact them to find out why.

Resources