How to determine the execution stop event of a program - unix

I have a script which logs certain data when a benchmark (some c code like matrix multiplication) runs.
I want to first start the log script when benchmark starts, this is easy since I can just start the binary from the log script and then proceed to log the info.
But the real question is when do I stop it? The benchmark can stop at anytime (The log script shouldn't stop the benchmark). How do I get the info/variable which can be used in the log script to stop it when benchmark program stops?
I was thinking if I can use PID of the benchmark, but then thought there should be a better solution than searching and using the PID.
Thanks!

Perhaps you could try something like this:
#!/bin/bash
#
# Your main script
#
# Run your log program in background
your_log_program &
# The PID of last background program
LOGPROGRAMPID=$!
# Install EXIT trap (EXIT is a bash's special event)
trap 'kill -15 $LOGPROGRAMPID; exit 0' EXIT
# In foreground launch your benchmark program
run_your_benchmark_program
# When benchmark program ends your trap will be launched and it will kill your log
# program.

Related

Run command with `system2(..., wait = FALSE)` in R and kill it later

Assume I want to start a local server from R, access it some times for calculations and kill it again at the end. So:
## start the server
system2("sleep", "100", wait = FALSE)
## do some work
# Here I want to kill the process started at the beginning
This should be cross platform, as it is part of a package (Mac, Linux, Windows, ...).
How can I achieve this in R?
EDIT 1
The command I have to run is a java jar,
system2("java", "... plantuml.jar ...")
Use the processx package.
proc <- processx::process$new("sleep", c("100"))
proc
# PROCESS 'sleep', running, pid 431792.
### ... pause ...
proc
# PROCESS 'sleep', running, pid 431792.
proc$kill()
# [1] TRUE
proc
# PROCESS 'sleep', finished.
proc$get_exit_status()
# [1] 2
The finished output is just an indication that the process has exited, not whether it was successful or erred. Use the exit status, where 0 indicates a good exit.
proc <- processx::process$new("sleep", c("1"))
Sys.sleep(1)
proc
# PROCESS 'sleep', finished.
proc$get_exit_status()
# [1] 0
FYI, base R's system and system2 work fine when you don't need to easily kill it and don't have any embedded spaces in the command or its arguments. system2 appears like it should be better at embedded spaces (since it accepts a vector of arguments), but under the hood all it's doing is
command <- paste(c(shQuote(command), env, args), collapse = " ")
and then
rval <- .Internal(system(command, flag, f, stdout, stderr, timeout))
which does nothing to protect the arguments.
You said "do some work", which suggests that you need to pass something to/from it periodically. processx does support writing to the standard input of the background process, as well as capturing its output. Its documentation at https://processx.r-lib.org/ is rich with great examples on this, including one-time calls for output or a callback function.

Is there a way to make a Denodo 8 VDP scheduler job WAIT() for a certain amount of time?

I want to make a VDP scheduler job in Denodo 8 wait for a certain amount of time. The wait function in the job creation process is not working as expected so I figured I'd write it into the VQL. However when i try the suggested function from the documentation (https://community.denodo.com/docs/html/browse/8.0/en/vdp/vql/stored_procedures/predefined_stored_procedures/wait) the Denodo 8 VQL shell doesn't recognize the function.
--Not working
SELECT WAIT('10000');
Returns the following error:
Function 'wait' with arity 1 not found
--Not working
WAIT('10000');
Returns the following error:
Error parsing command 'WAIT('10000')'
Any suggestions would be much appreciated.
There are two ways of invoking WAIT:
Option #1
-- Wait for one minute
CALL WAIT(60000);
Option #2:
-- Wait for ten seconds
SELECT timeinmillis
FROM WAIT()
WHERE timeinmillis = 10000;

How to make zsh's REPORTTIME work? (time for long-running commands)

This is my .zshrc:
export REPORTTIME=3
When I run sleep 4 it doesn't output anything.
If I change to REPORTTIME=blablabla (or anything non-sensical) it doesn't raise an error and starts behaving as REPORTTIME=0, i.e. returning the time taken for everything.
Interestingly, if I try REPORTTIME=3s I get the following message:
zsh: bad math expression: operator expected at `s'
sleep 4 0.00s user 0.00s system 0% cpu 4.004 total
So I get the error and still the output.
I tried RERPORTTIME="3" and even REPORTTIME=1+2. None of these work.
Also, if I run python -c "import time; time.sleep(4)" I get the same results (so the problem is not with sleep).
Of course, I tried other values too (other than 3).
I'm running MacOS with iterm2 and zsh is my default shell.
You need to set it explicitly to a non-negative integer representing seconds; i.e.
% REPORTTIME=3
Setting to other non-negative values does not work on my Zsh v5.4.2
either. Running something like a system update (e.g., yaourt) then
acts as if I had put time on the front of it. Pretty slick!
So you need a command that eats some user/system time; sleep does
not. Although total elapsed is long enough, user and system time is
not:
% time sleep 3
sleep 3 0.00s user 0.00s system 0% cpu 3.002 total
Also, no need to export this since it's directly used by Zsh.
You can undo/turn off this behavior with:
% unset REPORTTIME
Docs on REPORTTIME from man zshparam:
If nonnegative, commands whose combined user and system execution times (measured in seconds) are greater than this value have timing statistics printed for them. Output is suppressed for commands executed within the line editor, including completion; commands explicitly marked with the time keyword still cause the summary to be printed in this case.

Out-of-memory error in parfor: kill the slave, not the master

When an out-of-memory error is raised in a parfor, is there any way to kill only one Matlab slave to free some memory instead of having the entire script terminate?
Here is what happens by default when an out-of-memory error occurs in a parfor: the script terminated, as shown in the screenshot below.
I wish there was a way to just kill one slave (i.e. removing a worker from parpool) or stop using it to release as much memory as possible from it:
If you get a out of memory in the master process there is no chance to fix this. For out of memory on the slave, this should do it:
The simple idea of the code: Restart the parfor again and again with the missing data until you get all results. If one iteration fails, a flag (file) is written which let's all iterations throw an error as soon as the first error occurred. This way we get "out of the loop" without wasting time producing other out of memory.
%Your intended iterator
iterator=1:10;
%flags which indicate what succeeded
succeeded=false(size(iterator));
%result array
result=nan(size(iterator));
FLAG='ANY_WORKER_CRASHED';
while ~all(succeeded)
fprintf('Another try\n')
%determine which iterations should be done
todo=iterator(~succeeded);
%initialize array for the remaining results
partresult=nan(size(todo));
%initialize flags which indicate which iterations succeeded (we can not
%throw erros, it throws aray results)
partsucceeded=false(size(todo));
%flag indicates that any worker crashed. Have to use file based
%solution, don't know a better one. #'
delete(FLAG);
try
parfor falseindex=1:sum(~succeeded)
realindex=todo(falseindex);
try
% The flag is used to let all other workers jump out of the
% loop as soon as one calculation has crashed.
if exist(FLAG,'file')
error('some other worker crashed');
end
% insert your code here
%dummy code which randomly trowsexpection
if rand<.5
error('hit out of memory')
end
partresult(falseindex)=realindex*2
% End of user code
partsucceeded(falseindex)=true;
fprintf('trying to run %d and succeeded\n',realindex)
catch ME
% catch errors within workers to preserve work
partresult(falseindex)=nan
partsucceeded(falseindex)=false;
fprintf('trying to run %d but it failed\n',realindex)
fclose(fopen(FLAG,'w'));
end
end
catch
%reduce poolsize by 1
newsize = matlabpool('size')-1;
matlabpool close
matlabpool(newsize)
end
%put the result of the current iteration into the full result
result(~succeeded)=partresult;
succeeded(~succeeded)=partsucceeded;
end
After quite bit of research, and a lot of trial and error, I think I may have a decent, compact answer. What you're going to do is:
Declare some max memory value. You can set it dynamically using the MATLAB function memory, but I like to set it directly.
Call memory inside your parfor loop, which returns the memory information for that particular worker.
If the memory used by the worker exceeds the threshold, cancel the task that worker was working on. Now, here it get's a bit tricky. Depending on the way you're using parfor, you'll either need to delete or cancel either the task or worker. I've verified that it works with the code below when there is one task per worker, on a remote cluster.
Insert the following code at the beginning of your parfor contents. Tweak as necessary.
memLimit = 280000000; %// This doesn't have to be in parfor. Everything else does.
memData = memory;
if memData.MemUsedMATLAB > memLimit
task = getCurrentTask();
cancel(task);
end
Enjoy! (Fun question, by the way.)
One other option to consider is that since R2013b, you can open a parallel pool with 'SpmdEnabled' set to false - this allows MATLAB worker processes to die without the whole pool being shut down - see the doc here http://www.mathworks.co.uk/help/distcomp/parpool.html . Of course, you still need to arrange somehow to shutdown the workers.

Kill a calculation programme after user defined time in R

Say my executable is c:\my irectory\myfile.exe and my R script calls on this executeable with system(myfile.exe)
The R script gives parameters to the executable programme which uses them to do numerical calculations. From the ouput of the executable, the R script then tests whether the parameters are good ore not. If they are not good, the parameters are changed and the executable rerun with updated parameters.
Now, as this executable carries out mathematical calculations and solutions may converge only slowly I wish to be able to kill the executable once it has takes to long to carry out the calculations (say 5 seconds)
How do I do this time dependant kill?
PS:
My question is a little related to this one: (time non dependant kill)
how to run an executable file and then later kill or terminate the same process with R in Windows
You can add code to your R function which issued the executable call:
setTimeLimit(elapse=5, trans=T)
This will kill the calling function, returning control to the parent environment (which could well be a function as well). Then use the examples in the question you linked to for further work.
Alternatively, set up a loop which examines Sys.time and if the expected update to the parameter set has not taken place after 5 seconds, break the loop and issue the system kill command to terminate myfile.exe .
There might possibly be nicer ways but it is a solution.
The assumption here is, that myfile.exe successfully does its calculation within 5 seconds
try.wtl <- function(timeout = 5)
{
y <- evalWithTimeout(system(myfile.exe), timeout = timeout, onTimeout= "warning")
if(inherits(y, "try-error")) NA else y
}
case 1 (myfile.exe is closed after successfull calculation)
g <- try.wtl(5)
case 2 (myfile.exe is not closed after successfull calculation)
g <- try.wtl(0.1)
MSDOS taskkill required for case 2 to recommence from the beginnging
if (class(g) == "NULL") {system('taskkill /im "myfile.exe" /f',show.output.on.console = FALSE)}
PS: inspiration came from Time out an R command via something like try()

Resources