I have the following situation: In my R script I start a third-party program with system2. The program is called lots of times, and unfortunately it is not very stable and crashes sometimes. If this happens, control is not returned to R until I kill the program via Task Manager manually.
What I would like to do: If the program has not returned control after 10 minutes, kill it automatically.
I could of course wrap the program in C++, Java or similar, implement this functionality in the wrapper, and call the wrapper from R. Quite possibly I could also utilize Rcpp.
However, I wonder if there is any way to achive this in R directly?
Btw: I am on Windows 7.
Thanks for any hints!
If you are using a unix-like system, you can add the unix command timeout to your system call. Example:
# system command that times out
> exitcode = system('timeout 1 sleep 20')
> exitcode
[1] 124
# system command that does not time out
> exitcode = system('timeout 2 sleep 1')
> exitcode
[1] 0
system returns the exit status of the process so you can check whether it is 0 (OK) or 124 (timed out).
Related
I am writing an R code on a Linux system using RStudio. At some point in the code, I need to use a system call to a command that will download a few thousand of files from the lines of a text file:
down.command <- paste0("parallel --gnu -a links.txt wget")
system(down.command)
However, this command takes a little while to run (a couple of hours), and the R prompt stays locked while the command runs. I would like to keep using R while the command runs on the background.
I tried to use nohup like this:
down.command <- paste0("nohup parallel --gnu -a links.txt wget > ~/down.log 2>&1")
system(down.command)
but the R prompt still gets "locked" waiting for the end of the command.
Is there any way to circumvent this? Is there a way to submit system commands from R and keep them running on the background?
Using ‘processx’, here’s how to create a new process that redirects both stdout and stderr to the same file:
args = c('--gnu', '-a', 'links.txt', 'wget')
p = processx::process$new('parallel', args, stdout = '~/down.log', stderr = '2>&1')
This launches the process and resumes the execution of the R script. You can then interact with the running process via the p name. Notably you can signal to it, you can query its status (e.g. is_alive()), and you can synchronously wait for its completion (optionally with a timeout after which to kill it):
p$wait()
result = p$get_exit_status()
Based on the comment by #KonradRudolph, I became aware of the processx R package that very smartly deals with system process submissions from within R.
All I had to do was:
library(processx)
down.command <- c("parallel","--gnu", "-a", "links.txt", "wget", ">", "~/down.log", "2>&1")
processx::process$new("nohup", down.comm, cleanup=FALSE)
As simple as that, and very effective.
I am queueing and running an R script on a HPC cluster via sbatch and mpirun; the script is meant to use foreach in parallel. To do this I've used several useful questions & answers from StackOverflow: R Running foreach dopar loop on HPC MPIcluster, Single R script on multiple nodes, Slurm: Use cores from multiple nodes for R parallelization.
It seems that the script completes, but a couple of strange things happen. The most important is that the slurm job keeps on running afterwards, doing nothing(?). I'd like to understand if I'm doing things properly. I'll first give some more specific information, then explain the strange things I'm seeing, then I'll ask my questions.
– Information:
R is loaded as a module, which also calls an OpenMPI module. The packages Rmpi, doParallel, snow, foreach were already compiled and included in the module.
The cluster has nodes with 20 CPUs each. My sbatch file books 2 nodes and 20 CPUs per node.
The R script myscript.R is called in the sbatch file like this:
mpirun -np 1 Rscript -e "source('myscript.R')"
My script calls several libraries in this order:
library('snow')
library('Rmpi')
library('doParallel')
library('foreach')
and then sets up parallelization as follows at the beginning:
workers <- mpi.universe.size() - 1
cl <- makeMPIcluster(workers, outfile='', type='MPI')
registerDoParallel(cl)
Then several foreach-dopar are called in succession – that is, each starts after the previous has finished. Finally
stopCluster(cl)
mpi.quit()
are called at the very end of the script.
mpi.universe.size() correctly gives 40, as expected. Also, getDoParWorkers() gives doParallelSNOW. The slurm log encouragingly says
39 slaves are spawned successfully. 0 failed.
starting MPI worker
starting MPI worker
...
Also, calling print(clusterCall(cl, function() Sys.info()[c("nodename","machine")])) from within the script correctly reports the node names shown in the slurm queue.
– What's strange:
The R script completes all its operations, the last one being saving a plot as pdf, which I do see and is correct. But the slurm job doesn't end, it remains in the queue indefinitely with status "running".
The slurm log shows very many lines with
Type: EXEC. I can't find any relation between their number and the number of foreach called. At the very end the log shows 19 lines with Type: DONE (which make sense to me).
– My questions:
Why does the slurm job run indefinitely after the script has finished?
Why the numerous Type: EXEC messages? are they normal?
There is some masking between packages snow and doParallel. Am I calling the right packages and in the right order?
Some answers to the StackOverflow questions mentioned above recommend to call the script with
mpirun -np 1 R --slave -f 'myscript.R'
instead of using Rscript as I did. What's the difference? Note that the problems I mentioned remain even if I call the script this way, though.
I thank you very much for your help!
Is there a simple way to trigger a crash in R? This is for testing purposes only, to see how a certain program that uses R in the background reacts to a crash and help determine if some rare problems are due to crashes or not.
The easiest way is to call C-code. C provides a standard function abort()[1] that does what you want. You need to call: .Call("abort").
As #Phillip pointed out you may need to load libc via:
on Linux, dyn.load("/lib/x86_64-linux-gnu/libc.so.6") before issuing .Call("abort"). The path may of course vary depending on your system.
on OS X, dyn.load("/usr/lib/libc.dylib")
on Windows (I just tested it on XP as I could not get hold of a newer version.) you will need to install Rtools[2]. After that you should load dyn.load("C:/.../Rtools/bin/cygwin1.dll").
There is an entire package on GitHub dedicated to this:
crash
R package that purposely crash an R session. WARNING: intended
for test.
How to install a package from github is covered in other questions.
I'm going to steal an idea from #Spacedman, but I'm giving him full conceptual credit by copying from his Twitter feed:
Segfault #rstats in one easy step:
options(device=function(){});plot(1)
reported Danger, will crash your R session.
— Barry Rowlingson (#geospacedman) July 16, 2014
As mentioned in a comment to your question, the minimal approach is a simple call to the system function abort(). One way to do this in one line is to
R> Rcpp::cppFunction('int crashMe(int ignored) { ::abort(); }');
R> crashMe(123)
Aborted (core dumped)
$
or you can use the inline package:
R> library(inline)
R> crashMe <- cfunction(body="::abort();")
R> crashMe()
Aborted (core dumped)
$
You can of course also do this outside of Rcpp or inline, but then you need to deal with the system-dependent ways of compiling, linking and loading.
I'll do this in plain C because my C++-foo isn't Dirkian:
Create a C file, segv.c:
#include <signal.h>
void crashme(){raise(SIGSEGV);}
Compile it at the command line (windows users will have to work this out for themselves):
R CMD SHLIB segv.c
In R, load and run:
dyn.load("segv.so") # or possibly .dll for Windows users
.C("crashme")
Producing a segfault:
> .C("crashme")
*** caught segfault ***
address 0x1d9e, cause 'unknown'
Traceback:
1: .C("crashme")
Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace
Selection: 1
aborting ...
Segmentation fault
This is the same behaviour as the one Thomas references in the graphics system bug report which I have filed and might get fixed one day. However this two-liner will always raise a segfault...
Maybe Dirk can one-line-Rcpp-ise it?
If you want to crash your R, try this
lapply("", function(x) eval(sys.call(1)))
(Save everything before running because this immediately results in "R Session Aborted")
Edit: This works for me on Windows 10.
I want R to shutdown my computer after my (extensive) simulation and saving results, is this possible?
Yes, look at the function shutdown in the package fun.
The flags for system command shutdown depends on your operating system, the function simply calls the appropriately flagged command.
fun::shutdown
function (wait = 0)
{
Sys.sleep(wait)
ifelse(.Platform$OS.type == "windows", shell("shutdown -s -t 0"),
system("shutdown -h now"))
}
R can send commands to the system with ?system, and so whatever is required for Windows can be done with that:
http://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/shutdown.mspx?mfr=true
R has a .Last() function controlled by quit() (or q()) with its runLast argument, so this is where you would send the shutdown commands via system, so that it occurs after quitting R. Saving objects with R is done with save or save.image, though there is a default to save as well with quit().
Executing
system("shutdown -s")
will shut your computer down. Just add it at the end of your script
I want R to shutdown my computer after my (extensive) simulation and saving results, is this possible?
Yes, look at the function shutdown in the package fun.
The flags for system command shutdown depends on your operating system, the function simply calls the appropriately flagged command.
fun::shutdown
function (wait = 0)
{
Sys.sleep(wait)
ifelse(.Platform$OS.type == "windows", shell("shutdown -s -t 0"),
system("shutdown -h now"))
}
R can send commands to the system with ?system, and so whatever is required for Windows can be done with that:
http://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/shutdown.mspx?mfr=true
R has a .Last() function controlled by quit() (or q()) with its runLast argument, so this is where you would send the shutdown commands via system, so that it occurs after quitting R. Saving objects with R is done with save or save.image, though there is a default to save as well with quit().
Executing
system("shutdown -s")
will shut your computer down. Just add it at the end of your script