Kill a calculation programme after user defined time in R - r

Say my executable is c:\my irectory\myfile.exe and my R script calls on this executeable with system(myfile.exe)
The R script gives parameters to the executable programme which uses them to do numerical calculations. From the ouput of the executable, the R script then tests whether the parameters are good ore not. If they are not good, the parameters are changed and the executable rerun with updated parameters.
Now, as this executable carries out mathematical calculations and solutions may converge only slowly I wish to be able to kill the executable once it has takes to long to carry out the calculations (say 5 seconds)
How do I do this time dependant kill?
PS:
My question is a little related to this one: (time non dependant kill)
how to run an executable file and then later kill or terminate the same process with R in Windows

You can add code to your R function which issued the executable call:
setTimeLimit(elapse=5, trans=T)
This will kill the calling function, returning control to the parent environment (which could well be a function as well). Then use the examples in the question you linked to for further work.
Alternatively, set up a loop which examines Sys.time and if the expected update to the parameter set has not taken place after 5 seconds, break the loop and issue the system kill command to terminate myfile.exe .

There might possibly be nicer ways but it is a solution.
The assumption here is, that myfile.exe successfully does its calculation within 5 seconds
try.wtl <- function(timeout = 5)
{
y <- evalWithTimeout(system(myfile.exe), timeout = timeout, onTimeout= "warning")
if(inherits(y, "try-error")) NA else y
}
case 1 (myfile.exe is closed after successfull calculation)
g <- try.wtl(5)
case 2 (myfile.exe is not closed after successfull calculation)
g <- try.wtl(0.1)
MSDOS taskkill required for case 2 to recommence from the beginnging
if (class(g) == "NULL") {system('taskkill /im "myfile.exe" /f',show.output.on.console = FALSE)}
PS: inspiration came from Time out an R command via something like try()

Related

Run command with `system2(..., wait = FALSE)` in R and kill it later

Assume I want to start a local server from R, access it some times for calculations and kill it again at the end. So:
## start the server
system2("sleep", "100", wait = FALSE)
## do some work
# Here I want to kill the process started at the beginning
This should be cross platform, as it is part of a package (Mac, Linux, Windows, ...).
How can I achieve this in R?
EDIT 1
The command I have to run is a java jar,
system2("java", "... plantuml.jar ...")
Use the processx package.
proc <- processx::process$new("sleep", c("100"))
proc
# PROCESS 'sleep', running, pid 431792.
### ... pause ...
proc
# PROCESS 'sleep', running, pid 431792.
proc$kill()
# [1] TRUE
proc
# PROCESS 'sleep', finished.
proc$get_exit_status()
# [1] 2
The finished output is just an indication that the process has exited, not whether it was successful or erred. Use the exit status, where 0 indicates a good exit.
proc <- processx::process$new("sleep", c("1"))
Sys.sleep(1)
proc
# PROCESS 'sleep', finished.
proc$get_exit_status()
# [1] 0
FYI, base R's system and system2 work fine when you don't need to easily kill it and don't have any embedded spaces in the command or its arguments. system2 appears like it should be better at embedded spaces (since it accepts a vector of arguments), but under the hood all it's doing is
command <- paste(c(shQuote(command), env, args), collapse = " ")
and then
rval <- .Internal(system(command, flag, f, stdout, stderr, timeout))
which does nothing to protect the arguments.
You said "do some work", which suggests that you need to pass something to/from it periodically. processx does support writing to the standard input of the background process, as well as capturing its output. Its documentation at https://processx.r-lib.org/ is rich with great examples on this, including one-time calls for output or a callback function.

Julia Distributed, redundant iterations appearing

I ran
mpiexec -n $nprocs julia --project myfile.jl
on a cluster, where myfile.jl has the following form
using Distributed; using Dates; using JLD2; using LaTeXStrings
#everywhere begin
using SharedArrays; using QuantumOptics; using LinearAlgebra; using Plots; using Statistics; using DifferentialEquations; using StaticArrays
#Defining some other functions and SharedArrays to be used later e.g.
MySharedArray=SharedArray{SVector{Nt,Float64}}(Np,Np)
end
#sync #distributed for pp in 1:Np^2
for jj in 1:Nj
#do some stuff with local variables
for tt in 1:Nt
#do some stuff with local variables
end
end
MySharedArray[pp]=... #using linear indexing
println("$pp finished")
end
timestr=Dates.format(Dates.now(), "yyyy-mm-dd-HH:MM:SS")
filename="MyName"*timestr
#save filename*".jld2"
#later on, some other small stuff like making and saving a figure. (This does give an error "no method matching heatmap_edges(::Surface{Array{Float64,2}}, ::Symbol)" but I think that this is a technical thing about Plots so not very related to the bigger issue here)
However, when looking at the output, there are a few issues that make me conclude that something is wrong
The "$pp finished" output is repeated many times for each value of pp. It seems that this amount is actually equal to 32=$nprocs
Despite the code not being finished, "MyName" files are generated. It should be one, but I get a dozen of them with different timestr component
EDIT: two more things that I can add
the output of the different "MyName" files is not identical, but this is expected since random numbers are used in the inner loops. There are 28 of them, a number that I don't easily recognize except that its again close to the 32 $nprocs
earlier, I wrote that the walltime was exceeded, but this turns out not to be true. The .o file ends with "BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES ... EXIT CODE :9", pretty shortly after the last output file.
$nprocs is obtained in the pbs script through
#PBS -l select=1:ncpus=32:mpiprocs=32
nprocs= `cat $PBS_NODEFILE|wc -l`
As pointed out by adamslc on the Julia discourse, the proper way to use Julia on a cluster is to either
Start a session with one core from the job script, add more with addprocs() in the Julia script itself
Use more specialized Julia packages
https://discourse.julialang.org/t/julia-distributed-redundant-iterations-appearing/57682/3

Opening a new instance of R and sourcing a script within that instance

Background/Motivation:
I am running a bioinformatics pipeline that, if executed from beginning to end linearly takes several days to finish. Fortunately, some of the tasks don't depend upon each other so they can be performed individually. For example, Task 2, 3, and 4 all depend upon the output from Task 1, but do not need information from each other. Task 5 uses the output of 2, 3, and 4 as input.
I'm trying to write a script that will open new instances of R for each of the three tasks and run them simultaneously. Once all three are complete I can continue with the remaining pipeline.
What I've done in the past, for more linear workflows, is have one "master" script that sources (source()) each task's subscript in turn.
I've scoured SO and google and haven't been able to find a solution for this particular problem. Hopefully you guys can help.
From within R, you can run system() to invoke commands within a terminal and open to open a file. For example, the following will open a new terminal instance:
system("open -a Terminal .",wait=FALSE)
Similarly, I can start a new r session by using
system("open -a r .")
What I can't figure out for the life of me is how to set the "input" argument so that it sources one of my scripts. For example, I would expect the following to open a new terminal instance, call r within the new instance, and then source the script.
system("open -a Terminal .",wait=FALSE,input=paste0("r; source(\"/path/to/script/M_01-A.R\",verbose=TRUE,max.deparse.length=Inf)"))
Answering my own question in the event someone else is interested down the road.
After a couple of days of working on this, I think the best way to carry out this workflow is to not limit myself to working just in R. Writing a bash script offers more flexibility and is probably a more direct solution. The following example was suggested to me on another website.
#!/bin/bash
# Run task 1
Rscript Task1.R
# now run the three jobs that use Task1's output
# we can fork these using '&' to run in the background in parallel
Rscript Task2.R &
Rscript Task3.R &
Rscript Task4.R &
# wait until background processes have finished
wait %1 %2 %3
Rscript Task5.R
You might be interested in the future package (I'm the author). It allows you to write your code as:
library("future")
v1 %<-% task1(args_1)
v2 %<-% task2(v1, args_2)
v3 %<-% task3(v1, args_3)
v4 %<-% task4(v1, args_4)
v5 %<-% task5(v2, v3, v4, args_5)
Each of those v %<-% expr statements creates a future based on the R expression expr (and all of it's dependencies) and assigns it to a promise v. It is only when v is used, it will block and wait for the value v to be available.
How and where these futures are resolved is decided by the user of the above code. For instance, by specifying:
library("future")
plan(multiprocess)
at the top, then the futures (= the different tasks) are resolved in parallel on your local machine. If you use,
plan(cluster, workers = c("n1", "n3", "n3", "n5"))
they're resolved on those for machine (where n3 accepts two concurrent jobs).
This works on all operating systems (including Windows).
If you have access to a HPC compute with schedulers such as Slurm, SGE, and TORQUE / PBS, you can use the future.BatchJobs package, e.g.
plan(future.BatchJobs::batchjobs_torque)
PS. One reason for creating future was to do large-scale Bioinformatics in parallel / distributed.

I want to run a R code in batch and wait for user input

I want to run a program in batch and I want it to wait for the user input in the program.
This is what I have:
library('XLConnect')
FolderPath<-"C:/Documents/Testing/"
load(paste('C:/Documents/Testing/Program.ml',sep=''));Run()
This code is named Test.R, and it calls a compiled file that runs a function that does what I need. I need to have some inputs in this exercise, so I wanted to ask for the user to input some dates.
The program should do something like this:
What are the years for this simulation?
Then the user enter:
>2001, 2002, 2003, 2004
The program will save this vector in variable, lets call y.
Then, when I load the compiled function, I will use y.
The problem is that I run this R code in cmd batch.
Yukio, try the function readline (documentation here) as in the following example, in which a user inputs two years separated by a comma.
> years = readline('What are the years for this simulation? ')
What are the years for this simulation? 2001, 2002
> years = as.numeric(unlist(strsplit(years, ",")))
> years
[1] 2001 2002
By the way, your English is great!
As far as I understand (someone please correct me if this is wrong), but you can't send messages to the user at the command line when executing a script with R CMD BATCH (it writes all input and output to an .Rout file). But executing a script with Rscript.exe does in fact let you send stuff to stdout. However, in order to get input from a user when executing with Rscript.exe, you need to use file('stdin') as the connection to your input function, which, funny enough, won't work if you then try to run the script from the R interpreter with the source function. Here's an example that showing you how to prompt the user for a series of comma separated years, regardless of whether the script runs with Rscript.exe or the source function in interactive mode.
con <- if (interactive()) stdin() else file('stdin')
message('what are the years')
years <- scan(file=con, sep=',', nlines=1, quiet=TRUE)
print(years)
You can deal easily with user inputs in a batch file not in the R.
You create a batch file , say launcher.bat , like this for example
ECHO I am Rscript launcher
set /p years= What are the years for this simulation?
cd R_SCRIPT_PATH
Rscript youscript.R %years%
The user can write as many years he want, it will go in the variable years. Then in your script you parse this variable to set it as a valid list of years.

How to successfully interrupt R from tcltk scripting widget?

I am trying to create a scripting widget for R using the tcltk package. But I don't know how to to create a STOP button to interrupt a script coming from the widget. Basically, I would like to have a button, a menu option, and/or a key binding that will interrupt the current script execution, but I can't figure out how to make it work.
One (non-ideal) strategy is to just use the RGui STOP button (or <ESC> or <Ctrl-c> on the console), but this seems cause the tk widget to hang permanently.
Here's a minimal example of the widget based on the tcl/tk examples (http://bioinf.wehi.edu.au/~wettenhall/RTclTkExamples/evalRcode.html):
require(tcltk)
tkscript <- function() {
tt <- tktoplevel()
txt <- tktext(tt, height=10)
tkpack(txt)
run <- function() {
code <- tclvalue(tkget(txt,"0.0","end"))
e <- try(parse(text=code))
if (inherits(e, "try-error")) {
tkmessageBox(message="Syntax error", icon="error")
return()
}
print(eval(e))
}
tkbind(txt,"<Control-r>",run)
}
tkscript()
In the scripting widget if you try executing Sys.sleep(20) and then interrupt from the console, the widget hangs. The same thing happens if one were to run, for example, an infinite loop, like while(TRUE) 2+2.
I think what I'm experiencing may be similar to the bug reported here: https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=14730
Also, I should mention that I'm running this on R 3.0.0 on Windows (x64), so maybe the problem is platform-specific.
Any thoughts on how to interrupt the running script without causing the widget to hang?
It depends on what the script is doing; a script that is sitting waiting for the user to do something is easy to interrupt (since you can make it listen for your interruption message) but a script that is doing an intensive loop is rather more tricky. The possible solutions depend on the version of Tcl inside.
Interpreter Cancellation — Requires 8.6
If you are using Tcl 8.6, you can use interpreter cancellation to stop the script. All you have to do is arrange for:
interp cancel -unwind
to be run, and the script will return control back to you. A reasonable way of doing this would be to use the extra Tcl package TclX (or Expect) to install a signal handler that will run the command when a signal is received:
package require Tcl 8.6
package require TclX
# Our signal handler
proc doInterrupt {} {
# Print a message so you can see what's happening
puts "It goes boom!"
# Unwind the stack back to the R code
interp cancel -unwind
}
# Install it...
signal trap sigint doInterrupt
# Now evaluate the code which might try to run forever
Adding signal handling in earlier versions is possible, but not quite as easy as you can't guarantee that things will return control to you so easily; the stack unwinding isn't there.
Execution Time Limiting — Requires 8.5 (or 8.6)
The other thing you could try is setting an execution time limit on a slave interpreter and running the user script in that slave. The time limit machinery will then guarantee a trap back to you every so often, giving you a chance to check for interruption and a way to do the stack unwinding. This is a considerably more complex method.
proc nextSecond {} {
clock add [clock seconds] 1 second
}
interp create child
proc checkInterrupt {} {
if {["decide if the R code wanted an interrupt"]} {
# Do nothing
return
}
# Reset the time limit to another second ahead
interp limit child time -seconds [nextSecond]
}
interp limit child time -seconds [nextSecond] -command checkInterrupt
interp eval child "the user script"
Think of this mechanism being a lot like how an operating system works, and yes, it can stop a tight loop.
Use a Subprocess — Any version of Tcl
The most portable mechanism is to run the script in a subprocess (with the tclsh program; the exact name varies by version, platform and distribution, but it's all variations on that) and just kill off that subprocess when it is no longer wanted with pskill. The downside of this is that you cannot (easily) carry any state over from one execution to another; subprocesses are pretty isolated from each other. The other methods described above can leave the state able to be accessed from another run: they do a real interrupt, whereas this destroys.
Also, I don't know exactly how to start the subprocess in such a way that you can communicate with it from R while it is still running; system and system2 don't seem to quite give enough control, and hacking something with forking is non-portable. Needs an R expert here. Alternatively, use a Tcl script (running inside the R process) to do it with:
set executable "tclsh"; # Adjust this line
set scriptfile "file/where/you/put/the_user/script.tcl"
# Open a bi-directional pipe to talk to the subprocess
set pipeline [open |[list $executable $scriptfile] "r+"]
# Get the subprocess's PID
set thePID [pid $pipeline]
That is actually reasonably portable to Windows (if not perfectly so) but intermediate states with forking are not.
I seem to have found a solution to prevent the tk widget from hanging by embedding the eval in a tryCatch that handles interrupt conditions. Unfortunately, it requires interruption from the console rather than the widget, but it does work. tryCatch is pretty poorly documented, so I'm putting this out here in case anyone else has similar needs in the future.
require(tcltk)
tkscript <- function() {
tt <- tktoplevel()
txt <- tktext(tt, height=10)
tkpack(txt)
run <- function() {
code <- tclvalue(tkget(txt,"0.0","end"))
e <- try(parse(text=code))
if (inherits(e, "try-error")) {
tkmessageBox(message="Parse error", icon="error")
tkfocus(txt)
return()
}
e <- tryCatch(eval(e),
error = function(errmsg)
tkmessageBox(message=as.character(errmsg), icon="error"),
interrupt = function(errmsg)
tkmessageBox(message=as.character(errmsg), icon="error")
)
print(eval(e))
}
tkbind(txt,"<Control-r>",run)
}
tkscript()
Another strategy I stumbled across is using tools::pskill (in the form pskill(Sys.getpid(), SIGINT) as a tk menu option) to interrupt the process, but - at least on Windows - this terminates the entire R process (including the tk widget). So, that's not a great solution but at least seems to exit everything as an absolute fallback.

Resources