Execution of custom command exiting R session - r

Is it possible to initiate a command, when exiting an R session, similar to the commands in the .Rprofile file, but just on leaving the session.
I know of course, that a .RData file can be stored automatically, but since I am often switching machines, which might have different storage settings it would be easier to execute a custom save.image() command per session.

The help for q can give some hints. You can either create a function called .Last or register a finalizer on an environment to run on exit.
> reg.finalizer(.GlobalEnv,function(e){message("Bye Bye")},onexit=TRUE)
> q()
Save workspace image? [y/n/c]: n
Bye bye!
You can register the finalizer in your R startup (eg .RProfile) if you want it to be fairly permanent.
[edit: previously I registered the finalizer on a new environment, but that means keeping this object around and not removing it, because garbage collection would trigger the finalizer. As I've now written it the finalizer is hooked onto the Global Environment, which shouldn't get garbage collected during normal use).]

Related

R not responding request to interrupt stop process

Every now and then I have to run a function that takes a lot of time and I need to interrupt the processing before it's complete. To do so, I click on the red sign of "stop" at the top of the console in Rstudio, which quite often returns this message below:
R is not responding to your request to interrupt processing so to stop the current operation you may need to terminate R entirely.
Terminating R will cause your R session to immediately abort. Active computations will be interrupted and unsaved source file changes and workspace objects will be discarded.
Do you want to terminate R now?
The problem is that I click "No" and then Rstudios seems to freeze completely. I would like to know if others face a similar issue and if there is any way to get around this.
Is there a way to stop a process in Rstudio quickly without loosing the objects in the workspace?
Unfortunately, RStudio is currently not able to interrupt R in a couple situations:
R is executing an external program (e.g. you cannot interrupt system("sleep 10")),
R is executing (for example) a C / C++ library call that doesn't provide R an opportunity to check for interrupts.
In such a case, the only option is to forcefully kill the R process -- hopefully this is something that could change in a future iteration of RStudio.
EDIT: RStudio v1.2 should now better handle interrupts in many of these contexts.
This could happen when R is not working within R and is invoking an external library call. The only option is to close the project window. Fortunately, unsaved changes including objects are retained on opening RStudio again.

Delaying part of an R script inside of a loop

I'm executing a batch file inside an R script. I'd like to run this and another large section of the R script twice using a foreach loop.
foreach (i=1:2, .combine = rbind)%do%{
shell.exec("\\\\network\\path\\to\\batch\\script.ext")
*rest of the R script*
}
One silly problem though is that this batch file generates data and that data is connected to SQL Server localdb inside the loop. I thought at first that the script would execute the batch file, wait for it to finish and then move on. However, (seems obvious in hindsight) the script instead executes the batch file, tries to grab data that hasn't been created yet (because the file isn't finished running) and the executes the batch file again before it finishes the first time.
I've been trying to find away to delay the rest of the script from executing until the batch script has finished executing but have not come up with anything yet. I'd appreciate any insights anyone has.
Use system2 instead of shell.exe. system2 calls are blocking — meaning, the function waits until the external program has finished running. On most systems, this can be used directly to run scripts. On Windows, you may have to invoke rundll32 to execute a script:
cmd = c('rundll32.exe', 'Shell32.dll,ShellExecute', 'NULL', 'open', scriptpath)
system2(paste(shQuote(cmd), collapse = ' '))
Windows users may use shell, which by default has wait=TRUE, which will cause R to wait for its completion. You may choose whether or not to directly "intern" the result.
On unix-like systems, use system, which also defaults to wait=TRUE.
If your batch file simply launches another process and terminates, then it may need to be modified to either wait for completion or return a suitable process or file indicator that can be monitored.

Programs have to access to global variables in package

I've a package with global variables related to a file open
(*os.File), and its logger associated.
By another side, I'll build several commands that are going to use
that package and I want not open the file to set it as logger every
time I run a command.
So, the first program to run will set the global variables, and here
is my question:
Do the next programs to use the package can access to those global
variables without problem? It could be created a command with a flag
to initialize those values before of be used by another programs, and
another flag to finish it (unset the global variables in the package).
If that is not possible, which is the best option to avoid such IO-bound? To use a server in Unix sockets?
Assuming by 'program' you actually mean 'process', the answer is no.
If you want to share a (customized perhaps) logging functionality between processes then I would consider a daemon-like (Go doesn't yet AFAIK support writing true daemons) process/server and any kind of IPC you see handy.

have R halt the EC2 machine it's running on

I have a few work flows where I would like R to halt the Linux machine it's running on after completion of a script. I can think of two similar ways to do this:
run R as root and then call system("halt")
run R from a root shell script (could run the R script as any user) then have the shell script run halt after the R bit completes.
Are there other easy ways of doing this?
The use case here is for scripts running on AWS where I would like the instance to stop after script completion so that I don't get charged for machine time post job run. My instance I use for data analysis is an EBS backed instance so I don't want to terminate it, simply suspend. Issuing a halt command from inside the instance is the same effect as a stop/suspend from AWS console.
I'm impressed that works. (For anyone else surprised that an instance can stop itself, see notes 1 & 2.)
You can also try "sudo halt", as you wouldn't need to run as a root user, as long as the user account running R is capable of running sudo. This is pretty common on a lot of AMIs on EC2.
Be careful about what constitutes an assumption of R quitting - believe it or not, one can crash R. It may be better to have a separate script that watches the R pid and, once that PID is no longer active, terminates the instance. Doing this command inside of R means that if R crashes, it never reaches the call to halt. If you call it from within another script, that can be dangerous, too. If you know Linux well, what you're looking for is the PID from starting R, which you can pass to another script that checks ps, say every 1 second, and then terminates the instance once the PID is no longer running.
I think a better solution is to use the EC2 API tools (see: http://docs.amazonwebservices.com/AWSEC2/latest/APIReference/ for documentation) to terminate OR stop instances. There's a difference between the two of these, and it matters if your instance is EBS backed or S3 backed. You needn't run as root in order to terminate the instance - the fact that you have the private key and certificate shows Amazon that you're the BOSS, way above the hoi polloi who merely have root access on your instance.
Because these credentials can be used for mischief, be careful about running API tools from a given server, you'll need your certificate and private key on the server. That's a bad idea in the event that you have a security problem. It would be better to message to a master server and have it shut down the instance. If you have messaging set up in any way between instances, this can do all the work for you.
Note 1: Eric Hammond reports that the halt will only suspend an EBS instance, so you still have storage fees. If you happen to start a lot of such instances, this can clutter things up. Your original question seems unclear about whether you mean to terminate or stop an instance. He has other good advice on this page
Note 2: A short thread on the EC2 developers forum gives advice for Linux & Windows users.
Note 3: EBS instances are billed for partial hours, even when restarted. (See this thread from the developer forum.) Having an auto-suspend close to the hour mark can be useful, assuming the R process isn't working, in case one might re-task that instance (i.e. to save on not restarting). Other useful tools to consider: setTimeLimit and setSessionTimeLimit, and various checkpointing tools (I have a Q that mentions a couple). Using an auto-kill is useful if one has potentially badly behaved code.
Note 4: I recently learned of the shutdown command in package fun. This is multi-platform. See this blog post for commentary, and code is here. Dangerous stuff, but it could be useful if you want to adapt to Windows. I haven't tried it, though.
Update 1. Three more ideas:
You could use .Last() and runLast = TRUE for q() and quit(), which could shut down the instance.
If using littler or a script that invokes the script via Rscript, the same command line functions could be used.
My favorite package of today, tcltk2 has a neat timer mechanism, called tclTaskSchedule() that can be used to schedule the execution of an expression. You could then go crazy with the execution of stuff just before a hourly interval has elapsed.
system("echo 'rootpassword' | sudo halt")
However, the downside is having your root password in plain text in the script.
AFAIK those ways you mentioned are the only ones. In any case the script will have to run as root to be able to shut down the machine (if you find a way to do it without root that's possibly an exploit). You ask for an easier way but system("halt") is just an additional line at the end of your script.
sudo is an option -- it allows you to run certain commands without prompting for any password. Just put something like this in /etc/sudoers
<username> ALL=(ALL) PASSWD: ALL, NOPASSWD: /sbin/halt
(of course replacing with the name of user running R) and system('sudo halt') should just work.

How can I wrap an executable on UNIX (SunOS) so that it is never run more than once at the same time?

I have an executable (no source) that I need to wrap, to make sure that it is not called more than once at a time. I immediately think of some sort of queue wrapper, but how do I actually make it so that my wrapper is called instead of the executable itself? Is there a better way to do this? The solution needs to be invisible because the users are other applications. Any information/recommendations are appreciated.
Method 1: Put the executable in some location not in the standard path. Create a shell script that checks a sentinel file and, if the sentinel file is absent, executes the program, waits for the ptogram to complete, then deletes the sentinel file. If the sentinel file is present, the script will enter a loop with a short delay (1 second? How long is the standard execution of this program? Take that and half it), check the sentential file again, and so on.
Method 2: Create a separate program that does the same thing as the script, but using a system-level semaphore or lock instead. You could even simply use a read/write lock on a file. The program would do a fork() and exec() on the real program, waiting for child exit before clearing the sentinel.
If the users are other applications, you can just rename the executable (e.g. name -> name.real) and call the wrapper with the original name. To make sure that it's only called once at a time, you can use the pidof command (e.g. pidof name.real) to check if the program is running already (pidof actually gives you the PID of the running process, so that you can use stuff such as kill or whatever to send signals to it).

Resources