Pass commands to a running R-Runtime - r

Is there a way to pass commands (from a shell) to an already running R-runtime/R-GUI, without copy and past.
So far I only know how to call R via shell with the -f or -e options, but in both cases a new R-Runtime will process the R-Script or R-Command I passed to it.
I rather would like to have an open R-Runtime waiting for commands passed to it via whatever connection is possible.

What you ask for cannot be done. R is single threaded and has a single REPL aka Read-eval-print loop which is, say, attached to a single input as e.g. the console in the GUI, or stdin if you pipe into R. But never two.
Unless you use something else as e.g. the most excellent Rserve which (when hosted on an OS other than Windoze) can handle multiple concurrent requests over tcp/ip. You may however have to write your custom connection. Examples for Java, C++ and R exist in the Rserve documentation.

You can use Rterm (under C:\Program Files\R\R-2.10.1\bin in Windows and R version 2.10.1). Or you can start R from the shell typing "R" (if the shell does not recognize the command you need to modify your path).

You could try simply saving the workspace from one session and manually loading it into the other one (or any kind of variation on this theme, like saving only the objects you share between the 2 sessions with saveRDS or similar). That would require some extra load and save commands but you could automatise this further by adding some lines in your .RProfile file that is executed at the beginning of every R session. Here is some more detailed information about R on startup. But I guess it all highly depends on what are you doing inside the R sessions. hth

Related

How can I permanently set an environment variable using Autotools?

I'm adapting an existing program to use Autotools for its build, but the resulting process depends on an environment variable. Is there a way to permanently set this environment variable during the build or installation process?
The program is intended to be used by Unix users and I could try to concatenate an export command directly to the .bashrc file and warn the user in case it fails because most of them will actually just use Ubuntu to run it (it's a relatively simple program that targets students), but I'd like to know if there's a more portable way to do this.
That's what I wouldn't like to do:
export VAR=/my/totally/not/hardcoded/path >> $HOME/.bashrc
Sorry to come to this late, but all of the answers to date are shockingly ... incomplete.
Building and installing software are both core use cases for the Autotools, and the installation part can absolutely involve adding or modifying files that affect user environments. If the software is installed by a user with sufficient privilege, then such effects can absolutely be applied to all system users, though the details may vary a bit from system to system (and the Autotools can help with that, too!).
For example, on RedHat-family Linuxes such as RedHat Enterprise, Fedora, Oracle Linux, and various others, you can drop an appropriately named file in /etc/profile.d, and the commands in it will automatically be read and executed by every login shell. Setting environment variables for all users is one of the common uses of this feature. I'm uncertain about Debian-family Linuxes such as Ubuntu, but it is always possible to modify file /etc/profile instead to have the same effect, and you absolutely can write an Automake install hook to do that.
Or for an altogether different approach, you can always provide a wrapper script around your program that sets the needed environment variables (supposing that the point is other than to add a directory to the PATH so as to find the program in the first place). In that case, you can even install the main program in a location that is not ordinarily in the path, so that users don't accidentally run it directly. This mechanism has the advantage that the environment variables are scoped to a run of the program, not a whole login session, but the disadvantage that users cannot override them.
I guess, no.
Autotools are about building your program, not about environment setup for the program to run. That's what users/admins are supposed to do. (Well I can imagine doing this, but I really don't want to try to figure it out, because the idea itself seems broken to me)
If your program REALLY needs some environment variable during run-time, then you should patch your sources for your application to test if the variable exists, and set one to default desired value, if it doesn't. Another idea is to enforce usage of an obligatory command line switch to pass the value in.
It's not clear what this has to do with autotools (or any other build system). No build system, by itself, can arrange for an env var to be present when the program it builds is run at a later tiem.
One solution is for your program to have a hardcoded default value for the var which is used if the environment var isn't present when the program starts running. Another frequently used solution is to name your binary something like myprog.bin and install a shell script named myprog which sets up the environment before doing exec myprog.bin.
I'm adapting an existing program to use Autotools for its build, but the resulting process depends on an environment variable. Is there a way to permanently set this environment variable during the build or installation process?
You've not been very concrete about what the program is (e.g. is the program a daemon? A user program?) or the nature of the environment variable dependency (e.g. is it another program? A mount point? A URL? A DB connection string?). Being more specific might give a better answer for you.
Anyway, autotools is not likely to offer any feature to help: It's a build system. Depending on the nature of your environment variable dependency, you're likely going to need package management (if you package it) or system administration level setup.
Since you think your primary user base is on Ubuntu this help page might give you some ideas.

use a batch script to run an R script interactively in the R shell

I can run an R script non-interactively from a batch file and save the output.
I can open an interactive R session using a batch file.
I can open R AND run a script from a batch file by doing one, then the other (but then the variables from the script are not available in the interactive session).
What I have not been able to do is to use a batch file to open an interactive R session and run a script from within that interactive session. How can I do this?
Regarding workarounds:
The purpose is to use the script to load and process a very large dataset daily, and then have that available for use during an interactive session in R, so using a non-interactive session is not an option for this task. Likewise I am aware that I can run an interactive session using the Windows command prompt but for a variety of reasons I do not want to do that. I want everything loaded into the R shell for use there. I realize Task Scheduler would be useful for this, but unfortunately, I do not have permissions to modify Task Scheduler. I am only allowed to use a batch file which is scheduled then by IT.
I apologize if I simply lack the vocabulary to search for the answer to my question effectively and welcome answers directing me to previous questions.
Thank you.
One possibility would be to create a Rprofile.site script that sources the setup file that loads the data. Then, from your batch script set the R_PROFILE variable for this session to point to this Rprofile.site script.
In your batch script,
#echo off
set R_PROFILE=C:\path\to\Rprofile.site
Rterm.exe
In Rprofile.site
.First <- function() {
print("Loading some stuff")
source("setup.R")
}
In setup.R
dat <- data.frame(x=1:10)
I guess you could do with the setup.R file and put it all in the .First function as well.

How to easily execute R commands on remote server?

I use Excel + R on Windows on a rather slow desktop. I have a full admin access to very fast Ubuntu-based server. I am wondering: how to remotely execute commands on the server?
What I can do is to save the needed variables with saveRDS, and load them on server with loadRDS, execute the commands on server, and then save the results and load them on Windows.
But it is all very interactive and manual, and can hardly be done on regular basis.
Is there any way to do the stuff directly from R, like
Connect with the server via e.g. ssh,
Transfer the needed objects (which can be specified manually)
Execute given code on the server and wait for the result
Get the result.
I could run the whole R remotely, but then it would spawn a network-related problems. Most R commands I do from within Excel are very fast and data-hungry. I just need to remotely execute some specific commands, not all of them.
Here is my setup.
Copy your code and data over using scp. (I used github, so I clone my code from github. This has the benefit of making sure that my work is reproducible)
(optional) Use sshfs to mount the remote folder on your local machine. This allows you to edit the remote files using your local text editor instead of ssh command line.
Put all things you want to run in an R script (on the remote server), then run it via ssh in R batch mode.
There are a few options, the simplest is to exchange secure keys to avoid entering SSH/SCP passwords manually all the time. After this, you can write a simple R script that will:
Save necessary variables into a data file,
Use scp to upload the data file to ubuntu server
Use ssh to run remote script that will process the data (which you have just uploaded) and store the result in another data file
Again, use scp command to transfer the results back to your workstation.
You can use R's system command to run scp and ssh with necessary options.
Another option is to set up cluster worker at the remote machine, then you can export the data using clusterExport and evaluate expressions using clusterEvalQ and clusterApply.
There are a few more options:
1) You can do the stuff directly from R by using Rserve. See: https://rforge.net/
Keep in mind that Rserve can accept connections from R clients, see for example how to connect to Rserve with an R client.
2) You can set up cluster on your linux machine and then use these cluster facilities from your windows client. The simplest is to use Snow, https://cran.r-project.org/package=snow, also see foreach and many other cluster libraries.

Can my CGI call R?

I know barely more than zero about R: until yesterday I didn't know how to spell it. But I'm suicidal: for my web site, I'm thinking about letting a visitor type in an R "program" ( is it even called a "program") and then, at submit time, blindly calling the R interpreter from my CGI. I'd then return the interpreter's output to the visitor.
Does this make sense? Or does it amount to useless noise?
If it's workable, what are the pitfalls in this approach? For example, what are the security issues, if any? Is it possible to make R crash, killing my CGI program? Do I have to clean up the R code before calling the interpreter? And the like.
you could take a look to Rserve which allows to execute R scripts via the TCP/IP interface available in PHP for example if I'm not mistaken.
Its just asking for trouble to let people run arbitrary R code on your server. You could try running it in a chroot jail, but these things can be broken out of. Even in a chroot, the R process could delete or alter files, or spawn a long-running process, or download a file to your server, and all manner of nastiness.
You might look at Rweb, which has exactly this behavior: http://www.math.montana.edu/Rweb/
Since you can read and write files in R, it would not be safe to let people run arbitrary R code at your server. I would look if R has something like PHP's safe mode... If not, and if you are root, you can try to run R under user nobody in a chroot (you must also place there packages and libraries - for readonly access, and some temporary directory for RW access).

monitoring for changes in file(s) in real time

I have a program that monitors certain files for change. As soon as the file gets updated, the file is processed. So far I've come up with this general approach of handing "real time analysis" in R. I was hoping you guys have other approaches. Maybe we can discuss their advantages/disadvantages.
monitor <- TRUE
start.state <- file.info$mtime # modification time of the file when initiating
while(monitor) {
change.state <- file.info$mtime
if(start.state < change.state) {
#process
} else {
print("Nothing new.")
}
Sys.sleep(sleep.time)
}
Similar to the suggestion to use a system API, this can be also done using qtbase which will be a cross-platform means from within R:
dir_to_watch <- "/tmp"
library(qtbase)
fsw <- Qt$QFileSystemWatcher()
fsw$addPath(dir_to_watch)
id <- qconnect(fsw, "directoryChanged", function(path) {
message(sprintf("directory %s has changed", path))
})
cat("abc", file="/tmp/deleteme.txt")
If your system provides an API for monitoring filesystem changes, then you should use that. I believe Macs come with this. Not sure about other platforms though.
Edit:
A quick goog gave me:
Linux - http://wiki.linuxquestions.org/wiki/FAM
Win32 - http://msdn.microsoft.com/en-us/library/aa364417(VS.85).aspx
Obviously, these APIs will eliminate any polling that you require. On the other hand, they may not always be available.
Java has this: http://jnotify.sourceforge.net/ and http://java.sun.com/developer/technicalArticles/javase/nio/#6
I have a hack in mind: you can setup a CRON job/Scheduled task to run R script every n seconds (or whatever). R script checks the file hash, and if hashes don't match, runs the analysis. You can use digest::digest function, just check out the manual.
If you have lots of files that you want to monitor, then R may be too slow for this purpose. Go to your c: or / dir and see how long it takes to do file.info(dir(recursive = TRUE)). A dos or bash script may be quicker.
Otherwise, the code looks fine.
You could use the tclTaskSchedule function in the tcltk2 package to set up a function that checks for updates and runs your code. This would then be run on a regular basis (you set the timing) but would still allow you to use your R session.
I'll offer another solution for windows that I have been using in a production environment that works perfectly and that I find very easy to set up and, under the hood it basically accesses the system API for monitoring folder changes as others have mentioned, but all the "hard work" is taken care of for you. I use a freely available piece of software called Folder Monitor by Nodesoft and well described here. Once you execute this program, it appears in your system tray and from there you can specify a given directory to monitor. When files are written to the directory (or changed or modified - there are a few options from which you can choose), the program executes any program you like. I simply link the program to a windows batch that that calls my R Script. So for example, I have Folder Monitor set up to monitor a "\myservername\DropOff" UNC path for any new data files written to it. When Folder Monitor detects new files, it executes RunBatch.bat file that simply runs an R script (see here for information on setting that up) that validates the format of the expected file based on an expected naming convention for files received and then it unzips and processes the data, creating a dataframe and ultimately loads that into a SQL Server Database. It just doesn't get any easier.
One note if you decide to use this solution: take a look at the optional delay execution parameter, which might be important if files take a while to copy into the target directory from the source location.

Resources