Question regarding RStudio. Suppose I am running a code in the console:
> code1()
assume that code1() prints nothing on the console, but code1() above takes an hour to complete. I want to work on something else while I wait for code1(). is it possible? Is there a function like runInBackground which I can use as follows
> runInBackground(code1())
> code2()
The alternatives are running two RStudios or writing a batch file that uses Rscript to run code1(), but I wanted to know if there is something easier that I can do without leaving the RStudio console. I tried to browse through R's help documentation but didn't come up with anything (or may be I didn't use the proper keywords).
The future package (I'm the author) provides this:
library("future")
plan(multisession)
future(code1())
code2()
FYI, if you use
plan(cluster, workers = c("n1", "n3", "remote.server.org"))
then the future expression is resolved on one of those machines. Using
plan(future.BatchJobs::batchjobs_slurm)
will cause it to be resolved via a Slurm job scheduler queue.
This question is closely related to Run asynchronous function in R
You can always do this, which is not ideal but works for most purposes:
shell(cmd = 'Rscript.exe some_script.R', wait=FALSE)
RStudio as of version 1.2 provides this feature. To run a script in the background select "Start Job" in the "Jobs" panel. You also have the option of copying the background job result into the working environment.
The mcparallel() function in the parallel package will do the trick, if you are on Linux, that is ...
library(parallel)
Job1 = mcparallel(code1())
JobResult1 = mccollect(Job1)
Related
I wrote an R function that updates the version number of a package in another question. I work a lot with GitHub and RStudio, and it would safe me quite some time (plus be much more precise) if this function was automatically run every time I opened a certain project (or better yet, make a git commit/push, but I assume that is harder to do). But I don't know how to do this or if this is even possible.
I could use .Rprofile to run R codes every time I start R, so I could just update versions whenever I start R (or build in that it only updates the version if the date is not today or something) but that seems overdoing it.
You can make a separate .Rprofile for each project. You have to put it in the main directory of the project (http://www.rstudio.com/ide/docs/using/projects).
Well I would use .Rprofile for that. There is something to be said for being independent of the tool chain around you: knitr works from RStudio as well as without it, dito for Rcpp/RInside etc pp.
You can hook into commit hooks for svn, both explicitly via hooks in the back end, or simply at your by end adding wrapper scripts. I presume you can do likewise with git but I simply know much less about it. So to abstract this away, I would write myself a 'commitThis' or 'pushThis' or ... function that does the number increment, test run, code push and what have you.
If your code needs RStudio to be already running (e.g. because it's relying on some rstudioapi:: function), putting it directly in .Rprofile won't work (.Rprofile is executed before RStudio is available).
Instead, you could set a hook for "rstudio.sessionInit":
setHook(
hookName = "rstudio.sessionInit",
action = function(newSession) {
if (newSession) {
# your code goes here
},
action = "append"
)
I'm running R script in OpenCPU/Docker and looking for the way how I can run the script that will create background process and returning back the message that the process has started. Something like this:
fun1<-function(arg1,arg2){
#calling different function to run in background, using arg1/arg2, no need
#of result now, I will check the result through the file later
fun2(arg1,arg2)
return('Process has started')
}
How can I run the fun2 in the background? All functions are inside the package and I have tried Rscript but with no success. Any ideas or packages for this? I believe a lot of people already went thru this.
I need to run (several times) my R script (script.R), which basically looks like this:
library(myLib)
cmd = commandArgs(TRUE)
args=myLib::parse.cmd(cmd)
myLib::exec(args)
myLib is my own package, which load some dependencies (car, minpack.lm, plyr, ggplot2). The time required for loading libraries is comparable with the time of myLib::exec, so I'm looking for a method which helps me not to load them every time I call Rscript script.R
I know about Rserve, but it looks like a little bit overkill, though it could do exactly what I need. Is there any other solutions?
P.S: I call script.R from JVM using Scala.
Briefly:
on startup you need to load your libraries
if you call repeatedly and start repeatedly you repeatedly load the libraries
you already mentioned a stateful solution (Rserve) which allows you start it once but connect and eval multiple times
so I think you answered your question.
Otherwise, I enjoy littler and have shown how it starts faster than either R or Rscript -- but the fastest approach is simply not to restart.
I tried littlr, seems amazing, but don't want to work on R v4.0.
Rserve seems cool but like you pointed out it seems to be an overkill.
I end up limiting the import to the functions I need.
For example:
library(dplyr, include.only = c("select", "mutate","group_by", "summarise", "filter" , "%>%", "row_number", 'left_join', 'rename') )
I'm trying to call a simple python script from within R using system2(). I've read some information I found vague that said if 'too much' memory is used, it won't work.
If I load a large dataset and use some information in it to use as arguments to pass into system2(), it will only work if I manually click "Restart R" in call Rstudio.
What I want:
df <- read.csv('some_large_file.csv')
###extracting some info called 'args_vec'
for(arg in args_vec){
system2('python', args)
}
This won't work as is. The for loop is simply passed over.
What I need:
df <- read.csv('some_large_file.csv')
###extracting some info called 'args_vec'
###something that 'restarts' R
for(arg in args_vec){
system2('python', args)
}
This answer doesn't quite get what I want. Namely, it doesn't work for me within Rstudio and it calls "system" (which presents the same problem as "system2" in this case). In fact, when I put the answer referenced above in my Rprofile.site file, it just immediately closed rstudio:
I tried the suggestion as a normal function (rather than using "makeActiveBinding", and it didn't quite work.
##restart R in r session -- doesn't work
makeActiveBinding("refresh", function() { system("R --save"); q("no") }, .GlobalEnv)
##nor did this:
refresh <- function() { system("R --save"); q("no") }
I tried a number of variations of these two options above, but this is getting long for what feels like a simple question. There's a lot I don't yet understand about the startup process and "makeActiveBinding" is a bit mysterious. Can anyone point me in the right direction?
In Rstudio, you can restart the R session by:
command/ctrl + shift + F10
You can also use:
.rs.restartR()
RStudio has this undocumented rs.restartR() which is supposed to do just that: restarting R.
However, it does not unload the packages that were loaded, nor does it clean the environment, so that I have some doubts about if it restarts R at all.
If you use RStudio, use the menu item Session > Restart R or the associated keyboard shortcut Ctrl+Shift+F10 (Windows and Linux) or Command+Shift+F10 (Mac OS). Additional keyboard shortcuts make it easy to restart development where you left off, i.e. to say “re-run all the code up to HERE”:
In an R script, use Ctrl+Alt+B (Windows and Linux) or Command+Option+B (Mac OS)
In R markdown, use Ctrl+Alt+P (Windows and Linux) or Command+Option+P (Mac OS)
If you run R from the shell, use Ctrl+D or q() to quit, then restart R.
Have you tried embedding the function call within the apply function, rather than a for loop?
I've had some pieces of code that ran the system out of memory in a for loop run perfectly with apply. It might help?
For those not limited to a command that want something that actually resets the system (no prior state, no loaded packages, no variables, etc.) you can select Terminate R from the Session menu.
It is a bit awkward (asks you if you are sure). If anyone knows something like clear all or really clear classes in MATLAB let me know!
I wrote an R function that updates the version number of a package in another question. I work a lot with GitHub and RStudio, and it would safe me quite some time (plus be much more precise) if this function was automatically run every time I opened a certain project (or better yet, make a git commit/push, but I assume that is harder to do). But I don't know how to do this or if this is even possible.
I could use .Rprofile to run R codes every time I start R, so I could just update versions whenever I start R (or build in that it only updates the version if the date is not today or something) but that seems overdoing it.
You can make a separate .Rprofile for each project. You have to put it in the main directory of the project (http://www.rstudio.com/ide/docs/using/projects).
Well I would use .Rprofile for that. There is something to be said for being independent of the tool chain around you: knitr works from RStudio as well as without it, dito for Rcpp/RInside etc pp.
You can hook into commit hooks for svn, both explicitly via hooks in the back end, or simply at your by end adding wrapper scripts. I presume you can do likewise with git but I simply know much less about it. So to abstract this away, I would write myself a 'commitThis' or 'pushThis' or ... function that does the number increment, test run, code push and what have you.
If your code needs RStudio to be already running (e.g. because it's relying on some rstudioapi:: function), putting it directly in .Rprofile won't work (.Rprofile is executed before RStudio is available).
Instead, you could set a hook for "rstudio.sessionInit":
setHook(
hookName = "rstudio.sessionInit",
action = function(newSession) {
if (newSession) {
# your code goes here
},
action = "append"
)