calling an Rscript from node.js - r

I have been trying to execute an Rscript from my node.js server. tried to follow an example online, but i keep getting a null returned object or sometimes the process keeps running forever. I have mentioned the code snippet below. Thank you.
example.js ::
var R = require("r-script");
var out = R("scripts/testScript.R")
.data("hello world", 20)
.callSync(function(err,resp){
console.log(out);
});
testScript.R file :::
needs(magrittr)
set.seed(512)
do.call(rep, input) %>%
strsplit(NULL) %>%
sapply(sample) %>%
apply(2, paste, collapse = "")

For windows users:
You need to add the environment variable to Windows's %PATH% variable. R-script package needs to call "R" command from the CMD. If R.exe is not set as a environment vairable, then it will never be able to call the "R" command from anywhere.
Look up how to add environment variables to Windows, and remember: if the path to the folder containing the executables has a white space, it must be added between double quotes. "C:\Program Files\R\R-3.3.2\bin\x64"
If you have already done this but the problem persists, I can only think of two reasons:
There's something wrong with your R method and it's giving an internal exception inside the R session.
The system can't find the file. Maybe check the filepath.

You can use child processes in node to call other languages. I find it easiest to call Python from node, and use Python's subprocess module to then call R:
NODE
var spawn = require("child_process").spawn
var process = spawn('python',["call_r.py", script_choice, function_choice]);
This calls our call_r.py file passing along our script and function choices:
PYTHON (call_r.py)
import subprocess
import sys
script_choice = sys.argv[1]
function_choice = sys.argv[2]
call_script = 'R_Scripts/' + script_choice + '.R'
cmd = ['Rscript', call_script] + [function_choice]
result = subprocess.check_output(cmd, universal_newlines=True)
print(result)
sys.stdout.flush()
This parses the passed script and function choices, calling R via Python's subprocess module.
R (script that was chosen)
myArgs <- commandArgs(trailingOnly = TRUE)
function_choice <- myArgs[1]
# add your R functions here
eval(parse(text=function_choice))
Here, R parses the passed function choice and evaluates it. Note that arguments can be passed to the R function of choice by simply including them in the function argument (e.g. my_function('hey there'))

Related

Calling a python function from R with passing the arguments

Is there any package for calling a python function from R by passing the function arguments through R? Now i have directly called the python file using system in R.
a<-system('/home/anaconda3/bin/python /home/Desktop/myfile.py' ,intern = TRUE)
But this myfile.py file is having a function with paramenter. How to specify the parameter in R?
I have tried system('/home/anaconda3/bin/python /home/Desktop/myfile.py argument',wait=FALSE,intern = TRUE) .But it returns 0.
for example I want to pass the number of core that my python script can use:
system(paste('/home/anaconda3/bin/python','home/Desktop/myfile.py',NCORE))
Then in Python Script before launch the function I can read my parameter in this way:
n_core = int(sys.argv[1])
sys.argv is a list in Python, which contains the command-line arguments passed to the script.
please look at reticulate
library(reticulate)
os <- import("os")
os$listdir(".")

How to get the output data from R script in node using r-script

I am trying to execute a R script from node.js using r-script because it looks pretty simple.
With the documentation example:
example.js
var out = R("ex-sync.R")
.data("hello world", 20)
.callSync();
console.log(out);
ex-sync.R
needs(magrittr)
set.seed(512)
do.call(rep, input) %>%
strsplit(NULL) %>%
sapply(sample) %>%
apply(2, paste, collapse = "")
My out variable which supposed to be the last line of R script, is always null and I have no idea why this can happen.
For Windows users:
You need to add the environment variable to Windows's %PATH% variable. R-script package needs to call "R" command from the CMD. If R.exe is not set as an environment variable, then it will never be able to call the "R" command from anywhere.
Look up how to add environment variables to Windows, and remember: if the path to the folder containing the executables has a white space, it must be added to double quotes. "C:\Program Files\R\R-version\bin\x64"
**** replace version**
If you have already done this but the problem persists, I can only think of two reasons:
There's something wrong with your R method and it's giving an internal exception inside the R session.
The system can't find the file. Maybe check the file path.

R Script as a Function

I have a long script that involves data manipulation and estimation. I have it setup to use a set of parameters, though I would like to be able to run this script multiple times with different sets of inputs kind of like a function.
Running the script produces plots and saves estimates to a csv, I am not particularly concerned with the objects it creates.
I would rather not wrap the script in a function as it is meant to be used interactively.
How do people go about doing something like this?
I found this for command line arguments : How to pass command-line arguments when source() an R file but still doesn't solve the interactive problem
I have dealt with something similar before. Below is the solution I came up with.
I basically use list2env to push variables to either the global or function's local environment
and I then source the function in the designated environment.
This can be quite useful especially when coupled with exists as shown in the example below which would allow you to keep your script stand-alone.
These two questions may also be of help:
Source-ing an .R script within a function and passing a variable through (RODBC)
How to pass command-line arguments when source() an R file
# Function ----------------------------------------------------------------
subroutine <- function(file, param = list(), local = TRUE, ...) {
list2env(param, envir = if (local) environment() else globalenv())
source(file, local = local, ...)
}
# Example -----------------------------------------------------------------
# Create an example script
tmp <- "test_subroutine.R"
cat("if (!exists('msg')) msg <- 'no argument provided'; print(msg)", file = tmp)
# Example of using exists in the script to keep it stand-alone
subroutine(tmp)
# Evaluate in functions environment
subroutine(tmp, list(msg = "use function's environment"), local = TRUE)
exists("msg", envir = globalenv()) # FALSE
# Evaluate in global environment
subroutine(tmp, list(msg = "use global environment"), local = FALSE)
exists("msg", envir = globalenv()) # TRUE
unlink(tmp)
Just to clarify what was alluded to in Hansi's comment, here is one approach to this issue:
Wrap the script into a function, since this will let you go up one level of abstraction if needed, and will also make it easier to call the function whenever it is needed in any other script.
In cases where you want to use the script interactively, you can put a browser() call somewhere in your script. At the point where browser() is called, the function will pause and keep the environment as-is within the function, and you can then step through the function and use R interactively from within the function.
In the base package, check ?commandArgs, you can use this to parse out arguments from the command line.
If I have a script, test.R, containing the code:
args <- commandArgs(trailingOnly=TRUE)
for (arg in args){
print(arg)
}
and I call it from the command line with rscript as follows:
rscript test.R arg1 arg2 arg3
The output is:
[1] "arg1"
[1] "arg2"
[1] "arg3"

Different results from Rscript and R CMD BATCH

I have an inconsistency issue which I cannot explain when running an R script. I am not able to produce a reproducible example because there is a whole set of files/functions called by the entry script.
Using Rscript or RStudio with R v3.1.2 I obtain the results I'm expecting, however when calling R CMD BATCH from bash my script does not produce identical output. From bash, R seems to read the command line arguments correctly and reports them from the script, BUT in my code only the Rscript and RStudio source methods seem to use the parameter correctly in my code.
The 2 command line calls are as follows:
Rscript ./script/forecast_category_script.R "category='razors'" "cores=4L"
R CMD BATCH --no-save "--args category='razors' cores=4L" ./script/forecast_category_script.R ~/data/output/out.out
Is there any obvious reason why these inconsistencies might be occurring? I'd prefer to use R CMD BATCH as it redirects output to a file and when I migrate my code to the university cluster as a batch job through the scheduler I'd like to be able to follow what it has done.
UPDATE: changing this line resolves it but why?
Previously I had the following line in there, basically so when I was testing I didn't keep reloading the huge dataset if it was already loaded in my RStudio environment:
if(!exists("spi")) spi = f_load.spi(category = category)
Replaced it with this:
spi = f_load.spi(category = category)
The underlying function f_load_spi remained the same however:
f_load.spi = function(spi = NULL, category = "razors" , n=NULL) {
# check if the data is pre-loaded
if (is.null(spi)) {
fil = paste0(pth.data.storage, "categories/", category, "/", category, ".sp_ss.interp.rds")
print(fil)
spi = readRDS(fil)
}
# subset to a specific set of items
if (!is.null(n)) {
fc.items = unique(spi$fc.item)
rnd = sample(1:length(fc.items), n)
spi = spi[fc.item %in% fc.items[rnd]]
}
spi
}
For some reason the category variable was not being passed through properly into the function and it was loading a different category (beer rather than razors) which was an enormous file and not suitable for testing.
This still doesn't explain why Rscript and R CMD BATCH behaved differently.
It is possible that one of them is loading up a previously saved workspace and using global variables. Have you checked whether it matters which directory you are in or if there are any .Rhistory files present? One way to ensure that you don't have any hidden variables is to clear the worspace at the beginning of each script. For example, rm(list=ls()) as the first line of your Rscript.
Also, you can pipe output to a file with an Rscript using sink().

Write access to commandArgs?

I know that I can use commandArgs to read the command line arguments passed to a script in R, but I would like to debug a command line script by sourceing it in R and making it run using custom command line arguments. Is there a way of modifying the command line arguments without modifying the script file?
My scripts are normally using the optparse package for actual argument parsing, if that helps.
I'll try and expand what I said in a comment.
The python way of writing scripts usually involves detecting if the file is being run as a script, handling the args, and then calling functions defined in the file. Something like:
def foo(x):
return x*2
if __name__=="__main__":
v = sys.argv[1]
print foo(v)
This has the advantage that you can import the file into an interactive python session and the code in the 'if' block doesn't run. You can then test the foo function interactively.
Now is there a way you can check in R if the file is being run as a script, or being sourced from an interactive session?
foo=function(x){
return(x*2)
}
if(!interactive()){
x = as.numeric(commandArgs(trailingOnly=TRUE)[1])
print(foo(x))
}
If run with Rscript argtest.R 22 will print 44, if you run R interactively and do source("argtest.R") it won't run the code in the if block. Its a nice pattern.
How about simply overwriting it with your own definition, e.g.
commandArgs <- function(trailingOnly=FALSE) {
args<- c("/foo/bar", "baz")
# copied from base:::commandArgs
if (trailingOnly) {
m <- match("--args", args, 0L)
if (m)
args[-seq_len(m)]
else character()
}
else args
}
The simplest solution is to replace source() with system(). Try
system("Rscript file_to_source.R 1 2 3")

Resources