Error when running (working) R script from command prompt - r

I am trying to run an R script from the Windows command prompt (the reason is that later on I would like to run the script by using VBA).
After having set up the R environment variable (see the end of the post), the following lines of code saved in R_code.R work perfectly:
library('xlsx')
x <- cbind(rnorm(10),rnorm(10))
write.xlsx(x, 'C:/Temp/output.xlsx')
(in order to run the script and get the resulting xlsx output, I simply type the following command in the Windows command prompt: Rscript C:\Temp\R_code.R).
Now, given that this "toy example" is working as expected, I tried to move to my main goal, which is indeed very similar (to run a simple R script from the command line), but for some reason I cannot succeed.
Again I have to use a specific R package (-copula-, used to sample some correlated random variables) and export the R output into a csv file.
The following script (R_code2.R) works perfectly in R:
library('copula')
par_1 <- list(mean=0, sd=1)
par_2 <- list(mean=0, sd=1)
myCop.norm <- ellipCopula(family='normal', dim=2, dispstr='un', param=c(0.2))
myMvd <- mvdc(myCop.norm,margins=c('norm','norm'),paramMargins=list(par_1,par_2))
x <- rMvdc(10, myMvd)
write.table(x, 'C:/Temp/R_output.csv', row.names=FALSE, col.names=FALSE, sep=',')
Unfortunately, when I try to run the same script from the command prompt as before (Rscript C:\Temp\R_code2.R) I get the following error:
Error in FUN(c("norm", "norm"))[[1L]], ...) :
cannot find the function "existsFunction"
Calls: mvdc -> mvdcCheckM -> mvd.has.marF -> vapply -> FUN
Do you have any idea idea on how to proceed to fix the problem?
Any help is highly appreciated, S.
Setting up the R environment variable (Windows)
For those of you that want to replicate the code, in order to set up the environment variable you have to:
Right click on Computer -> Properties -> Advanced System Settings -> Environment variables
Double click on 'PATH' and add ; followed by the path to your Rscript.exe folder. In my case it is ;C:\Program Files\R\R-3.1.1\bin\x64.

This is a tricky one that has bitten me before. According to the documentation (?Rscript),
Rscript omits the methods package as it takes about 60% of the startup time.
So your better solution IMHO is to add library(methods) near the top of your script.

For those interested, I solved the problem by simply typing the following in the command prompt:
R CMD BATCH C:\Temp\R_code2.R
It is still not clear to me why the previous command does not work. Anyway, once again searching into the R documentation (see here) proves to be an excellent choice!

Related

How to get the output data from R script in node using r-script

I am trying to execute a R script from node.js using r-script because it looks pretty simple.
With the documentation example:
example.js
var out = R("ex-sync.R")
.data("hello world", 20)
.callSync();
console.log(out);
ex-sync.R
needs(magrittr)
set.seed(512)
do.call(rep, input) %>%
strsplit(NULL) %>%
sapply(sample) %>%
apply(2, paste, collapse = "")
My out variable which supposed to be the last line of R script, is always null and I have no idea why this can happen.
For Windows users:
You need to add the environment variable to Windows's %PATH% variable. R-script package needs to call "R" command from the CMD. If R.exe is not set as an environment variable, then it will never be able to call the "R" command from anywhere.
Look up how to add environment variables to Windows, and remember: if the path to the folder containing the executables has a white space, it must be added to double quotes. "C:\Program Files\R\R-version\bin\x64"
**** replace version**
If you have already done this but the problem persists, I can only think of two reasons:
There's something wrong with your R method and it's giving an internal exception inside the R session.
The system can't find the file. Maybe check the file path.

Not able to run selected lines in REPL R in sublime text

Followed these instructions to set up REPL for sublime text
http://www.kevjohnson.org/using-r-in-sublime-text-3/
R console is running. But I am unable to push text to console using the shortcuts
Ctrl+Shift+,,l
I must be doing something wrong here, not able to figure it out on my own. Any help appreciated.
I get the following error:
Cannot find REPL for 'regexp'
Edit: Adding sample code
library("e1071")
data(iris)
m <- naiveBayes(Species ~ ., data = iris)
m
table(predict(m, iris), iris[,5])
REPL checks scopes to know what console to run, and returns an error if no console is associated to a language. It might have been that your file was not in the R syntax.
To change the syntax : Command Panl (ctrl+shift+p), Set Syntax: R

Different results from Rscript and R CMD BATCH

I have an inconsistency issue which I cannot explain when running an R script. I am not able to produce a reproducible example because there is a whole set of files/functions called by the entry script.
Using Rscript or RStudio with R v3.1.2 I obtain the results I'm expecting, however when calling R CMD BATCH from bash my script does not produce identical output. From bash, R seems to read the command line arguments correctly and reports them from the script, BUT in my code only the Rscript and RStudio source methods seem to use the parameter correctly in my code.
The 2 command line calls are as follows:
Rscript ./script/forecast_category_script.R "category='razors'" "cores=4L"
R CMD BATCH --no-save "--args category='razors' cores=4L" ./script/forecast_category_script.R ~/data/output/out.out
Is there any obvious reason why these inconsistencies might be occurring? I'd prefer to use R CMD BATCH as it redirects output to a file and when I migrate my code to the university cluster as a batch job through the scheduler I'd like to be able to follow what it has done.
UPDATE: changing this line resolves it but why?
Previously I had the following line in there, basically so when I was testing I didn't keep reloading the huge dataset if it was already loaded in my RStudio environment:
if(!exists("spi")) spi = f_load.spi(category = category)
Replaced it with this:
spi = f_load.spi(category = category)
The underlying function f_load_spi remained the same however:
f_load.spi = function(spi = NULL, category = "razors" , n=NULL) {
# check if the data is pre-loaded
if (is.null(spi)) {
fil = paste0(pth.data.storage, "categories/", category, "/", category, ".sp_ss.interp.rds")
print(fil)
spi = readRDS(fil)
}
# subset to a specific set of items
if (!is.null(n)) {
fc.items = unique(spi$fc.item)
rnd = sample(1:length(fc.items), n)
spi = spi[fc.item %in% fc.items[rnd]]
}
spi
}
For some reason the category variable was not being passed through properly into the function and it was loading a different category (beer rather than razors) which was an enormous file and not suitable for testing.
This still doesn't explain why Rscript and R CMD BATCH behaved differently.
It is possible that one of them is loading up a previously saved workspace and using global variables. Have you checked whether it matters which directory you are in or if there are any .Rhistory files present? One way to ensure that you don't have any hidden variables is to clear the worspace at the beginning of each script. For example, rm(list=ls()) as the first line of your Rscript.
Also, you can pipe output to a file with an Rscript using sink().

call to sapply() works in interactive mode, not in batch mode

I need to execute some commands in batch mode (e.g., via Rscript). They work in interactive mode, but not in batch mode. Here is a minimal example: sapply(1:3, is, "numeric"). Why does this work in interactive mode but return an error in batch mode? Is there a way to make a command like this work in batch mode?
More specifically, I need to write scripts and to run them in batch mode. They need to call a function (which I didn't write and can't edit) that looks like this:
testfun <- function (...)
{
args <- list(...)
if (any(!sapply(args, is, "numeric")))
stop("All arguments must be numeric.")
else
writeLines("All arguments look OK.")
}
I need to pass a list to this function. A command like testfun(list(1, 2, 3)) works in interactive mode. But in batch mode, it produces an error: Error in match.fun(FUN) : object 'is' not found. I tried debugger() to get a handle on the problem, but it didn't give me any insight. I also looked through r-help, the R FAQ, R Inferno, but I couldn't find anything that spoke to this problem.
Rscript doesn't load the methods package by default because it takes a lot of time. From the Details section of ?Rscript:
‘--default-packages=list’ where ‘list’ is a comma-separated list
of package names or ‘NULL’. Sets the environment variable
‘R_DEFAULT_PACKAGES’ which determines the packages loaded on
startup. The default for ‘Rscript’ omits ‘methods’ as it
takes about 60% of the startup time.
You can make it load methods by using the --default-packages argument.
> Rscript -e 'sapply(1:3, is, "numeric")' --default-packages='methods'
[1] TRUE TRUE TRUE

Getting path of an R script

Is there a way to programmatically find the path of an R script inside the script itself?
I am asking this because I have several scripts that use RGtk2 and load a GUI from a .glade file.
In these scripts I am obliged to put a setwd("path/to/the/script") instruction at the beginning, otherwise the .glade file (which is in the same directory) will not be found.
This is fine, but if I move the script in a different directory or to another computer I have to change the path. I know, it's not a big deal, but it would be nice to have something like:
setwd(getScriptPath())
So, does a similar function exist?
This works for me:
getSrcDirectory(function(x) {x})
This defines an anonymous function (that does nothing) inside the script, and then determines the source directory of that function, which is the directory where the script is.
For RStudio only:
setwd(dirname(rstudioapi::getActiveDocumentContext()$path))
This works when Running or Sourceing your file.
Use source("yourfile.R", chdir = T)
Exploit the implicit "--file" argument of Rscript
When calling the script using "Rscript" (Rscript doc) the full path of the script is given as a system parameter. The following function exploits this to extract the script directory:
getScriptPath <- function(){
cmd.args <- commandArgs()
m <- regexpr("(?<=^--file=).+", cmd.args, perl=TRUE)
script.dir <- dirname(regmatches(cmd.args, m))
if(length(script.dir) == 0) stop("can't determine script dir: please call the script with Rscript")
if(length(script.dir) > 1) stop("can't determine script dir: more than one '--file' argument detected")
return(script.dir)
}
If you wrap your code in a package, you can always query parts of the package directory.
Here is an example from the RGtk2 package:
> system.file("ui", "demo.ui", package="RGtk2")
[1] "C:/opt/R/library/RGtk2/ui/demo.ui"
>
You can do the same with a directory inst/glade/ in your sources which will become a directory glade/ in the installed package -- and system.file() will compute the path for you when installed, irrespective of the OS.
This answer works fine to me:
script.dir <- dirname(sys.frame(1)$ofile)
Note: script must be sourced in order to return correct path
I found it in: https://support.rstudio.com/hc/communities/public/questions/200895567-can-user-obtain-the-path-of-current-Project-s-directory-
But I still don´t understand what is sys.frame(1)$ofile. I didn´t find anything about that in R Documentation. Someone can explain it?
#' current script dir
#' #param
#' #return
#' #examples
#' works with source() or in RStudio Run selection
#' #export
z.csd <- function() {
# http://stackoverflow.com/questions/1815606/rscript-determine-path-of-the-executing-script
# must work with source()
if (!is.null(res <- .thisfile_source())) res
else if (!is.null(res <- .thisfile_rscript())) dirname(res)
# http://stackoverflow.com/a/35842176/2292993
# RStudio only, can work without source()
else dirname(rstudioapi::getActiveDocumentContext()$path)
}
# Helper functions
.thisfile_source <- function() {
for (i in -(1:sys.nframe())) {
if (identical(sys.function(i), base::source))
return (normalizePath(sys.frame(i)$ofile))
}
NULL
}
.thisfile_rscript <- function() {
cmdArgs <- commandArgs(trailingOnly = FALSE)
cmdArgsTrailing <- commandArgs(trailingOnly = TRUE)
cmdArgs <- cmdArgs[seq.int(from=1, length.out=length(cmdArgs) - length(cmdArgsTrailing))]
res <- gsub("^(?:--file=(.*)|.*)$", "\\1", cmdArgs)
# If multiple --file arguments are given, R uses the last one
res <- tail(res[res != ""], 1)
if (length(res) > 0)
return (res)
NULL
}
A lot of these solutions are several years old. While some may still work, there are good reasons against utilizing each of them (see linked source below). I have the best solution (also from source): use the here library.
Original example code:
library(ggplot2)
setwd("/Users/jenny/cuddly_broccoli/verbose_funicular/foofy/data")
df <- read.delim("raw_foofy_data.csv")
Revised code
library(ggplot2)
library(here)
df <- read.delim(here("data", "raw_foofy_data.csv"))
This solution is the most dynamic and robust because it works regardless of whether you are using the command line, RStudio, calling from an R script, etc. It is also extremely simple to use and is succinct.
Source: https://www.tidyverse.org/articles/2017/12/workflow-vs-script/
I have found something that works for me.
setwd(dirname(rstudioapi::getActiveDocumentContext()$path))
How about using system and shell commands? With the windows one, I think when you open the script in RStudio it sets the current shell directory to the directory of the script. You might have to add cd C:\ e.g or whatever drive you want to search (e.g. shell('dir C:\\*file_name /s', intern = TRUE) - \\ to escape escape character). Will only work for uniquely named files unless you further specify subdirectories (for Linux I started searching from /). In any case, if you know how to find something in the shell, this provides a layout to find it within R and return the directory. Should work whether you are sourcing or running the script but I haven't fully explored the potential bugs.
#Get operating system
OS<-Sys.info()
win<-length(grep("Windows",OS))
lin<-length(grep("Linux",OS))
#Find path of data directory
#Linux Bash Commands
if(lin==1){
file_path<-system("find / -name 'file_name'", intern = TRUE)
data_directory<-gsub('/file_name',"",file_path)
}
#Windows Command Prompt Commands
if(win==1){
file_path<-shell('dir file_name /s', intern = TRUE)
file_path<-file_path[4]
file_path<-gsub(" Directory of ","",file_path)
filepath<-gsub("\\\\","/",file_path)
data_directory<-file_path
}
#Change working directory to location of data and sources
setwd(data_directory)
Thank you for the function, though I had to adjust it a Little as following for me (W10):
#Windows Command Prompt Commands
if(win==1){
file_path<-shell('dir file_name', intern = TRUE)
file_path<-file_path[4]
file_path<-gsub(" Verzeichnis von ","",file_path)
file_path<-chartr("\\","/",file_path)
data_directory<-file_path
}
In my case, I needed a way to copy the executing file to back up the original script together with its outputs. This is relatively important in research. What worked for me while running my script on the command line, was a mixure of other solutions presented here, that looks like this:
library(scriptName)
file_dir <- gsub("\\", "/", fileSnapshot()$path, fixed=TRUE)
file.copy(from = file.path(file_dir, scriptName::current_filename()) ,
to = file.path(new_dir, scriptName::current_filename()))
Alternatively, one can add to the file name the date and our to help in distinguishing that file from the source like this:
file.copy(from = file.path(current_dir, current_filename()) ,
to = file.path(new_dir, subDir, paste0(current_filename(),"_", Sys.time(), ".R")))
None of the solutions given so far work in all circumstances. Worse, many solutions use setwd, and thus break code that expects the working directory to be, well, the working directory — i.e. the code that the user of the code chose (I realise that the question asks about setwd() but this doesn’t change the fact that this is generally a bad idea).
R simply has no built-in way to determine the path of the currently running piece of code.
A clean solution requires a systematic way of managing non-package code. That’s what ‘box’ does. With ‘box’, the directory relative to the currently executing code can be found trivially:
box::file()
However, that isn’t the purpose of ‘box’; it’s just a side-effect of what it actually does: it implements a proper, modern module system for R. This includes organising code in (nested) modules, and hence the ability to load code from modules relative to the currently running code.
To load code with ‘box’ you wouldn’t use e.g. source(file.path(box::file(), 'foo.r')). Instead, you’d use
box::use(./foo)
However, box::file() is still useful for locating data (i.e. OP’s use-case). So, for instance, to locate a file mygui.glade from the current module’s path, you would write.
glade_path = box::file('mygui.glade')
And (as long as you’re using ‘box’ modules) this always works, doesn’t require any hacks, and doesn’t use setwd.

Resources