Is there a package to process command-line options in R?
I know commandArgs, but it's too basic. Its result is basically the equivalent to argc and argv in C, but I'd need something on top of that, just like boost::program_options in C++, or GetOptions::Long in perl.
In particular, I'd like to specify in advance what options are allowed and give an error message if the user specifies something else.
The call would be like this (with user options --width=32 --file=foo.txt):
R --vanilla --args --width=32 --file=foo.txt < myscript.R
or, if Rscript is used:
myscript.R --width=32 --file=foo.txt
(Please don't say, "why don't you write it yourself, it's not that hard". In other languages you don't have to write it yourself either. :)
getopt for R
How about commandArgs with eval for a built in solution?
test.R
## 'trailingOnly=TRUE' means only parse args after '--args'
args=(commandArgs(trailingOnly=TRUE))
## Supply default arguments
if(length(args)==0){
print("No arguments supplied.")
##supply default values
a = 1
b = c(1,1,1)
}else{
for(i in 1:length(args)){
eval(parse(text=args[[i]]))
}
}
print(a*2)
print(b*3)
and to invoke it
R CMD BATCH --no-save --no-restore '--args a=1 b=c(2,5,6)' test.R test.out
The usual caveats w.r.t using eval apply of course.
Shamelessly stolen from this blog post.
Related
I'm developing a bash program that execute a R oneliner command to convert a RMarkdown template into a HTML document.
This R oneliner command looks like:
R -e 'library(rmarkdown) ; rmarkdown::render( "template.Rmd", "html_document", output_file = "report.html", output_dir = "'${OUTDIR}'", params = list( param1 = "'${PARAM1}'", param2 = "'${PARAM2}'", ... ) )
I have a long list of parameters, let's say 10 to explain the problem, and it seems that the R or bash has a command line length limit.
When I execute the R oneliner with 10 parameters I obtain a error message like this:
WARNING: '-e library(rmarkdown)~+~;~+~rmarkdown::render(~+~"template.Rmd",~+~"html_document",~+~output_file~+~=~+~"report.html",~+~output_dir~+~=~+~"output/",~+~params~+~=~+~list(~+~param1~+~=~+~"param2", ...
Fatal error: you must specify '--save', '--no-save' or '--vanilla'
When I execute the R oneliner with 9 parameters it's ok (I tried different combinations to verify that the problem was not the last parameter).
When I execute the R oneliner with 10 parameters but with removing all spaces in it, it's ok too so I guess that R or bash use a command line length limit.
R -e 'library(rmarkdown);rmarkdown::render("template.Rmd","html_document",output_file="report.html",output_dir="'${OUTDIR}'",params=list(param1="'${PARAM1}'",param2="'${PARAM2}'",...))
Is it possible to increase this limit?
This will break a number of ways – including if your arguments have spaces or quotes in them.
Instead, try passing the values as arguments. Something like this should give you an idea how it works:
# create a script file
tee arguments.r << 'EOF'
argv <- commandArgs(trailingOnly=TRUE)
arg1 <- argv[1]
print(paste("Argument 1 was", arg1))
EOF
# set some values
param1="foo bar"
param2="baz"
# run the script with arguments
Rscript arguments.r "$param1" "$param2"
Expected output:
[1] "Argument 1 was foo bar"
Always quote your variables and always use lowercase variable names to avoid conflicts with system or application variables.
I'm calling R function in Perl by passing variables in Perl program using system command.
#!/usr/bin/perl
$file1= "Test1.txt"
$file2= "Test2.txt"
$val="Rscript Test.R ".$file1." ".$file2;
print($val,"\n");
system('Rscript Test.R', $file1, $file2);
But it does not call the R script and pass the file1 and file2 values. How can I fix this?
When using the system LIST syntax, put all the arguments to the list - otherwise, Rscript Test.R is taken as one command.
system('Rscript', 'Test.R', $file1, $file2);
I'm trying to run R from the command line using command line arguments. This includes passing in some filepaths as arguments for use inside the script. It all works most of the time, but sometimes the paths have spaces in and R doesn't understand.
I'm running something of the form:
R CMD BATCH --slave "--args inputfile='C:/Work/FolderWith SpaceInName/myinputfile.csv' outputfile='C:/Work/myoutputfile.csv'" RScript.r ROut.txt
And R throws out a file saying
Fatal error: cannot open file 'C:\Work\FolderWith': No such file or directory
So evidently my single quotes aren't enough to tell R to take everything inside the quotes as the argument value. I'm thinking this means I should find a way to delimit my --args using a comma, but I can't find a way to do this. I'm sure it's simple but I've not found anything in the documentation.
The current script is very basic:
ca = commandArgs(trailingOnly=TRUE)
eval(parse(text=ca))
tempdata = read.csv(inputFile)
tempdata$total = apply(tempdata[,4:18], 1, sum)
write.csv(tempdata, outputFile, row.names = FALSE)
In case it's relevant I'm using windows for this, but it seems like it's not a cmd prompt problem.
Using eval(parse()) is probably not the best and most efficient way to parse command line arguments. I recommend to use a package like the optparse to do the parsing for you. Parsing command line args has already been solved, no need to reimplement this. I could imagine that this solves your problems. Although, spaces in path names are a bad idea to begin with.
Alternatively, you could take a very simple approach and pass the arguments like this:
R CMD BATCH --slave arg1 arg2
Where you can retrieve them like:
ca = commandArgs(TRUE)
arg1 = ca[2]
arg2 = ca[3]
This avoids the eval(parse which I think is causing the issues. Finally, you could try and escape the space like this:
R CMD BATCH --slave "C:/spam\ bla"
You could also give Rscript a try, R CMD BATCH seems to be less favored than Rscript.
As an enhancement of #PaulHimestra answer here how you can use Rscript :
you create a launcher.bat ,
echo off
C:
PATH R_PATH;%path%
cd DEMO_PATH
Rscript youscript.R arg1 arg2
exit
with R_PATH something like C:/Program Files/R/R-version
There are many similarities with this post:
R command line passing a filename to script in arguments (Windows)
Also this post is very OS related. My answer applies only to Windows.
Probably what you are looking for is RScript.exe instead of R.exe. The latter has no problem with spaces: path\to\RScript "My script.r".
One boring thing may be searching or setting the path for RScript and doing this every time one updates R.
Among the convenience scripts I have in my search path, I wrote a little facility to run RScript without bothering with paths. Just in case it may be of interest for someone:
#echo off
setlocal
::Get change to file dir par (-CD must be 1st par)
::================================================
Set CHANGEDIR="F"
If /I %1 EQU -cd (
Set CHANGEDIR="T"
SHIFT
)
::No args given
::=============
If [%1] EQU [] GoTo :USAGE
::Get R path from registry
::========================
:: may check http://code.google.com/p/batchfiles for updates on R reg keys
Call :CHECKSET hklm\software\R-core\R InstallPath
Call :CHECKSET hklm\software\wow6432Node\r-core\r InstallPath
if not defined RINSTALLPATH echo "Error: R not found" & goto:EOF
::Detect filepath when arg not starting with "-"
::==============================================
::Note the space after ARGS down here!!!
Set ARGS=
:LOOP
if [%1]==[] (GoTo :ELOOP)
Set ARGS=%ARGS% %1
::Echo [%ARGS%]
Set THIS=%~1
if [%THIS:~0,1%] NEQ [-] (Set FPATH=%~dp1)
SHIFT
GoTo :LOOP
:ELOOP
::echo %FPATH%
::Run Rscript script, changing to its path if asked
::=================================================
If %CHANGEDIR%=="T" (CD %FPATH%)
Echo "%RINSTALLPATH%\bin\Rscript.exe" %ARGS%
"%RINSTALLPATH%\bin\Rscript.exe" %ARGS%
endlocal
:: ==== Subroutines ====
GoTo :EOF
:USAGE
Echo USAGE:
Echo R [-cd] [RScriptOptions] Script [ScriptArgs]
Echo.
Echo -cd changes to script dir. Must be first par.
Echo To get RScript help on options etc.:
Echo R --help
GoTo :EOF
:CHECKSET
if not defined RINSTALLPATH for /f "tokens=2*" %%a in ('reg query %1 /v %2 2^>NUL') do set RINSTALLPATH=%%~b
GoTo :EOF
The script prints the actual RScript invoking line, before running it.
Note that there is an added argument, -cd, to change automatically to the script directory. In fact it is not easy to guess the script path from inside R (and set it with setwd()), in order to call other scripts or read/write data files placed in the same path (or in a relative one).
This (-cd) might possibly make superfluous your other commandargs, as you may find convenient calling them straight from inside the script.
I wanted to submit a R job to the grid. I have saved the main R code in MGSA_rand.r
In the file callmgsa.r I have written
print('here')
source('/home/users/pegah/MGSA_rand.r')
mgsalooprand($SGE_TASK_ID,382)
And I use the file Rscript.sh to call the job (with the -t parameter I send the value corrseponding to $SGE_TASK_ID)
R CMD BATCH --no-save callmgsa.r
I submit the job like this:
qsub -t 1 -cwd -b y -l vf=1000m /home/users/pegah/Rscript.sh
I neither get an error nor any output. The job terminates just as I submit it, with out any output. Could you please help me?
thanks, Pegah
The variable $SGE_TASK_ID is a shelscript variable. Calling it in R with the same syntax is no going to work. What you could do is use Rscript in stead. From the shellscript you call:
Rscript callmgsa.r $SGE_TASK_ID
In the R script you can catch the command line arguments like:
args <- commandArgs(trailingOnly = TRUE)
print('here')
source('/home/users/pegah/MGSA_rand.r')
mgsalooprand(args[1],382)
This should work...
If I have an R script:
print("hi")
commandArgs()
And I run it using:
r CMD BATCH --slave --no-timing test.r output.txt
The output will contain:
[1] "hi"
[1] "/Library/Frameworks/R.framework/Resources/bin/exec/x86_64/R"
[2] "-f"
[3] "test.r"
[4] "--restore"
[5] "--save"
[6] "--no-readline"
[7] "--slave"
How can i suppress the line numbers[1]..[7] in the output so only the output of the script appears?
Use cat instead of print if you want to suppress the line numbers ([1], [2], ...) in the output.
I think you are also going to want to pass command line arguments. I think the easiest way to do that is to create a file with the RScript shebang:
For example, create a file called args.r:
#!/usr/bin/env Rscript
args <- commandArgs(TRUE)
cat(args, sep = "\n")
Make it executable with chmod +x args.r and then you can run it with ./args.r ARG1 ARG2
FWIW, passing command line parameters with the R CMD BATCH ... syntax is a pain. Here is how you do it: R CMD BATCH "--args ARG1 ARG2" args.r Note the quotes. More discussion here
UPDATE: changed shebang line above from #!/usr/bin/Rscript to #!/usr/bin/env Rscript in response to #mbq's comment (thanks!)
Yes, mbq is right -- use Rscript, or, if it floats your boat, littler:
$ cat /tmp/tommy.r
#!/usr/bin/r
cat("hello world\n")
print(argv[])
$ /tmp/tommy.r a b c
hello world
[1] "a" "b" "c"
$
You probably want to look at CRAN packages getopt and optparse for argument-parsing as you'd do in other scripting languages/
Use commandArgs(TRUE) and run your script with Rscript.
EDIT: Ok, I've misread your question. David has it right.
Stop Rscript from command-numbering the output from print
By default, R makes print(...) pre-pend command numbering to stdout like this:
print("we get signal")
Produces:
[1] "we get signal"
Rscript lets the user change the definition of functions like print, so it serves our purpose by default:
print = cat
print("we get signal")
Produces:
we get signal
Notice the command numbering and double quoting is gone.
Get more control of print by using R first class functions:
my_print <- function(x, ...){
#extra shenanigans for when the wind blows from the east on tuesdays, go here.
cat(x)
}
print = my_print
print("we get signal")
Prints:
we get signal
If you're using print as a poor mans debugger... We're not laughing at you, we're laughing with you.