batch mode quoted parameters parsing - r

I'm trying to use my R script in batch mode, but R doesn't seem able to parse quoted parameters properly:
args=(commandArgs(TRUE))
for(i in 1:length(args)){
print(paste('ARG ',i,args[[i]],sep=" "))
}
Then if a parameter with spaces and quotes is supplied, like:
R CMD BATCH "--args foo=2 bar=3 's=string with spaces'" test-parameters.R output
the output is:
[1] "ARG 1 foo=2"
[1] "ARG 2 bar=3"
[1] "ARG 3 's=string"
[1] "ARG 4 with"
[1] "ARG 5 spaces'"
of course I'd like the third parameter to be s='string with spaces': is there a way to obtain that?
Thank you!

Yeah, R CMD BATCH acts a little weird.
Try this instead:
R --slave --vanilla --file=test-parameters.R --args foo=2 bar=3 "s=string with spaces" > output
The --slave and --vanilla options might be replaced with more suitable options as needed.

To stick with R CMD BATCH, I ended passing the argument with a caret where a space is needed, and then performing a gsub("^"," ",argumentPassed,fixed=TRUE)
Batch Script
R CMD BATCH --no-restore "--args automationRDS='//networkLocation/folder/folder/my^filename^has^spaces.RDS'" "\\networkLocation\folder\folder\RScript.R"
Within R
automationRDS <- gsub("^"," ",automationRDS,fixed=TRUE)
I like the logging feature of R CMD BATCH vs Rscript

Related

How to read in parameters to R script from commandline when doing GNU parallel jobs on HPC

I am very new to GNU parallel, and I have some very basic questions. I need to do a large parameter sweep, and I have 6 different parameters. Following the documentation provided by the HPC cluster I am using, my command for submitting GNU jobs is the following:
parallel -j 2 --colsep '\t' R CMD BATCH Rscript.R --a {1} --b {2} --c {3} --d {4} --e {5} --f {6} :::: parameter.txt
parameter.txt contains the tab separated values of the 6 parameters (a-f). Say a=1, b=2, ..., f=6. After googling, I imagined that the Rscript will read in the parameters by commandArgs(). But this appearantly does not work. Simply setting Rscript.R to be a single line:
args<-commandArgs(trailingOnly=TRUE)
the results for args is NULL. What is the correct way of reading in the parameters into Rscript.R in this case? Thank you very much!

R or bash command line length limit

I'm developing a bash program that execute a R oneliner command to convert a RMarkdown template into a HTML document.
This R oneliner command looks like:
R -e 'library(rmarkdown) ; rmarkdown::render( "template.Rmd", "html_document", output_file = "report.html", output_dir = "'${OUTDIR}'", params = list( param1 = "'${PARAM1}'", param2 = "'${PARAM2}'", ... ) )
I have a long list of parameters, let's say 10 to explain the problem, and it seems that the R or bash has a command line length limit.
When I execute the R oneliner with 10 parameters I obtain a error message like this:
WARNING: '-e library(rmarkdown)~+~;~+~rmarkdown::render(~+~"template.Rmd",~+~"html_document",~+~output_file~+~=~+~"report.html",~+~output_dir~+~=~+~"output/",~+~params~+~=~+~list(~+~param1~+~=~+~"param2", ...
Fatal error: you must specify '--save', '--no-save' or '--vanilla'
When I execute the R oneliner with 9 parameters it's ok (I tried different combinations to verify that the problem was not the last parameter).
When I execute the R oneliner with 10 parameters but with removing all spaces in it, it's ok too so I guess that R or bash use a command line length limit.
R -e 'library(rmarkdown);rmarkdown::render("template.Rmd","html_document",output_file="report.html",output_dir="'${OUTDIR}'",params=list(param1="'${PARAM1}'",param2="'${PARAM2}'",...))
Is it possible to increase this limit?
This will break a number of ways – including if your arguments have spaces or quotes in them.
Instead, try passing the values as arguments. Something like this should give you an idea how it works:
# create a script file
tee arguments.r << 'EOF'
argv <- commandArgs(trailingOnly=TRUE)
arg1 <- argv[1]
print(paste("Argument 1 was", arg1))
EOF
# set some values
param1="foo bar"
param2="baz"
# run the script with arguments
Rscript arguments.r "$param1" "$param2"
Expected output:
[1] "Argument 1 was foo bar"
Always quote your variables and always use lowercase variable names to avoid conflicts with system or application variables.

Using rscript for expression with dash

I am using rscript to run some expressions but I'm having an issue with some cases with dashes. A simple example would be:
$ rscript -e '-1'
ERROR: option '-e' requires a non-empty argument
Adding parenthesis works out (rscript -e (-1)) but I'm not always sure that they will be properly parenthesized.
In the documentation it says
When using -e options be aware of the quoting rules in the shell used
So I tried using different quoting rules for bash, escaping the dashes or using single quotes but it still doesn't work.
$ rscript -e "\-1"
Error: unexpected input in "\"
Execution halted
Is there something I'm missing?
You misunderstand one part here. "Expression" is something R can parse, ie:
$ R --slave -e '1+1'
[1] 2
$
What you hit with -1 is a corner case. You can do
$ R --slave -e 'a <- -1; a'
[1] -1
$
or
$ R --slave -e 'print(-1)'
[1] -1
$
For actual argument parsing do you want an package like docopt (which I like and use a lot), or getopt (which I used before) or optparse. All are on CRAN.

How can I suppress the line numbers output using R CMD BATCH?

If I have an R script:
print("hi")
commandArgs()
And I run it using:
r CMD BATCH --slave --no-timing test.r output.txt
The output will contain:
[1] "hi"
[1] "/Library/Frameworks/R.framework/Resources/bin/exec/x86_64/R"
[2] "-f"
[3] "test.r"
[4] "--restore"
[5] "--save"
[6] "--no-readline"
[7] "--slave"
How can i suppress the line numbers[1]..[7] in the output so only the output of the script appears?
Use cat instead of print if you want to suppress the line numbers ([1], [2], ...) in the output.
I think you are also going to want to pass command line arguments. I think the easiest way to do that is to create a file with the RScript shebang:
For example, create a file called args.r:
#!/usr/bin/env Rscript
args <- commandArgs(TRUE)
cat(args, sep = "\n")
Make it executable with chmod +x args.r and then you can run it with ./args.r ARG1 ARG2
FWIW, passing command line parameters with the R CMD BATCH ... syntax is a pain. Here is how you do it: R CMD BATCH "--args ARG1 ARG2" args.r Note the quotes. More discussion here
UPDATE: changed shebang line above from #!/usr/bin/Rscript to #!/usr/bin/env Rscript in response to #mbq's comment (thanks!)
Yes, mbq is right -- use Rscript, or, if it floats your boat, littler:
$ cat /tmp/tommy.r
#!/usr/bin/r
cat("hello world\n")
print(argv[])
$ /tmp/tommy.r a b c
hello world
[1] "a" "b" "c"
$
You probably want to look at CRAN packages getopt and optparse for argument-parsing as you'd do in other scripting languages/
Use commandArgs(TRUE) and run your script with Rscript.
EDIT: Ok, I've misread your question. David has it right.
Stop Rscript from command-numbering the output from print
By default, R makes print(...) pre-pend command numbering to stdout like this:
print("we get signal")
Produces:
[1] "we get signal"
Rscript lets the user change the definition of functions like print, so it serves our purpose by default:
print = cat
print("we get signal")
Produces:
we get signal
Notice the command numbering and double quoting is gone.
Get more control of print by using R first class functions:
my_print <- function(x, ...){
#extra shenanigans for when the wind blows from the east on tuesdays, go here.
cat(x)
}
print = my_print
print("we get signal")
Prints:
we get signal
If you're using print as a poor mans debugger... We're not laughing at you, we're laughing with you.

Is there a package to process command line options in R?

Is there a package to process command-line options in R?
I know commandArgs, but it's too basic. Its result is basically the equivalent to argc and argv in C, but I'd need something on top of that, just like boost::program_options in C++, or GetOptions::Long in perl.
In particular, I'd like to specify in advance what options are allowed and give an error message if the user specifies something else.
The call would be like this (with user options --width=32 --file=foo.txt):
R --vanilla --args --width=32 --file=foo.txt < myscript.R
or, if Rscript is used:
myscript.R --width=32 --file=foo.txt
(Please don't say, "why don't you write it yourself, it's not that hard". In other languages you don't have to write it yourself either. :)
getopt for R
How about commandArgs with eval for a built in solution?
test.R
## 'trailingOnly=TRUE' means only parse args after '--args'
args=(commandArgs(trailingOnly=TRUE))
## Supply default arguments
if(length(args)==0){
print("No arguments supplied.")
##supply default values
a = 1
b = c(1,1,1)
}else{
for(i in 1:length(args)){
eval(parse(text=args[[i]]))
}
}
print(a*2)
print(b*3)
and to invoke it
R CMD BATCH --no-save --no-restore '--args a=1 b=c(2,5,6)' test.R test.out
The usual caveats w.r.t using eval apply of course.
Shamelessly stolen from this blog post.

Resources