How can I submit a batch job knitting an Rscript on LSF? - r

How to knit an Rmd document in a job on an LSF cluster?
I tried this:
bsub -q normal -E 'test -e /nfs/users/nfs_c/username' -R "select[mem>20000] rusage[mem=20000]" -M20000 -o DOWNLOAD.o -e DOWNLOAD.e -J DOWNLOAD Rscript -e "rmarkdown::render('code/12downloadSeq.Rmd')"
The above command does not work. I get the following error:
Error in strsplit(version_info, "\n")[[1]] : subscript out of bounds
Calls: <Anonymous> ... pandoc_available -> find_pandoc -> lapply -> FUN -> get_pandoc_version
Execution halted
On the other hand, a script like the following to run a normal R script works:
bsub -q normal -E 'test -e /nfs/users/nfs_c/cr7' -R "select[mem>20000] rusage[mem=20000]" -M20000 -o RSCRIPT.o -e RSCRIPT.e -J RSCRIPT Rscript code/setup.R
I imagine there is a problem with the quotation marks but I do not know how to solve it.

Related

How to prevent R system command syntax error

I am trying to run a command similar to
> system("cat <(echo $PATH)")
which fails when run from within R or Rstudio with the following error message:
sh: -c: line 0: syntax error near unexpected token `('
sh: -c: line 0: `cat <(echo $PATH)'
However, if I run this on the command line it works fine:
$ cat <(echo $PATH)
[...]/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin
I checked that the shell I am using is bash using system("echo $SHELL"). Can anyone help me solving this?
This syntax works in bash but not sh. The $SHELL environment variable doesn't necessarily mean that is the shell being used. echo $0 will show your shell.
system("echo $0")
#> sh
You could force bash to be used like this
system("bash -c 'cat <(echo $PATH)'")
#> /usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin

Running programmatically created Rscript call at command line

I have the following script:
rstest
text=$1
cmd="Rscript -e \"a='$1'; print(a)\""
echo $cmd
$cmd
This is the output I get when I run it:
balter#spectre3:~$ bash rstest hello
Rscript -e "a='hello'; print(a)"
Error: unexpected end of input
Execution halted
However, if I run the echoed command directly, it runs fine:
balter#spectre3:~$ Rscript -e "a='hello'; print(a)"
[1] "hello"
I would like to understand why this is. I've tried various combinations of quoting the bash variables and adding eval. But that doesn't seem to be the issue.
EDIT
I tried the answer given below, but get a different result!
balter#spectre3:~$ cat rstest
text=$1
cmd="Rscript -e \"a=$1; print(a)\""
echo $cmd
eval $cmd
balter#spectre3:~$ bash rstest
Rscript -e "a=; print(a)"
Error in cat("pointing to conda env:", env_name, "and lib location", lib, :
argument "env_name" is missing, with no default
Calls: startCondaEnv -> cat
Execution halted
Below script worked for me.
text=$1
cmd="Rscript -e \"a='$1'; print(a)\""
echo $cmd
eval $cmd
Removing eval gave the same error you posted.
Rscript -e "a='Hello'; print(a)"
Error: unexpected end of input
Execution halted

Where is my log output going?

I'm executing an R-script from within ruby using the following command:
country = "de"
system("cd ~/myrep/production && Rscript --vanilla main.r --country #{country} --environment production &> /home/myuser/dir/shared/log/log_#{country}.log &")
The log file is created, however I find no output in the file, where as if I execute the command straight in the OS like so:
cd ~/myrep/production && Rscript --vanilla main.r --country de --environment production
Then I see output I see lot's of output in the terminal
update
As suggested in the comments I tried changing the redirect. It didn't work. When doing a ps aux in the terminal, it looks like the actual command being executed is actually:
/usr/lib/R/bin/exec/R --slave --no-restore --vanilla --file=main.r --args --country de --environment production
So without the output redirection to file...?

Submitting R scripts with command_line_arguments to PBS HPC clusters

please could you advise me on the following : I've written a R script that reads 3 arguments from the command line, i.e. :
args <- commandArgs(TRUE)
TUMOR <- args[1]
GERMLINE <- args[2]
CHR <- args[3]
when I submit the R script to a PBS HPC scheduler, I do the following (below), but ... I am getting an error message.
(I am not posting the error message, because the R script I wrote works fine when it is run from a regular terminal ..)
Please may I ask, how do you usually submit the R scripts with command line arguments to PBS HPC schedulers ?
qsub -d $PWD -l nodes=1:ppn=4 -l vmem=10gb -m bea -M tanasa#gmail.com \
-v TUMOR="tumor.bam",GERMLINE="germline.bam",CHR="chr22" \
-e script.efile.chr22 \
-o script.ofile.chr22 \
script.R

R markdown files overlap figures when parallelized using Makefile

I've created a simple example showing the problem I currently have.
I have a R-markdown file, named example.Rmd, containing the following code
```{r}
plot(rnorm(10000))
```
and a Makefile file with the following content
all : example01.html example02.html
example01.html : example.Rmd
Rscript -e "library(knitr); knit2html(input='example.Rmd', output='example01.html')"
example02.html : example.Rmd
Rscript -e "library(knitr); knit2html(input='example.Rmd', output='example02.html')"
If I run the Makefile file sequentially
make
there is no problem.
If I run the makefile in parallel
make -j 2
the chunks generated by knit2html function overlap and both html files contains the same image.
Any suggestion? I've been searching for a solution but I've found nothing.
Using the idea of Karl, I've written a possible solution.
all : example01.html example02.html
example01.html : example.Rmd
mkdir -p dir_$#
Rscript -e 'library(knitr); opts_knit$$set(base.dir = "dir_$#"); knit2html(input="example.Rmd", output="dir_$#/$#")'
mv dir_$#/$# .
rm -r dir_$#
example02.html : example.Rmd
mkdir -p dir_$#
Rscript -e 'library(knitr); opts_knit$$set(base.dir = "dir_$#"); knit2html(input="example.Rmd", output="dir_$#/$#")'
mv dir_$#/$# .
rm -r dir_$#
There are two modifications respect to initial code.
As Karl commented, I've included the line opts_knit$set(base.dir= "dir_example0?.html") in such a way the figure folder is create in that path.
I've swap " and ' symbol in Rscript -e command as commented here
Parallel execution
make -j 2
works fine.

Resources