Running programmatically created Rscript call at command line - r

I have the following script:
rstest
text=$1
cmd="Rscript -e \"a='$1'; print(a)\""
echo $cmd
$cmd
This is the output I get when I run it:
balter#spectre3:~$ bash rstest hello
Rscript -e "a='hello'; print(a)"
Error: unexpected end of input
Execution halted
However, if I run the echoed command directly, it runs fine:
balter#spectre3:~$ Rscript -e "a='hello'; print(a)"
[1] "hello"
I would like to understand why this is. I've tried various combinations of quoting the bash variables and adding eval. But that doesn't seem to be the issue.
EDIT
I tried the answer given below, but get a different result!
balter#spectre3:~$ cat rstest
text=$1
cmd="Rscript -e \"a=$1; print(a)\""
echo $cmd
eval $cmd
balter#spectre3:~$ bash rstest
Rscript -e "a=; print(a)"
Error in cat("pointing to conda env:", env_name, "and lib location", lib, :
argument "env_name" is missing, with no default
Calls: startCondaEnv -> cat
Execution halted

Below script worked for me.
text=$1
cmd="Rscript -e \"a='$1'; print(a)\""
echo $cmd
eval $cmd
Removing eval gave the same error you posted.
Rscript -e "a='Hello'; print(a)"
Error: unexpected end of input
Execution halted

Related

Grep R error message in bash to halt a pipeline

I have a pipeline I am working on. I have a wrappper.sh that pipes together various .R scripts. However, This pipeline will run through an error message. I want to add a way to grep our the word Error if True, shut down pipeline. I know I need an if/else statment, but dont know how to grep this info out of .R script running in bash.sh. See an example error.
Current script:
#!/bin/bash
#Bash script for running GeoMx Pipeline
####
# Install required R packages for pipeline
echo "installing R packages"
Rscript installPackages.R
echo "DONE! R packages installed"
#####
# Created required folders
echo "Creating Folders"
Rscript CreateFolder.R
echo "DONE! Folders created"
####
# Copy data over
cp -u -p Path/Initial\ Dataset.xlsx /PATO_TO
####
# Run Statistical Models
echo "Running Statistical Analysis"
Rscript GLM_EdgeR.R
echo "DONE! Statistical Models completed"
Example error:
Error in glmLRT(glmfit, coef = coef, contrast = contrast) :
contrast vector of wrong length, should be equal to number of coefficients in the linear model.
Calls: glmQLFTest -> glmLRT
Execution halted
What I want:
#!/bin/bash
#Bash script for running GeoMx Pipeline
####
# Install required R packages for pipeline
echo "installing R packages"
Rscript installPackages.R
if grep error == TRUE
then
echo "Fatal Error, STOP Pipeline"
STOP
else
echo "DONE! R packages installed"
#####
# Created required folders
echo "Creating Folders"
Rscript CreateFolder.R
if grep error == TRUE
then
echo "Fatal Error, STOP Pipeline"
STOP
else
echo "DONE! Folders created"
####
# Copy data over
cp -u -p Path/Initial\ Dataset.xlsx /PATO_TO
####
# Run Statistical Models
echo "Running Statistical Analysis"
Rscript GLM_EdgeR.R
if grep error == TRUE
then
echo "Fatal Error, STOP Pipeline"
STOP
else
echo "DONE! Statistical Models completed"
You don't need to grep for errors, you can test if last status-code was non-zero:
#!/bin/bash
Rscript CreateFolder.R
exit_code=$?
if test $exit_code -ne 0
then
echo "Fatal Error, STOP Pipeline"
exit $exit_code
else
echo "DONE! Folders created"
fi
If Rscript CreateFolder.R fails, bash script will exit with the same status-code.
Though if you have more of those conditions you want to check against, it makes sense to use set -e instead.
Exit immediately if a pipeline, which may consist of a
single simple command, a list, or a
compound command returns a non-zero status.
https://www.gnu.org/software/bash/manual/html_node/The-Set-Builtin.html
Basically it makes your script run until something fails:
#!/bin/bash
set -e
Rscript installPackages.R
echo "DONE! R packages installed"
Rscript CreateFolder.R
echo "DONE! Folders created"
Rscript GLM_EdgeR.R
echo "DONE! Statistical Models completed"
With 2nd script, CreateFolder.R, failing, it will look like this:
~/r# ./wrappper.sh
[1] "OK"
DONE! R packages installed
Error: object 'will_fail' not found
Execution halted

How to prevent R system command syntax error

I am trying to run a command similar to
> system("cat <(echo $PATH)")
which fails when run from within R or Rstudio with the following error message:
sh: -c: line 0: syntax error near unexpected token `('
sh: -c: line 0: `cat <(echo $PATH)'
However, if I run this on the command line it works fine:
$ cat <(echo $PATH)
[...]/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin
I checked that the shell I am using is bash using system("echo $SHELL"). Can anyone help me solving this?
This syntax works in bash but not sh. The $SHELL environment variable doesn't necessarily mean that is the shell being used. echo $0 will show your shell.
system("echo $0")
#> sh
You could force bash to be used like this
system("bash -c 'cat <(echo $PATH)'")
#> /usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin

How to get an Rscript to return a status code in non-interactive bash mode

I am trying to get the status code out of an Rscript run in an non-interactive way in the form of a bash script. This step is part of larger data processing cycle that involves db2 scripts among other things.
So I have the following contents in a script sample.sh:
Rscript --verbose --no-restore --no-save /home/R/scripts/sample.r >> sample.rout
when this sample.sh is run it always returns a status code of 0, irrespective of if the sample.r script run fully or error out in an intermediate step.
I tried the following things but no luck
1 - in the sample.sh file, I added an if and else condition for a return code like the below, but it again wrote back 0 despite sample.r failing in one of the functions inside.
if Rscript --verbose --no-restore --no-save /home/R/scripts/sample.r >> sample.rout
then
echo -e "0"
else
echo -e "1"
fi
2 - I also tried a wrapper script, like in a sample.wrapper.sh file
r=0
a=$(./sample.sh)
r=$?
echo -e "\n return code of the script is: $a\n"
echo -e "\n The process completed with status: $r"
here also I did not get the expected '1' in the case of failure of the sample.r in an intermediate step on both the variables a and r. Ideally, i would like a way to capture the error (as '1') in a.
Could someone please advice how to get rscript to write '0' only in case of completion of the entire script without any errors and '1' in all other cases?
greatly appreciate the input! thank you!
I solved the problem by returning the status code in addition to echo. below is the code snipped from sample.sh script. In addition, in sample.R code i have added trycatch to catch the errors and quit(status = 1).
function fun {
if Rscript --verbose --no-restore --no-save /home/R/scripts/sample.r > sample.rout 2>&1
then
echo -e "0"
return 0
else
echo -e "1"
return 1
fi
}
fun
thanks everyone for your inputs.
The above code works for me. I modified it so that I could reuse the function and have it exit when there's an error
Rscript_with_status () {
rscript=$1
if Rscript --vanilla $rscript
then
return 0
else
exit 1
fi
}
run r scripts by:
Rscript_with_status /path/to/script/sample.r
Your remote script needs to provide a proper exit status.
You can make a 1st test by providing i.e. "exit 1" at the end of the remote script and see that it will make a difference.
remote.sh:
#!/bin/sh
exit 1
From local machine:
ssh -l username remoteip /home/username/remote.sh
echo $?
1
But the remote script should also provide to you the exit status of the last executed command. Experiment further by modifying your remote script:
#!/bin/sh
#exit 1
/bin/false
The exit status of the remote command will now also be 1.

How can I submit a batch job knitting an Rscript on LSF?

How to knit an Rmd document in a job on an LSF cluster?
I tried this:
bsub -q normal -E 'test -e /nfs/users/nfs_c/username' -R "select[mem>20000] rusage[mem=20000]" -M20000 -o DOWNLOAD.o -e DOWNLOAD.e -J DOWNLOAD Rscript -e "rmarkdown::render('code/12downloadSeq.Rmd')"
The above command does not work. I get the following error:
Error in strsplit(version_info, "\n")[[1]] : subscript out of bounds
Calls: <Anonymous> ... pandoc_available -> find_pandoc -> lapply -> FUN -> get_pandoc_version
Execution halted
On the other hand, a script like the following to run a normal R script works:
bsub -q normal -E 'test -e /nfs/users/nfs_c/cr7' -R "select[mem>20000] rusage[mem=20000]" -M20000 -o RSCRIPT.o -e RSCRIPT.e -J RSCRIPT Rscript code/setup.R
I imagine there is a problem with the quotation marks but I do not know how to solve it.

Passing arguments from a call to a bash script to an Rscript

I have a bash script that does some things and then calls Rscript. Here a simple example to illustrate:
test.sh:
Rscript test.r
test.r:
args <- commandArgs()
print(args)
How can I use ./test.sh hello on the command line result in R printing hello?
You can have bash pass all the arguments to the R script using something like this for a bash script:
#!/bin/bash
Rscript /path/to/R/script --args "$*"
exit 0
You can then choose how many of the arguments from $* need to be discarded inside of R.
I noticed the way to deal with this is:
test.sh:
Rscript test.r $1
test.r:
args <- commandArgs(TRUE)
print(args)
The $1 represents the first argument passed to the bash script.
When calling commandArgs() instead of commandArgs(TRUE), it does not pass from bash, but instead it will print other arguments called internally.
Regarding asb's answer:
having "--args" in the line of bash script doesn't work, the "--args" was taken as the literal of real argument that I want to pass into my R script. Taking it out works, i.e. "Rscript /path/to/my/rfile.R arg1 arg2"
bash version: GNU bash, version 3.2.25(1)-release (x86_64-redhat-linux-gnu)
Rscript version: R scripting front-end version 3.0.1 (2013-05-16)

Resources