Calling external program from R with multiple commands in system - r

I am new to programming and mainly I am able to do some scripts within R, but for my work I need to call an external program. For this program to work on the ubuntu's terminal I have to first use setenv and then execute the program. Googling I've found the system () and Sys.setenv() functions, but unfortunately I can make it function.
This is the code that does work in the ubuntu terminal:
$ export PATH=/home/meme/bin:$PATH
$ mast "/home/meme/meme.txt" "/home/meme/seqs.txt" -o "/home/meme/output" -comp
Where the first two arguments are input files, the -o argument is the output directory and the -comp is another parameter for the program to run.
The reason that I need to do it in R despite it already works in the terminal is because I need to run the program 1000 times with 1000 different files so I want to make a for loop where the input name changes in every loop and then analyze every output in R.
I have already tried to use:
Sys.setenv(PATH="/home/meme/bin"); system(mast "/home/meme/meme.txt" "/home/meme/seqs.txt" -o "/home/meme/output" -comp )
and
system(Sys.setenv(PATH="/home/meme/bin") && mast "/home/meme/meme.txt" "/home/meme/seqs.txt" -o "/home/meme/output" -comp )
but always received:
Error: unexpected constant string in "system(mast "/home/meme/meme.txt""
or
Error: unexpected symbol in "system(Sys.setenv(PATH="/home/meme/bin") && mast "/home/meme/meme.txt""
At this point I have run out of ideas to make this work. If this has already been answered, then my googling have just been poor and I would appreciate any links to its response.
Thank you very much for your time.
Carlos
Additional details:
I use Ubuntu 12.04 64-bits version, RStudio version 0.97.551, R version 3.0.2 (2013-09-25) -- "Frisbee Sailing" Platform: x86_64-pc-linux-gnu (64-bit).
The program I use (MAST) finds a sequence pattern in a list of letters and is part of the MEME SUIT version 4.9.1 found in http://meme.nbcr.net/meme/doc/meme-install.html and run through command line. The command-line usage for mast is:
mast <motif file> <sequence file> [options]

Construct the string you want to execute with paste and feed that to system:
for(i in 1:10){
cmd=paste("export FOO=",i," ; echo \"$FOO\" ",sep='')
system(cmd)
}
Note the use of sep='' to stop paste putting spaces in, and back-quoting quote marks in the string to preserve them.
Test before running by using print(cmd) instead of system(cmd) to make sure you are getting the right command built. Maybe do:
if(TESTING){print(cmd)}else{system(cmd)}
and set TESTING=TRUE or FALSE in R before running.
If you are going to be running more than one shell command per system call, it might be better to put them all in one shell script file and call that instead, passing parameters from R. Something like:
cmd = paste("/home/me/bin/dojob.sh ",i,i+1)
system(cmd)
and then dojob.sh is a shell script that parses the args. You'll need to learn a bit more shell scripting.

Related

How can I activate a '.js' script from R?

I have a '.js' script that I usually activate from the terminal using the command node script.js. As this is part of a process where I first do some data analysis in R, I want to avoid the manual step of opening the terminal and typing the command by simply having R do it for me. My goal would be something like this:
...R analysis
write.csv(df, "data.csv")
system('node script.js')
However, when I use that specific code, I get the error:
sh: 1: node: not found
Warning message:
In system("node script.js") : error in running command
Of course, the same command runs without problem if I type it directly on the terminal.
About my Software
I am using:
Linux computer with the PopOS!
RStudio 2021.09.1+372 "Ghost Orchid"
R version 4.0.4.
The error message node: not found indicates that it couldn't find the program node. It's likely in PATH in your terminal's shell, but not in system()'s shell (sh).
In your terminal, locate node by executing which node. This will show the full path to the executable. Use that full path in system() instead.
Alternatively, run echo $PATH in your terminal, and run system('echo $PATH') or Sys.getenv('PATH') in R. Add any missing directories to R's path with Sys.setenv(PATH = <your new path string>)
Note that system2() is recommended over system() nowadays - but for reasons unimportant for your case. See ?system and ?system2 for a comparison.
Examples
Terminal
$ which node
/usr/bin/node
R
system('/usr/bin/node script.js')
# alternatively:
system2('/usr/bin/node', c('script.js'))
or adapt your PATH permanently:
Terminal
% echo $PATH
/usr/local/bin:/home/caspar/programs:/usr/bin:/bin:/usr/sbin:/sbin:/opt/X11/bin
R
> Sys.getenv('PATH')
[1] "/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/opt/X11/bin:/Applications/RStudio.app/Contents/MacOS/postback"
> Sys.setenv(PATH = "/home/caspar/programs:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/opt/X11/bin:/Applications/RStudio.app/Contents/MacOS/postback")

How to invoke a system command in R on shared computing cluster?

I am attempting to call a system command from within an R script. The system command I want to call is a name of a software that I load as a module on a shared computing cluster (using singularity). The problem I am having is that the software is not able to run when I use the system command.
system_trial.R) has only one line:
system('STAR')
hostname[1] Rscript system_trial.R
sh: STAR: command not found
Warning message:
In system("STAR") : error in running command
Of course, the software works if I call it from the shell directly.
hostname[2] STAR
Usage: STAR [options]... --genomeDir /path/to/genome/index/ --readFilesIn R1.fq R2.fq
Spliced Transcripts Alignment to a Reference (c) Alexander Dobin, 2009-2020
If I run which STAR, I get singularity exec /apps/singularity-3/star/star-2.7.5a--0.sif STAR $#
Replacing system('STAR') with system('singularity exec /apps/singularity-3/star/star-2.7.5a--0.sif STAR $#') actually executes the software.
Replacing system('STAR') with system('which STAR'), returns which: no STAR in (/bin:etc...)
Using system2('STAR') gives sh: STAR: command not found.
I would like to simply use system('STAR'). How can I achieve this?
Related post without an answer: R: calling a system command

responding yes to terminal prompt via system2() in R

tl;dr: How can I invoke the system command y | conda create --name gee_interface from an R console, e.g. via system2()? I'm comfortable enough with system2('conda', c('create', '--name', 'gee_interface')), but I don't know how to handle piping in the 'y' via system2().
Details
I am trying to use an R console to run the bash command conda create --name gee_interface (OSX Mojave with Anaconda installed).
In terminal, that command executes just fine, but prompts me to answer with Proceed ([y]/n)? (I answer 'y' and everything works smoothly).
In R, I run
Sys.setenv(PATH = paste(c("/Applications/anaconda3/bin", Sys.getenv("PATH")), collapse = .Platform$path.sep)) # ensures that system2() finds conda
system2('conda', c('create', '--name', 'gee_interface')) # This is the key line for the purposes of this question
When running the second line [i.e. system2('conda', c('create', '--name', 'gee_interface'))], the process never finishes, but quickly falls to zero CPU usage. Presumably the system is waiting for my response to the prompt, but I don't know how to provide it. How does one do this via an R script? Note also that in my particular case, the number of times that I need to respond 'y' is variable, depending on whether an environment of the name gee_interface already exists or not.
The fix to your first problem is to tell conda not to ask for confirmation using -y:
system2('conda', c('create', '--name', 'gee_interface', '-y'))
As to the second part (variable times that your input is required), I'm guessing it's to overwrite the environment if it exists? In that case, you could check for its existence first with conda info --envs, and run conda remove --name gee_interface --all if necessary before creating it.
See:
https://docs.conda.io/projects/conda/en/latest/commands/create.html
https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#removing-an-environment
You could also try your system2 call, with the argument input = "y", but that doesn't fix your second problem of needing to affirm multiple times.
See: Invoke a system command and pipe a variable as an argument

Using python to execute .R script

After seven hours of googling and rereading through somewhat similar questions, and then lots of trial and error, I'm now comfortable asking for some guidance.
To simplify my actual task, I created a very basic R script (named test_script):
x <- c(1,2,3,4,5)
avg <- mean(x)
write.csv(avg, file = "output.csv")
This works as expected.
I'm new to python and I'm just trying to figure out how to execute the R script so that the same .csv file is created.
Notable results come from:
subprocess.call(["C:/Program Files/R/R-2.15.2/bin/R", 'C:/Users/matt/Desktop/test_script.R'])
This opens a cmd window with the typical R start-up verbiage, except there is a message which reads, "ARGUMENT 'C:/Users/matt/Desktop/test_script.R' __ ignored __"
And:
subprocess.call(['C:/Program Files/R/R-2.15.2/bin/Rscript', 'C:/Users/matt/Desktop/test_script.r'])
This flashes a cmd window and returns a 0, but no .csv file is created.
Otherwise, I've tried every suggestion I could identify on this site or any other. Any insight will be greatly appreciated. Thanks in advance for your time and efforts.
Running R --help at the command prompt prints:
Usage: R [options] [< infile] [> outfile]
or: R CMD command [arguments]
Start R, a system for statistical computation and graphics, with the
specified options, or invoke an R tool via the 'R CMD' interface.
Options:
-h, --help Print short help message and exit
--version Print version info and exit
...
-f FILE, --file=FILE Take input from 'FILE'
-e EXPR Execute 'EXPR' and exit
FILE may contain spaces but not shell metacharacers.
Commands:
BATCH Run R in batch mode
COMPILE Compile files for use with R
...
Try
call(["C:/Program Files/R/R-2.15.2/bin/R", '-f', 'C:/Users/matt/Desktop/test_script.R'])
There are also some other command-line arguments you can pass to R that may be helpful. Run R --help to see the full list.
It might be too late, but hope it helps for others:
Just add --vanilla in the call list.
subprocess.call(['C:/Program Files/R/R-2.15.2/bin/Rscript', '--vanilla', 'C:/Users/matt/Desktop/test_script.r'])

R and System calls

I have used R in the past to do very basic calls to the commmand line. The example can be found here.
This time around, I am looking to mimic this code which runs successfully from the command line in Windows:
> cd C:\Documents and Settings\BTIBERT\My Documents\My Dropbox\Eclipse\Projects\R\MLB\retrosheet\rawdata
> bgame -y 2010 2010bos.eva >2010bos.txt
This is the code I am trying to run inside of R. I have already set the working directory inside of R.
dir <- paste("cd", getwd(), sep=" ")
system(dir)
system("bgame -y 2010 2010bos.eva >2010bos.txt")
I am sure this is user error, but what am I doing wrong? It appears to work initially, but returns the following error. I very well could be doing something wrong, but I believe I am using the same commands.
Expanded game descriptor, version 109(185) of 05/08/2008.
Type 'bgame -h' for help.
Copyright (c) 2001 by DiamondWare.
[Processing file 2010bos.eva.]
>2010bos.txt: can't open.
Warning message:
running command 'bgame -y 2010 2010bos.eva >2010bos.txt' had status 2
Any help you can provide will be appreciated.
You need to issue all commands in one system() call:
system(paste("cd",getwd() "&& bgame -y 2010 2010bos.eva >2010bos.txt",sep=" "))
You should already be in your working directory, so I'm not sure the cd getwd() is necessary. And you may need quotes around your path because it contains spaces. The error may be resolved by putting spaces around >.
If I were in your shoes, I would try this:
system("bgame -y 2010 2010bos.eva > 2010bos.txt")
UPDATE:
And you should probably heed this advice in the "Differences between Unix and Windows" section of ?system that says you should use shell:
• The most important difference is that on a Unix-alike
‘system’ launches a shell which then runs ‘command’. On
Windows the command is run directly - use ‘shell’ for an
interface which runs ‘command’ _via_ a shell (by default the
Windows shell ‘cmd.exe’, which has many differences from the
POSIX shell).
This means that it cannot be assumed that redirection or
piping will work in ‘system’ (redirection sometimes does, but
we have seen cases where it stopped working after a Windows
security patch), and ‘system2’ (or ‘shell’) must be used on
Windows.
Has no-one else found that system("dir", intern = T) for example doesn't work, but that you need system("cmd.exe /c dir", intern = T)? Only the latter works for me. I found this at the discussion site here (William Dunlap's post, about a third of the way down).
Also, it doesn't work with the "cd" command, but you can use the setwd() function within R and then the command will be executed within that directory.
I created the following functions for convenience, for executing programmes and running commands:
#the subject is an input file that a programme might require
execute <- function(programme, subject.spec = "", intern = FALSE, wait = FALSE){
if(!identical(subject.spec, "")){subject.spec <- paste0(" ", subject.spec)} #put space before the subject if it exists
system(paste0("cmd.exe /c ", programme, subject.spec), intern = intern, wait = wait)
}
command <- function(command, intern = TRUE, wait = FALSE){
system(paste("cmd.exe /c", command), intern = T, wait = wait)
}
Does it break your code when you get error 1 or does execution continue?
Whenever executing system commands through another language it is useful to print the system call before you call it to see exactly what is happening, pull up the shell you are intending to use and check for the same error. As the command is executed correctly this could be a hickup in bgame or R.
If you look at http://astrostatistics.psu.edu/datasets/R/html/base/html/shell.html you can see the variable flag passed to the system call."flag the switch to run a command under the shell. If the shell is bash or tcsh the default is changed to "-c"."
Also "the shell to be used can be changed by setting the configure variable R_SHELL to a suitable value (a full path to a shell, e.g. /usr/local/bin/bash)."

Resources