Running multiple R scripts sequentially from shell in the same R session - r

Is it possible to run multiple .R files from the shell or a bash script, in sequence, in the same R session (so without having to write intermediate results to disk)?
E.g. if file1.R contains a=1 and file2.R print(a+1)
then do something like
$ Rscript file1.R file2.R
[1] 2
(of course a workaround would be to stitch the scripts together or have a master script sourcing 1 and 2)

You could write a wrapper script that calls each script in turn:
source("file1.R")
source("file2.R")
Call this source_files.R and then run Rscript source_files.R. Of course, with something this simple you can also just pass the statements on the command line:
Rscript -e 'source("file1.R"); source("file2.R")'

Related

Pass output of R script to bash script

I would like to pass the output of my R file to a bash script.
The output of the R script being the title of a video: "Title of Video"
And my bash script being simply:
youtube-dl title_of_video.avi https://www.youtube.com/watch?v=w82a1FTere5o88
Ideally I would like the output of the video being "Title of Video".avi
I know I can use Rscript to launch an R script with a bash command but I don't think Rscript can help me here.
In bash you can call a command and use its output further via the $(my_command) syntax.
Make sure your script only outputs the title.
E.g.
# in getTitle.R
cat('This is my title') # note, print would have resulted in an extra "[1]"
# in your bash script:
youtube-dl $(Rscript getTitle.R) http://blablabla
If you want to pass arguments to your R script as well, do so inside the $() syntax; if you want to pass those arguments to your bash script, and delegate them to the R script to handle them, you can use the special bash variables $1 (denote the 1st argument, etc), or $* to denote all arguments passed to the bash script, e.g.
#!/bin/bash
youtube-dl $(Rscript getTitle.R $1) http://blablablabla
(this assumes that getTitle.R does something to those arguments internally via commandArgs etc to produce the wanted title output)
You can call Rscript in bash script and you might assign the output of the R script to a variable in bash script. Check this question. After that you can execute
youtube-dl $outputFromRScript https://www.youtube.com/watch?v=w82a1FTere5o88

Can I force Rscript to pass wildcard in command line arguments?

As best as I can sort out (since I haven't had much luck finding documentation of it) when one runs Rscript with a command argument that includes a wildcard *, the argument is expanded to a character vector of the filepaths that match, or passed if there are no matches. Is there a way to pass the wildcard all the time, so I can handle it myself within the script (using things like Sys.glob, for example)?
Here's a minimal example, run from the terminal:
ls
## foo.csv bar.csv baz.txt
Rscript -e "print(commandArgs(T))" *.csv
## [1] "foo.csv" "bar.csv"
Rscript -e "print(commandArgs(T))" *.txt
## [1] "baz.txt"
Rscript -e "print(commandArgs(T))" *.rds
## [1] "*.rds"
EDIT: I have learned that this behavior is from bash, not Rscript. Is there some way to work around this behavior from within R, or to suppress wildcard expansion for a particular R script but not the Rscript command? In my particular case, I want to run a function with two arguments, Rscript collapse.R *.rds out.rds that concatenates the contents of many individual RDS files into a list and saves the result in out.rds. But since the wildcard gets expanded before being passed to R, I have no way of checking whether the second argument has been supplied.
If I understand correctly, you don't want bash to glob the wildcard for you, you want to pass the expression itself, e.g. *.csv. Some options include:
Pass the expression in quoted text and process it within R, either by evaluating that in another command or otherwise
Rscript -e "list.files(pattern = commandArgs(T))" "*\.csv$"
Pass just the extension and process the * within R by context
Rscript -e "list.files(pattern = paste0('*\\\\.', commandArgs(T)))" "csv$"
Through complicated and unnecessary means, disable globbing for that command: Stop shell wildcard character expansion?
Note: I've changed the argument to a regex to prevent it matching too greedily.

How To Run Different, Multiple Rscripts on SGE Cluster

I am trying to run different Rscripts on a SGE cluster, each Rscript only changes by one variable (e.g. cancer <- "UVM" or "ACC", etc.).
I have attempted two ways: either run a Single Rscript that gets command line arguments for the 30 different cancer names
OR
run each Rscript (i.e. UVM.r, ACC.r, etc.)
Either way, I am having alot of difficulty figuring out how to submit these jobs so I can run either one Rscript 30 times with different argument each time OR run multiple Rscripts with no command line arguments.
You can use while loop in bash for this.
Setup input file of arguments, e.g. args.txt:
UVM
ACC
Run qsub in while loop to submit script for each argument:
while read arg
do
echo "Rscript script.R ${arg}" | qsub <options>
done <args.txt
Above uses echo to pass code to run to qsub.
A job script like this:
#!/bin/bash
#$ -t 1-30
shift ${SGE_TASK_ID}
exec Rscript script.R $1
Submit like this qsub job_script dummy UVM ACC ...

How to pass parameters in system call in Perl?

I'm calling R function in Perl by passing variables in Perl program using system command.
#!/usr/bin/perl
$file1= "Test1.txt"
$file2= "Test2.txt"
$val="Rscript Test.R ".$file1." ".$file2;
print($val,"\n");
system('Rscript Test.R', $file1, $file2);
But it does not call the R script and pass the file1 and file2 values. How can I fix this?
When using the system LIST syntax, put all the arguments to the list - otherwise, Rscript Test.R is taken as one command.
system('Rscript', 'Test.R', $file1, $file2);

Executing expressions in Rscript.exe

I'd like to put some expressions that write stuff to a file directly into a call to Rscript.exe (without specifying file in Rscript [options] [-e expression] file [args] and thus without an explicit R script that is run).
Everything seems to work except the fact that the desired file is not created. What am I doing wrong?
# Works:
shell("Rscript -e print(Sys.time())")
# Works:
write(Sys.time(), file='c:/temp/systime.txt')
# No error, but no file created:
shell("Rscript -e write(Sys.time(), file='c:/temp/systime.txt')")
Rscript parses its command line using spaces as separators. If your R string contains embedded spaces, you need to wrap it within quotes to make sure it gets sent as a complete unit to the R parser.
You also don't need to use shell unless you specifically need features of cmd.exe.
system("Rscript.exe -e \"write(Sys.time(), file='c:/temp/systime.txt')\"")

Resources