Have an Rscript read or take input from stdin - r

I see how to have an Rscript perform the operations I want when given a filename as an argument, e.g. if my Rscript is called script and contains:
#!/usr/bin/Rscript
path <- commandArgs()[1]
writeLines(readLines(path))
Then I can run from the bash command line:
Rscript script filename.Rmd --args dev='svg'
and successfully get the contents of filename.Rmd echoed back out to me. If instead of passing the above argument a filename like filename.Rmd I want to pass it text from stdin, I try modifying my script to read from stdin:
#!/usr/bin/Rscript
writeLines(file("stdin"))
but I do not know how to construct the commandline call for this case. I tried piping in the contents:
cat filename.Rmd | Rscript script --args dev='svg'
and also tried redirect:
Rscript script --args dev='svg' < filename.Rmd
and either way I get the error:
Error in writeLines(file("stdin")) : invalid 'text' argument
(I've also tried open(file("stdin"))). I'm not sure if I'm constructing the Rscript incorrectly, or the commandline argument incorrectly, or both.

You need to read text from the connection created by file("stdin") in order to pass anything useful to the text argument of writeLines(). This should work
#!/usr/bin/Rscript
writeLines(readLines(file("stdin")))

Related

Pass output of R script to bash script

I would like to pass the output of my R file to a bash script.
The output of the R script being the title of a video: "Title of Video"
And my bash script being simply:
youtube-dl title_of_video.avi https://www.youtube.com/watch?v=w82a1FTere5o88
Ideally I would like the output of the video being "Title of Video".avi
I know I can use Rscript to launch an R script with a bash command but I don't think Rscript can help me here.
In bash you can call a command and use its output further via the $(my_command) syntax.
Make sure your script only outputs the title.
E.g.
# in getTitle.R
cat('This is my title') # note, print would have resulted in an extra "[1]"
# in your bash script:
youtube-dl $(Rscript getTitle.R) http://blablabla
If you want to pass arguments to your R script as well, do so inside the $() syntax; if you want to pass those arguments to your bash script, and delegate them to the R script to handle them, you can use the special bash variables $1 (denote the 1st argument, etc), or $* to denote all arguments passed to the bash script, e.g.
#!/bin/bash
youtube-dl $(Rscript getTitle.R $1) http://blablablabla
(this assumes that getTitle.R does something to those arguments internally via commandArgs etc to produce the wanted title output)
You can call Rscript in bash script and you might assign the output of the R script to a variable in bash script. Check this question. After that you can execute
youtube-dl $outputFromRScript https://www.youtube.com/watch?v=w82a1FTere5o88

arguments defaults for docopt

I'm using docopt in R but I expect a python solution will work for me as well.
library(docopt)
doc = 'Usage:
script.r [<filename>]
Arguments:
<filename> The input filename [default: file.txt]
'
docopt(doc)$filename
gives me NULL when what I expect is file.txt. Or to put it another way, I want these two commands to have the same behavior:
Rscript script.r
Rscript script.r file.txt
Default values are only for options with an argument like this:
--filename=FILENAME The input filename [default: file.txt]
Default values can not be specified for arguments. You could maybe change the you program to use an option with an argument instead?

Can I force Rscript to pass wildcard in command line arguments?

As best as I can sort out (since I haven't had much luck finding documentation of it) when one runs Rscript with a command argument that includes a wildcard *, the argument is expanded to a character vector of the filepaths that match, or passed if there are no matches. Is there a way to pass the wildcard all the time, so I can handle it myself within the script (using things like Sys.glob, for example)?
Here's a minimal example, run from the terminal:
ls
## foo.csv bar.csv baz.txt
Rscript -e "print(commandArgs(T))" *.csv
## [1] "foo.csv" "bar.csv"
Rscript -e "print(commandArgs(T))" *.txt
## [1] "baz.txt"
Rscript -e "print(commandArgs(T))" *.rds
## [1] "*.rds"
EDIT: I have learned that this behavior is from bash, not Rscript. Is there some way to work around this behavior from within R, or to suppress wildcard expansion for a particular R script but not the Rscript command? In my particular case, I want to run a function with two arguments, Rscript collapse.R *.rds out.rds that concatenates the contents of many individual RDS files into a list and saves the result in out.rds. But since the wildcard gets expanded before being passed to R, I have no way of checking whether the second argument has been supplied.
If I understand correctly, you don't want bash to glob the wildcard for you, you want to pass the expression itself, e.g. *.csv. Some options include:
Pass the expression in quoted text and process it within R, either by evaluating that in another command or otherwise
Rscript -e "list.files(pattern = commandArgs(T))" "*\.csv$"
Pass just the extension and process the * within R by context
Rscript -e "list.files(pattern = paste0('*\\\\.', commandArgs(T)))" "csv$"
Through complicated and unnecessary means, disable globbing for that command: Stop shell wildcard character expansion?
Note: I've changed the argument to a regex to prevent it matching too greedily.

Executing expressions in Rscript.exe

I'd like to put some expressions that write stuff to a file directly into a call to Rscript.exe (without specifying file in Rscript [options] [-e expression] file [args] and thus without an explicit R script that is run).
Everything seems to work except the fact that the desired file is not created. What am I doing wrong?
# Works:
shell("Rscript -e print(Sys.time())")
# Works:
write(Sys.time(), file='c:/temp/systime.txt')
# No error, but no file created:
shell("Rscript -e write(Sys.time(), file='c:/temp/systime.txt')")
Rscript parses its command line using spaces as separators. If your R string contains embedded spaces, you need to wrap it within quotes to make sure it gets sent as a complete unit to the R parser.
You also don't need to use shell unless you specifically need features of cmd.exe.
system("Rscript.exe -e \"write(Sys.time(), file='c:/temp/systime.txt')\"")

submit R script to grid in form of batch

I wanted to submit a R job to the grid. I have saved the main R code in MGSA_rand.r
In the file callmgsa.r I have written
print('here')
source('/home/users/pegah/MGSA_rand.r')
mgsalooprand($SGE_TASK_ID,382)
And I use the file Rscript.sh to call the job (with the -t parameter I send the value corrseponding to $SGE_TASK_ID)
R CMD BATCH --no-save callmgsa.r
I submit the job like this:
qsub -t 1 -cwd -b y -l vf=1000m /home/users/pegah/Rscript.sh
I neither get an error nor any output. The job terminates just as I submit it, with out any output. Could you please help me?
thanks, Pegah
The variable $SGE_TASK_ID is a shelscript variable. Calling it in R with the same syntax is no going to work. What you could do is use Rscript in stead. From the shellscript you call:
Rscript callmgsa.r $SGE_TASK_ID
In the R script you can catch the command line arguments like:
args <- commandArgs(trailingOnly = TRUE)
print('here')
source('/home/users/pegah/MGSA_rand.r')
mgsalooprand(args[1],382)
This should work...

Resources