What does the `input` argument do in the system() function in R? - r

What does the input argument do in the system() function in R? For example in the code below
authentication_test <- "authentication_test aws s3 ls s3://test-bucket/ > /dev/null"
system(authentication_test, input = "q")
I don't understand what purpose the letter q serves.
Looking at the help file, input is described as
input: if a character vector is supplied, this is copied
one string per line to a temporary file, and the standard
input of command is redirected to the file.
but I still have trouble understanding what exactly it is doing.

input creates a temporary file which is used as STDIN for the system shell command.
Take for example the cat command:
system("cat", input = "Line1\nLine2")
#Line1
#Line2
In your bash shell this would be the same as
echo -e "File1\nFile2" > file
cat < file
#Line1
#Line2

Related

Pass output of R script to bash script

I would like to pass the output of my R file to a bash script.
The output of the R script being the title of a video: "Title of Video"
And my bash script being simply:
youtube-dl title_of_video.avi https://www.youtube.com/watch?v=w82a1FTere5o88
Ideally I would like the output of the video being "Title of Video".avi
I know I can use Rscript to launch an R script with a bash command but I don't think Rscript can help me here.
In bash you can call a command and use its output further via the $(my_command) syntax.
Make sure your script only outputs the title.
E.g.
# in getTitle.R
cat('This is my title') # note, print would have resulted in an extra "[1]"
# in your bash script:
youtube-dl $(Rscript getTitle.R) http://blablabla
If you want to pass arguments to your R script as well, do so inside the $() syntax; if you want to pass those arguments to your bash script, and delegate them to the R script to handle them, you can use the special bash variables $1 (denote the 1st argument, etc), or $* to denote all arguments passed to the bash script, e.g.
#!/bin/bash
youtube-dl $(Rscript getTitle.R $1) http://blablablabla
(this assumes that getTitle.R does something to those arguments internally via commandArgs etc to produce the wanted title output)
You can call Rscript in bash script and you might assign the output of the R script to a variable in bash script. Check this question. After that you can execute
youtube-dl $outputFromRScript https://www.youtube.com/watch?v=w82a1FTere5o88

R or bash command line length limit

I'm developing a bash program that execute a R oneliner command to convert a RMarkdown template into a HTML document.
This R oneliner command looks like:
R -e 'library(rmarkdown) ; rmarkdown::render( "template.Rmd", "html_document", output_file = "report.html", output_dir = "'${OUTDIR}'", params = list( param1 = "'${PARAM1}'", param2 = "'${PARAM2}'", ... ) )
I have a long list of parameters, let's say 10 to explain the problem, and it seems that the R or bash has a command line length limit.
When I execute the R oneliner with 10 parameters I obtain a error message like this:
WARNING: '-e library(rmarkdown)~+~;~+~rmarkdown::render(~+~"template.Rmd",~+~"html_document",~+~output_file~+~=~+~"report.html",~+~output_dir~+~=~+~"output/",~+~params~+~=~+~list(~+~param1~+~=~+~"param2", ...
Fatal error: you must specify '--save', '--no-save' or '--vanilla'
When I execute the R oneliner with 9 parameters it's ok (I tried different combinations to verify that the problem was not the last parameter).
When I execute the R oneliner with 10 parameters but with removing all spaces in it, it's ok too so I guess that R or bash use a command line length limit.
R -e 'library(rmarkdown);rmarkdown::render("template.Rmd","html_document",output_file="report.html",output_dir="'${OUTDIR}'",params=list(param1="'${PARAM1}'",param2="'${PARAM2}'",...))
Is it possible to increase this limit?
This will break a number of ways – including if your arguments have spaces or quotes in them.
Instead, try passing the values as arguments. Something like this should give you an idea how it works:
# create a script file
tee arguments.r << 'EOF'
argv <- commandArgs(trailingOnly=TRUE)
arg1 <- argv[1]
print(paste("Argument 1 was", arg1))
EOF
# set some values
param1="foo bar"
param2="baz"
# run the script with arguments
Rscript arguments.r "$param1" "$param2"
Expected output:
[1] "Argument 1 was foo bar"
Always quote your variables and always use lowercase variable names to avoid conflicts with system or application variables.

arguments defaults for docopt

I'm using docopt in R but I expect a python solution will work for me as well.
library(docopt)
doc = 'Usage:
script.r [<filename>]
Arguments:
<filename> The input filename [default: file.txt]
'
docopt(doc)$filename
gives me NULL when what I expect is file.txt. Or to put it another way, I want these two commands to have the same behavior:
Rscript script.r
Rscript script.r file.txt
Default values are only for options with an argument like this:
--filename=FILENAME The input filename [default: file.txt]
Default values can not be specified for arguments. You could maybe change the you program to use an option with an argument instead?

Can I force Rscript to pass wildcard in command line arguments?

As best as I can sort out (since I haven't had much luck finding documentation of it) when one runs Rscript with a command argument that includes a wildcard *, the argument is expanded to a character vector of the filepaths that match, or passed if there are no matches. Is there a way to pass the wildcard all the time, so I can handle it myself within the script (using things like Sys.glob, for example)?
Here's a minimal example, run from the terminal:
ls
## foo.csv bar.csv baz.txt
Rscript -e "print(commandArgs(T))" *.csv
## [1] "foo.csv" "bar.csv"
Rscript -e "print(commandArgs(T))" *.txt
## [1] "baz.txt"
Rscript -e "print(commandArgs(T))" *.rds
## [1] "*.rds"
EDIT: I have learned that this behavior is from bash, not Rscript. Is there some way to work around this behavior from within R, or to suppress wildcard expansion for a particular R script but not the Rscript command? In my particular case, I want to run a function with two arguments, Rscript collapse.R *.rds out.rds that concatenates the contents of many individual RDS files into a list and saves the result in out.rds. But since the wildcard gets expanded before being passed to R, I have no way of checking whether the second argument has been supplied.
If I understand correctly, you don't want bash to glob the wildcard for you, you want to pass the expression itself, e.g. *.csv. Some options include:
Pass the expression in quoted text and process it within R, either by evaluating that in another command or otherwise
Rscript -e "list.files(pattern = commandArgs(T))" "*\.csv$"
Pass just the extension and process the * within R by context
Rscript -e "list.files(pattern = paste0('*\\\\.', commandArgs(T)))" "csv$"
Through complicated and unnecessary means, disable globbing for that command: Stop shell wildcard character expansion?
Note: I've changed the argument to a regex to prevent it matching too greedily.

Converting this code from R to Shell script?

So I'm running a program that works but the issue is my computer is not powerful enough to handle the task. I have a code written in R but I have access to a supercomputer that runs a Unix system (as one would expect).
The program is designed to read a .csv file and find everything with the unit ft3(monthly total) in the "Units" column and select the value in the column before it. The files are charts that list things in multiple units.
To convert this program in R:
getwd()
setwd("/Users/youruserName/Desktop")
myData= read.table("yourFileName.csv", header=T, sep=",")
funData= subset(myData, units="ft3(monthly total)", select=units:value)
write.csv(funData, file="funData.csv")
To a program in Shell Script, I tried:
pwd
cd /Users/yourusername/Desktop
touch RunThisProgram
nano RunThisProgram
(((In nano, I wrote)))
if
grep -r yourFileName.csv ft3(monthly total)
cat > funData.csv
else
cat > nofun.csv
fi
control+x (((used control x to close nano)))
chmod -x RunThisProgram
./RunThisProgram
(((It runs for a while)))
We get a funData.csv file output but that file is empty
What am I doing wrong?
It isn't actually running, because there are a couple problems with your script.
grep needs the pattern first, and quoted; -r is for recursing a
directory...
if without a then
cat is called wrong so it is actually reading from stdin.
You really only need one line:
grep -F "ft3(monthly total)" yourFileName.csv > funData.csv

Resources