How to specify input arguments to Rscript by name from command line? - r

I am new to command line usage and don't think this question has been asked elsewhere. I'm trying to adapt an Rscript to be run from the command line in a shell script. Basically, I'm using some tools in the immcantation framework to read and annotate some antibody NGS data, and then to group sequences into their clonal families. To set the similarity threshold, the creators recommend using a function in their shazam package to set an appropriate threshold.
I've made the simple script below to read and validate the arguments:
#!/usr/bin/env Rscript
params <- commandArgs(trailingOnly=TRUE)
### read and validate mode argument
mode <- params[1]
modeAllowed <- c("ham","aa","hh_s1f","hh_s5f")
if(!(mode %in% modeAllowed)){
stop(paste("illegal mode argument supplied. acceptable values are",
paste(paste(modeAllowed, collapse = ", "), ".", sep = ""), "\nmode should be supplied first",
sep = " "))
}
### execute function
cat(threshold)
The script works, however since for each parameter there's only a finite number of options. I was wondering if there was a way of passing in the arguments like --mode aa (for example) from the terminal? All the information I've seen online seems to be using code like my mode <- params[1] from above which I guess only works if the mode argument is first?

Related

How can I pass the names of a list of files from bash to an R program?

I have a long list of files with names like: file-typeX-sectorY.tsv, where X and Y get values from 0-100. I process each of those files with an R program, but read them one by one like this:
data <- read.table(file='my_info.tsv', sep = '\t', header = TRUE, fill = TRUE)
it is impractical. I want to build a bash program that does something like
#!/bin/bash
for i in {0..100..1}
do
for j in {1..100..1)
do
Rscript program.R < file-type$i-sector$j.tsv
done
done
My problem is not with the bash script but with the R program. How can I receive the files one by one? I have googled and tried instructions like:
args <- commandArgs(TRUE)
either
data <- commandArgs(trailingOnly = TRUE)
but I can't find the way. Could you please help me?
At the simplest level your problem may be the (possible accidental ?) redirect you have -- so remove the <.
Then a mininmal R 'program' to take a command-line argument and do something with it would be
#!/usr/bin/env Rscript
args <- commandArgs(trailingOnly = TRUE)
stopifnot("require at least one arg" = length(args) > 0)
cat("We were called with '", args[1], "'\n", sep="")
We use a 'shebang' line and make it chmod 0755 basicScript.R to be runnable. The your shell double loop, reduced here (and correcting one typo) becomes
#!/bin/bash
for i in {0..2..1}; do
for j in {1..2..1}; do
./basicScript.R file-type${i}-sector${j}.tsv
done
done
and this works as we hope with the inner program reflecting the argument:
$ ./basicCaller.sh
We were called with 'file-type0-sector1.tsv'
We were called with 'file-type0-sector2.tsv'
We were called with 'file-type1-sector1.tsv'
We were called with 'file-type1-sector2.tsv'
We were called with 'file-type2-sector1.tsv'
We were called with 'file-type2-sector2.tsv'
$
Of course, this is horribly inefficient as you have N x M external processes. The two outer loops could be written in R, and instead of calling the script you would call your script-turned-function.

Ask for user multiple-line input during R-script execution [duplicate]

I am trying to use
var <- as.numeric(readline(prompt="Enter a number: "))
and later use this in a calculation.
It works fine when running in RStudio but I need to be able to pass this input from the command line in Windows 10
I am using a batch file with a single line
Rscript.exe "C:\My Files\R_scripts\my_script.R"
When it gets to the user input part it freezes and it doesn't provide expected output.
From the documentation of readline():
This can only be used in an interactive session. [...] In non-interactive use the result is as if the response was RETURN and the value is "".
For non-interactive use - when calling R from the command line - I think you've got two options:
Use readLines(con = "stdin", n = 1) to read user input from the terminal.
Use commandArgs(trailingOnly = TRUE) to supply the input as an argument from the command line when calling the script instead.
Under is more information.
1. Using readLines()
readLines() looks very similar to readline() which you're using, but is meant to read files line by line. If we instead of a file points it to the standard input (con = "stdin") it will read user input from the terminal. We set n = 1 so that it stops reading from the command line when you press Enter (that is, it only read one line).
Example
Use readLines() in a R-script:
# some-r-file.R
# This is our prompt, since readLines doesn't provide one
cat("Please write something: ")
args <- readLines(con = "stdin", n = 1)
writeLines(args[[1]], "output.txt")
Call the script:
Rscript.exe "some-r-file.R"
It will now ask you for your input. Here is a screen capture from PowerShell, where I supplied "Any text!".
Then the output.txt will contain:
Any text!
2. UsingcommandArgs()
When calling an Rscript.exe from the terminal, you can add extra arguments. With commandArgs() you can capture these arguments and use them in your code.
Example:
Use commandArgs() in a R-script:
# some-r-file.R
args <- commandArgs(trailingOnly = TRUE)
writeLines(args[[1]], "output.txt")
Call the script:
Rscript.exe "some-r-file.R" "Any text!"
Then the output.txt will contain:
Any text!

How to get a user input in command prompt and pass it to R

I am trying to use
var <- as.numeric(readline(prompt="Enter a number: "))
and later use this in a calculation.
It works fine when running in RStudio but I need to be able to pass this input from the command line in Windows 10
I am using a batch file with a single line
Rscript.exe "C:\My Files\R_scripts\my_script.R"
When it gets to the user input part it freezes and it doesn't provide expected output.
From the documentation of readline():
This can only be used in an interactive session. [...] In non-interactive use the result is as if the response was RETURN and the value is "".
For non-interactive use - when calling R from the command line - I think you've got two options:
Use readLines(con = "stdin", n = 1) to read user input from the terminal.
Use commandArgs(trailingOnly = TRUE) to supply the input as an argument from the command line when calling the script instead.
Under is more information.
1. Using readLines()
readLines() looks very similar to readline() which you're using, but is meant to read files line by line. If we instead of a file points it to the standard input (con = "stdin") it will read user input from the terminal. We set n = 1 so that it stops reading from the command line when you press Enter (that is, it only read one line).
Example
Use readLines() in a R-script:
# some-r-file.R
# This is our prompt, since readLines doesn't provide one
cat("Please write something: ")
args <- readLines(con = "stdin", n = 1)
writeLines(args[[1]], "output.txt")
Call the script:
Rscript.exe "some-r-file.R"
It will now ask you for your input. Here is a screen capture from PowerShell, where I supplied "Any text!".
Then the output.txt will contain:
Any text!
2. UsingcommandArgs()
When calling an Rscript.exe from the terminal, you can add extra arguments. With commandArgs() you can capture these arguments and use them in your code.
Example:
Use commandArgs() in a R-script:
# some-r-file.R
args <- commandArgs(trailingOnly = TRUE)
writeLines(args[[1]], "output.txt")
Call the script:
Rscript.exe "some-r-file.R" "Any text!"
Then the output.txt will contain:
Any text!

R, passing variables to a system command

Using R, I am looking to create a QR code and embed it into an Excel spreadsheet (hundreds of codes and spreadsheets). The obvious way seems to be to create a QR code using the command line, and use the "system" command in R. Does anyone know how to pass R variables through the "system" command? Google is not too helpful as "system" is a bit generic, ?system does not contain any examples of this.
Note - I am actually using data matrices rather than QR codes, but using the term "data matrix" in an R question will lead to havoc, so let's talk QR codes instead. :-)
system("dmtxwrite my_r_variable -o image.png")
fails, as do the variants I have tried with "paste". Any suggestions gratefully received.
Let's say we have the variable x that we want to pass on to dmtxwrite, you can pass it on like:
x = 10
system(sprintf("dmtxwrite %s -o image.png", x))
or alternatively using paste:
system(paste("dmtxwrite", x, "-o image.png"))
but I prefer sprintf in this case.
Also making use of base::system2 may be worth considering as system2 provides args argument that can be used for that purpose. In your example:
my_r_variable <- "a"
system2(
'echo',
args = c(my_r_variable, '-o image.png')
)
would return:
a -o image.png
which is equivalent to running echo in the terminal. You may also want to redirect output to text files:
system2(
'echo',
args = c(my_r_variable, '-o image.png'),
stdout = 'stdout.txt',
stderr = 'stderr.txt'
)

Retrieving Variable Declaration

How can I find how did I first declare a certain variable when I am a few hundred
lines down from where I first declared it. For example, I have declared the following:
a <- c(vectorA,vectorB,vectorC)
and now I want to see how I declared it. How can I do that?
Thanks.
You could try using the history command:
history(pattern = "a <-")
to try to find lines in your history where you assigned something to the variable a. I think this matches exactly, though, so you may have to watch out for spaces.
Indeed, if you type history at the command line, it doesn't appear to be doing anything fancier than saving the current history in a tempfile, loading it back in using readLines and then searching it using grep. It ought to be fairly simple to modify that function to include more functionality...for example, this modification will cause it to return the matching lines so you can store it in a variable:
myHistory <- function (max.show = 25, reverse = FALSE, pattern, ...)
{
file1 <- tempfile("Rrawhist")
savehistory(file1)
rawhist <- readLines(file1)
unlink(file1)
if (!missing(pattern))
rawhist <- unique(grep(pattern, rawhist, value = TRUE,
...))
nlines <- length(rawhist)
if (nlines) {
inds <- max(1, nlines - max.show):nlines
if (reverse)
inds <- rev(inds)
}
else inds <- integer()
#file2 <- tempfile("hist")
#writeLines(rawhist[inds], file2)
#file.show(file2, title = "R History", delete.file = TRUE)
rawhist[inds]
}
I will assume you're using the default R console. If you're on Windows, you can File -> Save history and open the file in your fav text browser, or you can use function savehistory() (see help(savehistory)).
What you need to do is get a (good) IDE, or at least a decent text editor. You will benevit from code folding, syntax coloring and much more. There's a plethora of choices, from Tinn-R, VIM, ESS, Eclipse+StatET, RStudio or RevolutionR among others.
You can run grep 'a<-' .Rhistory from terminal (assuming that you've cdd to your working directory). ESS has several very useful history-searching functions, like (comint-history-isearch-backward-regexp) - binded to M-r by default.
For further info, consult ESS manual: http://ess.r-project.org/Manual/ess.html
When you define a function, R stores the source code of the function (preserving formatting and comments) in an attribute named "source". When you type the name of the function, you will get this content printed.
But it doesn't do this with variables. You can deparse a variable, which generates an expression that will produce the variable's value but this doesn't need to be the original expression. For example when you have b <- c(17, 5, 21), deparse(b) will produce the string "c(17, 5, 21)".
In your example, however, the result wouldn't be "c(vectorA,vectorB,vectorC)", it would be an expression that produces the combined result of your three vectors.

Resources