Better string interpolation in R - r

I need to build up long command lines in R and pass them to system(). I find it is very inconvenient to use paste0/paste function, or even sprintf function to build each command line. Is there a simpler way to do like this:
Instead of this hard-to-read-and-too-many-quotes:
cmd <- paste("command", "-a", line$elem1, "-b", line$elem3, "-f", df$Colum5[4])
or:
cmd <- sprintf("command -a %s -b %s -f %s", line$elem1, line$elem3, df$Colum5[4])
Can I have this:
cmd <- buildcommand("command -a %line$elem1 -b %line$elem3 -f %df$Colum5[4]")

For a tidyverse solution see https://github.com/tidyverse/glue. Example
name="Foo Bar"
glue::glue("How do you do, {name}?")

With version 1.1.0 (CRAN release on 2016-08-19), the stringr package has gained a string interpolation function str_interp() which is an alternative to the gsubfn package.
# sample data
line <- list(elem1 = 10, elem3 = 30)
df <- data.frame(Colum5 = 1:4)
# do the string interpolation
stringr::str_interp("command -a ${line$elem1} -b ${line$elem3} -f ${df$Colum5[4]}")
#[1] "command -a 10 -b 30 -f 4"

This comes pretty close to what you are asking for. When any function f is prefaced with fn$, i.e. fn$f, character interpolation will be performed replacing ... with the result of running ... as an R expression.
library(gsubfn)
cmd <- fn$identity("command -a `line$elem1` -b `line$elem3` -f `df$Colum5[4]`")
Here is a self contained reproducible example:
library(gsubfn)
# test inputs
line <- list(elem1 = 10, elem3 = 30)
df <- data.frame(Colum5 = 1:4)
fn$identity("command -a `line$elem1` -b `line$elem3` -f `df$Colum5[4]`")
## [1] "command -a 10 -b 30 -f 4"
system
Since any function can be used we could operate directly on the system call like this. We have used echo here to make it executable but any command could be used.
exitcode <- fn$system("echo -a `line$elem1` -b `line$elem3` -f `df$Colum5[4]`")
## -a 10 -b 30 -f 4
Variation
This variation would also work. fn$f also performs substitution of $whatever with the value of variable whatever. See ?fn for details.
with(line, fn$identity("command -a $elem1 -b $elem3 -f `df$Colum5[4]`"))
## [1] "command -a 10 -b 30 -f 4"

Another option would be to use whisker.render from https://github.com/edwindj/whisker which is a {{Mustache}} implementation in R. Usage example:
require(dplyr); require(whisker)
bedFile="test.bed"
whisker.render("processing {{bedFile}}") %>% print

Not really a string interpolation solution, but still a very good option for the problem is to use the processx package instead of system() and then you don't need to quote anything.

library(GetoptLong)
str = qq("region = (#{region[1]}, #{region[2]}), value = #{value}, name = '#{name}'")
cat(str)
qqcat("region = (#{region[1]}, #{region[2]}), value = #{value}, name = '#{name}'")
https://cran.r-project.org/web/packages/GetoptLong/vignettes/variable_interpolation.html

Related

How to call exe program and input parameters using R?

I want to call .exe program (spi_sl_6.exe) using a command of R (system), however I can't input parameters to the program using "system". The followwing is my command and parameters:system("D:\\working\spi_sl_6.exe")
I am searching for a long time on net. But no use. Please help or try to give some ideas how to achieve this. Thanks in advance.
This is using the Standardized Precipitation Index software from
http://drought.unl.edu/MonitoringTools/DownloadableSPIProgram.aspx.
This seems to give a working solution using Windows (but not without warnings!)
Fisrt download the software and example files
# Create directory to download software
mydir <- "C:\\Users\\david\\spi"
dir.create(mydir)
url <- "http://drought.unl.edu/archive/Programs/SPI"
download.file(file.path(url, "spi_sl_6.exe"), file.path(mydir, "spi_sl_6.exe"), mode="wb")
# Download example files
download.file(file.path(url, "SPI_samplefiles.zip"), file.path(mydir, "SPI_samplefiles.zip"))
# extract one example file, and write out
temp <- unzip(file.path(mydir, "SPI_samplefiles.zip"), "wymo.cor")
dat <- read.table(temp)
# Use this file as an example input
write.table(dat, file.path(mydir,"wymo.cor"), col.names = FALSE, row.names = FALSE)
From page 3 of the help file basic-spi-program-information.pdf at the above link the command line code should be of the form spi 3 6 12 <infile.dat >outfile.dat, however,
neither of the following worked (just from command line not in R), and various iterations of how to pass parameters.
C:\Users\david\spi\spi_sl_6 3 <C:\Users\david\spi\wymo.cor >C:\Users\david\spi\out.dat
cd C:\Users\david\spi && spi_sl_6 3 <wymo.cor >out.dat
However, using the accepted answer from Running .exe file with multiple parameters in c#
seems to work. That is again from the command line
cd C:\Users\david\spi && (echo 2 && echo 3 && echo 6 && echo wymo.cor && echo out1.dat) | spi_sl_6
So to run this in R you can wrap this in a shell (you will need to change the path to where you have saved the exe)
shell("cd C:\\Users\\david\\spi && (echo 2 && echo 3 && echo 6 && echo wymo.cor && echo out2.dat) | spi_sl_6", intern=TRUE)
out1.dat and out2.dat should be the same.
This throws warning messages, I think from the echo (in R but not from command line) but the output file is produced.
Suppose you can automate all the echo calls sligtly, so all you need to do is input the time parameters.
timez <- c(2, 3, 6)
stime <- paste("echo", timez, collapse =" && ")
infile <- "wymo.cor"
outfile <- "out3.dat"
spiCall <- paste("cd", mydir, "&& (" , stime, "&& echo", infile, "&& echo", outfile, " ) | spi_sl_6")
shell(spiCall)
You can construct the command using sprintf :
cmd_name <- "D:\\working\spi_sl_6.exe"
param1 <- "a"
param2 <- "b"
system2(sprintf("%s %s %s",cmd_name,param1,param2))
Or using system2( I prefer this option):
system2(cmd_name, args = c(param1,param2))

R: pass variable from R to unix

I am running an R script via bash script and want to return the output of the R script to the bash script to keep working with it there.
The bash is sth like this:
#!/bin/bash
Rscript MYRScript.R
a=OUTPUT_FROM_MYRScript.R
do sth with a
and the R script is sth like this:
for(i in 1:5){
i
sink(type="message")
}
I want bash to work with one variable from R at the time, meaning: bash receives i=1 and works with that, when that task is done, receives i=2 and so on.
Any ideas how to do that?
One option is to make your R script executable with #!/usr/bin/env Rscript (setting the executable bit; e.g. chmod 0755 myrscript.r, chmod +x myrscript.r, etc...), and just treat it like any other command, e.g. assigning the results to an array variable below:
myrscript.r
#!/usr/bin/env Rscript
cat(1:5, sep = "\n")
mybashscript.sh
#!/bin/bash
RES=($(./myrscript.r))
for elem in "${RES[#]}"
do
echo elem is "${elem}"
done
nrussell$ ./mybashscript.sh
elem is 1
elem is 2
elem is 3
elem is 4
elem is 5
Here is MYRScript.R:
for(iter in 1:5) {
cat(iter, ' ')
}
and here is your bash script:
#!/bin/bash
r_output=`Rscript ~/MYRscript.R`
for iter in `echo $r_output`
do
echo Here is some output from R: $iter
done
Here is some output from R: 1
Here is some output from R: 2
Here is some output from R: 3
Here is some output from R: 4
Here is some output from R: 5

How to turn strings on the command line into individual positional parameters

My main question is how to split strings on the command line into parameters using a terminal command in Linux?
For example
on the command line:
./my program hello world "10 20 30"
The parameters are set as:
$1 = hello
$2 = world
$3 = 10 20 30
But I want:
$1 = hello
$2 = world
$3 = 10
$4 = 20
$5 = 30
How can I do it correctly?
You can reset the positional parameters $# by using the set builtin. If you do not double-quote $#, the shell will word-split it producing the behavior you desire:
$ cat my_program.sh
#! /bin/sh
i=1
for PARAM; do
echo "$i = $PARAM";
i=$(( $i + 1 ));
done
set -- $#
echo "Reset \$# with word-split params"
i=1
for PARAM; do
echo "$i = $PARAM";
i=$(( $i + 1 ));
done
$ sh ./my_program.sh foo bar "baz buz"
1 = foo
2 = bar
3 = baz buz
Reset $# with word-split params
1 = foo
2 = bar
3 = baz
4 = buz
As an aside, I find it mildly surprising that you want to do this. Many shell programmers are frustrated by the shell's easy, accidental word-splitting — they get "John", "Smith" when they wanted to preserve "John Smith" — but it seems to be your requirement here.
Use xargs:
echo "10 20 30" | xargs ./my_program hello world
xargs is a command on Unix and most Unix-like operating systems used
to build and execute command lines from standard input. Commands such as
grep and awk can accept the standard input as a parameter, or argument
by using a pipe. However, others such as cp and echo disregard the
standard input stream and rely solely on the arguments found after the
command. Additionally, under the Linux kernel before version 2.6.23,
and under many other Unix-like systems, arbitrarily long lists of
parameters cannot be passed to a command,[1] so xargs breaks the list
of arguments into sublists small enough to be acceptable.
(source)

two layers of quotes around a bash variable

I've written a bash script that makes a series of R scripts. However, Ive run into difficulty quoting a bash variable to echo to the R script as a file to be read into R. I have
echo "loadings_file <- $loadings ; calls_file <- $file" | cat - template.R > temp && mv temp $scriptname
$loadings and $file are files I want R to read in. But when I run it as is they end up in the R script with no quotes aroudn them for R to treat as a string. How do I make sure they're quoted in R but still expanded in bash first?
echo "loadings_file <- '$loadings' ; calls_file <- '$file'"
If you specifically need double quoting:
echo "loadings_file <- \"$loadings\" ; calls_file <- \"$file\""
You have to escape your quotes (\") around the variables:
echo "loadings_file <- \"$loadings\" ; calls_file <- \"$file\"" | cat - template.R > temp && mv temp $scriptname

Combining R + awk + bash commands

I want to combine awk and R language. The thing is that I have a set of *.txt files in a specified directory and that I don't know the length of the header from the files. In some cases I have to skip 25 lines while in others I have to skip 27 and etc. So I want to type some awk commands to get the number of lines to skip. Once I have this value, I can begin processing the data with R.
Furthermore, in the R file I combine R an bash so my code looks like this :
!/usr/bin/env Rscript
...
argv <- commandArgs(T)
**error checking...**
import_file <- argv[1]
export_file <- argv[2]
**# your function call**
format_windpro(import_file, export_file)
Where and how can i type my awk command. Thanks!
I tried to do what you told me about awk commands and I still get an error. The program doesn't recognize my command and so I can not enter the number of lines to skip to my function. Here is my code:
**nline <- paste('$(grep -n 'm/s' import_file |awk -F":" '{print $1}')')
nline <- scan(pipe(nline),quiet=T)**
I look for the pattern m/s in the first column in order to know where I have my header text. I use R under w7.
Besides Vincent's hint of using system("awk ...", intern=TRUE), you can also use the pipe() function that is part of the usual text connections:
R> sizes <- read.table(pipe("ls -l /tmp | awk '!/^total/ {print $5}'"))
R> summary(sizes)
V1
Min. : 0
1st Qu.: 482
Median : 4096
Mean : 98746
3rd Qu.: 13952
Max. :27662342
R>
Here I am piping a command into awk and then read all the output from awk, that could also be a single line:
R> cmd <- "ls -l /tmp | awk '!/^total/ {sum = sum + $5} END {print sum}'"
R> totalsize <- scan(pipe(cmd), quiet=TRUE)
R> totalsize
[1] 116027050
R>
You can use system to run an external program from R.
system("gawk --version", intern=TRUE)

Resources