R, passing variables to a system command - r

Using R, I am looking to create a QR code and embed it into an Excel spreadsheet (hundreds of codes and spreadsheets). The obvious way seems to be to create a QR code using the command line, and use the "system" command in R. Does anyone know how to pass R variables through the "system" command? Google is not too helpful as "system" is a bit generic, ?system does not contain any examples of this.
Note - I am actually using data matrices rather than QR codes, but using the term "data matrix" in an R question will lead to havoc, so let's talk QR codes instead. :-)
system("dmtxwrite my_r_variable -o image.png")
fails, as do the variants I have tried with "paste". Any suggestions gratefully received.

Let's say we have the variable x that we want to pass on to dmtxwrite, you can pass it on like:
x = 10
system(sprintf("dmtxwrite %s -o image.png", x))
or alternatively using paste:
system(paste("dmtxwrite", x, "-o image.png"))
but I prefer sprintf in this case.

Also making use of base::system2 may be worth considering as system2 provides args argument that can be used for that purpose. In your example:
my_r_variable <- "a"
system2(
'echo',
args = c(my_r_variable, '-o image.png')
)
would return:
a -o image.png
which is equivalent to running echo in the terminal. You may also want to redirect output to text files:
system2(
'echo',
args = c(my_r_variable, '-o image.png'),
stdout = 'stdout.txt',
stderr = 'stderr.txt'
)

Related

How can I pass the names of a list of files from bash to an R program?

I have a long list of files with names like: file-typeX-sectorY.tsv, where X and Y get values from 0-100. I process each of those files with an R program, but read them one by one like this:
data <- read.table(file='my_info.tsv', sep = '\t', header = TRUE, fill = TRUE)
it is impractical. I want to build a bash program that does something like
#!/bin/bash
for i in {0..100..1}
do
for j in {1..100..1)
do
Rscript program.R < file-type$i-sector$j.tsv
done
done
My problem is not with the bash script but with the R program. How can I receive the files one by one? I have googled and tried instructions like:
args <- commandArgs(TRUE)
either
data <- commandArgs(trailingOnly = TRUE)
but I can't find the way. Could you please help me?
At the simplest level your problem may be the (possible accidental ?) redirect you have -- so remove the <.
Then a mininmal R 'program' to take a command-line argument and do something with it would be
#!/usr/bin/env Rscript
args <- commandArgs(trailingOnly = TRUE)
stopifnot("require at least one arg" = length(args) > 0)
cat("We were called with '", args[1], "'\n", sep="")
We use a 'shebang' line and make it chmod 0755 basicScript.R to be runnable. The your shell double loop, reduced here (and correcting one typo) becomes
#!/bin/bash
for i in {0..2..1}; do
for j in {1..2..1}; do
./basicScript.R file-type${i}-sector${j}.tsv
done
done
and this works as we hope with the inner program reflecting the argument:
$ ./basicCaller.sh
We were called with 'file-type0-sector1.tsv'
We were called with 'file-type0-sector2.tsv'
We were called with 'file-type1-sector1.tsv'
We were called with 'file-type1-sector2.tsv'
We were called with 'file-type2-sector1.tsv'
We were called with 'file-type2-sector2.tsv'
$
Of course, this is horribly inefficient as you have N x M external processes. The two outer loops could be written in R, and instead of calling the script you would call your script-turned-function.

How to specify input arguments to Rscript by name from command line?

I am new to command line usage and don't think this question has been asked elsewhere. I'm trying to adapt an Rscript to be run from the command line in a shell script. Basically, I'm using some tools in the immcantation framework to read and annotate some antibody NGS data, and then to group sequences into their clonal families. To set the similarity threshold, the creators recommend using a function in their shazam package to set an appropriate threshold.
I've made the simple script below to read and validate the arguments:
#!/usr/bin/env Rscript
params <- commandArgs(trailingOnly=TRUE)
### read and validate mode argument
mode <- params[1]
modeAllowed <- c("ham","aa","hh_s1f","hh_s5f")
if(!(mode %in% modeAllowed)){
stop(paste("illegal mode argument supplied. acceptable values are",
paste(paste(modeAllowed, collapse = ", "), ".", sep = ""), "\nmode should be supplied first",
sep = " "))
}
### execute function
cat(threshold)
The script works, however since for each parameter there's only a finite number of options. I was wondering if there was a way of passing in the arguments like --mode aa (for example) from the terminal? All the information I've seen online seems to be using code like my mode <- params[1] from above which I guess only works if the mode argument is first?

How to pass bash variable into R script

I have a couple of R scripts that processes data in a particular input folder. I have a few folders I need to run this script on, so I started writing a bash script to loop through these folders and run those R scripts.
I'm not familiar with R at all (the script was written by a previous worker and it's basically a black box for me), and I'm inexperienced with passing variables through scripts, especially involving multiple languages. There's also an issue present when I call source("$SWS_output/Step_1_Setup.R") here - R isn't reading my $SWS_output as a variable, but rather a string.
Here's my bash script:
#!/bin/bash
# Inputs
workspace="`pwd`"
preprocessed="$workspace/6_preprocessed"
# Output
SWS_output="$workspace/7_SKSattempt4_results/"
# create output directory
mkdir -p $SWS_output
# Copy data from preprocessed to SWS_output
cp -a $preprocessed/* $SWS_output
# Loop through folders in the output and run the R code on each folder
for qdir in $SWS_output/*/; do
qdir_name=`basename $qdir`
echo -e 'source("$SWS_output/Step_1_Setup.R") \n source("$SWS_output/(Step_2_data.R") \n q()' | R --no-save
done
I need to pass the variable "qdir" into the second R script (Step_2_data.R) to tell it which folder to process.
Thanks!
My previous answer was incomplete. Here is a better effort to explain command line parsing.
It is pretty easy to use R's commandArgs function to process command line arguments. I wrote a small tutorial https://gitlab.crmda.ku.edu/crmda/hpcexample/tree/master/Ex51-R-ManySerialJobs. In cluster computing this works very well for us. The whole hpcexample repo is open source/free.
The basic idea is that in the command line you can run R with command line arguments, as in:
R --vanilla -f r-clargs-3.R --args runI=13 parmsC="params.csv" xN=33.45
In this case, my R program is a file r-clargs-3.R and the arguments that the file will import are three space separated elements, runI, parmsC, xN. You can add as many of these space separated parameters as you like. It is completely at your discretion what these are called, but it is required they are separated by spaces and there is NO SPACE around the equal signs. Character string variables should be quoted.
My habit is to name the arguments with suffix "I" to hint that it is an integer, "C" is for character, and "N" is for floating point numbers.
In the file r-clargs-3.R, include some code to read the arguments and sort through them. For example, my tutorial's example
cli <- commandArgs(trailingOnly = TRUE)
args <- strsplit(cli, "=", fixed = TRUE)
The rest of the work is sorting through the args, and this is my most evolved stanza to sort through arguments (because it looks for suffixes "I", "N", "C", and "L" (for logical)), and then it coerces the inputs to the correct variable types (all input variables are characters, unless we coerce with as.integer(), etc):
for (e in args) {
argname <- e[1]
if (! is.na(e[2])) {
argval <- e[2]
## regular expression to delete initial \" and trailing \"
argval <- gsub("(^\\\"|\\\"$)", "", argval)
}
else {
# If arg specified without value, assume it is bool type and TRUE
argval <- TRUE
}
# Infer type from last character of argname, cast val
type <- substring(argname, nchar(argname), nchar(argname))
if (type == "I") {
argval <- as.integer(argval)
}
if (type == "N") {
argval <- as.numeric(argval)
}
if (type == "L") {
argval <- as.logical(argval)
}
assign(argname, argval)
cat("Assigned", argname, "=", argval, "\n")
}
That will create variables in the R session named paramsC, runI, and xN.
The convenience of this approach is that the same base R code can be run with 100s or 1000s of command parameter variations. Good for Monte Carlo simulation, etc.
Thanks for all the answers they were very helpful. I was able to get a solution that works. Here's my completed script.
#!/bin/bash
# Inputs
workspace="`pwd`"
preprocessed="$workspace/6_preprocessed"
# Output
SWS_output="$workspace/7_SKSattempt4_results"
# create output directory
mkdir -p $SWS_output
# Copy data from preprocessed to SWS_output
cp -a $preprocessed/* $SWS_output
cd $SWS_output
# Loop through folders in the output and run the R code on each folder
for qdir in $SWS_output/*/; do
qdir_name=`basename $qdir`
echo $qdir_name
export VARIABLENAME=$qdir
echo -e 'source("Step_1_Setup.R") \n source("Step_2_Data.R") \n q()' | R --no-save --slave
done
And then the R script looks like this:
qdir<-Sys.getenv("VARIABLENAME")
pathname<-qdir[1]
As a couple of comments have pointed out, this isn't best practice, but this worked exactly as I wanted it to. Thanks!

How to combine two Vim commands into one (command not keybinding)

I've found few Stack Overflow questions talking about this, but they are all regarding only the :nmap or :noremap commands.
I want a command, not just a keybinding. Is there any way to accomplish this?
Use-case:
When I run :make, I doesn't saves automatically. So I'd like to combine :make and :w. I'd like to create a command :Compile/:C or :Wmake to achieve this.
The general information about concatenating Ex command via | can be found at :help cmdline-lines.
You can apply this for interactive commands, in mappings, and in custom commands as well.
Note that you only need to use the special <bar> in mappings (to avoid to prematurely conclude the mapping definition and execute the remainder immediately, a frequent beginner's mistake: :nnoremap <F1> :write | echo "This causes an error during Vim startup!"<CR>). For custom commands, you can just write |, but keep in mind which commands see this as their argument themselves.
:help line-continuation will help with overly long command definitions. Moving multiple commands into a separate :help :function can help, too (but note that this subtly changes the error handling).
arguments
If you want to pass custom command-line arguments, you can add -nargs=* to your :command definition and then specify the insertion point on the right-hand side via <args>. For example, to allow commands to your :write command, you could use
:command -nargs=* C w <args> | silent make | redraw!
You can combine commands with |, see help for :bar:
command! C update | silent make | redraw!
However, there is a cleaner way to achieve what you want.
Just enable the 'autowrite' option to automatically write
modified files before a :make:
'autowrite' 'aw' 'noautowrite' 'noaw'
'autowrite' 'aw' boolean (default off)
global
Write the contents of the file, if it has been modified, on each
:next, :rewind, :last, :first, :previous, :stop, :suspend, :tag, :!,
:make, CTRL-] and CTRL-^ command; and when a :buffer, CTRL-O, CTRL-I,
'{A-Z0-9}, or `{A-Z0-9} command takes one to another file.
Note that for some commands the 'autowrite' option is not used, see
'autowriteall' for that.
This option is mentioned in the help for :make.
I have found a solution after a bit of trial and error.
Solution for my usecase
command C w <bar> silent make <bar> redraw!
This is for compiling using make and it prints output only if there is nonzero output.
General solution
command COMMAND_NAME COMMAND_TO_RUN
Where COMMAND_TO_RUN can be constructed using more than one command using the following construct.
COMMAND_1_THAN_2 = COMMAND_1 <bar> COMMAND_2
You can use this multiple times and It is very similar to pipes in shell.

Tensorflow: How to convert .meta, .data and .index model files into one graph.pb file

In tensorflow the training from the scratch produced following 6 files:
events.out.tfevents.1503494436.06L7-BRM738
model.ckpt-22480.meta
checkpoint
model.ckpt-22480.data-00000-of-00001
model.ckpt-22480.index
graph.pbtxt
I would like to convert them (or only the needed ones) into one file graph.pb to be able to transfer it to my Android application.
I tried the script freeze_graph.py but it requires as an input already the input.pb file which I do not have. (I have only these 6 files mentioned before). How to proceed to get this one freezed_graph.pb file? I saw several threads but none was working for me.
You can use this simple script to do that. But you must specify the names of the output nodes.
import tensorflow as tf
meta_path = 'model.ckpt-22480.meta' # Your .meta file
output_node_names = ['output:0'] # Output nodes
with tf.Session() as sess:
# Restore the graph
saver = tf.train.import_meta_graph(meta_path)
# Load weights
saver.restore(sess,tf.train.latest_checkpoint('path/of/your/.meta/file'))
# Freeze the graph
frozen_graph_def = tf.graph_util.convert_variables_to_constants(
sess,
sess.graph_def,
output_node_names)
# Save the frozen graph
with open('output_graph.pb', 'wb') as f:
f.write(frozen_graph_def.SerializeToString())
If you don't know the name of the output node or nodes, there are two ways
You can explore the graph and find the name with Netron or with console summarize_graph utility.
You can use all the nodes as output ones as shown below.
output_node_names = [n.name for n in tf.get_default_graph().as_graph_def().node]
(Note that you have to put this line just before convert_variables_to_constants call.)
But I think it's unusual situation, because if you don't know the output node, you cannot use the graph actually.
As it may be helpful for others, I also answer here after the answer on github ;-).
I think you can try something like this (with the freeze_graph script in tensorflow/python/tools) :
python freeze_graph.py --input_graph=/path/to/graph.pbtxt --input_checkpoint=/path/to/model.ckpt-22480 --input_binary=false --output_graph=/path/to/frozen_graph.pb --output_node_names="the nodes that you want to output e.g. InceptionV3/Predictions/Reshape_1 for Inception V3 "
The important flag here is --input_binary=false as the file graph.pbtxt is in text format. I think it corresponds to the required graph.pb which is the equivalent in binary format.
Concerning the output_node_names, that's really confusing for me as I still have some problems on this part but you can use the summarize_graph script in tensorflow which can take the pb or the pbtxt as an input.
Regards,
Steph
I tried the freezed_graph.py script, but the output_node_name parameter is totally confusing. Job failed.
So I tried the other one: export_inference_graph.py.
And it worked as expected!
python -u /tfPath/models/object_detection/export_inference_graph.py \
--input_type=image_tensor \
--pipeline_config_path=/your/config/path/ssd_mobilenet_v1_pets.config \
--trained_checkpoint_prefix=/your/checkpoint/path/model.ckpt-50000 \
--output_directory=/output/path
The tensorflow installation package I used is from here:
https://github.com/tensorflow/models
First, use the following code to generate the graph.pb file.
with tf.Session() as sess:
# Restore the graph
_ = tf.train.import_meta_graph(args.input)
# save graph file
g = sess.graph
gdef = g.as_graph_def()
tf.train.write_graph(gdef, ".", args.output, True)
then, use summarize graph get the output node name.
Finally, use
python freeze_graph.py --input_graph=/path/to/graph.pbtxt --input_checkpoint=/path/to/model.ckpt-22480 --input_binary=false --output_graph=/path/to/frozen_graph.pb --output_node_names="the nodes that you want to output e.g. InceptionV3/Predictions/Reshape_1 for Inception V3 "
to generate the freeze graph.

Resources