How to correctly escape system calls from inside R - r

I have several shell commands that I want to run in in R.
I have tried system() but I have not find out how to do the right escaping even using shQuote.
# works OK
system('ls -a -l')
but how I execute a command like perl -e 'print "test\n"' or curl --data-urlencode query#biomart.xml http://biomart.org/biomart/martservice/results inside R?
update:
In the case of commands like the perl example I do not know how to do the escaping of quotes as it needs to be quoted as string but already use both types of quotes.
In the case of curl, the problem seems to be in the RESTful call to pass the xml with the # that works in the shell but not in the system() call
dat <-system('curl --data-urlencode query#biomart.xml http://biomart.org/biomart/martservice/results', intern=F)
Warning: Couldn't read data from file "query#biomart.xml", this makes an empty
Warning: POST.
The file is biomart.xml not query#biomart.xml
** Update2**
The xml file I am using for test is:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE Query>
<Query virtualSchemaName = "default" formatter = "TSV" header = "0" uniqueRows = "0" count = "" datasetConfigVersion = "0.6" >
<Dataset name = "hsapiens_gene_ensembl" interface = "default" >
<Filter name = "hgnc_symbol" value = "LDLR"/>
<Attribute name = "external_gene_id" />
</Dataset>
</Query>

Strings in R may be enclosed in either single (') or double (") quotes.
If you want to execute a command with both single and double quotes, such as:
perl -e 'print "test\n"'
then it is of little consequence which you choose for your R string - since one pair needs to be escaped either way.
Let's say you choose single quotes:
system('')
Then we need to escape the single quotes in the same way as for the newline character, with the escape character, \:
command <- 'perl -e \'print "test\n"\''
system(command)
It is also possible to encode Unicode characters in this way with \Unnnnnnnn or \unnnn. Alternatively with octal (\nnn), or hex (\xnnn).
Thus:
atSymbol <- '\u0040' # '\x040' '\100'
If the # in your curl command is causing the problem, encoding it like this should fix it.

In this example, in addition to escaping ', I had to escape \\ (with \\\\), \., \G, \K, but not \n
⌄ ⌄⌄ ⌄ ⌄ ⌄ ⌄ ⌄ ⌄
perl -0777 -pi -e ' s{ \\usage.*?\n\.\.\.\n} { ($r = $&) =~ s/\n//g; $r =~ s/\G.{0,79}(,|.$)\K/\n/g; $r }gse' filename.txt
system('perl -0777 -pi -e \' s{\\\\usage.*?\n\\.\\.\\.\n}{ ($r = $&) =~ s/\n//g; $r =~ s/\\G.{0,79}(,|.$)\\K/\n/g; $r }gse\' filename.txt')

Related

How to escape double quotes when using Rscript function in Bash

How can I escape double quotes to run this code block on the command line? The part of the code that is problematic is paste('! LaTeX Error: File', paste0("`", x, "'.sty'"), 'not found.')).
Rscript \
-e "biosketch_pkgs <- list('microtype', 'tabu', 'ulem', 'enumitem', 'titlesec')" \
-e "lapply(biosketch_pkgs, function (x) {tinytex::parse_install(text=paste('! LaTeX Error: File', paste0("`", x, "'.sty'"), 'not found.'))})"
I have tried to escape the double quote using the below but it returns unexpected EOF while looking for matching ''`
Rscript \
-e "biosketch_pkgs <- list('microtype', 'tabu', 'ulem', 'enumitem', 'titlesec')" \
-e "lapply(biosketch_pkgs, function (x) {tinytex::parse_install(text=paste('! LaTeX Error: File', paste0("\`"\, x, "\'.sty'"\), 'not found.'))})"
Instead of worrying about how to escape specific strings, think about handling in them a way that avoids the need to do any escaping at all. Running Rscript - tells Rscript to read the source code to run on stdin, while <<'EOF' makes all content up to the next line containing only EOF be fed on stdin. (You can replace the characters EOF with any other sigil you choose, but it's important that this sigil be quoted or escaped to instruct the shell not to modify the content in any way).
Rscript - <<'EOF'
biosketch_pkgs <- list('microtype', 'tabu', 'ulem', 'enumitem', 'titlesec')
lapply(biosketch_pkgs, function (x) {
tinytex::parse_install(text=paste('! LaTeX Error: File',
paste0("`", x, "'.sty'"),
'not found.'))
})
EOF
If you did have a compelling reason to use Rscript -e, though, a correct escaping would put backslashes before backticks, double quotes, or other syntax that has meaning to the shell within a double-quoted string, as follows:
Rscript \
-e "biosketch_pkgs <- list('microtype', 'tabu', 'ulem', 'enumitem', 'titlesec')" \
-e "lapply(biosketch_pkgs, function (x) {tinytex::parse_install(text=paste('! LaTeX Error: File', paste0(\"\`\", x, \"'.sty'\"), 'not found.'))})"

Unit script AWS/ SED help to add double quotes / how to match double quotes?

I have below records into txt file.
000D3A|"RiFR Botnets" AD||83634C|dk
000D3|Ries Bidvest" AD||8364A3C|dhh
000D3A|"Ra Boots D"||83634C|gft
here I want to add double quotes for those records which having " into the line using AWK Unix command.
expected output which I want to write to file:
000D3A|""RiFR Botnets" AD"||83634C|dk
000D3|"Ries Bidvest" AD"||8364A3C|dhh
000D3A|""Ra Boots""||83634C|gft
I have tried using AWS command and AWK -F "|" but how do i search double quotes " here into the every line of file.
You can try:
awk -F"|" -v OFS="|" '{
for (i=1;i<=NF;i++) # for every field...
if (match($i,"\"")) # check for "
$i="\"" $i "\"" # add quotes to that field
}
1 # print' file
000D3A|""RiFR Botnets" AD"||83634C|dk
000D3|"Ries Bidvest" AD"||8364A3C|dhh
000D3A|""Ra Boots D""||83634C|gft
Or, you could use this sed:
sed -E 's/\|([^|]*"[^|]*)\|/|"\1"|/g' file
# same output
(As noted in comments, the result is not valid csv. You requested this, but "Ries Bidvest" AD" is not valid quoting and will break csv parsing...)

substring before and substring after in shell script

I have a string:
//host:/dir1/dir2/dir3/file_name
I want to fetch value of host & directories in different variables in unix script.
Example :
host_name = host
dir_path = /dir1/dir2/dir3
Note - String length & no of directories is not fixed.
Could you please help me to fetch these values from string in unix shell script.
Using bash string operations:
str='//host:/dir1/dir2/dir3/file_name'
host_name=${str%%:*}
host_name=${host_name##*/}
dir_path=${str#*:}
dir_path=${dir_path%/*}
I would do it using regular expressions:
if [[ $path =~ ^//(.*):(.*)/(.*)$ ]]; then
host="${BASH_REMATCH[1]}"
dir_path="${BASH_REMATCH[2]}"
filename="${BASH_REMATCH[3]}"
else
echo "Invalid format" >&2
exit 1
fi
If you are sure that the format will match, you can do simply
[[ $path =~ ^//(.*):(.*)/(.*)$ ]]
host="${BASH_REMATCH[1]}"
dir_path="${BASH_REMATCH[2]}"
filename="${BASH_REMATCH[3]}"
Edit: Since you seem to be using ksh rather than bash (though bash was indicated in the question), the syntax is a bit different:
match=(${path/~(E)^\/\/(.*):(.*)\/(.*)$/\1 \2 \3})
host="${match[0]}"
dir_path="${match[1]}"
filename="${match[2]}"
This will break if there are spaces in the file name, though. In that case, you can use the more cumbersome
host="${path/~(E)^\/\/(.*):(.*)\/(.*)$/\1}"
dir_path="${path/~(E)^\/\/(.*):(.*)\/(.*)$/\2}"
filename="${path/~(E)^\/\/(.*):(.*)\/(.*)$/\3}"
Perhaps there are more elegant ways of doing it in ksh, but I'm not familiar with it.
The shortest way I can think of is to assign two variables in one statement:
$ read host_name dir_path <<< $(echo $string | sed -e 's,^//,,;s,:, ,')
Complete script:
string="//host:/dir1/dir2/dir3/file_name"
read host_name dir_path <<< $(echo $string | sed -e 's,^//,,;s,:, ,')
echo "host_name = " $host_name
echo "dir_path = " $dir_path
Output:
host_name: host
dir_path: /dir1/dir2/dir3/file_name

How can I set a default value when incorrect/invalid input is entered in Unix?

i want to set the value of inputLineNumber to 20. I tried checking if no value is given by user by [[-z "$inputLineNumber"]] and then setting the value by inputLineNumber=20. The code gives this message ./t.sh: [-z: not found as message on the console. How to resolve this? Here's my full script as well.
#!/bin/sh
cat /dev/null>copy.txt
echo "Please enter the sentence you want to search:"
read "inputVar"
echo "Please enter the name of the file in which you want to search:"
read "inputFileName"
echo "Please enter the number of lines you want to copy:"
read "inputLineNumber"
[[-z "$inputLineNumber"]] || inputLineNumber=20
for N in `grep -n $inputVar $inputFileName | cut -d ":" -f1`
do
LIMIT=`expr $N + $inputLineNumber`
sed -n $N,${LIMIT}p $inputFileName >> copy.txt
echo "-----------------------" >> copy.txt
done
cat copy.txt
Changed the script after suggestion from #Kevin. Now the error message ./t.sh: syntax error at line 11: `$' unexpected
#!/bin/sh
truncate copy.txt
echo "Please enter the sentence you want to search:"
read inputVar
echo "Please enter the name of the file in which you want to search:"
read inputFileName
echo Please enter the number of lines you want to copy:
read inputLineNumber
[ -z "$inputLineNumber" ] || inputLineNumber=20
for N in $(grep -n $inputVar $inputFileName | cut -d ":" -f1)
do
LIMIT=$((N+inputLineNumber))
sed -n $N,${LIMIT}p $inputFileName >> copy.txt
echo "-----------------------" >> copy.txt
done
cat copy.txt
Try changing this line from:
[[-z "$inputLineNumber"]] || inputLineNumber=20
To this:
if [[ -z "$inputLineNumber" ]]; then
inputLineNumber=20
fi
Hope this helps.
Where to start...
You are running as /bin/sh but trying to use [[. [[ is a bash command that sh does not recognize. Either change the shebang to /bin/bash (preferred) or use [ instead.
You do not have a space between [[-z. That causes bash to read it as a command named [[-z, which clearly doesn't exist. You need [[ -z $inputLineNumber ]] (note the space at the end too). Quoting within [[ doesn't matter, but if you change to [ (see above), you will need to keep the quotes.
Your code says [[-z but your error says [-z. Pick one.
Use $(...) instead of `...`. The backticks are deprecated, and $() handles quoting appropriately.
You don't need to cat /dev/null >copy.txt, certainly not twice without writing to it in-between. Use truncate copy.txt or just plain >copy.txt.
You seem to have inconsistent quoting. Quote or escape (\x) anything with special characters (~, `, !, #, $, &, *, ^, (), [], \, <, >, ?, ', ", ;) or whitespace and any variable that could have whitespace. You don't need to quote string literals with no special characters (e.g. ":").
Instead of LIMIT=`expr...`, use limit=$((N+inputLineNumber)).

Quoting command-line arguments in shell scripts

The following shell script takes a list of arguments, turns Unix paths into WINE/Windows paths and invokes the given executable under WINE.
#! /bin/sh
if [ "${1+set}" != "set" ]
then
echo "Usage; winewrap EXEC [ARGS...]"
exit 1
fi
EXEC="$1"
shift
ARGS=""
for p in "$#";
do
if [ -e "$p" ]
then
p=$(winepath -w $p)
fi
ARGS="$ARGS '$p'"
done
CMD="wine '$EXEC' $ARGS"
echo $CMD
$CMD
However, there's something wrong with the quotation of command-line arguments.
$ winewrap '/home/chris/.wine/drive_c/Program Files/Microsoft Research/Z3-1.3.6/bin/z3.exe' -smt /tmp/smtlib3cee8b.smt
Executing: wine '/home/chris/.wine/drive_c/Program Files/Microsoft Research/Z3-1.3.6/bin/z3.exe' '-smt' 'Z: mp\smtlib3cee8b.smt'
wine: cannot find ''/home/chris/.wine/drive_c/Program'
Note that:
The path to the executable is being chopped off at the first space, even though it is single-quoted.
The literal "\t" in the last path is being transformed into a tab character.
Obviously, the quotations aren't being parsed the way I intended by the shell. How can I avoid these errors?
EDIT: The "\t" is being expanded through two levels of indirection: first, "$p" (and/or "$ARGS") is being expanded into Z:\tmp\smtlib3cee8b.smt; then, \t is being expanded into the tab character. This is (seemingly) equivalent to
Y='y\ty'
Z="z${Y}z"
echo $Z
which yields
zy\tyz
and not
zy yz
UPDATE: eval "$CMD" does the trick. The "\t" problem seems to be echo's fault: "If the first operand is -n, or if any of the operands contain a backslash ( '\' ) character, the results are implementation-defined." (POSIX specification of echo)
bash’s arrays are unportable but the only sane way to handle argument lists in shell
The number of arguments is in ${#}
Bad stuff will happen with your script if there are filenames starting with a dash in the current directory
If the last line of your script just runs a program, and there are no traps on exit, you should exec it
With that in mind
#! /bin/bash
# push ARRAY arg1 arg2 ...
# adds arg1, arg2, ... to the end of ARRAY
function push() {
local ARRAY_NAME="${1}"
shift
for ARG in "${#}"; do
eval "${ARRAY_NAME}[\${#${ARRAY_NAME}[#]}]=\${ARG}"
done
}
PROG="$(basename -- "${0}")"
if (( ${#} < 1 )); then
# Error messages should state the program name and go to stderr
echo "${PROG}: Usage: winewrap EXEC [ARGS...]" 1>&2
exit 1
fi
EXEC=("${1}")
shift
for p in "${#}"; do
if [ -e "${p}" ]; then
p="$(winepath -w -- "${p}")"
fi
push EXEC "${p}"
done
exec "${EXEC[#]}"
I you do want to have the assignment to CMD you should use
eval $CMD
instead of just $CMD in the last line of your script. This should solve your problem with spaces in the paths, I don't know what to do about the "\t" problem.
replace the last line from $CMD to just
wine '$EXEC' $ARGS
You'll note that the error is ''/home/chris/.wine/drive_c/Program' and not '/home/chris/.wine/drive_c/Program'
The single quotes are not being interpolated properly, and the string is being split by spaces.
You can try preceeding the spaces with \ like so:
/home/chris/.wine/drive_c/Program Files/Microsoft\ Research/Z3-1.3.6/bin/z3.exe
You can also do the same with your \t problem - replace it with \\t.

Resources