Pass variable from bash to R with commandArgs - r

I'm having a terrible go trying to pass some variables from the shell to R. I am hesitant to post this because I can't figure out a reasonable way to make this reproducible, since it involves a tool that has to be downloaded, and really it's more of a general methodology issue that I don't think needs to be reproducible, if you can just suspend your disbelief and bear with me for a quick minute.
I have arguments that are defined in a bash script: $P, $G, and $O.
I have some if/then statements and everything is fine until I get to the $O options.
This is the first part of the $O section and it works fine. It grabs data from $P and passes it to the twoBitToFa utility from UCSC's genome project and outputs the data correctly in a .fa file. Beautiful. (Although I think using 'stdout' and '>' is perhaps redundant?)
if [ "$O" = "fasta" ]
then
awk '{print $0" "$1":"$2"-"$3}' "$P" |
twoBitToFa -bed=stdin -udcDir=. "$twobit" stdout > "${P%.bed}".fa
fi
The next section is where I am stuck. If the $O option is "bed", then I want to invoke the Rscript command and pass my stuff over to R. I am able to pass my $P, $G, and $O variables without issue, but now I also need to pass the output from the twoBitToFa function. I could add a step and make the .fa file and then pick that up in R, but I am trying to skip the .fa file creation step and output a different file type instead (.bed). Here are some things I have tried:
# try saving twoBitToFa output to variable and including it in the variables passed to R:
if [ "$O" = "bed" ]
then
awk '{print $0" "$1":"$2"-"$3}' "$P" |
myvar=$(twoBitToFa -bed=stdin -udcDir=. "$twobit" stdout) \
Rscript \
GetSeq_R.r \
$P \
$G \
$O \
$myvar
fi
To check what variables come through, my GetSeq_R.r script starts with:
args = commandArgs(trailingOnly=TRUE)
print(args)
and with the above code, the output only includes my $P, $G, and $O variables. $myvar doesn't make it. $P is the TAD-1 file, $G is "hg38", and $O is "bed".
[1] "TAD-1_template.bed" "hg38" "bed"
I am not sure if the way I am trying to pass the data in the variable is wrong. From everything I've read, it seems like it should work. I've also tried using tee to see what is in my stdout at that step like so:
if [ "$O" = "bed" ]
then
awk '{print $0" "$1":"$2"-"$3}' "$P" |
twoBitToFa -bed=stdin -udcDir=. "$twobit" stdout | tee \
Rscript \
GetSeq_R.r \
$P \
$G \
$O
fi
And the data I want to pass to R is correctly shown in my console by using tee. I've tried saving stdout and tee to a variable and passing that variable to R, thinking maybe it's something about twoBitToFa that refuses to be put inside a variable, but was unsuccessful. I've spent hours looking up info about tee, stdout, and passing variables from bash to R. I feel like I'm missing something fundamental, or trying to do something impossible, and would really appreciate some other eyes on this.
Here's the whole bash script, in case that's illuminating. Do I need to define a variable in "$#" for what I am trying to pass to R, even though it's not something I want the user to be aware of? Am I capturing the variable with $myvar incorrectly? Can I get the contents of stdout or tee to show up in R?
Thanks in advance.
for arg in "$#"; do
shift
case "$arg" in
"--path") set -- "$#" "-P" ;;
"--genome") set -- "$#" "-G" ;;
"--output") set -- "$#" "-O" ;;
"--help") set -- "$#" "-h" ;;
*) set -- "$#" "$arg"
esac
done
while getopts ":P:G:O:h" OPT
do
case $OPT in
P) P=$OPTARG;;
G) G=$OPTARG;;
O) O=$OPTARG;;
h) help ;;
\?)
echo "Invalid option: -$OPTARG" >&2
usage
exit 1
;;
:)
echo "Option -$OPTARG requires an argument." >&2
usage
exit 1
;;
esac
done
num_col=$(cat "$P" | awk "{print NF; exit}")
if [ "$num_col" = 3 ]
then
echo -e "\n\n3 column bed file detected; no directional considerations for sequences \n\n"
if [ "$G" = "hg38" ]
then
twobit="https://hgdownload.cse.ucsc.edu/goldenpath/hg38/bigZips/hg38.2bit"
fi
if [ "$G" = "hg19" ]
then
twobit="https://hgdownload.cse.ucsc.edu/goldenpath/hg19/bigZips/hg19.2bit"
fi
if [ "$O" = "fasta" ]
then
awk '{print $0" "$1":"$2"-"$3}' "$P" |
twoBitToFa -bed=stdin -udcDir=. "$twobit" stdout > "${P%.bed}".fa
fi
if [ "$O" = "bed" ]
then
awk '{print $0" "$1":"$2"-"$3}' "$P" |
#myvar=$(twoBitToFa -bed=stdin -udcDir=. "$twobit" stdout) \
Rscript \
GetSeq_R.r \
$P \
$G \
$O \
$myvar
fi
fi

Related

zsh function case condition parse error near `)'

I'm following this blog to setup a zsh function to switch aws cli profiles
: https://mads-hartmann.com/2017/04/27/multiple-aws-profiles.html
This is the zsh function in the blog:
function aws-switch() {
case ${1} in
"")
clear)
export AWS_PROFILE=""
;;
*)
export AWS_PROFILE="${1}"
;;
esac
}
#compdef aws-switch
#description Switch the AWS profile
_aws-switch() {
local -a aws_profiles
aws_profiles=$( \
grep '\[profile' ~/.aws/config \
| awk '{sub(/]/, "", $2); print $2}' \
| while read -r profile; do echo -n "$profile "; done \
)
_arguments \
':Aws profile:($(echo ${aws_profiles}) clear)'
}
_aws-switch "$#"
I added these lines to my ~/.zshrc, when I run source ~/.zshrc
It gives /.zshrc:4: parse error near `)'
I read the zsh function doc but still not very good at understanding the syntax and how could I fix this.
Have a look at the zsh man page (man zshmisc):
case word in [ [(] pattern [ | pattern ] ... ) list (;;|;&|;|) ] ... esac
As you see, you have to separate multiple pattern by |:
case $1 in
|clear)
....

echo does not display proper output

Following code read the test.txt contents and based on first field it redirect third field to result.txt
src_fld=s1
type=11
Logic_File=`cat /home/script/test.txt`
printf '%s\n' "$Logic_File" |
{
while IFS=',' read -r line
do
fld1=`echo $line | cut -d ',' -f 1`
if [[ $type -eq $fld1 ]];then
query=`echo $line | cut -d ',' -f 3-`
echo $query >> /home/stg/result.txt
fi
done
}
Following is the contents of test.txt:
6,STRING TO DECIMAL WITHOUT DEFAULT,cast($src_fld as DECIMAL(15,2) $tgt_fld
7,STRING TO INTERGER WITHOUT DEFAULT,cast($src_fld as integer) $tgt_fld
11,DEFAULT NO RULE,$src_fld
everything works fine except output in result.txt is $src_fld instead of s1. Can anyone please tell me what is wrong in the code?
Try replacing the below line
echo $query >> /home/stg/result.txt
with this one
eval "echo $query" >> /home/stg/result.txt

Unix — run one script when wc of the file not matched

I want to run the script with different parameters if the wc of the text file is matched or not matched!
My Script:
#!/bin/sh
x= echo `wc -l "/scc/ftp/mrdr_rpt/yet_to_load.txt"`
if [ $x -gt 0 ]
then
sh /scc/ftp/mrdr_rpt/eam.ksh /scc/ftp/mrdr_rpt/vinu_mrdr_rpt.txt /scc/ftp/mrdr_rpt/yet_to_load.txt from#from.com to.name#to.com
elif
sh /scc/ftp/mrdr_rpt/eam.ksh /scc/ftp/mrdr_rpt/vinu_mrdr_rpt.txt /scc/ftp/mrdr_rpt/yet_to_load.txt from#from.com to.name#to.com, hi.name#hi.com
fi
You need to capture the output of wc accurately, and you need to avoid getting a file name in its output. You have:
x= echo `wc -l "/scc/ftp/mrdr_rpt/yet_to_load.txt"`
if [ $x -gt 0 ]
The space after the = is wrong. The echo is not wanted. You should use input redirection with wc. (wc is a little peculiar. If you give it a file name to process, it includes the file name in the output; if you have it process standard input, it doesn't include a file name in the output.) You should use $(…) in preference to back-quotes.
x=$(wc -l < "/scc/ftp/mrdr_rpt/yet_to_load.txt")
if [ $x -gt 0 ]
If you want to check if the file is not empty (rather than being a file with data but no newlines), then you can use a more direct test:
if [ -s "/scc/ftp/mrdr_rpt/yet_to_load.txt" ]
You should probably be using a name such as
DIR="/scc/ftp/mrdr_rpt"
and then referencing it to reduce the ugly repetitions in your code:
if [ $x -gt 0 ]
then
sh "$DIR/eam.ksh" "$DIR/vinu_mrdr_rpt.txt" "$DIR/yet_to_load.txt" \
from#from.com to.name#to.com
else
sh "$DIR/eam.ksh" "$DIR/vinu_mrdr_rpt.txt" "$DIR/yet_to_load.txt" \
from#from.com to.name#to.com, hi.name#hi.com
fi
However, I think the comma in the second line is probably not needed, and it might be better to use:
who="from#from.com to.name#to.com"
if [ -s "$DIR/yet_to_load.txt" ]
then who="$who hi.name#hi.com"
fi
sh "$DIR/eam.ksh" "$DIR/vinu_mrdr_rpt.txt" "$DIR/yet_to_load.txt" $who
Then you've only one line with all the names in it. And you might do even better with an array instead of string:
who=("from#from.com" "to.name#to.com")
if [ -s "$DIR/yet_to_load.txt" ]
then who+=("$who hi.name#hi.com" "Firstname Lastname <someone#example.com>")
fi
sh "$DIR/eam.ksh" "$DIR/vinu_mrdr_rpt.txt" "$DIR/yet_to_load.txt" "${who[#]}"
Using arrays means you can handle blanks in the names correctly where a simple string doesn't.

SAS Unix Shell Script - Print Contents of Table or Macro Variables

I figured it out.
GREPOUT=`grep "NOTE: Table $TABLE created," $LOGFILE | awk '{print $6}'`
NIW=`grep "SYMBOLGEN: Macro variable NIW resolves to" $LOGFILE | awk '{print $0}'`
if [ "$GREPOUT" -gt "0" ]; then
echo "$NIW" |\
$MAILX -s "SUCESSFUL BATCH RUN: $PROG $RPTDATE" $MAILLIST
fi
from the body of the sent email
SYMBOLGEN: Macro variable NIW resolves to 8
My script runs a SAS code and sends out an email after it completes.
I'm looking to print the contents of a table or list of macro variables in the email.
The SAS code has a %put all; statement at the end so all macro variables are listed in the log.
Thanks.
#If it's gotten this far, we can safely grab the number of rows
#of output from $LOGFILE.
GREPOUT=`grep "NOTE: Table $TABLE created," $LOGFILE | awk '{print $6}'`
NIW=`grep "GLOBAL NIW" $LOGFILE | '(print $6)'`
if [ "$GREPOUT" -gt "0" ]; then
#echo "$GREPOUT rows found in $TABLE." |\
echo "$NIW NIW" |\
$MAILX -s "SUCESSFUL BATCH RUN: $PROG $RPTDATE" $MAILLIST
else
echo "$GREPOUT rows found in $TABLE." |\
$MAILX -s "SUCESSFUL BATCH RUN: $PROG $RPTDATE" $MAILLIST
fi

awk passing a variable

I am struggling with an awk problem in my bash shell script. In the below snippet of code i am passing a variable var_awk for regular expression in awk. The idea is to get lines above a regular expression but the below echo is not displaying any data
echo `ls -ltr $date*$f* | /usr/xpg4/bin/awk -v reg=$var_awk '/reg/ {print $0}'`
I am unable to reg for regex though when i do print reg it is printing but when not doing regex as expected.
if [ $GE == "HBCA" ] || [ $GE == "HBUS" ] || [ $GE == "HBEU" ]; then
for f in `ls -ltr $date*GEN*REVAL*log|grep -v LPD | awk '{split($9,a,"_")}{print a[3]}'`; do
echo $f
var_awk="$date"_RESET_CALC_"$f"
echo $var_awk
echo `ls -ltr $date*$f* | /usr/xpg4/bin/awk -v reg=$var_awk '/reg/ {print $0}'`
You cannot use variable in regex that way. You need to do:
/usr/xpg4/bin/awk -v reg="$var_awk" '$0~reg{ print $0 }'
or simply
/usr/xpg4/bin/awk -v reg="$var_awk" '$0~reg'
Inside / / your variable reg will be used as a literal word.
Quote your shell variables.
try this:
...whatever you had already..|awk -v reg="$var_awk" '$0~reg'
it is better to wrap shell variable with quotes, e.g. if your var has spaces.
/pattern/ in awk is called regex constant. It cannot be used with variable, that's why it is called constant. We need to use dynamic regex here in this example.

Resources