AWK Include Whitespaces in Command - unix

I have String: "./Delivery Note.doc 1" , where:
$1 = ./Delivery
$2 = Note.doc
$3 = 1
I need to execute sum command concatenating $1 and $2 but keeping white space (./Delivery Note.doc). I try this but it trim whitespaces:
| '{ command="sum -r "$1 $2"
Result: ./DeliveryNote.doc

To execute the sum command
echo "./Delivery Note.doc 1" | awk '{ command="sum -r \""$1" "$2"\""; print command}' | bash

$ echo "./Delivery Note.doc 1" | awk '{ command="sum -r "$1" "$2; print command}'
sum -r ./Delivery Note.doc

Related

UNIX shell script reading csv

I have a csv file. I would like to put the fields into different variables. Supposed there are three fields in each line of the csv file. I have this code:
csvfile=test.csv
while read inline; do
var1=`echo $inline | awk -F',' '{print $1}'`
var2=`echo $inline | awk -F',' '{print $2}'`
var3=`echo $inline | awk -F',' '{print $3}'`
.
.
.
done < $csvfile
This code is good. However, if a field is coded with an embedded comma, then, it would not work. Any suggestion? For example:
how,are,you
I,"am, very",good
this,is,"a, line"
This may not be the perfect solution but it will work in your case.
[cloudera#quickstart Documents]$ cat cd.csv
a,b,c
d,"e,f",g
File content
csvfile=cd.csv
while read inline; do
var1=`echo $inline | awk -F'"' -v OFS='' '{ for (i=2; i<=NF; i+=2) gsub(",", "*", $i) }1' | awk -F',' '{print $1}' | sed 's/*/,/g'`
var2=`echo $inline | awk -F'"' -v OFS='' '{ for (i=2; i<=NF; i+=2) gsub(",", "*", $i) }1' | awk -F',' '{print $2}' | sed 's/*/,/g'`
var3=`echo $inline | awk -F'"' -v OFS='' '{ for (i=2; i<=NF; i+=2) gsub(",", "*", $i) }1' | awk -F',' '{print $3}' | sed 's/*/,/g'`
echo $var1 " " $var2 " " $var3
done< $csvfile
Output :
[cloudera#quickstart Documents]$ sh a.sh
a b c
d e,f g
So basically first we are trying to handle "," in data and then replacing the "," with "*" to get desired column using awk and then reverting * to "," again to get actual field value

Unix File Merge - If key exists replace line with value

I have 2 files table.cols and table.rules
table.cols:
column_1
column_2
column_3
column_4
column_5
table.rules:
column_3 my_function(column_3, another_parameter)
column_4 my_other_function(column_3, a_different_parameter)
I want to merge these files to produce:
column_1,
column_2,
my_function(column_3, another_parameter),
my_other_function(column_3, a_different_parameter),
column_5
Notice the commas at the end of each line except the last.
Try this -
$ head file?
==> file1 <==
column_3 my_function(column_3, another_parameter)
column_4 my_other_function(column_3, a_different_parameter)
==> file2 <==
column_1
column_2
column_3
column_4
column_5
$ awk -v line=$(wc -l < file2) 'NR==FNR{a[$1]=$2FS$3;next} {print (a[$1] ?a[$1]OFS :((FNR<line)?$0 OFS:$0))}' OFS=, file1 file2
column_1,
column_2,
my_function(column_3, another_parameter),
my_other_function(column_3, a_different_parameter),
column_5
Explained :
((FNR<line)?$0 OFS:$0)) : to ignore the comma from last line.
This script works as expected:
#! /bin/sh
OIFS="$IFS"
IFS=$'\n'
NL=$(cat table.cols| wc -l)
CL=1
rm -f /tmp/tablecols /tmp/tablerules
for LINE in $(cat table.cols)
do
echo -ne "${LINE}" >> /tmp/tablecols
if [ $CL -lt $NL ]; then
echo "," >> /tmp/tablecols
CL=$((CL + 1))
else
echo "" >> /tmp/tablecols
fi
done
for LINE in $(cat table.rules)
do
KEY=$(echo $LINE|cut -f 1 -d " ")
VAL=$(echo $LINE| cut -f 2-5 -d " ")
sed -i "s/${KEY}/${VAL}/g" /tmp/tablecols
done
mv /tmp/tablecols result
Hope it helps! :)
join -a1 table.cols table.rules | sed -e 1,4s/$/,/
join to combine the files
-a FILENUM print unpairable lines
sed commands have an optional range
substitute end-of-line with ,

How can I extract all repeated pattern in a line to comma separated format

I am extracting an interested pattern in a file. In each line I have repeated pattern and I want to order all repeated pattern for each line in a comma separated format. For example: In each line I have a string like this:
Line1: InterPro:IPR000504 InterPro:IPR003954 InterPro:IPR012677 Pfam:PF00076 PROSITE:PS50102 SMART:SM00360 SMART:SM00361 EMBL:CP002684 Proteomes:UP000006548 GO:GO:0009507 GO:GO:0003723 GO:GO:0000166 Gene3D:3.30.70.330 SUPFAM:SSF54928 eggNOG:KOG0118 eggNOG:COG0724 InterPro:IPR003954
Line2: InterPro:IPR000306 InterPro:IPR002423 InterPro:IPR002498 Pfam:PF00118 Pfam:PF01363 Pfam:PF01504 PROSITE:PS51455 SMART:SM00064 SMART:SM00330 InterPro:IPR013083 Proteomes:UP000006548 GO:GO:0005739 GO:GO:0005524 EMBL:CP002686 GO:GO:0009555 GO:GO:0046872 GO:GO:0005768 GO:GO:0010008 Gene3D:3.30.40.10 InterPro:IPR017455
I want to extract all InterPro IDs for each line as like as this :
IPR000504,IPR003954,IPR012677,IPR003954
IPR000306,IPR002423,IPR002498,IPR013083,IPR017455
I have used this script:
while read line; do
NUM=$(echo $line | grep -oP 'InterPro:\K[^ ]+' | wc -l)
if [ $NUM -eq 0 ];then
echo "NA" >> InterPro.txt;
fi;
if [ ! $NUM -eq 0 ];then
echo $line | grep -oP 'InterPro:\K[^ ]+' | tr '\n' ',' >> InterPro.txt;
fi;
done <./File.txt
The problem is once I run this script, all the pattern's values in the File.txt print in one line. I want all interested pattern's values of each line print in separated line.
Thank you in advance
With awk:
awk '{for (i=1; i<=NF; ++i) {if ($i~/^InterPro:/) {gsub(/InterPro:/, "", $i); x=x","$i}} gsub (/^,/, "", x); print x; x=""}' file
Output:
IPR000504,IPR003954,IPR012677,IPR003954
IPR000306,IPR002423,IPR002498,IPR013083,IPR017455
With indent and more meaningful variable names:
awk '
{
for (column=1; column<=NF; ++column)
{
if ($column~/^InterPro:/)
{
gsub(/InterPro:/, "", $column)
line=line","$column
}
}
gsub (/^,/, "",line)
print line
line=""
}' file
With bash builtin commands:
while IFS= read -r line; do
for column in $line; do
[[ $column =~ ^InterPro:(.*) ]] && new+=",${BASH_REMATCH[1]}"
done
echo "${new#,*}"
unset new
done < file
Finally, I changed the script and could get the interested results:
while read line; do
NUM=$(echo $line | grep -oP 'InterPro:\K[^ ]+' | wc -l)
if [ $NUM -eq 0 ];then
echo "NA" >> InterPro.txt;
fi;
if [ ! $NUM -eq 0 ];then
echo $line | grep -oP 'InterPro:\K[^ ]+' | sed -n -e 'H;${x;s/\n/,/g;s/^,//;p;}' | sed 's/ /,/g' >> InterPro.txt;
fi;
done <./File.txt

Date validation in Unix shell script (ksh)

I am validating the date in Unix shell script as follow:
CheckDate="2010-04-09"
regex="[1-9][0-9][0-9][0-9]-[0-9][0-9]-[0-3][0-9]"
if [[ $CheckDate = *#$regex ]]
then
echo "ok"
else
echo "not ok"
fi
But ksh it is giving output as not okay.. pls help.. i want output as ok
Here is my little script (written in Solaris 10, nawk is mandatory... sorry...). I know if you try to trick it by sending an alphanumeric you get an error on the let statements. Not perfect, but it gets you there...
#!/usr/bin/ksh
# checks for "-" or "/" separated 3 field parameter...
if [[ `echo $1 | nawk -F"/|-" '{print NF}'` -ne 3 ]]
then
echo "invalid date!!!"
exit 1
fi
# typeset trickery...
typeset -Z4 YEAR
typeset -Z2 MONTH
typeset -Z2 DATE
let YEAR=`echo $1 | nawk -F"/|-" '{print $3}'`
let MONTH=`echo $1 | nawk -F"/|-" '{print $1}'`
let DATE=`echo $1 | nawk -F"/|-" '{print $2}'`
let DATE2=`echo $1 | nawk -F"/|-" '{print $2}'`
# validating the year
# if the year passed contains letters or is "0" the year is invalid...
if [[ $YEAR -eq 0 ]]
then
echo "Invalid year!!!"
exit 2
fi
# validating the month
if [[ $MONTH -eq 0 || $MONTH -gt 12 ]]
then
echo "Invalid month!"
exit 3
fi
# Validating the date
if [[ $DATE -eq 0 ]]
then
echo "Invalid date!"
exit 4
else
CAL_CHECK=`cal $MONTH $YEAR | grep $DATE2 > /dev/null 2>&1 ; echo $?`
if [[ $CAL_CHECK -ne 0 ]]
then
echo "invalid date!!!"
exit 5
else
echo "VALID DATE!!!"
fi
fi
You can try this and manipulate
echo "04/09/2010" | awk -F '/' '{ print ($1 <= 04 && $2 <= 09 && match($3, /^[0-9][0-9][0-9][0-9]$/)) ? "good" : "bad" }'
echo "2010/04/09" | awk -F '/' '{ print ( match($1, /^[0-9][0-9][0-9][0-9]$/) && $2 <= 04 && $3 <= 09 ) ? "good" : "bad" }'
Please find the below code works as your exception.
export checkdate="2010-04-09"
echo ${checkdate} | grep '^[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]$'
if [ $? -eq 0 ]; then
echo "Date is valid"
else
echo "Date is not valid"
fi

sub and gsub function?

I have this command:
$ find $PWD -name "*.jpg" | awk '{system( "echo " $(sub(/\//, "_")) ) }'
_home/mol/Pulpit/test/1.jpg
Now the same thing, but using gsub:
$ find $PWD -name "*.jpg" | awk '{system( "echo " $(gsub(/\//, "_")) ) }'
mol#mol:~
I want to get the result:
_home_mol_Pulpit_test_1.jpg
Thank you for your help.
EDIT:
I put 'echo' to test the command:
$ find $PWD -name "*.jpg" | awk '{gsub("/", "_")} {system( "echo " mv $0 " " $0) }'
_home_mol_Pulpit_test_1.jpg _home_pic_Pulpit_test_1.jpg
mol#mol:~
I want to get the result:
$ find $PWD -name "*.jpg" | awk '{gsub("/", "_")} {system( "echo " mv $0 " " $0) }'
/home/pic/Pulpit/test/1.jpg _home_pic_Pulpit_test_1.jpg
That won't work if the string contains more than one match... try this:
echo "/x/y/z/x" | awk '{ gsub("/", "_") ; system( "echo " $0) }'
or better (if the echo isn't a placeholder for something else):
echo "/x/y/z/x" | awk '{ gsub("/", "_") ; print $0 }'
In your case you want to make a copy of the value before changing it:
echo "/x/y/z/x" | awk '{ c=$0; gsub("/", "_", c) ; system( "echo " $0 " " c )}'

Resources