How to reverse a string in ksh - unix

please help me with this problem, i have an array witch includes 1000 lines with number which are treated as strings and i want for all of them to reverse them one by one, my problem is how to reverse them because i have to use ksh or else with bash or something it would be so easy..... what i have now is this, but
rev="$rev${copy:$y:1}" doesnt work in ksh.
i=0
while [[ $i -lt 999 ]]
do
rev=""
var=${xnumbers[$i]}
copy=${var}
len=${#copy}
y=$(expr $len - 1)
while [[ $y -ge 0 ]]
do
rev="$rev${copy:$y:1}"
echo "y = " $y
y=$(expr $y - 1)
done
echo "i = " $i
echo "rev = " $rev
#xnumbers[$i]=$(expr $xnumbers[$i] "|" $rev)
echo "xum = " ${xnumbers[$i]}
echo "##############################################"
i=$(expr $i + 1)
done

I am not sure why we cannot use built in rev function.
$ echo 798|rev
897
You can also try:
$ echo 798 | awk '{ for(i=length;i!=0;i--)x=x substr($0,i,1);}END{print x}'
897

If, you can print the contents of the array to a file, you can then process the file with this awk oneliner.
awk '{s1=split($0,A,""); line=""; for (i=s1;i>0;i--) line=line A[i];print line}' file
Check this!!
other_var=`echo ${xnumbers[$i]} | awk '{s1=split($0,A,""); line=""; for (i=s1;i>0;i--) line=line A[i];print line}'`
I have tested this on Ubuntu with ksh, same results:
number="789"
other_var=`echo $number | awk '{s1=split($0,A,""); line=""; for (i=s1;i>0;i--) line=line A[i];print line}'`
echo $other_var
987

You could use cut, paste and rev together, just change printf to cat file.txt:
paste -d' ' <(printf "%s data\n" {1..100} | cut -d' ' -f1) <(printf "%s data\n" {1..100} | cut -d' ' -f2 |rev)
Or rev alone if, it's not a numbered file as clarified by the OP.

Related

How to coerce AWK to evaluate string as math expression?

Is there a way to evaluate a string as a math expression in awk?
balter#spectre3:~$ echo "sin(0.3) 0.3" | awk '{print $1,sin($2)}'
sin(0.3) 0.29552
I would like to know a way to also have the first input evaluated to 0.29552.
You can just create your own eval function which calls awk again to execute whatever command you want it to:
$ cat tst.awk
{ print eval($1), sin($2) }
function eval(str, cmd,line,ret) {
cmd = "awk \047BEGIN{print " str "; exit}\047"
if ( (cmd | getline line) > 0 ) {
ret = line
}
close(cmd)
return ret
}
$ echo 'sin(0.3) 0.3' | awk -f tst.awk
0.29552 0.29552
$ echo '4*7 0.3' | awk -f tst.awk
28 0.29552
$ echo 'tolower("FOO") 0.3' | awk -f tst.awk
foo 0.29552
awk lacks an eval(...) function. This means that you cannot do string to code translation based on input after the awk program initializes. Ok, perhaps it could be done, but not without writing your own parsing and evaluation engine in awk.
I would recommend using bc for this effort, like
[edwbuck#phoenix ~]$ echo "s(0.3)" | bc -l
.29552020666133957510
Note that this would require sin to be shortened to s as that's the bc sine operation.
Here's a simple one liner!
math(){ awk "BEGIN{printf $1}"; }
Examples of use:
math 1+1
Yields "2"
math 'sqrt(25)'
Yeilds "5"
x=100; y=5; math "sqrt($x) + $y"
Yeilds "15"
With gawk version 4.1.2 :
echo "sin(0.3) 0.3" | awk '{split($1,a,/[()]/);f=a[1];print #f(a[2]),sin($2)}'
It's ok with tolower(FOO) too.
You can try Perl as it has eval() function.
$ echo "sin(0.3)" | perl -ne ' print eval '
0.29552020666134
$
For the given input,
$ echo "sin(0.3) 0.3" | perl -ne ' /(\S+)\s+(\S+)/ and print eval($1), " ", $2 '
0.29552020666134 0.3
$

How can I extract all repeated pattern in a line to comma separated format

I am extracting an interested pattern in a file. In each line I have repeated pattern and I want to order all repeated pattern for each line in a comma separated format. For example: In each line I have a string like this:
Line1: InterPro:IPR000504 InterPro:IPR003954 InterPro:IPR012677 Pfam:PF00076 PROSITE:PS50102 SMART:SM00360 SMART:SM00361 EMBL:CP002684 Proteomes:UP000006548 GO:GO:0009507 GO:GO:0003723 GO:GO:0000166 Gene3D:3.30.70.330 SUPFAM:SSF54928 eggNOG:KOG0118 eggNOG:COG0724 InterPro:IPR003954
Line2: InterPro:IPR000306 InterPro:IPR002423 InterPro:IPR002498 Pfam:PF00118 Pfam:PF01363 Pfam:PF01504 PROSITE:PS51455 SMART:SM00064 SMART:SM00330 InterPro:IPR013083 Proteomes:UP000006548 GO:GO:0005739 GO:GO:0005524 EMBL:CP002686 GO:GO:0009555 GO:GO:0046872 GO:GO:0005768 GO:GO:0010008 Gene3D:3.30.40.10 InterPro:IPR017455
I want to extract all InterPro IDs for each line as like as this :
IPR000504,IPR003954,IPR012677,IPR003954
IPR000306,IPR002423,IPR002498,IPR013083,IPR017455
I have used this script:
while read line; do
NUM=$(echo $line | grep -oP 'InterPro:\K[^ ]+' | wc -l)
if [ $NUM -eq 0 ];then
echo "NA" >> InterPro.txt;
fi;
if [ ! $NUM -eq 0 ];then
echo $line | grep -oP 'InterPro:\K[^ ]+' | tr '\n' ',' >> InterPro.txt;
fi;
done <./File.txt
The problem is once I run this script, all the pattern's values in the File.txt print in one line. I want all interested pattern's values of each line print in separated line.
Thank you in advance
With awk:
awk '{for (i=1; i<=NF; ++i) {if ($i~/^InterPro:/) {gsub(/InterPro:/, "", $i); x=x","$i}} gsub (/^,/, "", x); print x; x=""}' file
Output:
IPR000504,IPR003954,IPR012677,IPR003954
IPR000306,IPR002423,IPR002498,IPR013083,IPR017455
With indent and more meaningful variable names:
awk '
{
for (column=1; column<=NF; ++column)
{
if ($column~/^InterPro:/)
{
gsub(/InterPro:/, "", $column)
line=line","$column
}
}
gsub (/^,/, "",line)
print line
line=""
}' file
With bash builtin commands:
while IFS= read -r line; do
for column in $line; do
[[ $column =~ ^InterPro:(.*) ]] && new+=",${BASH_REMATCH[1]}"
done
echo "${new#,*}"
unset new
done < file
Finally, I changed the script and could get the interested results:
while read line; do
NUM=$(echo $line | grep -oP 'InterPro:\K[^ ]+' | wc -l)
if [ $NUM -eq 0 ];then
echo "NA" >> InterPro.txt;
fi;
if [ ! $NUM -eq 0 ];then
echo $line | grep -oP 'InterPro:\K[^ ]+' | sed -n -e 'H;${x;s/\n/,/g;s/^,//;p;}' | sed 's/ /,/g' >> InterPro.txt;
fi;
done <./File.txt

Date validation in Unix shell script (ksh)

I am validating the date in Unix shell script as follow:
CheckDate="2010-04-09"
regex="[1-9][0-9][0-9][0-9]-[0-9][0-9]-[0-3][0-9]"
if [[ $CheckDate = *#$regex ]]
then
echo "ok"
else
echo "not ok"
fi
But ksh it is giving output as not okay.. pls help.. i want output as ok
Here is my little script (written in Solaris 10, nawk is mandatory... sorry...). I know if you try to trick it by sending an alphanumeric you get an error on the let statements. Not perfect, but it gets you there...
#!/usr/bin/ksh
# checks for "-" or "/" separated 3 field parameter...
if [[ `echo $1 | nawk -F"/|-" '{print NF}'` -ne 3 ]]
then
echo "invalid date!!!"
exit 1
fi
# typeset trickery...
typeset -Z4 YEAR
typeset -Z2 MONTH
typeset -Z2 DATE
let YEAR=`echo $1 | nawk -F"/|-" '{print $3}'`
let MONTH=`echo $1 | nawk -F"/|-" '{print $1}'`
let DATE=`echo $1 | nawk -F"/|-" '{print $2}'`
let DATE2=`echo $1 | nawk -F"/|-" '{print $2}'`
# validating the year
# if the year passed contains letters or is "0" the year is invalid...
if [[ $YEAR -eq 0 ]]
then
echo "Invalid year!!!"
exit 2
fi
# validating the month
if [[ $MONTH -eq 0 || $MONTH -gt 12 ]]
then
echo "Invalid month!"
exit 3
fi
# Validating the date
if [[ $DATE -eq 0 ]]
then
echo "Invalid date!"
exit 4
else
CAL_CHECK=`cal $MONTH $YEAR | grep $DATE2 > /dev/null 2>&1 ; echo $?`
if [[ $CAL_CHECK -ne 0 ]]
then
echo "invalid date!!!"
exit 5
else
echo "VALID DATE!!!"
fi
fi
You can try this and manipulate
echo "04/09/2010" | awk -F '/' '{ print ($1 <= 04 && $2 <= 09 && match($3, /^[0-9][0-9][0-9][0-9]$/)) ? "good" : "bad" }'
echo "2010/04/09" | awk -F '/' '{ print ( match($1, /^[0-9][0-9][0-9][0-9]$/) && $2 <= 04 && $3 <= 09 ) ? "good" : "bad" }'
Please find the below code works as your exception.
export checkdate="2010-04-09"
echo ${checkdate} | grep '^[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]$'
if [ $? -eq 0 ]; then
echo "Date is valid"
else
echo "Date is not valid"
fi

comparing floating numbers in unix

I am facing problem in comparing big floating variables in unix
Code:
error message: syntax error on line 1 teletype
I got to know from one of the old posts in the forum this is because
"the script is trying to do a calculation with bc by echoing an expression into it. But one of the variables has an illegal number"
Below is the script which is giving the error
Code:
#! /bin/bash -xv
a=`cat abc.csv | sed '1d' | tr -s ' ' | cut -d, -f3`
echo $a
-180582621617.24
b=`sed '1d' def.csv | cut -d',' -f7 | awk '{s+=$1}END{ printf("%.2f\n",s)}'`
echo $b
-180582621617.37
Result=`echo "if($a !=$b) 1" | bc `
if [ $Result -eq 1 ]; then
echo "both values not equal"
else
echo " both values equal"
fi
But I was able to compare it when hard-coded
Code:
a=`echo "-180582621617.24,222.555,333.333" | awk -F"," '{print $1}'`
b=`echo "-180582621617.24,222.555,333.333" | awk -F"," '{print $1}'`
Result=`echo "if($a !=$b) 1" | bc `
if [ $Result -eq 1 ]; then
echo "both values not equal"
else
echo " both values equal"
fi
Your test in bc is return 1 if true and nothing when false.
$Result will be then either undefined or numeric (1). test with -eq only works with two operands both numeric. Just return 0 for the else case
Result=`echo "if($a !=$b) 1 else 0" | bc `
if [ $Result -eq 1 ] ; then
echo "both values not equal"
else
echo " both values equal"
fi
Use bc for dealing with floating numbers in shell:
$ bc <<< '-180582621617.24 == -180582621617.37'
0
$ bc <<< '-180582621617.24 != -180582621617.37'
1
In your case, it is going to be bc <<< "$a != $b", e.g.:
[[ bc <<< "$a != $b" ]] && Result=1 || Result=0
Thanks for all the suggestions.
I was able to compare by creating two temp files and using the diff -w command.
#! /bin/bash -xv
rm -f triger_cksum.txt data_cksum.txt
a=`cat ab.csv | sed '1d' | tr -s ' ' | cut -d, -f3`
echo $a > triger_cksum.txt
b=`sed '1d' cd.csv | cut -d',' -f61 | awk '{s+=$1}END{ printf("%.6f\n",s)}'`
echo $b > data_cksum.txt
diff_files=`diff -w triger_cksum.txt data_cksum.txt | wc -l | tr -s ' '`
if [ $diff_files -eq 0 ]
then
echo "cksum equal"
else
echo "cksum not equal"
fi

How to find the distinct values in unix

I need distinct values from the below columns:
AA|BB|CC
a#gmail.com,c#yahoo.co.in|a#gmail.com|a#gmail.com
y#gmail.com|x#yahoo.in,z#redhat.com|z#redhat.com
c#gmail.com|b#yahoo.co.in|c#uix.xo.in
Here records are '|' seperated and in the 1st column, we can two email id's which are ',' seperated. so, I want to consider that also. I want distinct email id's in the AA,BB,CC column, whether it is '|' seperated or ',' seperated.
Expected output:
c#yahoo.co.in|a#gmail.com|
y#gmail.com|x#yahoo.in|z#redhat.com
c#gmail.com|b#yahoo.co.in|c#uix.xo.in
is awk unix enough for you?
{
for(i=1; i < NF; i++) {
if ($i ~ /#/) {
mail[$i]++
}
}
}
END {
for (x in mail) {
print mail[x], x
}
}
output:
$ awk -F'[|,]' -f v.awk f1
2 z#redhat.com
3 a#gmail.com
1 x#yahoo.in
1 c#yahoo.co.in
1 c#gmail.com
1 y#gmail.com
1 b#yahoo.co.in
Using awk :
cat file | tr ',' '|' | awk -F '|' '{ line=""; for (i=1; i<=NF; i++) {if ($i != "" && list[NR"#"$i] != 1){line=line $i "|"}; list[NR"#"$i]=1 }; print line}'
Prints :
a#gmail.com|c#yahoo.co.in|
y#gmail.com|x#yahoo.in|z#redhat.com|
c#gmail.com|b#yahoo.co.in|c#uix.xo.in|
Edit :
Now works properly with inputs such as :
a#gmail.com|c#yahoo.co.in|
y#gmail.com|x#yahoo.in|a#gmail.com|
c#gmail.com|c#yahoo.co.in|c#uix.xo.in|
Prints :
a#gmail.com|c#yahoo.co.in|
y#gmail.com|x#yahoo.in|a#gmail.com|
c#gmail.com|c#yahoo.co.in|c#uix.xo.in|
The following python code will solve your problem:
#!/usr/bin/env python
while True:
try:
addrs = raw_input()
except EOFError:
break
print '|'.join(set(addrs.replace(',', '|').split('|')))
In Bash only:
while read s; do
IFS='|,'
for e in $s; do
echo "$e"
done | sort | uniq
unset IFS
done
This seems to work, although I'm not sure what to do if there are more than three unique mails. Run with awk -f filename.awk dataname.dat
BEGIN {IFS=/[,|]/}
NF {
delete uniqmails;
for (i=1; i<=NF; i++)
uniqmails[$i] = 1;
sep="";
n=0;
for (m in uniqmails) {
printf "%s%s", sep, m;
sep="|";
n++;
}
for (;n<3;n++) printf "|";
print ""; // EOL
}
There's also this "one-liner" that doesn't need awk:
while read line; do
echo $line | tr ",|" "\n" | sort -u |\
paste <( seq 3) - | cut -f 2 |\
tr "\n" "|" |\
rev | cut -c 2- | rev;
done
With perl:
perl -lane '$s{$_}++ for split /[|,]/; END { print for keys %s;}' input
I have edited this post, Hope it will work
while read line
do
val1=`echo $line|awk -F"|" '{print $1}'`
val2=`echo $line|awk -F"|" '{print $2}'`
val3=`echo $line|awk -F"|" '{print $3}'`
a=`echo $line|awk -F"|" '{print $2,"|",$3}'|sed 's/'$val1'//g'`
aa=`echo "$val1|$a"`
b=`echo $aa|awk -F"|" '{print $1,"|",$3}'|sed 's/'$val2'//g'`
b1=`echo $b|awk -F"|" '{print $1}'`
b2=`echo $b|awk -F"|" '{print $2}'`
bb=`echo "$b1|$val2|$b2"`
c=`echo $bb|awk -F"|" '{print $1,"|",$2}'|sed 's/'$val3'//g'`
cc=`echo "$c|$val3"|sed 's/,,/,/;s/,|/|/;s/|,/|/;s/^,//;s/ //g'`
echo "$cc">>abcd
done<ab.dat
cat abcd
c#yahoo.co.in||a#gmail.com
y#gmail.com|x#yahoo.in|z#redhat.com
c#gmail.com|b#yahoo.co.in|c#uix.xo.in
You can subtract all "," separated values and parse in the same way...if your all values are having "," separated.

Resources