Find all words starting with a fix string in a file? - unix

How can I find all the words in my csv file starting with $
My file is like:
As an output I want

Try following (grep -oE '\$\w+' filename):
$ cat 1.csv
$ grep -oE '\$\w+' 1.csv
Using awk:
$ awk -F, '{ for(i=1;i<=NF;i++) if ($i ~ /\$/) print $i; }' 1.csv

Use tr and grep:
$ tr ',' '\n' < inputfile | grep "^[$]"

Using perl:
perl -ne 'for (m/\$\w+/g) { print $_, "\n" }' < inputfile
Or even shorter:
perl -ne 'print map("$_\n", m/\$\w+/g)' < inputfile
The regular expression \$\w+ matches a $ followed by one or more word characters.
The m//g expression returns a list of matches.
perl -ne runs the expression for each file in the input, inserting the line in $_, which is then used by the m//g expression.


How to coerce AWK to evaluate string as math expression?

Is there a way to evaluate a string as a math expression in awk?
balter#spectre3:~$ echo "sin(0.3) 0.3" | awk '{print $1,sin($2)}'
sin(0.3) 0.29552
I would like to know a way to also have the first input evaluated to 0.29552.
You can just create your own eval function which calls awk again to execute whatever command you want it to:
$ cat tst.awk
{ print eval($1), sin($2) }
function eval(str, cmd,line,ret) {
cmd = "awk \047BEGIN{print " str "; exit}\047"
if ( (cmd | getline line) > 0 ) {
ret = line
return ret
$ echo 'sin(0.3) 0.3' | awk -f tst.awk
0.29552 0.29552
$ echo '4*7 0.3' | awk -f tst.awk
28 0.29552
$ echo 'tolower("FOO") 0.3' | awk -f tst.awk
foo 0.29552
awk lacks an eval(...) function. This means that you cannot do string to code translation based on input after the awk program initializes. Ok, perhaps it could be done, but not without writing your own parsing and evaluation engine in awk.
I would recommend using bc for this effort, like
[edwbuck#phoenix ~]$ echo "s(0.3)" | bc -l
Note that this would require sin to be shortened to s as that's the bc sine operation.
Here's a simple one liner!
math(){ awk "BEGIN{printf $1}"; }
Examples of use:
math 1+1
Yields "2"
math 'sqrt(25)'
Yeilds "5"
x=100; y=5; math "sqrt($x) + $y"
Yeilds "15"
With gawk version 4.1.2 :
echo "sin(0.3) 0.3" | awk '{split($1,a,/[()]/);f=a[1];print #f(a[2]),sin($2)}'
It's ok with tolower(FOO) too.
You can try Perl as it has eval() function.
$ echo "sin(0.3)" | perl -ne ' print eval '
For the given input,
$ echo "sin(0.3) 0.3" | perl -ne ' /(\S+)\s+(\S+)/ and print eval($1), " ", $2 '
0.29552020666134 0.3

How can I extract all repeated pattern in a line to comma separated format

I am extracting an interested pattern in a file. In each line I have repeated pattern and I want to order all repeated pattern for each line in a comma separated format. For example: In each line I have a string like this:
Line1: InterPro:IPR000504 InterPro:IPR003954 InterPro:IPR012677 Pfam:PF00076 PROSITE:PS50102 SMART:SM00360 SMART:SM00361 EMBL:CP002684 Proteomes:UP000006548 GO:GO:0009507 GO:GO:0003723 GO:GO:0000166 Gene3D: SUPFAM:SSF54928 eggNOG:KOG0118 eggNOG:COG0724 InterPro:IPR003954
Line2: InterPro:IPR000306 InterPro:IPR002423 InterPro:IPR002498 Pfam:PF00118 Pfam:PF01363 Pfam:PF01504 PROSITE:PS51455 SMART:SM00064 SMART:SM00330 InterPro:IPR013083 Proteomes:UP000006548 GO:GO:0005739 GO:GO:0005524 EMBL:CP002686 GO:GO:0009555 GO:GO:0046872 GO:GO:0005768 GO:GO:0010008 Gene3D: InterPro:IPR017455
I want to extract all InterPro IDs for each line as like as this :
I have used this script:
while read line; do
NUM=$(echo $line | grep -oP 'InterPro:\K[^ ]+' | wc -l)
if [ $NUM -eq 0 ];then
echo "NA" >> InterPro.txt;
if [ ! $NUM -eq 0 ];then
echo $line | grep -oP 'InterPro:\K[^ ]+' | tr '\n' ',' >> InterPro.txt;
done <./File.txt
The problem is once I run this script, all the pattern's values in the File.txt print in one line. I want all interested pattern's values of each line print in separated line.
Thank you in advance
With awk:
awk '{for (i=1; i<=NF; ++i) {if ($i~/^InterPro:/) {gsub(/InterPro:/, "", $i); x=x","$i}} gsub (/^,/, "", x); print x; x=""}' file
With indent and more meaningful variable names:
awk '
for (column=1; column<=NF; ++column)
if ($column~/^InterPro:/)
gsub(/InterPro:/, "", $column)
gsub (/^,/, "",line)
print line
}' file
With bash builtin commands:
while IFS= read -r line; do
for column in $line; do
[[ $column =~ ^InterPro:(.*) ]] && new+=",${BASH_REMATCH[1]}"
echo "${new#,*}"
unset new
done < file
Finally, I changed the script and could get the interested results:
while read line; do
NUM=$(echo $line | grep -oP 'InterPro:\K[^ ]+' | wc -l)
if [ $NUM -eq 0 ];then
echo "NA" >> InterPro.txt;
if [ ! $NUM -eq 0 ];then
echo $line | grep -oP 'InterPro:\K[^ ]+' | sed -n -e 'H;${x;s/\n/,/g;s/^,//;p;}' | sed 's/ /,/g' >> InterPro.txt;
done <./File.txt

Remove duplicated string stored in variable

I have a variable $var with this content:
and I need to delete duplicate words and the results is required stored in the same variable $var.
list=$(echo $var | tr "," "\n")
var=($(printf "%s\n" "${list[#]}" | sort | uniq -c | sort -rnk1 | awk '{ print $2 }'))
echo "${var[#]}"
If open to perl then:
$ var="word1,word2,word3,word1,word3"
$ var=$(perl -F, -lane'{$h{$_}++ or push #a, $_ for #F; print join ",", #a}' <<< "$var")
$ echo "$var"

How to reverse a string in ksh

please help me with this problem, i have an array witch includes 1000 lines with number which are treated as strings and i want for all of them to reverse them one by one, my problem is how to reverse them because i have to use ksh or else with bash or something it would be so easy..... what i have now is this, but
rev="$rev${copy:$y:1}" doesnt work in ksh.
while [[ $i -lt 999 ]]
y=$(expr $len - 1)
while [[ $y -ge 0 ]]
echo "y = " $y
y=$(expr $y - 1)
echo "i = " $i
echo "rev = " $rev
#xnumbers[$i]=$(expr $xnumbers[$i] "|" $rev)
echo "xum = " ${xnumbers[$i]}
echo "##############################################"
i=$(expr $i + 1)
I am not sure why we cannot use built in rev function.
$ echo 798|rev
You can also try:
$ echo 798 | awk '{ for(i=length;i!=0;i--)x=x substr($0,i,1);}END{print x}'
If, you can print the contents of the array to a file, you can then process the file with this awk oneliner.
awk '{s1=split($0,A,""); line=""; for (i=s1;i>0;i--) line=line A[i];print line}' file
Check this!!
other_var=`echo ${xnumbers[$i]} | awk '{s1=split($0,A,""); line=""; for (i=s1;i>0;i--) line=line A[i];print line}'`
I have tested this on Ubuntu with ksh, same results:
other_var=`echo $number | awk '{s1=split($0,A,""); line=""; for (i=s1;i>0;i--) line=line A[i];print line}'`
echo $other_var
You could use cut, paste and rev together, just change printf to cat file.txt:
paste -d' ' <(printf "%s data\n" {1..100} | cut -d' ' -f1) <(printf "%s data\n" {1..100} | cut -d' ' -f2 |rev)
Or rev alone if, it's not a numbered file as clarified by the OP.

If the last column is equal "R" then... Is it possible? In unix

I need to find the last column from a variable that contains some fields. I need to write something like:
if [ #the last column = "R" ];
value=`echo "'$value'"`
Is it possible?
With awk you can try:
awk '$NF=="R"' <<< "$var"
$ var="this is a var with last as R"
$ awk '$NF=="R"' <<< "$var"
this is a var with last as R
$ var1="This should not be printed"
$ awk '$NF=="R"' <<< "$var1"
The condition can be:
if [[ $value == *' 'R ]]
echo $value
No need for an external language, like awk.
Using the =~ binary operator:
$ var="Some arbitrary string ending in R"
$ unset value
$ [[ "$var" =~ $'R$' ]] && value=${var}
$ echo $value
Some arbitrary string ending in R
$ var="Some arbitrary string ending in Q"
$ unset value
$ [[ "$var" =~ $'R$' ]] && value=${var}
$ echo $value
More universal code assuming separation by spaces:
case $var in
(*\ R) printf "%s\n" "$var"
if [ "${var##* }" = R ]; then
printf "%s\n" "$var"
