file format change : top to bottom <> left to right
input file format:
100
150
200
300
500
output file format should be:
100,150,200,300,500
I need to apply this in reverse, too.
Just replace the linefeeds by comma:
$ tr '\n' ',' < input.txt > output.txt
and reverse
$ tr ',' '\n' < input.txt > output.txt
#!/bin/sh
i=0
while read line ; do
i=`expr $i + 1`
if [ $i -eq 1 ] ; then
echo -e "$line\c"
else
echo -e ",$line\c"
fi
done < filename
echo
Use this shell script to convert the \n as ,
The drawback in the tr command is end of the line one comma will be there to overcome
that use this script.
Related
I have 3 files at temp dir as below :
test1.txt -- It has 4 lines
test2.txt -- It has 5 lines
test3.txt -- It has 6 lines
Need to count the lines in the file along with the name to a separate file (LIST.txt), like below
'test1.txt','4'
'test2.txt','5'
'test3.txt','6'
Code Tried :
FileDir=/temp/test*.txt
for file in ${FileDir}
do
filename=$(basename $file) & awk 'END {print NR-1}' ${file} >> /temp/LIST.txt
done
This is not giving me the name, it only gives me the line counts.
Also, how to get the output of those 2 commands separated by ',' ?
Perhaps this would suit?
FileDir=/temp/test*.txt
for file in ${FileDir}
do
awk 'END{print FILENAME "," NR}' "$file"
done > LIST.txt
cat LIST.txt
/temp/test1.txt,4
/temp/test2.txt,5
/temp/test3.txt,6
Remove "/temp/" and include single quotes:
cd /temp
FileDir=test*.txt
for file in ${FileDir}
do
awk 'END{q="\047"; print q FILENAME q "," q NR q}' "$file"
done > ../LIST.txt
cd ../
cat LIST.txt
'test1.txt','4'
'test2.txt','5'
'test3.txt','6'
An alternative approach:
FileDir=/temp/test*.txt
for file in ${FileDir}
do
awk 'END{q="\047"; n = split(FILENAME, a, "/"); print q a[n] q "," q NR q}' "$file"
done > LIST.txt
cat LIST.txt
'test1.txt','4'
'test2.txt','5'
'test3.txt','6'
I am extracting an interested pattern in a file. In each line I have repeated pattern and I want to order all repeated pattern for each line in a comma separated format. For example: In each line I have a string like this:
Line1: InterPro:IPR000504 InterPro:IPR003954 InterPro:IPR012677 Pfam:PF00076 PROSITE:PS50102 SMART:SM00360 SMART:SM00361 EMBL:CP002684 Proteomes:UP000006548 GO:GO:0009507 GO:GO:0003723 GO:GO:0000166 Gene3D:3.30.70.330 SUPFAM:SSF54928 eggNOG:KOG0118 eggNOG:COG0724 InterPro:IPR003954
Line2: InterPro:IPR000306 InterPro:IPR002423 InterPro:IPR002498 Pfam:PF00118 Pfam:PF01363 Pfam:PF01504 PROSITE:PS51455 SMART:SM00064 SMART:SM00330 InterPro:IPR013083 Proteomes:UP000006548 GO:GO:0005739 GO:GO:0005524 EMBL:CP002686 GO:GO:0009555 GO:GO:0046872 GO:GO:0005768 GO:GO:0010008 Gene3D:3.30.40.10 InterPro:IPR017455
I want to extract all InterPro IDs for each line as like as this :
IPR000504,IPR003954,IPR012677,IPR003954
IPR000306,IPR002423,IPR002498,IPR013083,IPR017455
I have used this script:
while read line; do
NUM=$(echo $line | grep -oP 'InterPro:\K[^ ]+' | wc -l)
if [ $NUM -eq 0 ];then
echo "NA" >> InterPro.txt;
fi;
if [ ! $NUM -eq 0 ];then
echo $line | grep -oP 'InterPro:\K[^ ]+' | tr '\n' ',' >> InterPro.txt;
fi;
done <./File.txt
The problem is once I run this script, all the pattern's values in the File.txt print in one line. I want all interested pattern's values of each line print in separated line.
Thank you in advance
With awk:
awk '{for (i=1; i<=NF; ++i) {if ($i~/^InterPro:/) {gsub(/InterPro:/, "", $i); x=x","$i}} gsub (/^,/, "", x); print x; x=""}' file
Output:
IPR000504,IPR003954,IPR012677,IPR003954
IPR000306,IPR002423,IPR002498,IPR013083,IPR017455
With indent and more meaningful variable names:
awk '
{
for (column=1; column<=NF; ++column)
{
if ($column~/^InterPro:/)
{
gsub(/InterPro:/, "", $column)
line=line","$column
}
}
gsub (/^,/, "",line)
print line
line=""
}' file
With bash builtin commands:
while IFS= read -r line; do
for column in $line; do
[[ $column =~ ^InterPro:(.*) ]] && new+=",${BASH_REMATCH[1]}"
done
echo "${new#,*}"
unset new
done < file
Finally, I changed the script and could get the interested results:
while read line; do
NUM=$(echo $line | grep -oP 'InterPro:\K[^ ]+' | wc -l)
if [ $NUM -eq 0 ];then
echo "NA" >> InterPro.txt;
fi;
if [ ! $NUM -eq 0 ];then
echo $line | grep -oP 'InterPro:\K[^ ]+' | sed -n -e 'H;${x;s/\n/,/g;s/^,//;p;}' | sed 's/ /,/g' >> InterPro.txt;
fi;
done <./File.txt
please help me with this problem, i have an array witch includes 1000 lines with number which are treated as strings and i want for all of them to reverse them one by one, my problem is how to reverse them because i have to use ksh or else with bash or something it would be so easy..... what i have now is this, but
rev="$rev${copy:$y:1}" doesnt work in ksh.
i=0
while [[ $i -lt 999 ]]
do
rev=""
var=${xnumbers[$i]}
copy=${var}
len=${#copy}
y=$(expr $len - 1)
while [[ $y -ge 0 ]]
do
rev="$rev${copy:$y:1}"
echo "y = " $y
y=$(expr $y - 1)
done
echo "i = " $i
echo "rev = " $rev
#xnumbers[$i]=$(expr $xnumbers[$i] "|" $rev)
echo "xum = " ${xnumbers[$i]}
echo "##############################################"
i=$(expr $i + 1)
done
I am not sure why we cannot use built in rev function.
$ echo 798|rev
897
You can also try:
$ echo 798 | awk '{ for(i=length;i!=0;i--)x=x substr($0,i,1);}END{print x}'
897
If, you can print the contents of the array to a file, you can then process the file with this awk oneliner.
awk '{s1=split($0,A,""); line=""; for (i=s1;i>0;i--) line=line A[i];print line}' file
Check this!!
other_var=`echo ${xnumbers[$i]} | awk '{s1=split($0,A,""); line=""; for (i=s1;i>0;i--) line=line A[i];print line}'`
I have tested this on Ubuntu with ksh, same results:
number="789"
other_var=`echo $number | awk '{s1=split($0,A,""); line=""; for (i=s1;i>0;i--) line=line A[i];print line}'`
echo $other_var
987
You could use cut, paste and rev together, just change printf to cat file.txt:
paste -d' ' <(printf "%s data\n" {1..100} | cut -d' ' -f1) <(printf "%s data\n" {1..100} | cut -d' ' -f2 |rev)
Or rev alone if, it's not a numbered file as clarified by the OP.
How can I find all the words in my csv file starting with $
My file is like:
Test1,$Var1,$varCab1,$Vargab1,Comment1
Test2,$Var2,$varCab2,$Vargab2,Comment2
Test3,$Var3,$varCab3,$Vargab3,Comment3
As an output I want
$Var1
$varCab1
$Vargab1
$Var2
$varCab2
$Vargab2
$Var3
$varCab3
$Vargab3
Try following (grep -oE '\$\w+' filename):
$ cat 1.csv
Test1,$Var1,$varCab1,$Vargab1,Comment1
Test2,$Var2,$varCab2,$Vargab2,Comment2
Test3,$Var3,$varCab3,$Vargab3,Comment3
$ grep -oE '\$\w+' 1.csv
$Var1
$varCab1
$Vargab1
$Var2
$varCab2
$Vargab2
$Var3
$varCab3
$Vargab3
Using awk:
$ awk -F, '{ for(i=1;i<=NF;i++) if ($i ~ /\$/) print $i; }' 1.csv
$Var1
$varCab1
$Vargab1
$Var2
$varCab2
$Vargab2
$Var3
$varCab3
$Vargab3
Use tr and grep:
$ tr ',' '\n' < inputfile | grep "^[$]"
$Var1
$varCab1
$Vargab1
$Var2
$varCab2
$Vargab2
$Var3
$varCab3
$Vargab3
Using perl:
perl -ne 'for (m/\$\w+/g) { print $_, "\n" }' < inputfile
Or even shorter:
perl -ne 'print map("$_\n", m/\$\w+/g)' < inputfile
Explanation:
The regular expression \$\w+ matches a $ followed by one or more word characters.
The m//g expression returns a list of matches.
perl -ne runs the expression for each file in the input, inserting the line in $_, which is then used by the m//g expression.
i have a text file like this:
********** time1 **********
line of text1
line of text1.1
line of text1.2
********** time2 **********
********** time3 **********
********** time4 **********
line of text2.1
line of text2.2
********** time5 **********
********** time6 **********
line of text3.1
i want to extract line of text and the time(without the stars) above it and store it in a file.(time with no line of text beneath them have to be ignored). I want to do this preferably with grep and awk.
So for example, my output for the above code should be
time1 : line of text1
time1 : line of text1.1
time1 : line of text1.2
time4 : line of text2.1
time4 : line of text2.2
time6 : line of text3
how do i go about it?
This assumes that there are no spaces in the time and that there is only one (or zero) line of text after each time marker.
awk '$1 ~ /\*+/ {prev = $2} $1 !~ /\*+/ {print prev, ":", $0}' inputfile
Works with spaces in the time:
awk '/^[^*]+/ { gsub(/*/,"",x);printf x": "; print };{x=$0}' data.txt
You can do it like this with vim:
:%s_\*\+ \(YOUR TIME PATTERN\) \*\+\_.\(\[^*\].*\)$_\1 : \2_ | g_\*\+ YOUR TIME PATTERN \*\+_d
That is search for TIME PATTERN lines and saves the time pattern and the next line if it's not started with *. Then create the new line from them. Then delete every remaining TIME PATTERN line.
Note this assumes, that the time pattern lines are ending with *, etc.
With awk:
awk '/\*+ YOUR TIME PATTERN \*+/ { time=gensub("\*+ (YOUR TIME PATTERN) \*+","\\1","g") }
! /\*+ YOUR TIME PATTERN \*+/ { print time " : " $0 }' INPUTFILE
And there are other ways to do it.
In awk, see :
#!/bin/bash
awk '
BEGIN{
t=0
}
{
if ($0 ~ " time[0-9]+ ") {
v=$2
t=1
}
else if ($0 ~ "line of text") {
if (t==1) {
printf("%s : %s\n", v, $0)
} else {
t=0;
}
}
}
' FILE
Just replace FILE by your filename.
This might work for you (GNU sed):
sed '/^\*\+ \S\+.*/!d;s/[ *]//g;$!N;/\n[^*]/!D;s/\n/ : /' file
Explanation:
Look for lines beginning with *'s if not delete. /^\*\+ \S\+.*/!d
Got a time line. Delete *'s and spaces (leaving time). s/[ *]//g
Get next line $!N
Check the second line doesn't begin with *'s otherwise delete first line /\n[^*]/!D
Got intended pattern, replace \n with spaced : and print. s/\n/ : /
awk '{ if( $0 ~ /^\*+ time[0-9] \*+$/ ) { time = $2 } else { print time " : " $0 } }' file
$ uniq -f 2 input-file | awk '{getline n; print $2 " : " n}'
If your timestamp has spaces in it, change the argument to the -f option so that uniq is only comparing the final string of *. Eg, use -f X where X-2 is the number of spaces in the timestamp. Also if there are spaces in the timestamp, the awk will need to change. Either of these will work:
$ uniq -f 3 input-file | awk -F '**********' '{getline n; print $2 " : " n}'
$ uniq -f 3 input-file | awk '{getline n; $1=""; $NF=""; print $0 ": " n }'