How to change a file format from rows to columns? - unix

file format change : top to bottom <> left to right
input file format:
100
150
200
300
500
output file format should be:
100,150,200,300,500
I need to apply this in reverse, too.

Just replace the linefeeds by comma:
$ tr '\n' ',' < input.txt > output.txt
and reverse
$ tr ',' '\n' < input.txt > output.txt

#!/bin/sh
i=0
while read line ; do
i=`expr $i + 1`
if [ $i -eq 1 ] ; then
echo -e "$line\c"
else
echo -e ",$line\c"
fi
done < filename
echo
Use this shell script to convert the \n as ,
The drawback in the tr command is end of the line one comma will be there to overcome
that use this script.

Related

Count file line along with file name in unix

I have 3 files at temp dir as below :
test1.txt -- It has 4 lines
test2.txt -- It has 5 lines
test3.txt -- It has 6 lines
Need to count the lines in the file along with the name to a separate file (LIST.txt), like below
'test1.txt','4'
'test2.txt','5'
'test3.txt','6'
Code Tried :
FileDir=/temp/test*.txt
for file in ${FileDir}
do
filename=$(basename $file) & awk 'END {print NR-1}' ${file} >> /temp/LIST.txt
done
This is not giving me the name, it only gives me the line counts.
Also, how to get the output of those 2 commands separated by ',' ?
Perhaps this would suit?
FileDir=/temp/test*.txt
for file in ${FileDir}
do
awk 'END{print FILENAME "," NR}' "$file"
done > LIST.txt
cat LIST.txt
/temp/test1.txt,4
/temp/test2.txt,5
/temp/test3.txt,6
Remove "/temp/" and include single quotes:
cd /temp
FileDir=test*.txt
for file in ${FileDir}
do
awk 'END{q="\047"; print q FILENAME q "," q NR q}' "$file"
done > ../LIST.txt
cd ../
cat LIST.txt
'test1.txt','4'
'test2.txt','5'
'test3.txt','6'
An alternative approach:
FileDir=/temp/test*.txt
for file in ${FileDir}
do
awk 'END{q="\047"; n = split(FILENAME, a, "/"); print q a[n] q "," q NR q}' "$file"
done > LIST.txt
cat LIST.txt
'test1.txt','4'
'test2.txt','5'
'test3.txt','6'

How can I extract all repeated pattern in a line to comma separated format

I am extracting an interested pattern in a file. In each line I have repeated pattern and I want to order all repeated pattern for each line in a comma separated format. For example: In each line I have a string like this:
Line1: InterPro:IPR000504 InterPro:IPR003954 InterPro:IPR012677 Pfam:PF00076 PROSITE:PS50102 SMART:SM00360 SMART:SM00361 EMBL:CP002684 Proteomes:UP000006548 GO:GO:0009507 GO:GO:0003723 GO:GO:0000166 Gene3D:3.30.70.330 SUPFAM:SSF54928 eggNOG:KOG0118 eggNOG:COG0724 InterPro:IPR003954
Line2: InterPro:IPR000306 InterPro:IPR002423 InterPro:IPR002498 Pfam:PF00118 Pfam:PF01363 Pfam:PF01504 PROSITE:PS51455 SMART:SM00064 SMART:SM00330 InterPro:IPR013083 Proteomes:UP000006548 GO:GO:0005739 GO:GO:0005524 EMBL:CP002686 GO:GO:0009555 GO:GO:0046872 GO:GO:0005768 GO:GO:0010008 Gene3D:3.30.40.10 InterPro:IPR017455
I want to extract all InterPro IDs for each line as like as this :
IPR000504,IPR003954,IPR012677,IPR003954
IPR000306,IPR002423,IPR002498,IPR013083,IPR017455
I have used this script:
while read line; do
NUM=$(echo $line | grep -oP 'InterPro:\K[^ ]+' | wc -l)
if [ $NUM -eq 0 ];then
echo "NA" >> InterPro.txt;
fi;
if [ ! $NUM -eq 0 ];then
echo $line | grep -oP 'InterPro:\K[^ ]+' | tr '\n' ',' >> InterPro.txt;
fi;
done <./File.txt
The problem is once I run this script, all the pattern's values in the File.txt print in one line. I want all interested pattern's values of each line print in separated line.
Thank you in advance
With awk:
awk '{for (i=1; i<=NF; ++i) {if ($i~/^InterPro:/) {gsub(/InterPro:/, "", $i); x=x","$i}} gsub (/^,/, "", x); print x; x=""}' file
Output:
IPR000504,IPR003954,IPR012677,IPR003954
IPR000306,IPR002423,IPR002498,IPR013083,IPR017455
With indent and more meaningful variable names:
awk '
{
for (column=1; column<=NF; ++column)
{
if ($column~/^InterPro:/)
{
gsub(/InterPro:/, "", $column)
line=line","$column
}
}
gsub (/^,/, "",line)
print line
line=""
}' file
With bash builtin commands:
while IFS= read -r line; do
for column in $line; do
[[ $column =~ ^InterPro:(.*) ]] && new+=",${BASH_REMATCH[1]}"
done
echo "${new#,*}"
unset new
done < file
Finally, I changed the script and could get the interested results:
while read line; do
NUM=$(echo $line | grep -oP 'InterPro:\K[^ ]+' | wc -l)
if [ $NUM -eq 0 ];then
echo "NA" >> InterPro.txt;
fi;
if [ ! $NUM -eq 0 ];then
echo $line | grep -oP 'InterPro:\K[^ ]+' | sed -n -e 'H;${x;s/\n/,/g;s/^,//;p;}' | sed 's/ /,/g' >> InterPro.txt;
fi;
done <./File.txt

How to reverse a string in ksh

please help me with this problem, i have an array witch includes 1000 lines with number which are treated as strings and i want for all of them to reverse them one by one, my problem is how to reverse them because i have to use ksh or else with bash or something it would be so easy..... what i have now is this, but
rev="$rev${copy:$y:1}" doesnt work in ksh.
i=0
while [[ $i -lt 999 ]]
do
rev=""
var=${xnumbers[$i]}
copy=${var}
len=${#copy}
y=$(expr $len - 1)
while [[ $y -ge 0 ]]
do
rev="$rev${copy:$y:1}"
echo "y = " $y
y=$(expr $y - 1)
done
echo "i = " $i
echo "rev = " $rev
#xnumbers[$i]=$(expr $xnumbers[$i] "|" $rev)
echo "xum = " ${xnumbers[$i]}
echo "##############################################"
i=$(expr $i + 1)
done
I am not sure why we cannot use built in rev function.
$ echo 798|rev
897
You can also try:
$ echo 798 | awk '{ for(i=length;i!=0;i--)x=x substr($0,i,1);}END{print x}'
897
If, you can print the contents of the array to a file, you can then process the file with this awk oneliner.
awk '{s1=split($0,A,""); line=""; for (i=s1;i>0;i--) line=line A[i];print line}' file
Check this!!
other_var=`echo ${xnumbers[$i]} | awk '{s1=split($0,A,""); line=""; for (i=s1;i>0;i--) line=line A[i];print line}'`
I have tested this on Ubuntu with ksh, same results:
number="789"
other_var=`echo $number | awk '{s1=split($0,A,""); line=""; for (i=s1;i>0;i--) line=line A[i];print line}'`
echo $other_var
987
You could use cut, paste and rev together, just change printf to cat file.txt:
paste -d' ' <(printf "%s data\n" {1..100} | cut -d' ' -f1) <(printf "%s data\n" {1..100} | cut -d' ' -f2 |rev)
Or rev alone if, it's not a numbered file as clarified by the OP.

Find all words starting with a fix string in a file?

How can I find all the words in my csv file starting with $
My file is like:
Test1,$Var1,$varCab1,$Vargab1,Comment1
Test2,$Var2,$varCab2,$Vargab2,Comment2
Test3,$Var3,$varCab3,$Vargab3,Comment3
As an output I want
$Var1
$varCab1
$Vargab1
$Var2
$varCab2
$Vargab2
$Var3
$varCab3
$Vargab3
Try following (grep -oE '\$\w+' filename):
$ cat 1.csv
Test1,$Var1,$varCab1,$Vargab1,Comment1
Test2,$Var2,$varCab2,$Vargab2,Comment2
Test3,$Var3,$varCab3,$Vargab3,Comment3
$ grep -oE '\$\w+' 1.csv
$Var1
$varCab1
$Vargab1
$Var2
$varCab2
$Vargab2
$Var3
$varCab3
$Vargab3
Using awk:
$ awk -F, '{ for(i=1;i<=NF;i++) if ($i ~ /\$/) print $i; }' 1.csv
$Var1
$varCab1
$Vargab1
$Var2
$varCab2
$Vargab2
$Var3
$varCab3
$Vargab3
Use tr and grep:
$ tr ',' '\n' < inputfile | grep "^[$]"
$Var1
$varCab1
$Vargab1
$Var2
$varCab2
$Vargab2
$Var3
$varCab3
$Vargab3
Using perl:
perl -ne 'for (m/\$\w+/g) { print $_, "\n" }' < inputfile
Or even shorter:
perl -ne 'print map("$_\n", m/\$\w+/g)' < inputfile
Explanation:
The regular expression \$\w+ matches a $ followed by one or more word characters.
The m//g expression returns a list of matches.
perl -ne runs the expression for each file in the input, inserting the line in $_, which is then used by the m//g expression.

extracting a pattern and a certain field from the line above it using awk and grep preferably

i have a text file like this:
********** time1 **********
line of text1
line of text1.1
line of text1.2
********** time2 **********
********** time3 **********
********** time4 **********
line of text2.1
line of text2.2
********** time5 **********
********** time6 **********
line of text3.1
i want to extract line of text and the time(without the stars) above it and store it in a file.(time with no line of text beneath them have to be ignored). I want to do this preferably with grep and awk.
So for example, my output for the above code should be
time1 : line of text1
time1 : line of text1.1
time1 : line of text1.2
time4 : line of text2.1
time4 : line of text2.2
time6 : line of text3
how do i go about it?
This assumes that there are no spaces in the time and that there is only one (or zero) line of text after each time marker.
awk '$1 ~ /\*+/ {prev = $2} $1 !~ /\*+/ {print prev, ":", $0}' inputfile
Works with spaces in the time:
awk '/^[^*]+/ { gsub(/*/,"",x);printf x": "; print };{x=$0}' data.txt
You can do it like this with vim:
:%s_\*\+ \(YOUR TIME PATTERN\) \*\+\_.\(\[^*\].*\)$_\1 : \2_ | g_\*\+ YOUR TIME PATTERN \*\+_d
That is search for TIME PATTERN lines and saves the time pattern and the next line if it's not started with *. Then create the new line from them. Then delete every remaining TIME PATTERN line.
Note this assumes, that the time pattern lines are ending with *, etc.
With awk:
awk '/\*+ YOUR TIME PATTERN \*+/ { time=gensub("\*+ (YOUR TIME PATTERN) \*+","\\1","g") }
! /\*+ YOUR TIME PATTERN \*+/ { print time " : " $0 }' INPUTFILE
And there are other ways to do it.
In awk, see :
#!/bin/bash
awk '
BEGIN{
t=0
}
{
if ($0 ~ " time[0-9]+ ") {
v=$2
t=1
}
else if ($0 ~ "line of text") {
if (t==1) {
printf("%s : %s\n", v, $0)
} else {
t=0;
}
}
}
' FILE
Just replace FILE by your filename.
This might work for you (GNU sed):
sed '/^\*\+ \S\+.*/!d;s/[ *]//g;$!N;/\n[^*]/!D;s/\n/ : /' file
Explanation:
Look for lines beginning with *'s if not delete. /^\*\+ \S\+.*/!d
Got a time line. Delete *'s and spaces (leaving time). s/[ *]//g
Get next line $!N
Check the second line doesn't begin with *'s otherwise delete first line /\n[^*]/!D
Got intended pattern, replace \n with spaced : and print. s/\n/ : /
awk '{ if( $0 ~ /^\*+ time[0-9] \*+$/ ) { time = $2 } else { print time " : " $0 } }' file
$ uniq -f 2 input-file | awk '{getline n; print $2 " : " n}'
If your timestamp has spaces in it, change the argument to the -f option so that uniq is only comparing the final string of *. Eg, use -f X where X-2 is the number of spaces in the timestamp. Also if there are spaces in the timestamp, the awk will need to change. Either of these will work:
$ uniq -f 3 input-file | awk -F '**********' '{getline n; print $2 " : " n}'
$ uniq -f 3 input-file | awk '{getline n; $1=""; $NF=""; print $0 ": " n }'

Resources