How to print from 3rd column to till last columns using awk command in unix, if there are 'n' columns in a file. I am getting with cut command but I need awk command. I am trying to do with awk -F " " '{ for{i=3;i<=NF;i++) print $i}', I am getting the output but it is not in the correct format. Can anyone suggest me the proper command.
Combining Ed Morton's answers in:
Print all but the first three columns
delete a column with awk or sed
We get something like this:
awk '{sub(/^(\S+\s*){2}/,""); sub(/(\s*\S+){2}$/,"")}1'
# ^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^
# remove 2 first cols remove 2 last cols
Which you can adapt to your exact needs in terms of columns.
See an example given this input:
$ paste -d ' ' <(seq 5) <(seq 2 6) <(seq 3 7) <(seq 4 8) <(seq 5 9)
1 2 3 4 5
2 3 4 5 6
3 4 5 6 7
4 5 6 7 8
5 6 7 8 9
Let's just print the 3rd column:
$ awk '{sub(/^(\S+\s*){2}/,""); sub(/(\s*\S+){2}$/,"")}1' <(paste -d ' ' <(seq 5) <(seq 2 6) <(seq 3 7) <(seq 4 8) <(seq 5 9))
3
4
5
6
7
Your attempt was close but appears that it would print each and every column on a new line.
To correct this we create a variable called 'line' and initialize it to an empty string. The first time we are in the loop we just add the column to 'line'. From that point on we will append to 'line' with the field separator and the next column. Finally, we print 'line'. This will happen for each line in the file.
awk '{line="";for(i=3;i<=NF;i++) if(i==3) line=$i; else line=line FS $i; print line}'
In this example I assume to use awk's default field separator. Also any lines that are less than three will print blank lines.
Assuming your fields are space-separated then with GNU awk for gensub():
$ cat file
a b c d e f
g h i j k l
$ awk '{print gensub(/(\S\s){2}/,"",1)}' file
c d e f
i j k l
In general to print from, say, field 3 to field 5 if they are blank separated using GNU awk again with gensub():
$ awk '{print gensub(/(\S\s){2}((\S\s){2}\S).*/,"\\2",1)}' file
c d e
i j k
or the 3rd arg to match():
$ awk 'match($0,/(\S\s){2}((\S\s){2}\S)/,a){print a[2]}' file
c d e
i j k
or in general if they are separated by any single character:
$ awk '{print gensub("([^"FS"]"FS"){2}(([^"FS"]"FS"){2}[^"FS"]).*","\\2",1)}' file
c d e
i j k
$ awk 'match($0,"([^"FS"]"FS"){2}(([^"FS"]"FS"){2}[^"FS"])",a){print a[2]}' file
c d e
i j k
If the fields are separated by a string instead of a single-character but the RS is a single character then you should temporarily change FS to RS (which by definition you KNOW can't be present in the record) so you can negate it in the bracket expressions:
$ cat file
aSOMESTRINGbSOMESTRINGcSOMESTRINGdSOMESTRINGeSOMESTRINGf
gSOMESTRINGhSOMESTRINGiSOMESTRINGjSOMESTRINGkSOMESTRINGl
$ awk -F'SOMESTRING' '{gsub(FS,RS)} match($0,"([^"RS"]"RS"){2}(([^"RS"]"RS"){2}[^"RS"])",a){gsub(RS,FS,a[2]); print a[2]}' file
cSOMESTRINGdSOMESTRINGe
iSOMESTRINGjSOMESTRINGk
If both the FS and the RS are multi-char then there's various options but the simplest is to use the NUL character or some other character you know can't appear in your input file instead of RS as the temporary replacement FS:
$ awk -F'SOMESTRING' '{gsub(FS,"\0")} match($0,/([^\0]\0){2}(([^\0]\0){2}[^\0])/,a){gsub("\0",FS,a[2]); print a[2]}' file
cSOMESTRINGdSOMESTRINGe
iSOMESTRINGjSOMESTRINGk
Obviously change FS to OFS in the final gsub()s above if desired.
If the FS was a regexp instead of a string and you want to retain it in the output then you need to look at GNU awk for the 4th arg for split().
If you don't mind normalizing the space, the most straightforward way is
$ awk '{$1=$2=""}1' | sed -r 's/^ +//'
in action
$ seq 11 40 | pr -6ts' ' | awk '{$1=$2=""}1' | sed -r 's/^ +//'
21 26 31 36
22 27 32 37
23 28 33 38
24 29 34 39
25 30 35 40
for the input
$ seq 11 40 | pr -6ts' '
11 16 21 26 31 36
12 17 22 27 32 37
13 18 23 28 33 38
14 19 24 29 34 39
15 20 25 30 35 40
To print from third column to till end then
cat filename|awk '{for(i=1;i<3;i++) $i="";print $0}'
Related
So, I basically want to store sorted array/data into another array and use that data to print something else?
Even when I want to have the footer, sorted data is printed after the footer.
printf (" %-25s %-20s %d\n", employee_name[working_employee_id[y]], title[employee_name[working_employee_id[y]]], salary[employee_name[working_employee_id[y]]]) | "sort -nr -k2"
I want to print other things after the execution of this line instead of letting sort to print at the end
You need to close() the pipe at the end of your input before printing anything else if you want to make sure the command you're piping to finishes displaying all its output before your footer text.
Example:
$ paste <(seq 10 | shuf) <(seq 10 | shuf) |
awk '{ printf "%d\t%d\t%d\n", $1, $2, $1 + $2 | "sort -k1,1n" }
END { close("sort -k1,1n"); print "a\tb\tc" }'
1 8 9
2 3 5
3 6 9
4 4 8
5 2 7
6 10 16
7 9 16
8 1 9
9 7 16
10 5 15
a b c
Add a column before column n:
awk 'BEGIN{FS=OFS="fs"}{$n = value OFS $n}1' filename.
I have tried this command but it doesn't work. What does the "n" represent here? Do I have to change the n to a value?
All together I have a file with 17 columns. I would like to add a new column in between column 6 and 7.
This is better achieved by looping on the field:
Input file:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
Then adding "18" in between 6 and 7:
awk -F \| '{ for (i=1;i<=6;i++) { printf "%s ",$i } printf "%s","18";for (i=7;i<=$NF;i++) { printf " %s",$i } printf "\n" }' file
Explanation:
awk -F \| '{
for (i=1;i<=6;i++) {
printf "%s ",$i # Loop through the first 6 space delimited fields and print with a space after each one to replicate the delimiter
}
printf "%s","18"; # Print "18" with no spaces
for (i=7;i<=NF;i++) {
printf " %s",$i # Loop through the rest of the field printing a space and then the field (NF - represent the last field)
}
printf "\n" # Print a new line
}' file
Output:
1 2 3 4 5 6 18 7 8 9 10 11 12 13 14 15 16 17
I have a 2 column file that I am trying to find gaps of greater than 10 in consecutive numbers:
the file is in ascending order as follows:
a 12
b 16
c 19
d 25
e 28
f 38
g **40**
h **55**
i 56
j 59
k 62
What I would like is to be able to print every 1st column identifier (a-k) for each occurrence where two ADJACENT numbers from the 2nd column have a value greater than 10.
For example, the output I am looking for here is: g, h
(as the difference between the 2nd column associated with g and h is greater than 10)
Would very much appreciate your help :)
There are many ways to do it, using perl, awk, or even only a shell script, e. g.:
while read ID num
do if expr "$num" - "$prenum" ">" 10 >/dev/null 2>&1 # discard expr output
then echo $preID, $ID
fi
preID=$ID prenum=$num
done <2_column_file # your 2 column file
I assumed your 2 column file doesn't actually contain empty lines, but rather the above are artifacts of SO formatting problems (you could have used ``` lines around).
Using awk
awk ' { c=$2; if(c-p>10 && NR>1 ) { print a,p; print $0 } p=c;a=$1 } '
with inputs
$ awk ' { c=$2; if(c-p>10 && NR>1 ) { print a,p; print $0 } p=c;a=$1 } ' waheed.txt
g 40
h 55
$ cat waheed.txt
a 12
b 16
c 19
d 25
e 28
f 38
g 40
h 55
i 56
j 59
k 62
$
I have two file, first with single column (with repeated IDs), second file is three columns file, first column is IDs which is same with first file BUT unique number, I want to print remaining two columns of second file corresponding to first file IDs.
Example:
First file:
IDs
1
3
6
7
11
13
13
14
18
20
Second file:
IDs Freq Status
1 1 JD611
2 1 QD51
3 2
5
6
7 2
11 2
13 2
14 2
Desired OUTPUT
1 1 JD611
3 2
6
7 2
11 2
13 2
13 2
14 2
18
20
You can use this awk:
awk 'NR==FNR{a[$1]=$2 FS $3; next} {print $1, a[$1]}' f2 f1
To skip the header line,
awk 'FNR==1{next} NR==FNR{a[$1]=$2 FS $3; next} {print $1, a[$1]}' f2 f1
If second file has multiple columns,
awk 'NR==FNR{c=$1; $1=""; a[c]=$0; next} {print $1, a[$1]}' f2 f1
I would like to know how I could transform the following ('Old') to 'New1' and 'New2' using awk:
Old:
5
21
31
4
5
11
12
15
5
19
5
12
5
.
.
New1:
5 21 31 4
5 11 12 15
5 19
5 12
.
.
New2:
521314
5111215
519
512
.
.
Thanks so much!
Requires gawk for multi-character RS:
$ awk 'BEGIN {RS="\n5\n"} {$1=$1; print (NR>1 ? 5 OFS $0 : $0)}' file
5 21 31 4
5 11 12 15
5 19
5 12
For the second version, just set OFS to the empty string:
$ awk -v OFS="" 'BEGIN {RS="\n5\n"} {$1=$1; print (NR>1 ? 5 OFS $0 : $0)}' file
521314
5111215
519
512
To get new1:
awk '/^5/{printf "%s", (NR>1?RS:"")$0;next}{printf " %s",$0}END{print ""}' file
To get new2:
awk '/^5/{printf "%s", (NR>1?RS:"")$0;next}{printf "%s",$0}END{print ""}' file
some variation of #jas's script
$ awk -v RS="(^|\n)5\n" -v OFS='' 'NR>1{$1=$1; print 5,$0}' file
521314
5111215
519
512
$ awk -v RS="(^|\n)5\n" -v OFS=' ' 'NR>1{$1=$1; print 5,$0}' file
5 21 31 4
5 11 12 15
5 19
5 12
in the second one you don't have to set the OFS explicitly since it's the default value, otherwise both scripts are the same (essentially same as the other referenced answer).
With any awk:
$ awk -v ORS= '{print ($0==5 ? ors : OFS) $0; ors=RS} END{print ors}' file
5 21 31 4
5 11 12 15
5 19
5 12
$ awk -v ORS= -v OFS= '{print ($0==5 ? ors : OFS) $0; ors=RS} END{print ors}' file
521314
5111215
519
512