how to get NF from AWK argument? - unix

I've been trying to get the NF from 2 arguments (not files) in awk without success .
This is the command line:
awk -f the_program 12/12/2013 11/11/2014
Is it possible to some how pipe ARGV[1] or ARGV[2] to getline to get NF?
I wanted to get NF so I can easily validate the arguments before doing other stuff with them

You can do it in pure awk:
$ awk -F/ 'BEGIN{for (i=1;i<ARGC;i++) {print split(ARGV[i], a) }}' 12/12/2013 11/11/2014
3
3

I am not sure if it can be done alone using awk. Try using wrapper bash script around it.
myscript.sh
#!/bin/bash
awk 'BEGIN {
RS=" "; FS="/";
}
{
print NF;
}' <(echo $*)
Test
% myscript.sh 12/12/2013 11/11/2014
3
3
Or eliminating use of echo with <<< as suggested by #fedorqui
#!/bin/bash
awk 'BEGIN {RS=" ";FS="/";} {print NF}' <<<$*

Is this all you want?
$ awk 'BEGIN{ print ARGC - 1 }' 12/12/2013 11/11/2014
2
If not, post some expected output and explain why.

Related

cut command --complement flag equivalent in AWK

I am new to writing shell scripts
I am trying to write an AWK command which does exactly the below
cut --complement -c $IGNORE_RANGE file.txt > tmp
$IGNORE_RANGE can be of any value say, 1-5 or 5-10 etc
i cannot use cut since i am in AIX and AIX does not support --complement, is there any way to achieve this using AWK command
Example:
file.txt
abcdef
123456
Output
cut --complement -c 1-2 file.txt > tmp
cdef
3456
cut --complement -c 4-5 file.txt > tmp
abcf
1236
cut --complement -c 1-5 file.txt > tmp
f
6
Could you please try following, written and tested with shown samples. We have range variable of awk which should be in start_of_position-end_of_position and we could pass it as per need.
awk -v range="4-5" '
BEGIN{
split(range,array,"-")
}
{
print substr($0,1,array[1]-1) substr($0,array[2]+1)
}
' Input_file
OR to make it more clear in understanding wise try following:
awk -v range="4-5" '
BEGIN{
split(range,array,"-")
start=array[1]
end=array[2]
}
{
print substr($0,1,start-1) substr($0,end+1)
}
' Input_file
Explanation: Adding detailed explanation for above.
awk -v range="4-5" ' ##Starting awk program from here creating range variable which has range value of positions which we do not want to print in lines.
BEGIN{ ##Starting BEGIN section of this program from here.
split(range,array,"-") ##Splitting range variable into array with delimiter of - here.
start=array[1] ##Assigning 1st element of array to start variable here.
end=array[2] ##Assigning 2nd element of array to end variable here.
}
{
print substr($0,1,start-1) substr($0,end+1) ##Printing sub-string of current line from 1 to till value of start-1 and then printing from end+1 which basically means will skip that range of characters which OP does not want to print.
}
' Input_file ##Mentioning Input_file name here.
You can do this in awk:
awk -v st=1 -v en=2 '{print substr($0, 1, st-1) substr($0, en+1)}' file
cdef
3456
Or:
awk -v st=4 -v en=5 '{print substr($0, 1, st-1) substr($0, en+1)}' file
abcf
1236

awk change shell variable

I would like to modify several shell variables within awk:
echo "$LINE_IN" | awk '/pattern1/ {print $0; WRITTEN=1; REC=$REC+1}' >> $FILE1
I tried to put eval, but still does not work:
eval $( echo "$LINE_IN" | awk '/pattern1/ {print $0; WRITTEN=1; REC=$REC+1}' >> $FILE1 )
Any suggestion?
I would like to use k-shell script, thanks!
Count the hits when you are finished:
echo "${LINE_IN}" | grep -E 'pattern1' > "${FILE1}"
REC=$(wc -l < "${FILE1}")
if (( REC > 0 )); then
WRITTEN=1
fi
When you really want to use awk, you must let awk write the results to stdout and parse stdout:
echo "${LINE_IN}" | awk '/echo/ {print $0 > "x3"; WRITTEN=1; REC++}
END { print "WRITTEN=" WRITTEN; print "REC=" REC}'
WRITTEN=1
REC=6
And when you want the variables really set, wrap it:
source (echo "${LINE_IN}" | awk '/echo/ {print $0 > "x3"; WRITTEN=1; REC++}
END { print "WRITTEN=" WRITTEN; print "REC=" REC}')
Note: Get used to using lowercase variable names like written, file and rec.

Extract file string from left side but following 2nd delimiter from right

Below are the full file names.
qwertyuiop.abcdefgh.1234567890.txt
qwertyuiop.1234567890.txt
trying to use
awk -F'.' '{print $1}'
How can i use awk command to extract below output.
qwertyuiop.abcdefgh
qwertyuiop
Edit
i have a list of files in a directory
i am trying to extract time,size,owner,filename into seperate variables.
for filenames.
NAME=$(ls -lrt /tmp/qwertyuiop.1234567890.txt | awk -F'/' '{print $3}' | awk -F'.' '{print $1}')
$ echo $NAME
qwertyuiop
$
NAME=$(ls -lrt /tmp/qwertyuiop.abcdefgh.1234567890.txt | awk -F'/' '{print $3}' | awk -F'.' '{print $1}')
$ echo $NAME
qwertyuiop
$
expected
qwertyuiop.abcdefgh
With GNU awk and other versions that allow manipulation of NF
$ awk -F. -v OFS=. '{NF-=2} 1' ip.txt
qwertyuiop.abcdefgh
qwertyuiop
NF-=2 will effectively delete last two fields
1 is an awk idiom to print contents of $0
Note that this assumes there are at least two fields in every line, otherwise you'd get an error
Similar concept with perl, prints empty line if number of fields in the line is less than 3
$ perl -F'\.' -lane 'print join ".", #F[0..$#F-2]' ip.txt
qwertyuiop.abcdefgh
qwertyuiop
With sed, you can preserve lines if number of fields is less than 3
$ sed 's/\.[^.]*\.[^.]*$//' ip.txt
qwertyuiop.abcdefgh
qwertyuiop
EDIT: Taking inspiration from Sundeep sir's solution and adding this following too in this mix.
awk 'BEGIN{FS=OFS="."} {$(NF-1)=$NF="";sub(/\.+$/,"")} 1' Input_file
Could you please try following.
awk -F'.' '{for(i=(NF-1);i<=NF;i++){$i=""};sub(/\.+$/,"")} 1' OFS="." Input_file
OR
awk 'BEGIN{FS=OFS="."} {for(i=(NF-1);i<=NF;i++){$i=""};sub(/\.+$/,"")} 1' Input_file
Explanation: Adding explanation for above code too here.
awk '
BEGIN{ ##Mentioning BEGIN section of awk program here.
FS=OFS="." ##Setting FS and OFS variables for awk to DOT here as per OPs sample Input_file.
} ##Closing BEGIN section here.
{
for(i=(NF-1);i<=NF;i++){ ##Starting for loop from i value from (NF-1) to NF for all lines.
$i="" ##Setting value if respective field to NULL.
} ##Closing for loop block here.
sub(/\.+$/,"") ##Substituting all DOTs till end of line with NULL in current line.
}
1 ##Mentioning 1 here to print edited/non-edited current line here.
' Input_file ##Mentioning Input_file name here.

BASH SHELL print columns with specific order

I have this file :
933|Mahinda|Perera|male|1989-12-03|2010-03-17T13:32:10.447+0000|192.248.2.123|Firefox
1129|Carmen|Lepland|female|1984-02-18|2010-02-28T04:39:58.781+0000|81.25.252.111|Internet Explorer
4194|Hồ Chí|Do|male|1988-10-14|2010-03-17T22:46:17.657+0000|103.10.89.118|Internet Explorer
8333|Chen|Wang|female|1980-02-02|2010-03-15T10:21:43.365+0000|1.4.16.148|Internet Explorer
8698|Chen|Liu|female|1982-05-29|2010-02-21T08:44:41.479+0000|14.103.81.196|Firefox
8853|Albin|Monteno|male|1986-04-09|2010-03-19T21:52:36.860+0000|178.209.14.40|Internet Explorer
10027|Ning|Chen|female|1982-12-08|2010-02-22T17:59:59.221+0000|1.2.9.86|Firefox
and with this order
./tool.sh --browsers -f <file>
i want to count the number of the browsers in specific order , for example :
Chrome 143
Firefox 251
Internet Explorer 67
i use this command :
if [ "$1" == "--browsers" -a "$2" == "-f" -a "$4" == "" ]
then
awk -F'|' '{print $8}' $3 | sort | uniq -c | awk ' {print $2 , $3 , $1} '
fi
but it works only for 3 arguments. How to make it work for many arguments? for example a browser with 4 words or more
Seems like an awk one-liner to count your browsers:
$ awk -F'|' '{a[$8]++} END{for(i in a){printf("%s %d\n",i,a[i])}}' inputfile
Firefox 3
Internet Explorer 4
This increments elements of an array, then at the end of the file steps through the array and prints the totals. If you want the output sorted, you can just pipe it through sort. I don't see a problem with multiple words in a browser name.
try this:
awk -F"|" '{print $8}' in | sort | uniq -c | awk '{print $2,$1}'
where in is the input file.
output
[myShell] ➤ awk -F"|" '{print $8}' in | sort | uniq -c | awk '{print $2,$1}'
Firefox 3
Internet 4
also for parsing argument is better to use getopts
i.e.
#!/bin/bash
function usage {
echo "usage: ..."
}
while getopts b:o:h opt; do
case $opt in
b)
fileName=$OPTARG
echo "filename[$fileName]"
awk -F"|" '{print $8}' $fileName | sort | uniq -c | awk '{print $2,$1}'
;;
o)
otherargs=$OPTARG
echo "otherargs[$otherargs]"
;;
h)
usage && exit 0
;;
?)
usage && exit 2
;;
esac
done
output
[myShell] ➤ ./arg -b in
filename[in]
Firefox 3
Internet 4
Your final Awk hard-codes two fields; just continue with $4, $5, $6 etc to print more fields. However, this will add a spurious space for each comma.
Better yet, since the first field is fixed width (because that's the output format from uniq -c), you can do print substr($0,8), $1
I'd do it in perl:
#!/bin/perl
use strict;
use warnings;
use Data::Dumper;
my %count_of;
while ( <> ) {
chomp;
$count_of{(split /\|/)[7]}++;
}
print Dumper \%count_of;
This can be cut down to a one liner:
perl -F'\|' -lane '$c{$F[7]++}; END{ print "$_ => $c{$_}" for keys %c }'

Why my awk string match not working?

$ echo foooobazbarrrrr |
> gawk 'match($0, /(fo+).+(bar*)/, arr)
> {print arr[1], arr[2] }'
The output of this code should be foooo barrrr but on my Ubuntu, it is not working and failed.
If I wrote this code
> gawk 'match($0, /(fo+).+(bar*)/)
> {print }'
Then its working. Why is the first version not working?
Your command is slightly different from the example in the GNU manual. It has the opening { at the very start so that there's no pattern to match and the newline is required to separate the two awk commmands.
$ echo foooobazbarrrrr | gawk '{ match($0, /(fo+).+(bar*)/, arr)
> print arr[1], arr[2] }'
foooo barrrrr
Alternatively, you could use a semi-colon instead of a newline to separate the commands:
$ echo foooobazbarrrrr | gawk '{ match($0, /(fo+).+(bar*)/, arr); print arr[1], arr[2] }'
foooo barrrrr
Your version of the command will work if it’s entered as one line:
$ echo foooobazbarrrrr | gawk 'match($0, /(fo+).+(bar*)/, arr) {print arr[1], arr[2] }'
foooo barrrrr

Resources