dealing with % symbol in a file using awk command - unix

I am using this command
awk '{printf $1; for (i=2;i<=10;i++) {printf OFS $i} printf "\n"}' FS='|' OFS='|' file.txt >>new_file.txt
its working fine if the record does not have any % symbol in a file.
My Requirement :
Input:
A|B|C|D|E
A|B
Output:
A|B|C|D|E|||||
A|B||||||||
Sample value which is giving error - '20% OFF ONLINE PRICE MATCH'
how do I handle this issue?
Error - awk: There are not enough parameters in printf statement |20% OFF ONLINE PRICE MATCH.

The first argument to printf is actually the format string, which gets printed as-is if there are no formatting flags (%x) and it's the only argument. Unless you are in control of the string, you should always provide two arguments to printf, exactly for this reason, that is, to guard against formatting flags (expected or otherwise) occurring in the supplied strings. In your case, change the printf statements to
printf "%s", $1
and
printf "%s", OFS $i
and you should be fine.
By way of illustration:
$ echo '20% OFF ONLINE PRICE MATCH' | awk -F\| '{ printf $1 }'
awk: weird printf conversion % O
input record number 1, file
source line number 1
awk: not enough args in printf(20% OFF ONLINE PRICE MATCH)
input record number 1, file
source line number 1
$ echo '20% OFF ONLINE PRICE MATCH' | awk -F\| '{ printf "%s\n", $1 }'
20% OFF ONLINE PRICE MATCH

Related

Problems with filtering: awk: syntax error at source line 1

I am trying to filter a .tsv file by selected relevant rows. I had done this a few days ago with the same file, and had had no problems. However today, as I was filtering the file I had the error as below:
awk: syntax error at source line 1
context is
>>> /Users/rbs/Desktop/results.e <<< ntries.tsv
awk: bailing out at source line 1
Here is the code I had written into my terminal.
awk -F '{ if ($5 == 20004 || $5 == 41200) print $0; }' ~/Desktop/results.entries.tsv > ~/filtered2.tsv
Other details:
I am using Mac OSX
I apologise if the question is unclear - I am a beginner!
You did not include the field separator for the option -F.
Try for a CSV:
awk -F ',' '{ if ($5 == 20004 || $5 == 41200) print $0; }' ~/Desktop/results.entries.tsv > ~/filtered2.tsv

awk — getting minus instead of FILENAME

I am trying to add the filename to the end of each line as a new field. It works except instead of getting the filename I get -.
Base file:
070323111|Hudson
What I want:
070323111|Hudson|20150106.csv
What I get:
070323111|Hudson|-
This is my code:
mv $1 $1.bak
cat $1.bak | awk '{print $0 "|" FILENAME}' > $1
- is the way to present the filename when there is not such info. Since your are doing cat $1.bak | awk ..., awk is not reading from a file but from stdin.
Instead, just do:
awk '...' file
in your case:
awk '{print $0 "|" FILENAME}' $1.bak > $1
From man awk:
FILENAME
The name of the current input file. If no files are specified on the
command line, the value of FILENAME is “-”. However, FILENAME is
undefined inside the BEGIN rule (unless set by getline).

awk syntax to invoke function with argument read from a file

I have a function
xyz()
{
x=$1*2
echo x
}
then I want to use it to replace a particular column in a csv file by awk.
File input.csv:
abc,2,something
def,3,something1
I want output like:
abc,4,somthing
def,6,something1
Command used:
cat input.csv|awk -F, -v v="'"`xyz "$2""'" 'BEGIN {FS=","; OFS=","} {$2=v1; print $0}'
Open file input.csv, calling function xyz by passing file 2nd filed as argument and result is stored back to position 2 of file, but is not working!
If I put constant in place of $2 while calling function it works:
Please help me to do this.
cat input.csv|awk -F, -v v="'"`xyz "14""'" 'BEGIN {FS=","; OFS=","} {$2=v1; print $0}'
This above line of code is working properly by calling the xyz function and putting the result back to 2nd column of file input.csv, but with only 14*2, as 14 is taken as constant.
There's a back-quote missing from your command line, and a UUOC (Useless Use of Cat), and a mismatch between variable v on the command line and v1 in the awk program:
cat input.csv|awk -F, -v v="'"`xyz "$2""'" 'BEGIN {FS=","; OFS=","} {$2=v1; print $0}'
^ Here ^ Here ^ Here
That should be written using $(…) instead:
awk -F, -v v="'$(xyz "$2")'" 'BEGIN {FS=","; OFS=","} {$2=v; print $0}' input.csv
This leaves you with a problem, though; the function xyz is invoked once by the shell before you start your awk script running, and is never invoked by awk. You simply can't do it that way. However, you can define your function in awk (and on the fly):
awk -F, 'BEGIN { FS = ","; OFS = "," }
function xyz(a) { return a * 2 }
{ $2 = xyz($2); print $0 }' \
input.csv
For your two-line input file, it produces your desired output.

How connect words in a text file

I have a file in following format:
B: that
I: White
I: House
B: the
I: emergency
I: rooms
B: trauma
I: centers
What I need to do is to read line by line from the top, if the line begin with B then remove B:
If it begin with I: then remove I: and connect to the previous one (the previous one is processed in the same rule).
Expected Output:
that White House
the emergency rooms
trauma centers
What I tried:
while read line
do
string=$line
echo $string | grep "B:" 1>/dev/null
if [ `echo $?` -eq 0 ] //if start with " B: "
then
$newstring= echo ${var:4} //cut first 4 characters which including B: and space
echo $string | grep "I:" 1>/dev/null
if [ `echo $?` -eq 0 ] //if start with " I: "
then
$newstring= echo ${var:4} //cut first 4 characters which including I: and space
done < file.txt
What I don't know is how to put it back to the line (in the file) and how to connect the line to the previous processed one.
Using awk print the second field of I: and B: records. The variable first is used to control the newline output.
/B:/ searches for the B: pattern. This pattern marks the start of the record. If the record is NOT the first, then a newline is printed, then the data $2 is printed.
If the pattern found is I: the data $2 (the second field which follows I: is printed.
awk 'BEGIN{first=1}
/B:/ { if (first) first=0; else print ""; printf("%s ", $2); }
/I:/ { printf("%s ", $2) }
END {print ""}' filename
awk -F":" '{a[NR]=$0}
/^ B:/{print line;line=$2}
/^ I:/{line=line" "$2}
END{
if(a[NR]!~/^B/)
{print line}
}' Your_file
awk '/^B/ {printf "\n%s",$2} /^I/ {printf " %s",$2}' file
that White House
the emergency rooms
trauma centers
Shorten it some
awk '/./ {printf /^B/?"\n%s":" %s",$2}' file
There is an interesting solution using awk auto-split on RS patterns. Note that this is a bit sensitive to variations in the input format:
<infile awk 1 RS='(^|\n)B: ' | awk 1 RS='\n+I: ' ORS=' ' | grep -v '^ *$'
Output:
that White House
the emergency rooms
trauma centers
This works at least with GNU awk and Mikes awk.
This might work for you (GNU sed):
sed -r ':a;$!N;s/\n$//;s/\n\s*I://;ta;s/B://g;s/^\s*//;P;D' file
or:
sed -e ':a' -e '$!N' -e 's/\n$//' -e 's/\n\s*I://' -e 'ta' -e 's/B://g' -e 's/^\s*//' -e 'P' -e 'D' file

AWK to print field $2 first, then field $1

Here is the input(sample):
name1#gmail.com|com.emailclient.account
name2#msn.com|com.socialsite.auth.account
I'm trying to achieve this:
Emailclient name1#gmail.com
Socialsite name2#msn.com
If I use AWK like this:
cat foo | awk 'BEGIN{FS="|"} {print $2 " " $1}'
it messes up the output by overlaying field 1 on the top of field 2.
Any tips/suggestions? Thank you.
A couple of general tips (besides the DOS line ending issue):
cat is for concatenating files, it's not the only tool that can read files! If a command doesn't read files then use redirection like command < file.
You can set the field separator with the -F option so instead of:
cat foo | awk 'BEGIN{FS="|"} {print $2 " " $1}'
Try:
awk -F'|' '{print $2" "$1}' foo
This will output:
com.emailclient.account name1#gmail.com
com.socialsite.auth.accoun name2#msn.com
To get the desired output you could do a variety of things. I'd probably split() the second field:
awk -F'|' '{split($2,a,".");print a[2]" "$1}' file
emailclient name1#gmail.com
socialsite name2#msn.com
Finally to get the first character converted to uppercase is a bit of a pain in awk as you don't have a nice built in ucfirst() function:
awk -F'|' '{split($2,a,".");print toupper(substr(a[2],1,1)) substr(a[2],2),$1}' file
Emailclient name1#gmail.com
Socialsite name2#msn.com
If you want something more concise (although you give up a sub-process) you could do:
awk -F'|' '{split($2,a,".");print a[2]" "$1}' file | sed 's/^./\U&/'
Emailclient name1#gmail.com
Socialsite name2#msn.com
Use a dot or a pipe as the field separator:
awk -v FS='[.|]' '{
printf "%s%s %s.%s\n", toupper(substr($4,1,1)), substr($4,2), $1, $2
}' << END
name1#gmail.com|com.emailclient.account
name2#msn.com|com.socialsite.auth.account
END
gives:
Emailclient name1#gmail.com
Socialsite name2#msn.com
Maybe your file contains CRLF terminator. Every lines followed by \r\n.
awk recognizes the $2 actually $2\r. The \r means goto the start of the line.
{print $2\r$1} will print $2 first, then return to the head, then print $1. So the field 2 is overlaid by the field 1.
The awk is ok. I'm guessing the file is from a windows system and has a CR (^m ascii 0x0d) on the end of the line.
This will cause the cursor to go to the start of the line after $2.
Use dos2unix or vi with :se ff=unix to get rid of the CRs.

Resources