awk syntax to invoke function with argument read from a file - unix

I have a function
xyz()
{
x=$1*2
echo x
}
then I want to use it to replace a particular column in a csv file by awk.
File input.csv:
abc,2,something
def,3,something1
I want output like:
abc,4,somthing
def,6,something1
Command used:
cat input.csv|awk -F, -v v="'"`xyz "$2""'" 'BEGIN {FS=","; OFS=","} {$2=v1; print $0}'
Open file input.csv, calling function xyz by passing file 2nd filed as argument and result is stored back to position 2 of file, but is not working!
If I put constant in place of $2 while calling function it works:
Please help me to do this.
cat input.csv|awk -F, -v v="'"`xyz "14""'" 'BEGIN {FS=","; OFS=","} {$2=v1; print $0}'
This above line of code is working properly by calling the xyz function and putting the result back to 2nd column of file input.csv, but with only 14*2, as 14 is taken as constant.

There's a back-quote missing from your command line, and a UUOC (Useless Use of Cat), and a mismatch between variable v on the command line and v1 in the awk program:
cat input.csv|awk -F, -v v="'"`xyz "$2""'" 'BEGIN {FS=","; OFS=","} {$2=v1; print $0}'
^ Here ^ Here ^ Here
That should be written using $(…) instead:
awk -F, -v v="'$(xyz "$2")'" 'BEGIN {FS=","; OFS=","} {$2=v; print $0}' input.csv
This leaves you with a problem, though; the function xyz is invoked once by the shell before you start your awk script running, and is never invoked by awk. You simply can't do it that way. However, you can define your function in awk (and on the fly):
awk -F, 'BEGIN { FS = ","; OFS = "," }
function xyz(a) { return a * 2 }
{ $2 = xyz($2); print $0 }' \
input.csv
For your two-line input file, it produces your desired output.

Related

AWK print unexpected newline or end of string inside shell

I have a shell script which is trying to trim a file from end of the line but I always get some error.
Shell Script:
AWK_EXPRESSION='{if(length>'"$RANGE1"'){ print substr('"$0 "',0, length-'"$RANGE2"'}) } else { print '"$0 "'} }'
for report in ${ACTUAL_TARGET_FOLDER}/* ; do
awk $AWK_EXPRESSION $report > $target_file
done
If I trigger the AWK command, I get unexpected newline or end of string near print.
What am I missing?
Why are you trying to store the awk body in a shell variable? Just use awk and the -v option to pass a shell value into an awk variable:
awk -v range1="$RANGE1" -v range2="$RANGE2" '{
if (length > range1) {
print substr($0,0, length-range2)
} else {
print
}
}' "$ACTUAL_TARGET_FOLDER"/* > "$target_file"
Add a few newlines to help readability.
Get out of the habit of using ALLCAPS variable names, leave those as reserved by the shell. One day you'll write PATH=something and then wonder why your script is broken.
Unquoted variables are subject to word splitting and glob expansion. Use double quotes for all your variables unless you know what specific side-effect you want to use.
I would recommend writing the AWK program using AWK variables instead of interpolating variables into it from the shell. You can pass variable into awk on the command line using the -v command line option.
Also, awk permits using white space to make the program readable, just like other programming languages. Like this:
AWK_EXPRESSION='{
if (length > RANGE1) {
print substr($0, 1, length-RANGE2)
} else {
print
}
}'
for report in "${ACTUAL_TARGET_FOLDER}"/* ; do
awk -v RANGE1="$RANGE1" -v RANGE2="$RANGE2" "$AWK_EXPRESSION" "$report" > "$target_file"
done

cut command --complement flag equivalent in AWK

I am new to writing shell scripts
I am trying to write an AWK command which does exactly the below
cut --complement -c $IGNORE_RANGE file.txt > tmp
$IGNORE_RANGE can be of any value say, 1-5 or 5-10 etc
i cannot use cut since i am in AIX and AIX does not support --complement, is there any way to achieve this using AWK command
Example:
file.txt
abcdef
123456
Output
cut --complement -c 1-2 file.txt > tmp
cdef
3456
cut --complement -c 4-5 file.txt > tmp
abcf
1236
cut --complement -c 1-5 file.txt > tmp
f
6
Could you please try following, written and tested with shown samples. We have range variable of awk which should be in start_of_position-end_of_position and we could pass it as per need.
awk -v range="4-5" '
BEGIN{
split(range,array,"-")
}
{
print substr($0,1,array[1]-1) substr($0,array[2]+1)
}
' Input_file
OR to make it more clear in understanding wise try following:
awk -v range="4-5" '
BEGIN{
split(range,array,"-")
start=array[1]
end=array[2]
}
{
print substr($0,1,start-1) substr($0,end+1)
}
' Input_file
Explanation: Adding detailed explanation for above.
awk -v range="4-5" ' ##Starting awk program from here creating range variable which has range value of positions which we do not want to print in lines.
BEGIN{ ##Starting BEGIN section of this program from here.
split(range,array,"-") ##Splitting range variable into array with delimiter of - here.
start=array[1] ##Assigning 1st element of array to start variable here.
end=array[2] ##Assigning 2nd element of array to end variable here.
}
{
print substr($0,1,start-1) substr($0,end+1) ##Printing sub-string of current line from 1 to till value of start-1 and then printing from end+1 which basically means will skip that range of characters which OP does not want to print.
}
' Input_file ##Mentioning Input_file name here.
You can do this in awk:
awk -v st=1 -v en=2 '{print substr($0, 1, st-1) substr($0, en+1)}' file
cdef
3456
Or:
awk -v st=4 -v en=5 '{print substr($0, 1, st-1) substr($0, en+1)}' file
abcf
1236

awk — getting minus instead of FILENAME

I am trying to add the filename to the end of each line as a new field. It works except instead of getting the filename I get -.
Base file:
070323111|Hudson
What I want:
070323111|Hudson|20150106.csv
What I get:
070323111|Hudson|-
This is my code:
mv $1 $1.bak
cat $1.bak | awk '{print $0 "|" FILENAME}' > $1
- is the way to present the filename when there is not such info. Since your are doing cat $1.bak | awk ..., awk is not reading from a file but from stdin.
Instead, just do:
awk '...' file
in your case:
awk '{print $0 "|" FILENAME}' $1.bak > $1
From man awk:
FILENAME
The name of the current input file. If no files are specified on the
command line, the value of FILENAME is “-”. However, FILENAME is
undefined inside the BEGIN rule (unless set by getline).

Specifying the order of output using awk

I have the following file called testfile, with the following contents:
{"items":[{"ogit\/_created on":1413388512511,"\/environmentType":"PROD","\/soxRelevant":"true","\/environmentName":"dbVertical"}]}
I used the following awk command to get the values for soxRelevant, environment type and environment name.
cat test file | tr -d '"' | awk 'BEGIN {RS=","; FS=":"; ORS=",";} /soxRelevant/ {print $2}; /environmentType/ {print $2}; /environmentName/ {print $2};'
The output was as follows
PROD,true,dbVertical,
However I want the soxRelevant output first followed by environment type then environment name, as specified in the awk command:
I want the output to be:
true, PROD, dbVertical
How do I do this?
You could push the elements to an array then print them in the END block:
awk 'BEGIN {RS=ORS=","; FS=":"}
/soxRelevant/ {a[1]=$2}
/environmentType/ {a[2]=$2}
/environmentName/ {a[3]=$2}
END{for(n=1;n<=3;++n)print a[n]}'

AWK to print field $2 first, then field $1

Here is the input(sample):
name1#gmail.com|com.emailclient.account
name2#msn.com|com.socialsite.auth.account
I'm trying to achieve this:
Emailclient name1#gmail.com
Socialsite name2#msn.com
If I use AWK like this:
cat foo | awk 'BEGIN{FS="|"} {print $2 " " $1}'
it messes up the output by overlaying field 1 on the top of field 2.
Any tips/suggestions? Thank you.
A couple of general tips (besides the DOS line ending issue):
cat is for concatenating files, it's not the only tool that can read files! If a command doesn't read files then use redirection like command < file.
You can set the field separator with the -F option so instead of:
cat foo | awk 'BEGIN{FS="|"} {print $2 " " $1}'
Try:
awk -F'|' '{print $2" "$1}' foo
This will output:
com.emailclient.account name1#gmail.com
com.socialsite.auth.accoun name2#msn.com
To get the desired output you could do a variety of things. I'd probably split() the second field:
awk -F'|' '{split($2,a,".");print a[2]" "$1}' file
emailclient name1#gmail.com
socialsite name2#msn.com
Finally to get the first character converted to uppercase is a bit of a pain in awk as you don't have a nice built in ucfirst() function:
awk -F'|' '{split($2,a,".");print toupper(substr(a[2],1,1)) substr(a[2],2),$1}' file
Emailclient name1#gmail.com
Socialsite name2#msn.com
If you want something more concise (although you give up a sub-process) you could do:
awk -F'|' '{split($2,a,".");print a[2]" "$1}' file | sed 's/^./\U&/'
Emailclient name1#gmail.com
Socialsite name2#msn.com
Use a dot or a pipe as the field separator:
awk -v FS='[.|]' '{
printf "%s%s %s.%s\n", toupper(substr($4,1,1)), substr($4,2), $1, $2
}' << END
name1#gmail.com|com.emailclient.account
name2#msn.com|com.socialsite.auth.account
END
gives:
Emailclient name1#gmail.com
Socialsite name2#msn.com
Maybe your file contains CRLF terminator. Every lines followed by \r\n.
awk recognizes the $2 actually $2\r. The \r means goto the start of the line.
{print $2\r$1} will print $2 first, then return to the head, then print $1. So the field 2 is overlaid by the field 1.
The awk is ok. I'm guessing the file is from a windows system and has a CR (^m ascii 0x0d) on the end of the line.
This will cause the cursor to go to the start of the line after $2.
Use dos2unix or vi with :se ff=unix to get rid of the CRs.

Resources