I have a scenario where I am trying to add three fields in a line, that I receive it in my input file. I have written something which I felt doesn't follow the best unix practice and need some suggestions how to write the best possible. Attached my sample line.
My questions are:
Is it possible to add all three fields using one awk command?
The input file may not contain some of these fields ( basaed on scenarios), is awk able handle this? Should I do any check ?
Some time the values may contain "~" at the end , how to consider only numeric?
Input File1 Sample Line
CLP*4304096*20181231*BH*0*AH>444*158330.97~*FB*0*SS>02*0*SS>03*0*J1*0~
Input File2 sample line
CLP*4304096*20181231*BH*0*AH>444*158330.97*FB*0
Script I have written
clp=$(awk -F'[*]' '/CLP/{print $7}' $file)
ss02=$(awk -F'[*]' '/CLP/{print $11}' $file)
ss03=$(awk -F'[*]' '/CLP/{print $13}' $file)
clpsum=clp+ss02+ss03
I know it's not the best way, pls let me know how can I handle both the input file 1 ( it has 158330.97~) scenario and file 2 scenario.
Thanks!
in 1 awk command:
awk 'BEGIN{FS="*"}{var1=$7;var2=$11;var3=$13; var4=var1+var2+var3; printf("var4 = %.2f\n",var4)}' file.txt
works as long as fields are the same ~ you might want someone with a more robust answer if you want to handle whenever a file comes in with numbers in different fields ~etc. Hope this helps in anyway.
Related
I have a huge table I want to extract information from. Firstly, I want to extract a certain line based on a pattern -> I've done that successfully with grep. However this line has loads of columns and I'm interested only in a couple of them that have a certain pattern in them (partial match - beginning of the string). Is it possible to extract only the columns and the number of the column (the nth column) for some partial matches? Hope I was clear enough.
Languages: Preferably in bash but I can also work in R, alternatively I'm open to suggestions if you think another language can be more helpful.
Thanks!
Awk is perfect for stuff like this. To help you write a script I think we need more details. But I'm guessing you'll want to use the print feature of awk. To print out the nth column of a file "your_file" do:
awk '{print $n}' your_file
In solving your problem you may also want to loop over all N columns which you can do via:
for i in {1..N} ;
do
awk -v col=${i} '{print $col}' your_file ;
done
I am trying to read a parameter from the URL, i am able to read for single line but i don't know how to loop in awk, can someone help?
i have file with 1000+ entries like
http://projectreporter.nih.gov/project_info_details.cfm?aid=7714687&icde=0
http://projectreporter.nih.gov/project_info_description.cfm?aid=7896503&icde=0
http://projectreporter.nih.gov/project_info_details.cfm?aid=7895320&icde=0
http://projectreporter.nih.gov/project_info_details.cfm?aid=2675186&icde=9195637
i am trying to only retrive "aid=xxxxxxx", i used the following command to do it and i get the "aid" for the last line
awk '{match($0,"aid=([^ &]+)",a)}END{print a[1]}' file1.txt > outputFile.txt
how to do the same in a loop so i can get all the occurrence?
any help would be appreciated
This should work a little fine tuning for your attempted code.
awk 'match($0,/aid[^&]*/){print substr($0,RSTART,RLENGTH)}' Input_file
In case your single line can have multiple occurrences of aid and you want to print all then try following.
awk '
{
while(match($0,/aid[^&]*/)){
print substr($0,RSTART,RLENGTH)
$0=substr($0,RSTART+RLENGTH)
}
}
' Input_file
I am trying to separate fields using awk but have met some problems when I have multiple separators each of which appears multiple times.
For example, if I type
echo "aa#######=#3413.5" | awk -F "#+|#+|=" '{print $1","$2","$3","$4","$5}'
then the results are:
aa,,,,3413.5
but what I want is
aa,3413.5
I have searched online for a long time, but other questions are related to either multiple separators appearing one time for each, i.e. "#|#", or a single separator appearing multiple times, i.e. "#+".
Anyone has ideas about how to separate fields in my case?
Thanks a lot!
awk -F '[##=]+'
seems to work.
awk -F "#+|#+|="
this one matches only for string like #####, ####, or =.
see following URL for detail:
http://www.math.utah.edu/docs/info/gawk_5.html#SEC28
I've been going through an online UNIX course and have come across this question which I'm stuck on. Would appreciate any help!
You are provided with a set of files each one of which contains personal details about an individual. Each file is laid out in the following format, with one file per individual:
name:Niko Tanaka
age:41
occupation:Doctor
I know the answer has to be in the form:
n=$(awk -F: ' / /{print }' filename)
n=$(awk -F: '/name/{print $2}' infile)
Whatever is inside of / / are regular expressions. In this case you just want to match on the line that contains 'name'.
I have two files ...
file1:
002009092312291100098420090922111
010555101070002956200453T+00001190.81+00001295.920010.87P
010555101070002956200449J+00003128.85+00003693.90+00003128
010555101070002956200176H+00000281.14+00000300.32+00000281
file2:
002009092410521000098420090709111
010560458520002547500432M+00001822.88+00001592.96+00001822
010560458520002547500432D+00000106.68+00000114.77+00000106
In both files in every record starting with 01, the string from 3rd char to 25th char, i.e up to alphabet is the key.
Based on this key, I have to compare two files, and if there is any record matching in file 2, then I have to replace that record in file1, or else append it if it won't match.
Well, this is a fairly unspecific (and basic) programming question. We'll be better able to help us if you explain exactly what you did and where you got stuck.
Also, it looks a bit like homework, and people are wary of giving too much help on homework problems, as it might look like cheating.
To get you started:
I'd recommend Perl to solve this, but awk or another scripting language will also do. I'd recommend against sh/bash, as they are weak on text manipulation; also combining grep et al will become rather cumbersome.
First write a Perl program that filters records starting with 01. Then extract the key and put it into a hash (a Perl structure). Then output a new, combined file as required.
Using awk get the fields from 3-25 but doing something like
awk -F "" '/^01/{print $1}' file_name | cut -c 3-25 and match the first two fields with 01 from both files and get all the lines in two different buffers and compare both the buffers using for line in in a shell script.
Whenever the line in second buffer matches the first one grep the line in second buffer in first file and replace the line in first file with the line in second. I think you need to work a bit around the logic.