How to trim duplicated values in a string KQL? - azure-data-explorer

I have the following data sample:
let data = datatable(Name:string, Value:string)
[
"Device_1", "60.12.12 %",
"Device_2", "40.12.12 %",
"Device_3", "50.12.12 %",
"Device_4", "48.33.33 %"
];
As you can see in the Value column, the values after the decimal points are duplicated.
How can we trim the second decimal point and the values after it ?
Expected result:

As you wish :-)
P.S.
Added some use-cases to test it
let data = datatable(Name:string, Value:string)
[
"Device_1" ,"60.12.12 %"
,"Device_2" ,"40.12.12 %"
,"Device_3" ,"50.12.12 %"
,"Device_4" ,"48.33.33 %"
,"Device_5" ,"48.11.22.33 %"
,"Device_6" ,"48.11.22 %"
,"Device_7" ,"48.11 %"
,"Device_8" ,"48 %"
,"Device_9" ,".11 %"
,"Device_10" ,".11"
,"Device_11" ,".11.22"
];
data
| extend replace_regex(Value, #"(\.\d+).*", #"\1 %")
Name
Value
Column1
Device_1
60.12.12 %
60.12 %
Device_2
40.12.12 %
40.12 %
Device_3
50.12.12 %
50.12 %
Device_4
48.33.33 %
48.33 %
Device_5
48.11.22.33 %
48.11 %
Device_6
48.11.22 %
48.11 %
Device_7
48.11 %
48.11 %
Device_8
48 %
48 %
Device_9
.11 %
.11 %
Device_10
.11
.11 %
Device_11
.11.22
.11 %
Fiddle

let data = datatable(Name:string, Value:string)
[
"Device_1", "60.12.12 %",
"Device_2", "40.12.12 %",
"Device_3", "50.12.12 %",
"Device_4", "48.33.33 %"
];
data
| parse kind=regex Value with Percent:real
Name
Value
Percent
Device_1
60.12.12 %
60.12
Device_2
40.12.12 %
40.12
Device_3
50.12.12 %
50.12
Device_4
48.33.33 %
48.33
Fiddle

Related

gnuplot/ awk: ploating bar graph for filtered data

I am using gnuplot combined with AWK to plot 2D bar plot from the following input data:
#Acceptor DonorH Donor Frames Frac AvgDist AvgAng
lig_608#O3 HIE_163#HE2 HIE_163#NE2 498 0.5304 2.8317 153.0580
lig_608#O GLU_166#H GLU_166#N 476 0.5069 2.8858 161.7174
lig_608#O1 HIE_41#HE2 HIE_41#NE2 450 0.4792 2.8484 158.5193
THR_26#O lig_608#H9 lig_608#N1 399 0.4249 2.8312 149.9578
lig_608#O2 THR_26#H THR_26#N 312 0.3323 2.9029 164.8033
lig_608#O1 ASN_142#HD21 ASN_142#ND2 14 0.0149 2.8445 158.4224
lig_608#O1 GLN_189#HE22 GLN_189#NE2 2 0.0021 2.8562 149.7421
lig_608#O1 GLN_189#HE21 GLN_189#NE2 1 0.0011 2.7285 158.4377
lig_608#O3 GLY_143#H GLY_143#N 1 0.0011 2.7421 147.8213
My script takes the data from the third and 5th columns considering only the lines where the value from the 5th column > 0.05, producing bar graph
cat <<EOS | gnuplot > graph.png
set term pngcairo size 800,600
set xtics noenhanced
set xlabel "Fraction, %"
set ylabel "H-bond donor, residue"
set key off
set style fill solid 0.5
set boxwidth 0.9
plot "<awk 'NR == 1 || \$5 > 0.05' $file" using 0:5:xtic(3) with boxes
EOS
!EDITED:
within my bash workflow the script looks like
for file in "${output}"/${target}*.log ; do
file_name3=$(basename "$file")
file_name2="${file_name3/.log/}"
file_name="${file_name2/${target}_/}"
echo "vizualisation with Gnuplot!"
cat <<EOS | gnuplot > ${output}/${file_name2}.png
set title "$file_name" font "Century,22" textcolor "#b8860b"
set tics font "Helvetica,12"
#set term pngcairo size 1280,960
set term pngcairo size 800,600
set yrange [0:1]
set xtics noenhanced
set xlabel "Fraction, %"
set ylabel "H-bond donor, residue"
set key off
set style fill solid 0.5
set boxwidth 0.9
plot "<awk 'NR == 1 || \$5 > 0.05' $file" using 0:5:xtic(3) with boxes
EOS
done
This is the image produced from following filtered data:
HIE_163#NE2 0.5304
GLU_166#N 0.5069
HIE_41#NE2 0.4792
lig_608#N1 0.4249
THR_26#N 0.3323
I need to modify my awk searching expression integrated in the
gnuplot that makes selection of the two columns from the whole data.
Instead of taking the index from the third column (Donor) from each
line I need to take it either from the first (#Acceptor) or form the
third (#Donor) column. The index should be taken from one of these
columns depending on the lig_* pattern. E.g. if the data in the
(#Acceptor) column starts from lig* I need to take the value from the
third column (#Donor) of the same line and visa verse (lig* pattern
presents either in the 1st column or in the 3rd but not in the both..)
Taking my example, the filtered data with the updated searching should become:
HIE_163#NE2 0.5304 # the first index from the third column
GLU_166#N 0.5069 # the first index from the third column
HIE_41#NE2 0.4792 # the first index from the third column
THR_26#O 0.4249 # !!!! the first index from the first column !!
THR_26#N 0.3323 # the first index from the third column
No need for awk, you can do it all in gnuplot (hence platform-independent).
This would be my first attempt. You will filter by plotting the unwanted data at x-value of NaN, however, this will give some warnings: warning: add_tic_user: list sort error which you can ignore.
But this can probably be avoided by some changes.
Edit: the original script would have failed when the first line had a value <0.05 in column 5. Here are two versions which don't have this problem. There will also be no warnings. Maybe these attempts can be further simplified.
For creating an output file simply add this to your script: (check help output)
set term pngcairo size 800,600
set output "myOutputFile.png"
...<your script>...
set output
Data: SO73961783.dat
#Acceptor DonorH Donor Frames Frac AvgDist AvgAng
lig_608#O3 HIE_163#HE2 HIE_163#NE2 498 0.5304 2.8317 153.0580
lig_608#O GLU_166#H GLU_166#N 476 0.5069 2.8858 161.7174
lig_608#O1 HIE_41#HE2 HIE_41#NE2 450 0.4792 2.8484 158.5193
THR_26#O lig_608#H9 lig_608#N1 399 0.4249 2.8312 149.9578
lig_608#O2 THR_26#H THR_26#N 312 0.3323 2.9029 164.8033
lig_608#O1 ASN_142#HD21 ASN_142#ND2 14 0.0149 2.8445 158.4224
lig_608#O1 GLN_189#HE22 GLN_189#NE2 2 0.0021 2.8562 149.7421
lig_608#O1 GLN_189#HE21 GLN_189#NE2 1 0.0011 2.7285 158.4377
lig_608#O3 GLY_143#H GLY_143#N 1 0.0011 2.7421 147.8213
Script 1:
Filter your data and write it into a new table. If condition >0.05 is not met, write an empty line. Probably the easiest to understand and gives the shortest final plot command.
### conditional xtic labels
reset session
set termoption noenhanced
FILE = "SO73961783.dat"
set xlabel "Fraction, %"
set ylabel "H-bond donor, residue"
set key off
set style fill solid 0.5
set boxwidth 0.9
set grid y
set xrange[-1:5]
set table $Filtered
myTic(col1,col2) = strcol(col1)[1:3] eq 'lig' ? strcol(col2) : strcol(col1)
plot FILE u ((y0=column(5))>0.05 ? sprintf("%g %s",y0,myTic(1,3)) : '') w table
unset table
plot $Filtered u 0:1:xtic(2) w boxes
### end of script
Script 2:
Without extra table, but a more complex plot command. Increase the x-position x0 if a value>0.05 is found (except for the first time) and keep the previous position and and label (i.e. overwrite it) if a value<=0.05 is found.
### conditional xtic labels
reset session
set termoption noenhanced
FILE = "SO73961783.dat"
set xlabel "Fraction, %"
set ylabel "H-bond donor, residue"
set key off
set style fill solid 0.5
set boxwidth 0.9
set grid y
set xrange[-1:5]
myTic(col1,col2) = strcol(col1)[1:3] eq 'lig' ? strcol(col2) : strcol(col1)
plot x0=c=(t0='',0) FILE u ((y0=column(5))>0.05 ? (c==0 ? (c=1,t0=myTic(1,3)) : (x0=x0+1,t0=myTic(1,3))) : (y0=NaN),x0):(y0):xtic(t0) w boxes
### end of script
Result:
As you potentially want to do more complicated processing with awk, I would
suggest an alternative way of mixing awk and gnuplot.
Gnuplot supports including inline data in its script files, so you could have awk generate the inline data while supplying the plot-configuration with bash, all done in a sub-shell. For example:
(
printf '$data << EOD\n'
awk 'NR>1 && $5>0.05 { print $1 ~ /^lig/ ? $3 : $1, $5 }' infile
cat << EOS
EOD
set term pngcairo size 1280,960 font ",20"
set output "output.png"
set xtics noenhanced
set ytics 0.02
set grid y
set key off
set style fill solid 0.5
set boxwidth 0.9
set xlabel "Fraction, %"
set ylabel "H-bond donor, residue"
plot "\$data" using 0:2:xtic(1) with boxes, "" using 0:2:2 with labels offset 0,1
EOS
)
Would produce this gnuplot script:
$data << EOD
HIE_163#NE2 0.5304
GLU_166#N 0.5069
HIE_41#NE2 0.4792
THR_26#O 0.4249
THR_26#N 0.3323
EOD
set term pngcairo size 1280,960 font ",20"
set output "output.png"
set xtics noenhanced
set ytics 0.02
set grid y
set key off
set style fill solid 0.5
set boxwidth 0.9
set xlabel "Fraction, %"
set ylabel "H-bond donor, residue"
plot "$data" using 0:2:xtic(1) with boxes, "" using 0:2:2 with labels offset 0,1
Pipe it to Gnuplot, i.e. (...) | gnuplot and get this in output.png:

MomentJS TimeAgo function, when passed a non-empty string

So I needed to have moment("2000/03/23", "YYYY/MM/DD").fromNow() to return "20 years old" instead of "20 years ago", so I decided to pass in a string. It turns out that passing any non-empty string (ie. moment("2000/03/23", "YYYY/MM/DD").fromNow("blah")) removes "ago" from the return string.. so then you can just append " old" to the end of the result.
let age = moment("2000/03/23", "YYYY/MM/DD").fromNow("blah") + " old";
I am assuming this is non-documented, non-reliable result? Why does it do this?
EDITED: replaced timeAgo with fromNow, which is the function I was actually using in my code, but mixed it up.
You want to update the format of your fromNow , then you might want to update relativeTime of updateLocale.
moment.updateLocale("en", {
relativeTime: {
future: "in %s",
past: "%s old",
s: "a few seconds",
ss: "%d seconds",
m: "a minute",
mm: "%d minutes",
h: "an hour",
hh: "%dh",
d: "a day",
dd: "%d days",
M: "a month",
MM: "%d months",
y: "a year",
yy: "%d years"
}
});
let age = moment("2000/03/23 8:15:00", "YYYY/MM/DD hh:mm:ss").fromNow();
console.log(age);
<script src="https://cdnjs.cloudflare.com/ajax/libs/moment.js/2.18.1/moment.min.js"></script>

Why does XQuery add an extra space?

XQuery adds a space and I don't understand why. I have the following simple query :
declare option saxon:output "method=text";
for $i in 1 to 10
return concat(".", $i, " ", 100, "
", ".")
I ran it with Saxon (SaxonEE9-5-1-8J and SaxonHE9-5-1-8J):
java net.sf.saxon.Query -q:query.xq -o:result.txt
The result is the following:
.1 100
. .2 100
. .3 100
. .4 100
. .5 100
. .6 100
. .7 100
. .8 100
. .9 100
. .10 100
.
My question comes from the presence of an extra space between dots. The first line is OK but the folllowing lines (2 to 10) have that space and I don't understand why. What we see as spaces between digits is in fact a tabulation inserted by the character reference.
Could you enlighten me about that behavior ?
PS: I have added saxon as a tag for the question even if the question is not specific to Saxon.
I think your query returns a sequence of string values which are then by default concatenated with a space (see http://www.w3.org/TR/xslt-xquery-serialization/#sequence-normalization where it says "For each subsequence of adjacent strings in S2, copy a single string to the new sequence equal to the values of the strings in the subsequence concatenated in order, each separated by a single space"). If you don't want that then you can use
string-join(for $i in 1 to 10
return concat(".", $i, " ", 100, "
", "."), '')
The space between the dots is basically a separator introduced between the items in the sequence that you are constructing. It would seem that Saxon's text serializer where it outputs to the console inserts that space character to allow you to make sense of the output items.
Considering your code:
declare option saxon:output "method=text";
for $i in 1 to 10
return
concat(".", $i, " ", 100, "
", ".")
The result of for $i in 1 to 10 return is a sequence of 10 xs:string items. From your output you can determine that the space is interspersed between each evaluation of concat(".", $i, " ", 100, "
", ".").
If you want to check that you can rewrite your query as:
for $i in 1 to 10
return
<x>{concat(".", $i, " ", 100, "
", ".")}</x>
And you will see your 10 distinct items with no spaces between.
If you are trying to create a single text string, as you are already controlling the line-breaks, then you could also join all of the 10 xs:string items together yourself, which would have the effect of eliminating the spaces you are seeing between the sequence items. For example:
declare option saxon:output "method=text";
string-join(
for $i in 1 to 10
return
(".", string($i), " ", "100", "
", ".")
, "")

Minizinc: output for five days,there is a better flexible way?

I have to extend the output and the solution of my project (make an exams scheduling):
-Extend the structure to five days (I have always worked on one day):
I thought about moltiply the number of days for slotstimes (5*10) and then I tune the output! Is there a better way?
Now the whole code:
include "globals.mzn";include "alldifferent.mzn";
%------------------------------Scalar_data----------------------
int: Students; % number of students
int: Exams; % number of exams
int: Rooms; % number of rooms
int: Slotstime; % number of slots
int: Days; % a period i.e. five days
int: Exam_max_duration; % the maximum length of any exam (in slots)
%------------------------------Vectors--------------------------
array[1..Rooms] of int : Rooms_capacity;
array[1..Exams] of int : Exams_duration; % the duration of written test
array[1..Slotstime, 1..Rooms] of 0..1: Unavailability;
array[1..Students,1..Exams] of 0..1: Enrollments;
Enrollments keeps track of the registrations for every student;
from this I obtain the number of students which will be at the exam,
in order to choose the right room according to the capacity
%---------------------------Decision_variables------------------
array[1..Slotstime,1..Rooms] of var 0..Exams: Timetable_exams;
array[1..Exams] of var 1..Rooms: ExamsRoom;
array[1..Exams] of var 1..Slotstime: ExamsStart;
%---------------------------Constraints--------------------------
% Calculate the number of subscribers and assign classroom
% according to time and capacity
constraint forall (e in 1..Exams,r in 1..Rooms,s in 1..Slotstime)
(if Rooms_capacity[r] <= sum([bool2int(Enrollments[st,e]>0)| st in 1..Students])
then Timetable_exams[s,r] != e
else true
endif
);
% Unavailability OK
constraint forall(c in 1..Slotstime, p in 1..Rooms)
(if Unavailability[c,p] == 1
then Timetable_exams[c,p] = 0
else true
endif
);
% Assignment exams according with rooms and slotstimes (Thanks Hakan)
constraint forall(e in 1..Exams) % for each exam
(exists(r in 1..Rooms) % find a room
( ExamsRoom[e] = r /\ % assign the room to the exam
forall(t in 0..Exams_duration[e]-1)
% assign the exam to the slotstimes and room in the timetable
(Timetable_exams[t+ExamsStart[e],r] = e)
)
)
/\ % ensure that we have the correct number of exam slots
sum(Exams_duration) = sum([bool2int(Timetable_exams[t,r]>0) | t in 1..Slotstime,
r in 1..Rooms]);
%---------------------------Solver--------------------------
solve satisfy;
% solve::int_search([Timetable_exams[s, a] | s in 1..Slotstime, a in
% 1..Rooms],first_fail,indomain_min,complete) satisfy;
And now the output, extremely heavy and full of strings.
%---------------------------Output--------------------------
output ["\n" ++ "MiniZinc paper: Exams schedule " ++ "\n" ]
++["\nDay I \n"]++
[
if r=1 then "\n" else " " endif ++
show(Timetable_exams[t,r])
| t in 1..Slotstime div Days, r in 1..Rooms
]
++["\n\nDay II \n"]++
[
if r=1 then "\n" else " " endif ++
show(Timetable_exams[t,r])
| t in 11..((Slotstime div Days)*2), r in 1..Rooms
]
++["\n\nDay III \n"]++
[
if r=1 then "\n" else " " endif ++
show(Timetable_exams[t,r])
| t in 21..((Slotstime div Days)*3), r in 1..Rooms
]
++["\n\nDay IV \n"]++
[
if r=1 then "\n" else " " endif ++
show(Timetable_exams[t,r])
| t in 31..((Slotstime div Days)*4), r in 1..Rooms
]
++["\n\nDay V \n"]++
[
if r=1 then "\n" else " " endif ++
show(Timetable_exams[t,r])
| t in 41..Slotstime, r in 1..Rooms
]
++[ "\n"]++
[
"\nExams_Room: ", show(ExamsRoom), "\n",
"Exams_Start: ", show(ExamsStart), "\n",
]
++["Participants: "]++
[
if e=Exams then " " else " " endif ++
show (sum([bool2int(Enrollments[st,e]>0)| st in 1..Students]))
|e in 1..Exams
];
I finish with data:
%Data
Slotstime=10*Days;
Students=50;
Days=5;
% Exams
Exams = 5;
Exam_max_duration=4;
Exams_duration = [4,1,2,3,2];
% Rooms
Rooms = 4;
Rooms_capacity = [20,30,40,50];
Unavailability = [|0,0,0,0 % Rooms rows % Slotstime columns
|0,0,0,0
|0,0,0,0
|0,0,0,0
|1,1,1,1
|1,1,1,1
|0,0,0,0
|0,0,0,0
|0,0,0,0
|0,0,0,0
% End first day
|0,0,0,0
|0,0,0,0
|0,0,0,0
|0,0,0,0
|1,1,1,1
|1,1,1,1
|0,0,0,0
|0,0,0,0
|0,0,0,0
|0,0,0,0
% End secon day
|0,0,0,0
|0,0,0,0
|0,0,0,0
|0,0,0,0
|1,1,1,1
|1,1,1,1
|0,0,0,0
|0,0,0,0
|0,0,0,0
|0,0,0,0
% End third day
|0,0,0,0
|0,0,0,0
|0,0,0,0
|0,0,0,0
|1,1,1,1
|1,1,1,1
|0,0,0,0
|0,0,0,0
|0,0,0,0
|0,0,0,0
% End fourth day
|0,0,0,0
|0,0,0,0
|0,0,0,0
|0,0,0,0
|1,1,1,1
|1,1,1,1
|0,0,0,0
|0,0,0,0
|0,0,0,0
|0,0,0,0
%End fifth day
|];
Enrollments= [|1,0,1,0,1 % Exams rows %Students columns
|1,0,1,0,1
|0,1,0,0,0
|1,0,0,1,0
|0,1,0,0,0
|0,0,1,1,0
|1,0,0,1,0
|0,0,0,0,1
|1,0,0,0,1
|0,0,0,0,1
|0,1,0,0,0
|0,0,0,0,0
|0,1,0,0,1
|0,0,1,0,1
|1,0,1,0,1
|1,0,1,0,1
|0,1,0,0,0
|1,0,0,1,0
|0,1,0,0,0
|0,0,1,1,0
|1,0,0,1,0
|0,0,0,0,1
|1,0,0,0,1
|0,0,0,0,1
|0,1,0,0,0
|0,0,0,0,0
|0,1,0,0,1
|0,0,1,0,1
|1,0,1,0,1
|1,0,1,0,1
|0,1,0,0,0
|1,0,0,1,0
|0,1,0,0,0
|0,0,1,1,0
|1,0,0,1,0
|0,0,0,0,1
|1,0,0,0,1
|0,0,0,0,1
|0,1,0,0,0
|0,0,0,0,0
|0,1,0,0,1
|0,0,1,0,1
|1,0,1,0,1
|1,0,1,0,1
|0,1,0,0,0
|1,0,0,1,0
|0,1,0,0,0
|0,0,1,1,0
|1,0,0,1,0
|0,0,0,0,1
|];
Thanks in advance
For the output section, the following code should work. I only changed the Day schedule, the rest is unchanged.
output ["\n" ++ "MiniZinc paper: Exams schedule " ++ "\n" ]
++
[
if t mod 10 = 1 /\ r = 1 then
"\n\nDay " ++ show(d) ++ " \n"
else "" endif ++
if r=1 then "\n" else " " endif ++
show(Timetable_exams[t,r])
| d in 1..Days, t in 1+(d-1)*10..(Slotstime div Days)*d, r in 1..Rooms,
]
++[ "\n"]++
[
"\nExams_Room: ", show(ExamsRoom), "\n",
"Exams_Start: ", show(ExamsStart), "\n",
]
++["Participants: "]++
[
if e=Exams then " " else " " endif ++
show (sum([bool2int(Enrollments[st,e]>0)| st in 1..Students]))
|e in 1..Exams
];
If it's a requirement that the days should be numbered with "I","II", etc then you can define a string array with the day names, e.g.
array[1..Days] of string: DaysStr = ["I","II","III","IV","V"];
and then use it in the output loop:
% ....
if t mod 10 = 1 /\ r = 1 then
"\n\nDay " ++ DaysStr[d] ++ " \n" % <---
else "" endif ++
% ....
Later update:
One other thing to make the model a little more general (and smaller) is to replace the huge Unavailability matrix (and the constraint using it) with this:
set of int: UnavailabilitySlots = {5,6};
% ....
constraint
forall(c in 1..Slotstime, p in 1..Rooms) (
if c mod 10 in UnavailabilitySlots then
Timetable_exams[c,p] = 0
else
true
endif
);
Yet another comment:
The original model has a flaw in that it allow exams that will be over two days, e.g. the 2 last hours of day I and the first 2 hours of day II. I think the following extra (and not so pretty) constraint will fix that. Again, the magic "10" is used.
constraint
% do not pass over a day limit
forall(e in 1..Exams) (
not(exists(t in 1..Exams_duration[e]-1) (
(ExamsStart[e]+t-1) mod 10 > (ExamsStart[e]+t) mod 10
))
)
;

Maximum and Minimum using awk

How would you find the maximum and minimum a,b,c values for the rows that start with MATH from the following file?
TITLE a b c
MATH 12.3 -0.42 5.5
ENGLISH 70.45 3.21 6.63
MATH 3.32 2.43 9.42
MATH 3.91 -1.56 7.22
ENGLISH 89.21 4.66 5.32
It can not be just 1 command line. It has to be a script file using BEGIN function and END.
I get the wrong minimum value and I end up getting a string for max when I run my program. Please help!
Here is my code for the column a:
BEGIN { x=1 }
{
if ($1 == "MATH") {
min=max=$2;
for ( i=0; i<=NF; i++) {
min = (min < $i ? min : $i)
max = (max > $i ? max : $i)
}
}
}
END { print "max a value is ", max, " min a value is ", min }
Thanks!
This code could demonstrate a concept of what you want:
awk '$1!="MATH"{next}1;!i++{min=$2;max=$2;}{for(j=2;j<=NF;++j){min=(min<$j)?min:$j;max=(max>$j)?max:$j}}END{printf "Max value is %.2f. Min value is %.2f.\n", max, min}' file
Output:
MATH 12.3 -0.42 5.5
MATH 3.32 2.43 9.42
MATH 3.91 -1.56 7.22
Max value is 12.30. Min value is -1.56.
Remove 1 to suppress the messages:
awk '$1!="MATH"{next};...
Script version:
#!/usr/bin/awk
$1 != "MATH" {
# Move to next record if not about "MATH".
next
}
!i++ {
# Only does this on first match.
min = $2; max = $2
}
{
for (j = 2; j <= NF; ++j) {
min = (min < $j) ? min : $j
max = (max > $j) ? max : $j
}
}
END {
printf "Max value is %.2f. Min value is %.2f.\n", max, min
}
look at your for loop
it starts from i=0 so the condition should be
i<NF
instead of
i<= NF
try the following line instead of that line .... i hope you get what u are looking for
for(i=0;i<NF;i++){
rest all looks fine to me.... thanks
The i variable in the for loop should at least begin with 2(the 2rd field), not 0, which represent the whole line, and end with NF.
BEGIN { x=1;min=2147483647;max=-2147483648}
{
if ($1 == "MATH") {
for ( i=2; i<=NF; i++) {
min = (min < $i ? min : $i)
max = (max > $i ? max : $i)
}
}
}
END { print "max a value is ", max, " min a value is ", min }
Run with command:(testawk.script for the above awk script filename, test.data for input data filename)
cat test.data | awk -f testawk.script
output:
max a value is 12.30 min a value is -1.56
I don't have a terminal handy on me right now but something along these lines will get the smallest of each line.
cat YOURFILE | grep "^MATH" | cat test | \
while read CMD; do
A=`echo $CMD | awk '{ print $2 }'`
B=`echo $CMD | awk '{ print $3 }'`
C=`echo $CMD | awk '{ print $4 }'`
#IF Statement for comparing the three of them
#echo the smallest
done

Resources