I have a netcdf file containing 4-D variables:
variables:
double maxvegetfrac(time_counter, veget, lat, lon) ;
maxvegetfrac:_FillValue = 1.00000002004088e+20 ;
maxvegetfrac:history = "From Topo.115MaCTRL_WAM_360_180" ;
maxvegetfrac:long_name = "Vegetation types" ;
maxvegetfrac:missing_value = 1.e+20f ;
maxvegetfrac:name = "maxvegetfrac" ;
maxvegetfrac:units = "-" ;
double mask_veget(time_counter, veget, lat, lon) ;
mask_veget:missing_value = -1.e+34 ;
mask_veget:_FillValue = -1.e+34 ;
mask_veget:long_name = "IF MYVEG4 EQ 10 AND I GE 610 AND J GT 286 THEN 16 ELSE MYVEG4" ;
mask_veget:history = "From desert_115Ma_3" ;
I'd like to use the variable "mask_veget" as a mask to alter values of the variable "maxvegetfrac" over specific regions, and over chosen values of its "veget" dimension.
To do so I am using ncap2. For example, if I want to set maxvegetfrac values over the 5th rank of veget dimension to 500 where mask_veget equals 6, I do :
> ncap2 -s "where (mask_veget(:,:,:,:)== 6) maxvegetfrac(:,5,:,:) = 500" test.nc
My problem is that in the resulting test.nc file, maxvegetfrac has been modified at the first rank of "veget" dimension, not the 5th one. And I get the same result if I run the script over the entire veget dimension:
ncap2 -s "where (mask_veget(:,:,:,:)== 6) maxvegetfrac(:,:,:,:) = 500" test.nc
So I am mistaking somewhere, but... where ?
Any help appreciated !
A couple of things you may not be aware of
you shouldn't be hyperslabbing a variable in the where body -it makes no sense at the moment.
It is ok to hyperslab in the where statement proving its a single index
as a dim with a single value collapses
Try this:
/*** hyper.nco *****/
maxvegetfrac5=maxvegetfrac(:,5,:,:);
where( mask_veget(:,5,:,:)== 6 )
maxvegetfrac5=500.0;
/* put the hyperslab back in */
maxvegetfrac(:,5,:,:)=maxvegetfrac5;
/* script end *****/
run the script now with the command
ncap2 -v -O -S hyper.nco test.nc out.nc
...Henry
Would like to extract all the lines from first file (GunZip *.gz i.e Input.csv.gz), if the first file 4th field is falls within a range of
Second file (Slab.csv) first field (Start Range) and second field (End Range) then populate Slab wise count of rows and sum of 4th and 5th field of first file.
Input.csv.gz (GunZip)
Desc,Date,Zone,Duration,Calls
AB,01-06-2014,XYZ,450,3
AB,01-06-2014,XYZ,642,3
AB,01-06-2014,XYZ,0,0
AB,01-06-2014,XYZ,205,3
AB,01-06-2014,XYZ,98,1
AB,01-06-2014,XYZ,455,1
AB,01-06-2014,XYZ,120,1
AB,01-06-2014,XYZ,0,0
AB,01-06-2014,XYZ,193,1
AB,01-06-2014,XYZ,0,0
AB,01-06-2014,XYZ,161,2
Slab.csv
StartRange,EndRange
0,0
1,10
11,100
101,200
201,300
301,400
401,500
501,10000
Expected Output:
StartRange,EndRange,Count,Sum-4,Sum-5
0,0,3,0,0
1,10,NotFound,NotFound,NotFound
11,100,1,98,1
101,200,3,474,4
201,300,1,205,3
301,400,NotFound,NotFound,NotFound
401,500,2,905,4
501,10000,1,642,3
I am using below two commands to get the above output , expect "NotFound"cases .
awk -F, 'NR==FNR{s[NR]=$1;e[NR]=$2;c[NR]=$0;n++;next} {for(i=1;i<=n;i++) if($4>=s[i]&&$4<=e[i]) {print $0,","c[i];break}}' Slab.csv <(gzip -dc Input.csv.gz) >Op_step1.csv
cat Op_step1.csv | awk -F, '{key=$6","$7;++a[key];b[key]=b[key]+$4;c[key]=c[key]+$5} END{for(i in a)print i","a[i]","b[i]","c[i]}' >Op_step2.csv
Op_step2.csv
101,200,3,474,4
501,10000,1,642,3
0,0,3,0,0
401,500,2,905,4
11,100,1,98,1
201,300,1,205,3
Any suggestions to make it one liner command to achieve the Expected Output , Don't have perl , python access.
Here is another option using perl which takes benefits of creating multi-dimensional arrays and hashes.
perl -F, -lane'
BEGIN {
$x = pop;
## Create array of arrays from start and end ranges
## $range = ( [0,0] , [1,10] ... )
(undef, #range)= map { chomp; [split /,/] } <>;
#ARGV = $x;
}
## Skip the first line
next if $. ==1;
## Create hash of hash
## $line = '[0,0]' => { "count" => counts , "sum4" => sum_of_col4 , "sum5" => sum_of_col5 }
for (#range) {
if ($F[3] >= $_->[0] && $F[3] <= $_->[1]) {
$line{"#$_"}{"count"}++;
$line{"#$_"}{"sum4"} +=$F[3];
$line{"#$_"}{"sum5"} +=$F[4];
}
}
}{
print "StartRange,EndRange,Count,Sum-4,Sum-5";
print join ",", #$_,
$line{"#$_"}{"count"} //"NotFound",
$line{"#$_"}{"sum4"} //"NotFound",
$line{"#$_"}{"sum5"} //"NotFound"
for #range
' slab input
StartRange,EndRange,Count,Sum-4,Sum-5
0,0,3,0,0
1,10,NotFound,NotFound,NotFound
11,100,1,98,1
101,200,3,474,4
201,300,1,205,3
301,400,NotFound,NotFound,NotFound
401,500,2,905,4
501,10000,1,642,3
Here is one way using awk and sort:
awk '
BEGIN {
FS = OFS = SUBSEP = ",";
print "StartRange,EndRange,Count,Sum-4,Sum-5"
}
FNR == 1 { next }
NR == FNR {
ranges[$1,$2]++;
next
}
{
for (range in ranges) {
split(range, tmp, SUBSEP);
if ($4 >= tmp[1] && $4 <= tmp[2]) {
count[range]++;
sum4[range]+=$4;
sum5[range]+=$5;
next
}
}
}
END {
for(range in ranges)
print range, (count[range]?count[range]:"NotFound"), (sum4[range]?sum4[range]:"NotFound"), (sum5[range]?sum5[range]:"NotFound") | "sort -t, -nk1,2"
}' slab input
StartRange,EndRange,Count,Sum-4,Sum-5
0,0,3,NotFound,NotFound
1,10,NotFound,NotFound,NotFound
11,100,1,98,1
101,200,3,474,4
201,300,1,205,3
301,400,NotFound,NotFound,NotFound
401,500,2,905,4
501,10000,1,642,3
Set the Input, Output Field Separators and SUBSEP to ,. Print the Header line.
If it is the first line skip it.
Load the entire slab.txt in to an array called ranges.
For every range in the ranges array, split the field to get start and end range. If the 4th column is in the range, increment the count array and add the value to sum4 and sum5 array appropriately.
In the END block, iterate through the ranges and print them.
Pipe the output to sort to get the output in order.
I am attempting to automate a graph process using SAS macros. Since this will be used for several different subsets, the axes of the graph must be adjusted accordingly. I haver tried a few different ways and feel that I'm going the wrong way down the rabbit hole.
Here is my dataset.
data want;
input A B C D;
cards;
100 5 6 1
200 5 5 2
150 5.5 5.5 3
457 4.2 6.2 4
500 3.7 7.0 5
525 3.5 7.2 6
;
run;
What I want is a graph that has the following axis specs:
x-axis from min(D) to max(D) by some reasonable increment
left-axis from min(A) to max(A)
right-axis from min (B,C) to max(B,C)
Here is my latest attempt:
proc sql;
select roundz((max(A)+100), 100),
roundz(min(A), 100),
(&maxA.-&minA.)/10,
roundz(max(B, C)+1, 1),
roundz(min(B, C), 1),
(&maxBC.-&minBC.)/10,
roundz(max(D), 1),
roundz(min(D), 1),
(&maxD.-&minD.+1)/3
into :maxA, :minA, :Ainc,
:maxBC, :minBC, :BCinc,
:maxD, :minD, :Dinc
from want;
run;
goptions reset=all ftext=SWISS htext=2.5 ;
axis1 order=(&minA to &maxA by &Ainc) minor=none label=(angle=90 'A label' ) offset=(1) ;
axis2 order=(&minBC to &maxBC by &BCinc) minor=(number=1) label=(angle=90 'BC Label') offset=(1);
axis3 order=(&minD to &maxD by &Dinc) minor=(number=2) label=('D') offset=(1) ;
symbol1 color=black i=join value=circle height=2 width=2 ;
symbol2 color=black i=join value=square height=2 width=2 ;
symbol3 color=black i=join value=triangle height=2 width=2 ;
legend1 label=none mode=reserve position=(top center outside) value=('Label here' ) shape=symbol(5,1) ;
legend2 label=none mode=reserve position=(top center outside) value=('label 1' 'label 2') shape=symbol(3,1) ;
proc gplot data=want;
plot A*D=1 /overlay legend=legend1 vaxis=axis1 haxis=axis3 ;
plot2 B*D=2 &var_C*D=3 /overlay legend=legend2 vaxis=axis2 ;
run ;
Any help would be greatly appreciated. Even if that means a completely different way of doing it (though I'd also be interested to see where I am going wrong here).
Thanks, Pyll
What you're doing is sort-of writing a macro without writing a macro. Write the macro and this is easier. Also, if you're going to have the INCs always be 1/10ths, put that in let statements (although if they might vary in their conception, then leave them as parameters).
%macro graph_me(minA=,maxA=, minBC=,maxBC=, minD=, maxD=);
%let incA = %sysevalf((&maxA.-&minA.)/10); *same for incD and incBC;
goptions reset=all ftext=SWISS htext=2.5 ;
axis1 order=(&minA to &maxA by &incA) minor=none label=(angle=90 'A label' ) offset=(1) ;
axis2 order=(&minBC to &maxBC by &incBC) minor=(number=1) label=(angle=90 'BC Label') offset=(1);
axis3 order=(&minD to &maxD by &incD) minor=(number=2) label=('D') offset=(1) ;
symbol1 color=black i=join value=circle height=2 width=2 ;
symbol2 color=black i=join value=square height=2 width=2 ;
symbol3 color=black i=join value=triangle height=2 width=2 ;
legend1 label=none mode=reserve position=(top center outside) value=('Label here' ) shape=symbol(5,1) ;
legend2 label=none mode=reserve position=(top center outside) value=('label 1' 'label 2') shape=symbol(3,1) ;
%mend graph_me;
Now write your SQL call to grab those parameters into the macro call itself.
proc sql NOPRINT;
select
cats('%graph_me(minA=',roundz(min(A), 100),
',maxA=', roundz((max(A)+100), 100),
... etc. ...
into :mcall
from want;
quit;
This gives you the advantage that you may be able to generate multiple calls if you, for example, want to do this grouped by some variable (having one graph per variable value).
2 things in the sql:
you cannot use the macros you are creating and you need just one value, when doing max(B,C) you are creating as many values as there are obs in the dataset, you need another max.
I cannot check the sas graph part as I do not have it, but
proc sql NOPRINT;
select roundz((max(A)+100), 100) as maxA,
roundz(min(A), 100) as minA,
((calculated maxA)-(calculated minA))/10,
roundz(max(max(B, C))+1, 1) as maxBC,
roundz(min(min(B, C)), 1) as minBC,
((calculated maxBC)-(calculated minBC))/10,
roundz(max(D), 1) as maxD,
roundz(min(D), 1) as minD,
((calculated maxD)-(calculated minD)+1)/3
into :maxA, :minA, :Ainc,
:maxBC, :minBC, :BCinc,
:maxD, :minD, :Dinc
from want;
quit;