I am attempting to automate a graph process using SAS macros. Since this will be used for several different subsets, the axes of the graph must be adjusted accordingly. I haver tried a few different ways and feel that I'm going the wrong way down the rabbit hole.
Here is my dataset.
data want;
input A B C D;
cards;
100 5 6 1
200 5 5 2
150 5.5 5.5 3
457 4.2 6.2 4
500 3.7 7.0 5
525 3.5 7.2 6
;
run;
What I want is a graph that has the following axis specs:
x-axis from min(D) to max(D) by some reasonable increment
left-axis from min(A) to max(A)
right-axis from min (B,C) to max(B,C)
Here is my latest attempt:
proc sql;
select roundz((max(A)+100), 100),
roundz(min(A), 100),
(&maxA.-&minA.)/10,
roundz(max(B, C)+1, 1),
roundz(min(B, C), 1),
(&maxBC.-&minBC.)/10,
roundz(max(D), 1),
roundz(min(D), 1),
(&maxD.-&minD.+1)/3
into :maxA, :minA, :Ainc,
:maxBC, :minBC, :BCinc,
:maxD, :minD, :Dinc
from want;
run;
goptions reset=all ftext=SWISS htext=2.5 ;
axis1 order=(&minA to &maxA by &Ainc) minor=none label=(angle=90 'A label' ) offset=(1) ;
axis2 order=(&minBC to &maxBC by &BCinc) minor=(number=1) label=(angle=90 'BC Label') offset=(1);
axis3 order=(&minD to &maxD by &Dinc) minor=(number=2) label=('D') offset=(1) ;
symbol1 color=black i=join value=circle height=2 width=2 ;
symbol2 color=black i=join value=square height=2 width=2 ;
symbol3 color=black i=join value=triangle height=2 width=2 ;
legend1 label=none mode=reserve position=(top center outside) value=('Label here' ) shape=symbol(5,1) ;
legend2 label=none mode=reserve position=(top center outside) value=('label 1' 'label 2') shape=symbol(3,1) ;
proc gplot data=want;
plot A*D=1 /overlay legend=legend1 vaxis=axis1 haxis=axis3 ;
plot2 B*D=2 &var_C*D=3 /overlay legend=legend2 vaxis=axis2 ;
run ;
Any help would be greatly appreciated. Even if that means a completely different way of doing it (though I'd also be interested to see where I am going wrong here).
Thanks, Pyll
What you're doing is sort-of writing a macro without writing a macro. Write the macro and this is easier. Also, if you're going to have the INCs always be 1/10ths, put that in let statements (although if they might vary in their conception, then leave them as parameters).
%macro graph_me(minA=,maxA=, minBC=,maxBC=, minD=, maxD=);
%let incA = %sysevalf((&maxA.-&minA.)/10); *same for incD and incBC;
goptions reset=all ftext=SWISS htext=2.5 ;
axis1 order=(&minA to &maxA by &incA) minor=none label=(angle=90 'A label' ) offset=(1) ;
axis2 order=(&minBC to &maxBC by &incBC) minor=(number=1) label=(angle=90 'BC Label') offset=(1);
axis3 order=(&minD to &maxD by &incD) minor=(number=2) label=('D') offset=(1) ;
symbol1 color=black i=join value=circle height=2 width=2 ;
symbol2 color=black i=join value=square height=2 width=2 ;
symbol3 color=black i=join value=triangle height=2 width=2 ;
legend1 label=none mode=reserve position=(top center outside) value=('Label here' ) shape=symbol(5,1) ;
legend2 label=none mode=reserve position=(top center outside) value=('label 1' 'label 2') shape=symbol(3,1) ;
%mend graph_me;
Now write your SQL call to grab those parameters into the macro call itself.
proc sql NOPRINT;
select
cats('%graph_me(minA=',roundz(min(A), 100),
',maxA=', roundz((max(A)+100), 100),
... etc. ...
into :mcall
from want;
quit;
This gives you the advantage that you may be able to generate multiple calls if you, for example, want to do this grouped by some variable (having one graph per variable value).
2 things in the sql:
you cannot use the macros you are creating and you need just one value, when doing max(B,C) you are creating as many values as there are obs in the dataset, you need another max.
I cannot check the sas graph part as I do not have it, but
proc sql NOPRINT;
select roundz((max(A)+100), 100) as maxA,
roundz(min(A), 100) as minA,
((calculated maxA)-(calculated minA))/10,
roundz(max(max(B, C))+1, 1) as maxBC,
roundz(min(min(B, C)), 1) as minBC,
((calculated maxBC)-(calculated minBC))/10,
roundz(max(D), 1) as maxD,
roundz(min(D), 1) as minD,
((calculated maxD)-(calculated minD)+1)/3
into :maxA, :minA, :Ainc,
:maxBC, :minBC, :BCinc,
:maxD, :minD, :Dinc
from want;
quit;
Related
So I am generating a SAS bar-line chart in SAS with a dataset which looks like this:
id date default var1 log_var1 square_var1 ... cubic_var1
1 1 1 5 -3.3 0.9 1.2
1 2 0 15 -9.9 2.7 3.6
2 1 1 10 -6.6 1.8 2.4
...
Note, the transformations are not
log(var1)
but actually the transformation from the regression so
log_var1 = alpha + beta log(var1)
Now I use the following code, generated by the SAS task for bar-line chart:
SYMBOL1
INTERPOL=JOIN
HEIGHT=10pt
VALUE=SQUARE
LINE=1
WIDTH=2
CI=WHITE
CV = _STYLE_
;
SYMBOL2
INTERPOL=JOIN
HEIGHT=10pt
VALUE=SQUARE
LINE=1
WIDTH=2
CV = _STYLE_
;
SYMBOL3
INTERPOL=JOIN
HEIGHT=10pt
VALUE=SQUARE
LINE=1
WIDTH=2
CV = _STYLE_
;
SYMBOL4
INTERPOL=JOIN
HEIGHT=10pt
VALUE=SQUARE
LINE=1
WIDTH=2
CV = _STYLE_
;
SYMBOL5
INTERPOL=JOIN
HEIGHT=10pt
VALUE=SQUARE
LINE=1
WIDTH=2
CV = _STYLE_
;
SYMBOL6
INTERPOL=JOIN
HEIGHT=10pt
VALUE=SQUARE
LINE=1
WIDTH=2
CI=WHITE
CV = _STYLE_
;
Legend2
FRAME
;
Legend1
FRAME
;
Axis1
STYLE=1
WIDTH=1
MINOR=NONE
;
Axis2
STYLE=1
WIDTH=1
;
Axis3
STYLE=1
WIDTH=1
MINOR=NONE
;
TITLE;
TITLE1 "Bar-Line Chart";
FOOTNOTE;
FOOTNOTE1 "Generated by the SAS System (&_SASSERVERNAME, &SYSSCPL) on %TRIM(%QSYSFUNC(DATE(), NLDATE20.)) at %TRIM(%SYSFUNC(TIME(), TIMEAMPM12.))";
PROC GBARLINE DATA=WORK.SORTTempTableSorted
;
BAR var1
/
FRAME LEVELS=25
COUTLINE=BLACK
RAXIS=AXIS1
MAXIS=AXIS2
LEGEND=LEGEND2
;
PLOT / SUMVAR=default
TYPE=MEAN
AXIS=AXIS3
LEGEND=LEGEND1
;
PLOT / SUMVAR=lin_var1
TYPE=MEAN
AXIS=AXIS3
;
PLOT / SUMVAR=sigmoid_var1
TYPE=MEAN
AXIS=AXIS3
;
PLOT / SUMVAR=square_var1
TYPE=MEAN
AXIS=AXIS3
;
PLOT / SUMVAR=cubic_var1
TYPE=MEAN
AXIS=AXIS3
;
PLOT / SUMVAR=log_var1
TYPE=MEAN
AXIS=AXIS3
;
/* -------------------------------------------------------------------
End of task code
------------------------------------------------------------------- */
RUN; QUIT;
%_eg_conditional_dropds(WORK.SORTTempTableSorted);
TITLE; FOOTNOTE;
GOPTIONS RESET = SYMBOL;
My question is:
Can I somehow store or save the input to create this histogram?
I.e. a table that contains the mean value for default,
var1, square_var1, cubic_var1 for the 25 equally spaced bins?
The premise of doing this is that all the inputs are on different scales and so I'd like to standardise the inputs and then plot the graphs
Note: I can take the time to code up the binning myself but this would truly be a trick of a lazy programmer!
There is no option on the GBARLINE procedure for outputting the plotting parameters it computes. Your default graphical options probably creates a png image for an html page that is used to present the chart for viewing.
Change the graphics devices to svg and ODS will create html source that contains the drawing instructions for creating the image seen. The instructions will be in the <g> tag. So, if you are truly motivated to be lazy and not hand code the midpoints and axis values, you can write code to parse the html and scrape the computed midpoints and axis ticks from within the <g> tag.
ods html5 file="c:\temp\gbarline.html";
goptions reset=all;
goptions device=svg;
… gbarline …
ods html5 close;
… parse the ODS created c:\temp\gbarline.html …
I have a netcdf file containing 4-D variables:
variables:
double maxvegetfrac(time_counter, veget, lat, lon) ;
maxvegetfrac:_FillValue = 1.00000002004088e+20 ;
maxvegetfrac:history = "From Topo.115MaCTRL_WAM_360_180" ;
maxvegetfrac:long_name = "Vegetation types" ;
maxvegetfrac:missing_value = 1.e+20f ;
maxvegetfrac:name = "maxvegetfrac" ;
maxvegetfrac:units = "-" ;
double mask_veget(time_counter, veget, lat, lon) ;
mask_veget:missing_value = -1.e+34 ;
mask_veget:_FillValue = -1.e+34 ;
mask_veget:long_name = "IF MYVEG4 EQ 10 AND I GE 610 AND J GT 286 THEN 16 ELSE MYVEG4" ;
mask_veget:history = "From desert_115Ma_3" ;
I'd like to use the variable "mask_veget" as a mask to alter values of the variable "maxvegetfrac" over specific regions, and over chosen values of its "veget" dimension.
To do so I am using ncap2. For example, if I want to set maxvegetfrac values over the 5th rank of veget dimension to 500 where mask_veget equals 6, I do :
> ncap2 -s "where (mask_veget(:,:,:,:)== 6) maxvegetfrac(:,5,:,:) = 500" test.nc
My problem is that in the resulting test.nc file, maxvegetfrac has been modified at the first rank of "veget" dimension, not the 5th one. And I get the same result if I run the script over the entire veget dimension:
ncap2 -s "where (mask_veget(:,:,:,:)== 6) maxvegetfrac(:,:,:,:) = 500" test.nc
So I am mistaking somewhere, but... where ?
Any help appreciated !
A couple of things you may not be aware of
you shouldn't be hyperslabbing a variable in the where body -it makes no sense at the moment.
It is ok to hyperslab in the where statement proving its a single index
as a dim with a single value collapses
Try this:
/*** hyper.nco *****/
maxvegetfrac5=maxvegetfrac(:,5,:,:);
where( mask_veget(:,5,:,:)== 6 )
maxvegetfrac5=500.0;
/* put the hyperslab back in */
maxvegetfrac(:,5,:,:)=maxvegetfrac5;
/* script end *****/
run the script now with the command
ncap2 -v -O -S hyper.nco test.nc out.nc
...Henry
Hi I have a question when trying to make a spaghetti plot. I don't want each subject to have different symbols or colors. I just need them to each have a black segmented line. I have been able to do it successfully with fewer subjects, by just create the same symbol statement for everyone and use gplot, but when I do it with more than 255 subjects, SAS complains that I can't have more than 255 symbols. Is there a way to do this?
data _null_;
set ptdata&trtn. end=eof;
retain patcount 0;
by usubjid;
if first.usubjid then patcount+1;
if last.usubjid then lastgfr='Y';
call symput('sym'||trim(left(patcount)),
'symbol'||trim(left(patcount))
|| ' '|| 'c=black'|| ' '||'v=Dot'||' '
|| 'i=join'|| ' ' || 'line=1' || 'width=1' ||';');
if eof then call symput('total',patcount);
run;
%macro symbol;
%do j=1 %to &total;
&&sym&j
%end;
%mend symbol;
%symbol
proc gplot data = ptdata&trtn. ;
plot change_since_bl*FUPTIME=usubjid /haxis=axis3 vaxis=axis4 href=0 nolegend;
format change_since_bl 8. ;*/
run ;
I would use PROC SGPLOT, it is not limited to 255 like GPLOT and it is easier to use.
Try this:
data test;
do person=1 to 256;
value = 100;
do time=0 to 10;
value = value + rannor(1);
output;
end;
end;
run;
proc sgplot data=test noautolegend;
series x=time y=value / group=person lineattrs=(color=black pattern=dash) ;
run;
I think this is what you are looking for.
I need to draw a histogram to make comparison between two series. I have the following code, but the proc gchart is not working.
data test;
input date $ irate ppi savings income cpi;
datalines;
JUN1990 8.43 114.3 2.412 83.83 129.9
JUL1990 8.76 114.5 2.473 68.147 130.4
AUG1990 8.94 116.5 4.594 84.205 131.6
SEP1990 8.85 118.4 3.893 84.016 132.7
OCT1990 8.67 120.8 3.816 52.269 133.5
NOV1990 8.51 120.1 5.35 97.008 133.8
DEC1990 8.13 118.7 4.253 81.292 133.8
JAN1991 7.98 119 3.872 57.779 134.6
FEB1991 7.92 117.2 4.249 62.566 134.8
MAR1991 8.09 116.2 6.117 77.929 135
APR1991 8.31 116 3.69 92.044 135.2
MAY1991 8.22 116.5 3.798 59.509 135.6
JUN1991 8.02 116.3 1.812 59.549 136
JUL1991 7.68 116 2.951 49.197 136.2
;
run;
proc reg data=test;
model irate = ppi savings income cpi /p;
output out=b p=py;
run;
quit;
axis1 minor=none major=(h=1) label=none
order=(0 to 120000 by 10000) ;
axis2 major=(height=1) value=none
label=none offset=(5, 5)pct ;
axis3 label=none nobrackets ;
axis4 minor=none major=(h=1) label=none
order=(0 to 120000 by 60000) ;
axis5 minor=none major=(h=1) label=none
order=(0 to 120000 by 20000) ;
axis6 minor=none major=(h=1) label=none
order=(0 to 119000 by 17000) ;
pattern1 c=ligr ;
pattern2 c=gray ;
proc gchart data=test ;
title 'Too Many' ;
vbar group /
sumvar=value2 group=date
noframe nolegend
subgroup=group
raxis=axis1 maxis=axis2 gaxis=axis3
width=12 space=0 gspace=4
coutline=same ;
format date monname3. value2 comma10.0;
run ;
title 'Odd Tick Mark Intervals' ;
vbar group /
sumvar=value2 group=date
subgroup=group
noframe nolegend
raxis=axis6 maxis=axis2 gaxis=axis3
width=12 space=0 gspace=4
coutline=same ;
format date monname3. value2 comma10.0;
run ;
quit ;
I want to make the final graph like this:
Can someone help me to change the proc gchart code or you can use your own method to do this?
As someone else mentioned - your test data does not contain the variables GROUP and VALUE2 that you are trying to call in your PROC GCHART. I think to match your example, you will need to separate the date into month and year in order to chart year in side-by side bars. Below is some GCHART code that creates a histogram similar to your example. You will need to change the response variable to what you are trying to chart.
Hope this helps.
*** CREATE MONTH AND YEAR AS SEPARATE VARIABLES ***;
data test_fix;
set test;
*** FIRST CONVERT DATE FROM CHARACTER STRING TO NUMERIC SAS DATE VARIABLE ***;
date_sas=input(date, ANYDTDTE.);
*** USE SAS DATE VARIABLE TO GET MONTH AND YEAR AS NUMERIC VARIABLES ***;
month=month(date_sas);
year=year(date_sas);
run;
proc print data=test_fix;
format date_sas mmddyy10.;
run;
axis1 label=('MONTH') offset=(5,5);
axis2 label=none value=none;
axis3 label=(a=90 'PPI') ;
pattern1 v=solid color=greyc0; *** LIGHT GREY ***;
pattern2 v=solid color=grey40; *** DARY GREY ***;
proc gchart data=test_fix;
vbar year /
type=sum sumvar=ppi
group=month subgroup=year
discrete
space=0
gaxis=axis1 /* GROUP AXIS (X-AXIS) - MONTH */
maxis=axis2 /* MID POINT AXIS (X-AXIS) - YEAR */
raxis=axis3 /* RESPONSE AXIS (Y-AXIS) - PPI */
;
run;
quit;
I'm looking for an elegant solution to the below issue that will help avoid code duplication. You can see that this line:
put auction_id= potential_buyer= ;* THIS GETS REPEATED;
Gets repeated in this code:
data results;
attrib potential_buyer length=$1;
set auction;
if _n_ eq 1 then do;
declare hash ht1(dataset:'buyers', multidata: 'y');
ht1.definekey('auction_id');
ht1.definedata('potential_buyer');
ht1.definedone();
call missing (potential_buyer);
end;
**
** LOOP THROUGH EACH POTENTIAL BUYER AND PROCESS THEM
*;
if ht1.find() eq 0 then do;
put auction_id= potential_buyer= ;* THIS GETS REPEATED;
ht1.has_next(result: ht1_has_more);
do while(ht1_has_more);
rc = ht1.find_next();
put auction_id= potential_buyer= ;* THIS GETS REPEATED;
ht1.has_next(result: ht1_has_more);
end;
end;
run;
I've simplified the above example to a single line as the real code block is quite long and complex. I'd like to avoid using a %macro snippet or a %include if possible as I'd like to keep the logic "within" the data step.
Here's some sample data:
data auction;
input auction_id;
datalines;
111
222
333
;
run;
data buyers;
input auction_id potential_buyer $;
datalines;
111 a
111 c
222 a
222 b
222 c
333 d
;
run;
I figured it out. Turned out to be pretty simple in the end just had a little trouble wrapping my brain around it:
data results;
attrib potential_buyer length=$1;
set auction;
if _n_ eq 1 then do;
declare hash ht1(dataset:'buyers', multidata: 'y');
ht1.definekey('auction_id');
ht1.definedata('potential_buyer');
ht1.definedone();
call missing (potential_buyer);
end;
**
** LOOP THROUGH EACH POTENTIAL BUYER AND PROCESS THEM
*;
if ht1.find() eq 0 then do;
keep_processing = 1;
do while(keep_processing);
put auction_id= potential_buyer= ;* THIS GETS DOESNT GET REPEATED ANYMORE =);
ht1.has_next(result: keep_processing);
rc = ht1.find_next();
end;
end;
run;
You can solve it this way....but Rob's answer is better.
data results;
%Macro NoDuplicate;
Put auction_id= potential_buyer= ; * No Longer Duplicated;
%Mend noduplicate;
attrib potential_buyer length=$1;
set auction;
if _n_ eq 1 then do;
declare hash ht1(dataset:'buyers', multidata: 'y');
ht1.definekey('auction_id');
ht1.definedata('potential_buyer');
ht1.definedone();
call missing (potential_buyer);
end;
**
** LOOP THROUGH EACH POTENTIAL BUYER AND PROCESS THEM
*;
if ht1.find() eq 0 then do;
%NoDuplicate
ht1.has_next(result: ht1_has_more);
do while(ht1_has_more);
rc = ht1.find_next();
%NoDuplicate
ht1.has_next(result: ht1_has_more);
end;
end;
run;