I have case data of TB with the following variables: age, case_status, city, year, month, and race from the past 11 years (2006-2017). I would like to make a line graph of month by year with cases as the y axis, month as the x axis, and then a new line for each year. But I am unsure how to count the observations. I am new with sas so I apologize in advance if I am doing this wrong but this is what I had in mind:
PROC SORT DATA= TB
BY YEAR;
RUN;
DATA YEAR;
SET TB;
COUNT + 1;
BY YEAR;
IF FIRST.YEAR THEN COUNT =1;
RUN;
PROC SGPLOT DATA = YEAR;
SERIES X = Month Y = COUNT;
SERIES X = Year Y = COUNT;
TITLE 'Temporal Patterns of TB from 2006-2017';
RUN;
However I am getting a blank output with this code.
Any feedback/help would be greatly appreciated!
Thank you in advance!
This is much easier if you provide sample data. Assuming you're trying to do something similar to graphing a line chart, I'll use the SASHELP.PRDSALE data set as a demo. I use the weight statement here to add up the amounts but you won't need that.
First run PROC FREQ to get the summary statistics and then use the output from that in SGPLOT to get the graph. Your SG code is close.
proc freq data=sashelp.prdsale noprint;
table year*month / out=summary_stats;
weight actual;
run;
proc sgplot data=summary_stats;
series x=month y=count / group=year;
run;
There are a variety of groupings, bys, or panelbys that might better communicate or better present the count variations of each year.
For example
data have (label="Disease cases");
do casedate = '01jan2006'd to '31dec2017'd;
year = year(casedate);
month = month(casedate);
do day_n = 1 to 30*ranuni(123);
case_status = floor(4*ranuni(123));
city_id = ceil(100*ranuni(123));
race = ceil(6*ranuni(123));
output;
end;
end;
run;
proc means noprint data=have;
class year month;
types year*month;
output out=counts n=freq;
run;
data counts;
set counts;
yearmo = mdy(month,1,year);
format yearmo yymmd7.;
run;
options orientation=landscape papersize=a4;
ods html close;
ods html;
proc sgplot data=counts;
series x=month y=freq / group=year;
xaxis type=discrete;
where _type_ = 3;
run;
proc sgplot data=counts;
series x=yearmo y=freq / group=year;
xaxis type=discrete display=(novalues);
where _type_ = 3;
run;
proc sgplot data=counts;
by year;
series x=yearmo y=freq / group=year;
xaxis type=discrete ;
where _type_ = 3;
run;
proc sgpanel data=counts;
panelby year;
series x=month y=freq ;
colaxis type=discrete;
where _type_ = 3;
run;
proc sgpanel data=counts;
panelby year / columns=4 ;
series x=month y=freq ;
colaxis type=discrete display=(novalues);
where _type_ = 3;
run;
Related
i have a statistical df organised by years, for example 5 and i'll would like to export my table in 5 different Excel Workbook, how can i do? First i used SAS and i did it with macros like this
%let elenco = gen feb mar apr giu;
%macro export;
%local i;
%let i = 0;
%do %until (%scan(&elenco,&i+1) = );
%let i = %eval(&i+1);
%let ele=%scan(&elenco,&i);
data month_&i;
set tot;
where month="&ele";
run;
proc export data = month_&i
outfile = "C:\prova_&i.xls"
dbms = excelcs replace;
sheet="month_&i";
run;
%end;
Thanks in advance
Ods excel does not work well with big datasets. But it could be worth a try.
* Sort by your sheet-variable ;
proc sort data=sashelp.class out=work.class;
by age;
run;
ods excel
file='c:\temp\mult_sheet.xlsx'
options(sheet_interval='output' sheet_name='#byval1');
proc print data=work.class label noobs;
by age;
run;
ods excel close;
I am trying to create a plot in SAS that shows the number of lab results individual laboratories submit each week over the course of the year. I have managed to plot this out, but the plot skips weeks in which the laboratory submitted zero lab results, i.e. the count would be zero.
data testlabs;
input labdate:datetime22.3 labname$;
cards;
08JAN2019:09:40:37.000 A
07AUG2019:09:36:16.000 A
08AUG2019:13:16:51.000 B
21APR2019:09:33:54.000 B
22APR2016:12:47:51.000 B
08JUN2019:09:25:50.000 B
09JAN2019:13:48:24.000 A
10JAN2019:12:21:02.000 C
19FEB2019:14:40:39.000 C
09MAR2019:09:38:48.000 C
20NOV2019:09:50:30.000 A
07AUG2019:14:03:55.000 A
09MAR2019:09:31:39.000 B
09JUN2019:12:11:29.000 B
04APR2019:17:00:00.000 B
26NOV2019:13:05:28.000 C
09JUN2019:09:38:50.000 C
06MAY2019:12:44:20.000 C
08MAY2019:10:14:52.000 A
08JUN2019:08:43:17.000 A
02DEC2019:12:26:51.000 A
05MAY2019:12:53:17.000 B
06SEP2019:09:52:36.000 C
10MAR2019:09:31:41.000 A
08MAR2019:09:40:40.000 C
14JUL2019:09:38:59.000 B
08JAN2019:10:40:37.000 A
;
run;
proc sql;
create table testlabs1 as
select distinct count(*) as lab_count,
labname,
put(datepart(labdate),weeku6.)as wk
from testlabs
where year(datepart(labdate))>2018
group by wk, labname
order by labname, wk
;quit;
symbol color=blue interpol=join;
proc gplot data=testlabs1;
plot lab_count*(wk);
by labname;
run;quit;
This creates three plots with points only on weeks with at least one lab. I would like to plot all 52 weeks of the year, including weeks where the count is zero.
You need a process that can create something from nothing. The COMPLETETYPES option in SUMMARY/MEANS will do that.
data testlabs;
input labdate:datetime22.3 labname$;
lbdate = datepart(labdate);
format lbdate weeku6.;
cards;
08JAN2019:09:40:37.000 A
07AUG2019:09:36:16.000 A
08AUG2019:13:16:51.000 B
21APR2019:09:33:54.000 B
22APR2016:12:47:51.000 B
08JUN2019:09:25:50.000 B
09JAN2019:13:48:24.000 A
10JAN2019:12:21:02.000 C
19FEB2019:14:40:39.000 C
09MAR2019:09:38:48.000 C
20NOV2019:09:50:30.000 A
07AUG2019:14:03:55.000 A
09MAR2019:09:31:39.000 B
09JUN2019:12:11:29.000 B
04APR2019:17:00:00.000 B
26NOV2019:13:05:28.000 C
09JUN2019:09:38:50.000 C
06MAY2019:12:44:20.000 C
08MAY2019:10:14:52.000 A
08JUN2019:08:43:17.000 A
02DEC2019:12:26:51.000 A
05MAY2019:12:53:17.000 B
06SEP2019:09:52:36.000 C
10MAR2019:09:31:41.000 A
08MAR2019:09:40:40.000 C
14JUL2019:09:38:59.000 B
08JAN2019:10:40:37.000 A
;;;;
run;
proc print;
run;
proc summary data=testlabs completetypes nway;
class labname lbdate / mlf;
output out=testlabs2(drop=_type_ rename=(_freq_=lab_count));
run;
proc print;
run;
You will want to join your aggregate with a cross join of the wk x labname combinations.
The join will supplement the aggregate combination coverage, forcing a full coverage.
Example:
data weeks;
do week = intnx ('week', '01jan2019'd, 0) by 7 while (year(week) <= 2019);
output;
end;
format week weeku6.;
run;
data labs;
do labname = 'A', 'B', 'C', 'D'; output; end;
run;
proc sql;
create table testlabs1 as
select
labs.labname,
year(weeks.week) as year,
weeks.week,
coalesce(aggregate.lab_count,0) as lab_count
from
labs
cross join
weeks
left join
(
select distinct count(*) as lab_count,
labname,
intnx('year', datepart(labdate), 0) as yr format=year4.,
intnx('week', datepart(labdate), 0) as wk format=weeku6.
from testlabs
where year(datepart(labdate))>2018
group by yr, wk, labname
) aggregate
on aggregate.labname = labs.labname
& aggregate.wk = weeks.week
order by
year, labname, week
;
quit;
symbol color=blue interpol=join;
proc gplot data=testlabs1;
plot lab_count*(week);
by year labname;
where year = 2019;
run;quit;
I currently have a macro which moves my dataset from one branch to another. In each movement, the date is updated. However, I am having trouble with removing the old date autonomously before inputting the new date, any advice?
The code is:
%Macro Merge_Branch(Branch = , Filename = , Library = );
%Let Timestamp = %sysfunc(putn(%sysfunc(date()),yymmddn8.));
%if &Branch. = Latest_Commit %then %do;
Data LC._&Timestamp._&Filename.;
set &Library..&Filename.;
run;
%end;
%else %if &Branch. = Staging %then %do;
%Let Filename = _20180909_Financial_Input;
%Let Filename_temp = %scan(&Filename.,2,'_');
%Let Date_String = %scan(&Filename.,1,'_');
/* this is the section where I get stuck the dash is a subtraction i.e. I want to remove the date and just have the string*/
%Let Filename_Stg = &Filename_temp - &Date_String;
Data Stg._&Timestamp._&Filename_Stg.;
set LC.&Filename.;
run;
%end;
%mend;
Input data can be made like this
data LC._20180909_Financial_Input;
var var1;
datalines;
1
1
1
1
;
run;
%Macro Merge_Branch(Branch = Staging , Filename = Financial_Input, Library = LC );
/*Note, in this macro the library is not important because it will always move from LC to STG*/
Your new file name is resolving to this.
Data Stg._20181009_Financial - 20180909;
That is not a SASNAME.
I need to do a loop over a list of variables but excluding some of those.
I wanted to add a prefix to each variable except for those.
I wrote a macro:
%macro addprefijo(tabla);
proc contents data = labo2.&tabla.;
title 'before renaming'; run;
proc sql;
select nvar into :num_vars
from dictionary.tables
where libname='LABO2' and memname="&tabla";
%put 'num_vars' &num_vars;
select distinct(name) into :var1-:var%trim(%left(&num_vars))
from dictionary.columns
where libname='LABO2' and memname="&tabla" /*and name not in ('cid', 'COUNTY', 'ESTADO') */;
quit;
proc datasets library=LABO2;
modify &tabla;
rename
%do i=1 %to &num_vars.;
&&var&i = &tabla._&&var&i.
%end;
;
quit;
run;
proc contents data=LABO2.&tabla.;
title' after renaming';
run;
%mend;
%addprefijo(A_CLI);
I tried what is commented but crashes and with out it adds the prefix to all the variables. Those 3 variables are not in all the tables.
How can I solve it?
Thanks
The following should work. Using proc contents out = rather than dictionary tables. Also using the sql separated by syntax to create a space separated list of variables rather than indervidual indexed variables.
%macro addprefijo(tabla);
proc contents
data = labo2.&tabla.
out = _a_contents
noprint;
run;
proc sql noprint;
select NAME
into :vars separated by " "
from _a_contents
where NAME not in ('cid', 'COUNTY', 'ESTADO');
quit;
proc datasets library = labo2;
modify &tabla.;
rename
%do i = 1 %to %sysfunc(countw(&vars., %str( )));
%let var = %scan(&vars., &i., %str( ));
&var. = &tabla._&var.
%end;;
quit;
%mend;
%addprefijo(A_CLI);
Reformulation: I want to overlay the following two Graphs:
data mystates;
set maps.states;
myvar = state<30;
run;
pattern1 c=black v=m3n0;
pattern2 c=black v=m3n90;
%let except = (where=(state not in (72,2,15)));
proc gmap map=mystates &except. data=mystates &except.;
id state;
choro myvar;
run;
quit;
And
goptions reset=all;
%let no = 48;
proc gmap map=maps.counties (where =(state=&no.)) data=maps.counties (where =(state=&no.));
id county;
choro county;
run;
quit;
As the granularity is different I cannot simply use 2 choro statements in teh proc gmap: Note how the order of the two choro statements matters. One always overdraws the other.
data mytry;
set mystates &except. maps.county (where =(state=&no.));
run;
pattern1 c=black v=m3n0;
pattern2 c=black v=m3n90;
proc gmap map=mytry data=mytry all;
id state county;
choro myvar;
choro county;
run;
quit;
How can I display both the lines and the colors at the same time?