I am currently working on converting a SAS script to R. As I am relatively new to SAS, I am having a hard time understanding the following statement -
VARS=date id sales units
/* create lag event variable names to be used in the RETAIN statement */
%let vars_l = lag_%sysfunc(tranwrd(&vars,%str( ),%str( lag_)));
Here, date, id etc. are all variables present in my current data set. I understand the function tranwrd is used to replace a value by another value in a Character variable. In this case, it creates new items as -
vars_l = lag_date lag_id lag_sales lag_units
Am I right? What is vars_l? Is it a list? Or are these variables that are added to my dataset?
Also what is the use of lag before %sysfunc in the code below for?
%let vars_l = lag_%sysfunc(tranwrd(&vars,%str( ),%str( lag_)));
Are lagged variables created at all or just variables with no values prefixed with lag_ is created?
I don't have access to SAS or the datasets to try and check the result. Any help on this would be great. Thanks!
The following code is basically creating macro variables to hold a list of variables to process. (Note: macros in SAS are mere text replacements!)
%let VARS=date id sales units ;
/* create lag event variable names to be used in the RETAIN statement */
%let vars_l = lag_%sysfunc(tranwrd(&vars,%str( ),%str( lag_)));
If you run the following code (to see what exactly is stored in VARS & vars_l macros you see the following in the log:
31 %put VARS::&VARS.;
SYMBOLGEN: Macro variable VARS resolves to date id sales units
VARS::date id sales units
32 %put VARS_l::&VARS_L.;
SYMBOLGEN: Macro variable VARS_L resolves to lag_date lag_id lag_sales lag_units
VARS_l::lag_date lag_id lag_sales lag_units
In R the equvialent would be the following:
VARS<-c("date", "id", "sales", "units" )
vars_l<-paste("lag_",VARS, sep="")
The second vars_l macro assignment is just adding lag_ to the begining of each space delimited value in VARS macro variable. Since the first value does not begin with a space, if you omit the lag_ at the begining of %let vars_l = lag_%sysfunc(tranwrd(&vars,%str( ),%str( lag_))); you will get the following stored in vars_l: date lag_id lag_sales lag_units
From the code I can see there are no variables created just yet, but you should find a data step later on which does that. The mention of RETAIN statement in the comments suggests just that.
Related
All,
I am trying to nest a macro within a macro but am unsuccessful. The Start_Cycle variable is set every few months and updated manually. I want to create a start_point variable that goes back 6 months and I successfully created it, however, the output includes a space after the %STR as seen below
%let start_cycle = '01JUL2022:00:00:00'dt; /set to beginning month of this cycle/
%let start_point = %STR(%')%sysfunc(intnx(DTMONTH,&start_cycle.,-6,b),datetime19.)%STR(%')dt;
%put &start_point;
Output below
%let start_cycle = '01JUL2022:00:00:00'dt; /set to beginning month of this cycle/
%let start_point =
%STR(%')%sysfunc(intnx(DTMONTH,&start_cycle.,-6,b),datetime19.)%STR(%')dt;
%put &start_point; ' 01JAN2022:00:00:00'dt
^^Does anyone know why there is a space after the single quote? ' 01JAN2022:00:00:00'dt
Since it runs without issues, I decided to create another macro variable that does the same thing, but instead, the output needs to be converted to a character string in this format below (current Macro)
%let start_pointSales = '2022/01';
I tried multiple times using different ways of going about this, spent many hours looking through forum from SAS Communities to StackOverflow and even SAS youtube videos to no luck. Anyone have any luck in combating this?
To-Be Macro:
%let NEW_start_pointSales = %sysfunc(intnx(month,&start_cycle.,-6,b),yymms.);
%put &NEW_start_pointSales;
The NEW_start_pointSales will be used in the WHERE clause with Data type Varchar (21) using PROC SQL.
left join EDWSALE.DSCOE_LLD (where=( &NEW_start_pointSales. <= SALES_MONTH < &end_pointSales.
Output Error below:
NOTE: Writing TAGSETS.SASREPORT13(EGSR) Body file: EGSR
24
25 GOPTIONS ACCESSIBLE;
WARNING: An argument to the function INTNX referenced by the %SYSFUNC or %QSYSFUNC macro function is out of range.
NOTE: Mathematical operations could not be performed during %SYSFUNC function execution. The result of the operations have been set
to a missing value.
Any help is appreciated!
You cannot apply a format, YYMMS, designed to work on DAYS to a value that is in SECONDS. If you want to use a date format on a datetime value you need to convert the value from seconds to days. You could use the DATEPART() function or just divide by the number of seconds in a day.
Why are you trying to compare a variable that is CHARACTER to numbers? If you generate a macro variable with a string like 2022/01 and then use it to generate code like:
&NEW_start_pointSales. <= SALES_MONTH
that will result in code like:
2022/01 <= SALES_MONTH
which is comparing the number 2022 (any number divided by 1 is itself) to the value SALES_MONTH, which you just said was a character string.
What types of strings does SALES_MONTH contain? That will determine how (or whether) you can make inequality tests against it.
PS There is space in the output of DATETIME19 because that is how that particular format works. Note that there is a bug in the DATETIME format and you cannot use DATETIME18. to produce a string with four digits for the year even though only 18 characters would be used. The extra space does not matter to using the string in a datetime literal as the DATETIME informat will recognize the string even with the extra space.
I'm using a SQLite3 database, and I have a table that looks like this:
The database is quite big and running queries is very slow. I'm trying to speed up the process by indexing some of the columns. One of the columns that I want to index is the QUOTE_DATETIME column.
Problem: I want to index by date (YYYY-MM-DD) only, not by date and time (YYYY-MM-DD HH:MM:SS), which is the data I currently have in QUOTE_DATETIME.
Question: How can I use CREATE INDEX to create an index that uses only dates in the format YYYY-MM-DD? Should I split QUOTE_DATETIME into 2 columns: QUOTE_DATE and QUOTE_TIME? If so, how can I do that? Is there an easier solution?
Thanks for helping! :D
Attempt 1: I tried running CREATE INDEX id ON DATA (date(QUOTE_DATETIME)) but I got the error Error: non-deterministic functions prohibited in index expressions.
Attempt 2: I ran ALTER TABLE data ADD COLUMN QUOTE_DATE TEXT to create a new column to hold the date only. And then INSERT INTO data(QUOTE_DATE) SELECT date(QUOTE_DATETIME) FROM data. The date(QUOTE_DATETIME) should convert the date + time to only date, and the INSERT INTO should add the new values to QUOTE_DATE. However, it doesn't work and I don't know why. The new column ends up not having anything added to it.
Expression indexes must not use functions that might change their return value based on data not mentioned in the function call itself. The date() function is such a function because it might use the current time zone setting.
However, in SQLite 3.20 or later, you can use date() in indexes as long as you are not using any time zone modifiers.
INSERT adds new rows. To modify existing rows, use UPDATE:
UPDATE Data SET Quote_Date = date(Quote_DateTime);
This is an incredibly simple question. If I have a variable defined by a word such as:
variable1 = 'returns'
This would be something that I could later edit (i.e. change 'returns' to 'revenue').
I then want to execute a particular block of code provided that the variable is 'returns'. Why does R tell me that I cannot use an = sign when I try:
if(variable1 = 'returns'){
#code would go here
}
I want to create a unix utility to insert 1 row into a sas dataset. When run, this scipt will ask user to insert value for each variable in the dataset(preferabely telling him the type and length of the variable). It will then pass these values to SAS using EXPORT command and then SAS will create macro variable for these variables and using 'proc sql; insert into' will insert the value into dataset.
data raw_str;
/* init PDV */
if 0 then
set tracking_data;
/* programmatic structure to enable addressing of vars */
array a_c(*) _character_;
array a_n(*) _numeric_;
run;
now raw_str will variables whose type and length be same as that of the tracking data
proc sql noprint;
select distinct name
into : varlist separated by ' '
from dictionary.columns
where libname='WORK'
and memname='raw_str';
quit;
then i want to pass this list to unix, from there i will ask user to enter value for these variables and then i will append these values into the tracking_data using.
problem is with passing values from unix to sas and creating macro variables for these values
I can also pass the length and type of variable to the front end, telling user to pass value which matched the type and length of raw_str dataset
proc sql;
insert into raw_str
values (&val1, &val2, &val3...);
quit;
finally i can use proc append to append it into the original data
Here's one possible approach for getting user-entered values from UNIX into SAS:
Have your UNIX shell script write out the user-entered values into a (consistently formatted) temporary text file, e.g. CSV
Write a SAS data step that can read the text file and import the values into the required formats. You can run proc import and look at the log to get an idea of the sort of code to use.
Have the script call SAS via the command line, telling it to run a program containing the data step you wrote.
So, I have created a variable "batch" with datatype datetime. Now my OLEBD source has a column "addDate" eg 2012-05-18 11:11:17.470 so does empty destination which is to be populated.
now this column addDate has many dates and I want to copy all dates which are "2012-05-18 11:11:17.470"
When I put value of the variable as this date, it automatically changes to mm/dd/yyyy hh;mm AM format and hence in my conditional split transformation, it couldn't match the date with the variable and hence no records are getting copied to the destination !!
Where exactly is the problem?
Thanks!
I had this issue and the best solution I found is not “pretty”.
Basically you need to change the “expression” of the variable and the “evaluate as expression” to true (otherwise it will ignore the value on expression).
The secret is (and kind of the reason I said it is not a pretty solution) to create a second variable to evaluate the expression of the first variable because you can’t change the value of a variable based on a expression.
So let’s say your variable is called “DateVariable” and you have 23/05/2012, create a variable called “DateVar2” for example and set its expression to
(DT_WSTR,4)YEAR(#[User::DateVariable]) + "/"+RIGHT("0" +
(DT_WSTR,2)MONTH(#[User::DateVariable]),2) + "/" + RIGHT("0" +
(DT_WSTR,2)DAY(#[User::DateVariable]),2)
That will give you 2012/05/23
Just keep going to get the date on the format you want
I found the easier solution. Select datatype as string. put any desired value.
Before conditional split, you need data conversion transformation.
convert it into DT_DBTIMESTAMP then run the package.
It works!