How to add constraints in proc mi using SAS - constraints

I'd like to use variable A to impute variable B (has 40 missing values) in SAS using multiple imputation, however, the imputed value of B has to be smaller than variable A. Could someone give me an idea on how to add this constraints in PROC MI? My thought is to use "do while (B< A)" but don't know where to add and how.
Many thanks.
My Basic code:
proc mi data = test seed=432156 nimpute = 5 out=MI
minimum = 0 .
maximum = 40 . /*range of B*/
MINMAXITER = 400;
mcmc ;
var B
A;
run;
end;

Related

Rounding two values together to a "fancy" value

I have a total price T and a customer total price C. I have an amount A. I have a single piece price S.
Let us say, my total price T is 11. The amount, how much items a package includes, is 6. So for 11, I get a package of 6. The single pice price S would be T/A, 11/6 = 1,83333. 1,833333 is not a fancy value. Here, I want to set the S manually to 1,84.
C is the total price I give to customers. So C is the fancy value of T.
Well, with S=1,84 I get a C of SxA=1,84x6=11,04. 11,04 is again an ugly value. Here, I want to set C to 11,10. So S would be 1,85 now.
In formula: C/A=S ....... where C should be rounded to 0,1 decimals and S to 0,01 decimals in parallel.
Is there a way to create a formula to calculate my C and S based on the input T and A ?
I'll use Python
from math import ceil
# 'round' is available in Python's default workspace.
# 'ceil' and 'floor' must be imported from the math library.
# Try each of those, to see which one you want.
#
# From the question, I assume you want 'ceil'.
T = 11
A = 6
S_fancy = ceil(T/A*100)/100
print('S_fancy =', S_fancy)
C_fancy = ceil(S_fancy*A*10)/10
print('C_fancy =', C_fancy)
S_even_fancier = ceil(C_fancy/A*100)/100
print('S_even_fancier =', S_even_fancier)
Running that, you'll get:
S_fancy = 1.84
C_fancy = 11.1
S_even_fancier = 1.85
I hope this helps. Again, try among these: round, ceil, and floor.

PROC SQL with GROUP command extremely slow. Why? Workaround possible?

I have a MACRO which takes a data set D and essentially outputs k disjoint datasets, D_1,...,D_k. The value k is not fixed and depends on properties of the data that are not known in advance. We can assume that k is not larger than 10, though.
The dataset D contains the variables x and y, and I want to overlay the line/scatter plots of x and y for each of D_i over each other. In my particular case x is time, and I want to see the output y for each D_i and compare them to each other.
Hopefully that was clear.
How can I do this? I don't know k in advance, so I need some sort of %do loop. But it doesn't seem that I can put a do loop inside "proc sgplot".
I might be able to make a macro that includes a very long series of commands, but I'm not sure.
How can I overlay these plots in SAS?
EDIT: I am including for reference why I am trying to avoid doing a PROC SGPLOT with the GROUP clause. I tried the following code and it is taking over 30 minutes to compute (I canceled the calculation after this, so I don't know how long it will actually take). PROC SQL runs quite quickly, the program is stuck on PROC SGPLOT.
proc sql;
create table dataset as select
date, product_code, sum(num_of_records) as total_rec
from &filename
group by product_code, data
order by product_code, date
;
quit;
PROC SGPLOT Data = dataset;
scatter x = date y = total_rec/group=product_code;
title "Total records by product code";
run;
The number of observations in the file is 76,000,000.
What you should do is either change your macro to produce one dataset with a variable d_i (or whatever you can logically name it) which identifies which dataset it would've gone to (or identifies it with whatever determines what dataset it would've gone to), or post-macro combine the datasets.
Then, you can use group to overlay your plots. So for example:
data my_data;
call streaminit(7);
do d_i = 1 to 5;
y = 10;
x = 0;
output;
do x = 1 to 10;
y + round(rand('Uniform')*3,.1)-1.5;
output;
end;
end;
run;
proc sgplot data=my_data;
series x=x y=y/group=d_i;
run;

NetLogo: the meaning of TO-REPORT explained for dummies?

I have a problem to understand the role of to-report and report in NetLogo, even it seems pretty useful and I can't really find a help written in "human style" language.
In NetLogo dictionnary http://ccl.northwestern.edu/netlogo/docs/dictionary.html#report I can find definitions for to-report :
to-report procedure-name
to-report procedure-name [input1 ...]
Used to begin a reporter procedure.
The body of the procedure should use report to report a value for the procedure. See report.
and for report:
report value
Immediately exits from the current to-report procedure and reports value as the result of that procedure. report and to-report are always used in conjunction with each other. See to-report for a discussion of how to use them.
So, it seems to-report and report calculate some value and report it.
Thus, when I try add
to-report average [a b c]
report (a + b + c) / 2
end
to my code, and then use the average variable somewhere in my code p.e.:
to go
...
print average
tick
end
I've got an error: AVERAGE expected 3 inputs. When I try to create my variables [a b c] in globals [a b c] I've got an error There is already a global variable called A.
If I define my variables [a b c] within to-report procedure:
to-report average [a b c]
set a 1
set b 2
set c 3
report (a + b + c) / 2
end
My error is again AVERAGE expected 3 inputs.
Thus, how can I simply test the usefulness of to-report procedure? And where to place it correctly in my code to see what it is really doing? From Urban Suite - Economic Disparity (http://ccl.northwestern.edu/netlogo/models/UrbanSuite-EconomicDisparity) I see that to-report is used to calculate values related to each patch:
to-report patch-utility-for-poor
report ( ( 1 / (sddist / 100 + 0.1) ) ^ ( 1 - poor-price-priority ) ) * ( ( 1 / price ) ^ ( 1 + poor-price-priority ) )
end
however this reported value is not directly defined as patch variable which increase my confusion...
Thank you !
A function can take some input (usually one or more variables or values) and return some output (usually a single value). You can specify that a function returns a value using to-report in your function header and report returns the actual value.
Your error is due to the fact that you never passed in arguments to your average function
to go
...
print average
tick
end
should be
to go
...
print average 5 2 3 ;;a = 5, b = 2, c =3
tick
end
Inside your average function, you should not reassign values of a,b,and c.
You should use report whenever you want to return a result from a function.

Upon defining a new constant update the names of other constants

I have some logic that test for changes in values. As a certain threshold is reached a new constant claims the first spot, which is s0, and the rest are "pushed up" meaning the first becomes the second and the second becomes the third...Here is an example:
the initial state of my data might look like this:
s3 <- 7
s2 <- 5
s1 <- 4
s0 <- 2
Some test is run and s0 is redefined to a lower value like s0 = 1. at that time my variables need to be shifted up and a new "level" added as follows:
s4 <- 7
s3 <- 5
s2 <- 4
s1 <- 2
s0 <- 1
I know how to redefine s0 but I am not sure how to adjust the name of the other constants accordingly. Any help would be greatly appreciated.
You should have all these values in one vector, instead of stored as separate objects.
Initial state:
state <- c(2, 4, 5, 7)
Update state if new_value is less than all previous values:
if (new_value < min(state)) state <- sort(c(state, new_value))
Then you can always reference the current minimum value by state[1].
Not very efficient and I don't recommend this method. As commented/answered you should put your variables in the same structure ( list or vector). I show it just because the solution use some useful functions to deal with variable defines in the global environment ( switch from separate variables to a list and vice versa) .
That's said, here I define a function that do the job. It defines a new s0 and shift the name of others variables. Internally the function create a list (by gathering variable using some pattern) , shift its names and return again a separate variable to the global environment.
push <- function(value){
## call of gloabl variable twice here , once for ls and for mget
## not really elegant!
oo = mget(ls(pattern='s[0-9]+',envir=.GlobalEnv),envir=.GlobalEnv)
list2env(setNames(c(value,oo),c(names(oo),paste0('s',length(oo)))),
envir=.GlobalEnv)
}
Then you can redefine a new s0 like this :
push(1)
You test the result :
unlist(mget(ls(pattern='s[0-9]+')))
s0 s1 s2 s3 s4
1 2 4 5 7

SAS - select n equally spaced values between a and b

How would you do to translate the following R-command in SAS
sequence <- seq(from=a, to=b, length.out=n)
In other words, how would you do in SAS to select n equally spaced values between a and b?
You could easily replicate this in SAS with a DO loop, having previously stored the required values in macro variables. I'm not sure in what context you are using this, however the code below will create a dataset with the required number of rows and equally spaced values. Hopefully this will point you in the right direction.
%let n=5;
%let a=1;
%let b=2;
%let x=%sysevalf((&b.-&a.)/(&n.-1));
%put n = &n.
a = &a.
b = &b.
x = &x.;
data test;
do i=&a. to &b. by &x.;
output;
end;
run;

Resources