evaluate one variable in an expression with two variables - r

I want to evaluate f with the mean=7
mean=7
f <- expression(-(x-mean)^2/2)
then get a new expression:
-(x-7)^2/2
How could I do it? Thanks.

Here is one way.
f <- as.call(f)
eval(substitute(substitute(expr, list(mean=7)), list(expr= f)))
# -(x - 7)^2/2()
If that construction feels mind-bending, you don't need to feel alone: even the guys who wrote the R manual call the problem you've posed here "a puzzle".

How about gsub?
avg <- 7
f <- expression(-(x-avg)^2/2)
f.new <- as.expression(gsub('avg',avg,f))
expression("-(x - 7)^2/2")
on a side note, you should avoid defining variables with names like mean or data since they are built in R functions.

In S-Plus, the substitute function has an extra evaluate argument, so there it is rather easy. Unfortunately, R is missing that argument...
# in S-Plus:
x <- expression(-(x-mean)^2/2)
substitute(x, list(mean=7), evaluate=TRUE)
#-(x - 7)^2/2
...so you must resort to something like what #JoshO'Brien suggests. Consider logging this as a feature request with R core ;-)

Related

post-processing in mice, replace one variable with another

I'm trying to perform multiple imputation on a dataset in R where I have two variables, one of which needs to be the same or greater than the other one. I have set up the method and the predictive matrix, but I am having trouble understanding how to configure the post-processing. The manual (or main paper - van Buuren and Groothuis-Oudshoorn, 2011) states (section 3.5): "The mice() function has an argument post that takes a vector of strings of R commands. These commands are parsed and evaluated just after the univariate imputation function returns, and thus provide a way to post-process the imputed values." There are a couple of examples, of which the second one seems most useful:
R> post["gen"] <- "imp[[j]][p$data$age[!r[,j]]<5,i] <- levels(boys$gen)[1]"
this suggests to me that I could do:
R> ini <- mice(cbind(boys), max = 0, print = FALSE)
R> post["A"] <- "imp[[j]][p$data$B[!r[,j]]>p$data$A[!r[,j]],i] <- levels(boys$A)[boys$B]"
However, this doesn't work (when I plot A v B, I get random scatter rather than the points being confined to one half of the graph where A >= B).
I have also tried using the ifdo() function, as suggested in another sx post:
post["A"] <- "ifdo(A < B), B"
However, it seems the ifdo() function is not yet implemented. I tried running the code suggested for inspiration but afraid my R programming skills are not that brilliant.
So, in summary, has anyone any advice about how to implement post-processing in mice such that value A >= value B in the final imputed datasets?
Ok, so I've found an answer to my own question - but maybe this isn't the best way to do it.
In FIMD, there is a suggestion to do this kind of thing outside the imputation process, which thus gives:
R> long <- mice::complete(imp, "long", include = TRUE)
R> long$A <- with(long, ifelse(B < A, B, A))
This seems to work, so I'm happy.

Function to rename values in r doesn't work [duplicate]

How do I modify an argument being passed to a function in R? In C++ this would be pass by reference.
g=4
abc <- function(x) {x<-5}
abc(g)
I would like g to be set to 5.
There are ways as #Dason showed, but really - you shouldn't!
The whole paradigm of R is to "pass by value". #Rory just posted the normal way to handle it - just return the modified value...
Environments are typically the only objects that can be passed by reference in R.
But lately new objects called reference classes have been added to R (they use environments). They can modify their values (but in a controlled way). You might want to look into using them if you really feel the need...
There has got to be a better way to do this but...
abc <- function(x){eval(parse(text = paste(substitute(x), "<<- 5")))}
g <- 4
abc(g)
g
gives the output
[1] 5
I have a solution similar to #Dason's, and I am curious if there are good reasons not to use this or if there are important pitfalls I should be aware of:
changeMe = function(x){
assign(deparse(substitute(x)), "changed", env=.GlobalEnv)
}
I think that #Dason's method is the only way to do it theoretically, but practically I think R's way already does it.
For example, when you do the following:
y <- c(1,2)
x <- y
x is really just a pointer to a the value c(1,2). Similarly, when you do
abc <- function(x) {x <- 5; x}
g <- abc(g)
It is not that you are spending time copying g to the function and then copying the result back into g. I think what R does with the code
g <- abc(g)
is:
The right side is looked at first. An environment for the function abc is set up.
A pointer is created in that environment called x.
x points to the same value that g points to.
Then x points to 5
The function returns the pointer x
g now points to the same value that x pointed to at the time of return.
Thus, it is not that there is a whole bunch of unnecessary copying of large options.
I hope that someone can confirm/correct this.
Am I missing something as to why you can't just do this?
g <- abc(g)

modify variable within R function

How do I modify an argument being passed to a function in R? In C++ this would be pass by reference.
g=4
abc <- function(x) {x<-5}
abc(g)
I would like g to be set to 5.
There are ways as #Dason showed, but really - you shouldn't!
The whole paradigm of R is to "pass by value". #Rory just posted the normal way to handle it - just return the modified value...
Environments are typically the only objects that can be passed by reference in R.
But lately new objects called reference classes have been added to R (they use environments). They can modify their values (but in a controlled way). You might want to look into using them if you really feel the need...
There has got to be a better way to do this but...
abc <- function(x){eval(parse(text = paste(substitute(x), "<<- 5")))}
g <- 4
abc(g)
g
gives the output
[1] 5
I have a solution similar to #Dason's, and I am curious if there are good reasons not to use this or if there are important pitfalls I should be aware of:
changeMe = function(x){
assign(deparse(substitute(x)), "changed", env=.GlobalEnv)
}
I think that #Dason's method is the only way to do it theoretically, but practically I think R's way already does it.
For example, when you do the following:
y <- c(1,2)
x <- y
x is really just a pointer to a the value c(1,2). Similarly, when you do
abc <- function(x) {x <- 5; x}
g <- abc(g)
It is not that you are spending time copying g to the function and then copying the result back into g. I think what R does with the code
g <- abc(g)
is:
The right side is looked at first. An environment for the function abc is set up.
A pointer is created in that environment called x.
x points to the same value that g points to.
Then x points to 5
The function returns the pointer x
g now points to the same value that x pointed to at the time of return.
Thus, it is not that there is a whole bunch of unnecessary copying of large options.
I hope that someone can confirm/correct this.
Am I missing something as to why you can't just do this?
g <- abc(g)

R loops+predict()

Hi I am a beginner with R (beginner programmer in general) and the help documents are absolutely killing me.
Suppose I have a matrix
[a,b,c,d]
I complete 2 regression of some kind a~b+c+d. My goal is to do a predict() for the variable "a" in test data set but c is full of NAs. How do I replace the NAs in c using the model I have created?
If it helps this is the kind of loop I would do in Octave,
for i:length(c)
if c(i)=NA
c(i)=some_function(b,d);<---- I tried to bold this but it came out wrong
end
Thanks
It's even easier than Seb suggests.
c[is.na(c)] <- mean(c, na.rm = TRUE)
Here, the mean function returns a single number (namely the mean of all the values in c that weren't NA). The assignment operator <- then assigns this number to every element in c where is.na returns TRUE.
As an alternative, try passing the argument na.action = na.omit to the predict function.
The direct translation of your Octave script is something like
for(i in seq_along(c))
{
if(is.na(c[i]))
{
c(i) <- some_function(b[i], d[i])
}
}
Note however that in R, just as in Octave, loops are usually inferior to operating directly on vectors.
do you mean something like
c <- ifelse(is.na(c), mean(c, na.rm=TRUE), c)
you may want to check the help files ?ifelse and ?is.na.

Creating formulas in R involving an arbitrary number of variables

I'm using the library poLCA. To use the main command of the library one has to create a formula as follows:
f <- cbind(V1,V2,V3)~1
After this a command is invoked:
poLCA(f,data0,...)
V1, V2, V3 are the names of variables in the dataset data0. I'm running a simulation and I need to change the formula several times. Sometimes it has 3 variables, sometimes 4, sometimes more.
If I try something like:
f <- cbind(get(names(data0)[1]),get(names(data0)[2]),get(names(data0)[3]))~1
it works fine. But then I have to know in advance how many variables I will use. I would like to define an arbitrary vector
vars0 <- c(1,5,17,21)
and then create the formula as follows
f<- cbind(get(names(data0)[var0]))
Unfortunaly I get an error. I suspect the answer may involve some form of apply but I still don't understand very well how this functions work. Thanks in advance for any help.
Using data from the examples in ?poLCA this (possibly hackish) idiom seems to work:
library(poLCA)
vec <- c(1,3,4)
M4 <- poLCA(do.call(cbind,values[,vec])~1,values,nclass = 1)
Edit
As Hadley points out in the comments, we're making this a bit more complicated than we need. In this case values is a data frame, not a matrix, so this:
M1 <- poLCA(values[,c(1,2,4)]~1,values,nclass = 1)
generates an error, but this:
M1 <- poLCA(as.matrix(values[,c(1,2,4)])~1,values,nclass = 1)
works fine. So you can just subset the columns as long as you wrap it in as.matrix.
#DWin mentioned building the formula with paste and as.formula. I thought I'd show you what that would look like using the election dataset.
library("poLCA")
data(election)
vec <- c(1,3,4)
f <- as.formula(paste("cbind(",paste(names(election)[vec],collapse=","),")~1",sep=""))

Resources