I am looking to make a function where the user can enter their own model selection function as an input to be used. I'm having trouble with finding the answer as I keep getting search results about how to make a simple R function, as opposed to an input much like the apply family.
Here is an example similar to what I am looking for but not quite:
simple<- function(mod, FUN){
switch(FUN,
AIC = AIC(mod),
BIC = BIC(mod))
}
simple(lm(rnorm(100) ~ rnorm(100,4)), "AIC")
The above code runs but I must plan for all of the possible functions and write them within switch. I also am forced to write "AIC" as opposed to simply AIC.
Any thoughts to how I can create the function I am looking for? Let me know if you need additional information.
Something like this:
simple<- function(mod, FUN){
FUN <- match.fun(FUN)
FUN(mod)
}
simple(lm(rnorm(100) ~ rnorm(100,4)),FUN = "BIC")
match.fun accepts a function, symbol or character, so there is some flexibility in how the FUN argument is passed.
An option for passing multiple functions, as mentioned in the comments:
simple <- function(mod, FUN){
FUNS <- lapply(FUN,match.fun)
lapply(FUNS,function(fun) fun(mod))
}
Related
I'm running some Surv() functions, and one thing I do not like, or understand, is why this function does not take a "data=" argument. This is annoying because I want to perform the same Surv() function on the same data frame but filtered by different criteria each time.
So for example, my data frame is called "ikt" and I want to filter by "donor_type2=='LD'" and also use a strata variable "plan 2". I tried the following but it didn't work:
library(survival)
library(dplyr)
ikt<-data.frame(organ_yrs=(seq(1,20)),
organ_status=rep(c(0,0,1,1),each=5),
plan2=rep(c('A','B','A','B'),each=5),
donor_type2=rep(c('LD','DD'),each=10) )
organ_surv_func<-function(data,criteria,strata) {
data2<-filter(data,criteria)
Surv(data2$organ_yrs,data2$organ_status)~data2$strata
}
organ_surv_func(ikt,donor_type2=='LD',plan2)
Error in filter_impl(.data, quo) : object 'donor_type2' not found
I'm coming from a SAS background so that's probably why I'm thinking this should work and it doesn't...
I looked up something about sapply(), but I don't think that works when the function doesn't have the data= option.
Also the reason I need the Surv() object and not just survfit(Surv()) (which would let me use data=) is because I'm also using survdiff() for log-rank tests, which takes in the Surv() object as it's main argument:
lr<-function (surv) {
round(1-pchisq(survdiff(surv)$chisq,length(survfit(surv)$strata)-1),3)
}
Thanks for any help you can provide.
I'm writing this "answer" to caution you against proceeding down the path you seem to be following. The Surv function is really intended to be used as the LHS of a formula defined within one of the survival package functions. You should avoid using constructions like:
Surv(data2$organ_yrs,data2$organ_status)~data2$strata
For one thing it's needlessly verbose, but more importantly, it will prevent the use of predict when it comes time to match up names to formals. The survdiff and the other survival functions all have both a "data" argument as well as a "subset" argument. The subset function should allow you to avoid using filter.
organ_surv_func<-function(data, covar) {
form = as.formula(substitute( Surv(organ_yrs, organ_status) ~ covar, list(covar=covar) ) )
survdiff(form, data=data)
}
# although I think running surdiff in a for-loop might be easier,
# as it would involve fewer tricky language constructs
organ_surv_func( subset(ikt, (donor_type2=='LD')), covar=quote(plan2))
If you assign the output of survfit to a named variable, you will be able to more economically access chisq and strata:
myfit <- organ_surv_func( subset(ikt, (donor_type2=='LD')), covar=quote(plan2))
my.lr.test<-function (myfit) {
round(1-pchisq(myfit$chisq, length(myfit$strata)-1), 3)
}
my.lr.test(myfit) # not going to be useful with that dataset.
Can anyone help me understand what a wrapper function in r is? I would really appreciate if you could explain it with the help of examples on building one's own wrapper function and when to use one.
Thanks in advance.
Say I want to use mean() but I want to set some default arguments and my usecase doesn't allow me to add additional arguments when I'm actually calling mean().
I could create a wrapper function:
mean_noNA <- function(x) {
return(mean(x, na.rm = T))
}
mean_noNA is a wrapper for mean() where we have set na.rm to TRUE.
Now we could use mean_noNA(x) the same as mean(x, na.rm = T).
Wrapper functions occur in any programming language, and they just mean that you are "wrapping" one function inside another function that alters how it works in some useful way. When we refer to a "wrapper" function we mean a function that the main purpose of the function is to call some internal function; there may be some alteration or additional computation in the wrapper, but this is sufficiently minor that the original function constitutes the bulk of the computation.
As an example, consider the following wrapper function for the log function in R. One of the drawbacks of the original function is that it does not work properly for negative numeric inputs (it gives NaN with a warning message). We can remedy this by creating a "wrapper" function that turns it into the complex logarithm:
Log <- function(x, base = exp(1)) {
LOG <- base::log(as.complex(x), base = base)
if (all(Im(LOG) == 0)) { LOG <- Re(LOG) }
LOG }
The function Log is a "wrapper" for log that adjusts it so that it will now accept numeric or complex inputs, including negative numeric inputs. In the event that it receives a non-negative numeric or a complex input it gives the same output the original log function. However, if it is given a negative numeric input it gives the complex output that should be returned by the complex logarithm.
Good evening,
I have a quick question about mle2() syntax. I have an optim() routine that optimizes a log-Likelihood function of the following form (and this runs thousands of times, so i don't want to change much):
ObjFun <- function(p, X, y, ModelFunction, CostFunction)
where p is a vector of 1-8 parameters, X is the input matrix, y is the response/independent variable vector, ModelFunction is a function specifying the shape of a model, and CostFunction specifies the cost/loss function the model likelihood should incorporate during the optimization. The code works fine with optim() or maxLik [maxLik] wit the following code:
maxLik(ObjFun, method="NM", start=c(1,2,3,4,5),
X=conc, y=y, ModelFunction=Model1, CostFunction=GCost)
constrOptim(init.par, ObjFun, ui = Ui, ci = Ci, method = "Nelder-Mead",
control = control1, X=X, y=y, ModelFunction= get(Model1),
CostFunction= get(GCost))
##i'm obviously using constrained optimization in my actual problem.
But I can't get it to work easily with mle() or mle2(). I just want to run it in mle2 to compare the likelihood profile with my own profiling function. Before i go digging through the mle2() code, does anyone know if it's my unnamed parameter vector or the additional arguments that make the function crash? I thought it was the former, but i am confused because the error it's giving me is:
mle2(ObjFun, method="Nelder-Mead"", start=c(1,2,3,4,5),
X=X, y=y, ModelFunction=Model1, CostFunction=GCost)
"minuslogl() : argument "ModelFunction" is missing, with no default"
and that argument is clearly specified. I couldn't really find any examples with additional parameters in the vignettes.
PS:
I would have just commented on this post as it's obviously related:
Creating function arguments from a named list (with an application to stats4::mle)
But I don't have enough points to comment.
UPDATE:
mle2() has options vecpar and parnames options that should allow one to specify "for compatibility with Optim", according to Ben's vignette. I simplified the objective function (the log-likelihood) to include hard-coded loss and model examples. The result looks like this:
mod2 <- mle2(ObjFun2, method="Nelder-Mead", start=list(1,2,3,4,5),
vecpar=T, parnames=c("A", "B", "C", "D", "E"))
However this still doesn't work. I have a hard time troubleshooting it because i don't know how to refer to the parameters inside the objective function after the call from mle2. If i include debugging commands such as print(p[2]) inside the ObjFun2, it returns NULL. So the parameter is no longer in a vector form. However print(A) forces the function to crush. Again, I can't find any working examples of this online, so maybe I'm missing something very simple.
I can't use the parameters argument as Ben suggested in the above link because my models are not linear.
I attempted to look inside the mle2() but got stuck on a call to calc_mle2_function().
As in other functional languages, returning a function is a common case in R. for example, after training a model you'd like to return a "predictor" object, which is essentially a function, that given new data, returns predictions. There are other cases when this is useful, of course.
My question is when does the binding (e.g. evaluation) of values within the returned function occur.
As a simple example, suppose I want to have a list of three functions, each is slightly different based on a parameter whose value I set at the time of the creation of the function. Here is a simple code for this:
function.list = list()
for (i in 1:3) function.list[[i]] = function(x) x+i
So now I have three functions. Ideally, the first one returns x+1, the second computes x+2 and the third computes x+3
so I would expect:
function.list[[1]] (3) = 4
function.list[[2]] (3) = 5
etc.
Unfortunately, this doesn't happen and all the functions in the list above compute the same x+3. my question is why? why does the binding of the value of i is so late, and hence the same for all the functions in the list? How can I work around this?
EDIT:
rawr's link to a similar question was insightful, and I thought it solved the problem. Here is the link:
Explain a lazy evaluation quirk
however, I checked the code I gave above, with the fix suggested there, and it still doesn't work. Certainly, I miss something very basic here. can anyone tell me what is it? here is the "fixed" code (that still doesn't work)
function.list = list()
for (i in 1:3) { force(i); function.list[[i]] = function(x) x+i}
Still function.list[[1]] (3) gives 6 and not 4 as expected.
I also tried the following (e.g. putting the force() inside the function)
function.list = list()
for (i in 1:3) function.list[[i]] = function(x) {force(i);x+i}
what's going on?
Here's a solution with a for loop, using R 3.1:
> makeadd=function(i){force(i);function(x){x+i}}
> for (i in 1:3) { function.list[[i]] = makeadd(i)}
> rm(i) # not necessary but causes errors if we're actually using that `i`
> function.list[[1]](1)
[1] 2
> function.list[[2]](1)
[1] 3
The makeadd function creates the adding function in a context with a local i, which is why this works. It would be interesting to know if this works without the force in R 3.2. I always use the force, Luke....
I've been trying to learn more about environments in R. Through reading, it seemed that I should be able to use functions like with() and transform() to modify variables in a data.frame as if I was operating within that object's environment. So, I thought the following might work:
X <- expand.grid(
Cond=c("baseline","perceptual","semantic"),
Age=c("child","adult"),
Gender=c("male","female")
)
Z <- transform(X,
contrasts(Cond) <- cbind(c(1,0,-1)/2, c(1,-2,1))/4,
contrasts(Age) <- cbind(c(-1,1)/2),
contrasts(Gender) <- cbind(c(-1,1)/2)
)
str(Z)
contrasts(Z$Cond)
But it does not. I was hoping someone could explain why. Of course, I understand that contrasts(X$Cond) <- ... would work, but I'm curious about why this does not.
In fact, this does not work either [EDIT: false, this does work. I tried this quickly before posting originally and did something wrong]:
attach(X)
contrasts(Cond) <- cbind(c(1,0,-1)/2, c(1,-2,1))/4
contrasts(Age) <- cbind(c(-1,1)/2)
contrasts(Gender) <- cbind(c(-1,1)/2)
detach(X)
I apologize if this is a "RTFM" sort of thing... it's not that I haven't looked. I just don't understand. Thank you!
[EDIT: Thank you joran---within() instead of with() or transform() does the trick! The following syntax worked.]
Z <- within(X, {
contrasts(Cond) <- ...
contrasts(Age) <- ...
contrasts(Gender) <- ...
}
)
transform is definitely the wrong tool, I think. And you don't want with, you probably want within, in order to return the entire object:
X <- within(X,{contrasts(Cond) <- cbind(c(1,0,-1)/2, c(1,-2,1))/4
contrasts(Age) <- cbind(c(-1,1)/2)
contrasts(Gender) <- cbind(c(-1,1)/2)})
The only tricky part here is to remember the curly braces to enclose multiple lines in a single expression.
Your last example, using attach, works just fine for me.
transform is only set up to evaluate expressions of the form tag = value, and because of the way it evaluates those expressions, it isn't really set up to modify attributes of a column. It is more intended for direct modifications to the columns themselves. (Scaling, taking the log, etc.)
The difference between with and within is nicely summed up by the Value section of ?within:
Value For with, the value of the evaluated expr. For within, the modified object.
So with only returns the result of the expression. within is for modifying an object and returning the whole thing.
While I agree with #Jornan that within is the best strategy here, I will point out it is possible to use transform you just need to do so in a different way
Z <- transform(X,
Cond = `contrasts<-`(Cond, value=cbind(c(1,0,-1)/2, c(1,-2,1))/4),
Age = `contrasts<-`(Age, value=cbind(c(-1,1)/2)),
Gender= `contrasts<-`(Gender, value=cbind(c(-1,1)/2))
)
Here we are explicitly calling the magic function that is used when you run contrasts(a)=b. This actually returns a value that can be used with the a=b format that transform expects. And of course it leaves X unchanged.
The within solution looks much cleaner of course.