Error including correlation structure in function with gamm - r

I am trying to create my own function that contains 1.) the mgcv gamm function and 2.) a nested autocorrelation (ARMA) argument. I am getting an error when I try to run the function like this:
df <- AirPassengers
df <- as.data.frame(df)
df$month <- rep(1:12)
df$yr <- rep(1949:1960,each=12)
df$datediff <- 1:nrow(df)
try_fxn1 <- function(dfz, colz){gamm(dfz[[colz]] ~ s(month, bs="cc",k=12)+s(datediff,bs="ts",k=20), data=dfz,correlation = corARMA(form = ~ 1|yr, p=2))}
try_fxn1(df,"x")
Error in eval(predvars, data, env) : object 'dfz' not found
I know the issue is with the correlation portion of the formula, as when I run the same function without the correlation structure included (as seen below), the function behaves as expected.
try_fxn2 <- function(dfz, colz){gamm(dfz[[colz]] ~ s(month, bs="cc",k=12)+ s(datediff,bs="ts",k=20), data=dfz)}
try_fxn2(df,"x")
Any ideas on how I can modify try_fxn1 to make the function behave as expected? Thank you!

You are confusing a vector with the symbolic representation of that vector when building a formula.
You don't want dfz[[colz]] as the response in the formula, you want x or whatever you set colz too. What you are getting is
dfz[[colz]] ~ ...
when what you really want is the variable colz:
colz ~ ...
And you don't want a literal colz but whatever colz evaluates to. To do this you can create a formula by pasting the parts together:
fml <- paste(colz, '~ s(month, bs="cc", k=12) + s(datediff,bs="ts",k=20)')
This turns colz into whatever it was storing, not the literal colz:
> fml
[1] "x ~ s(month, bs=\"cc\", k=12) + s(datediff,bs=\"ts\",k=20)"
Then convert the string into a formula object using formula() or as.formula().
The final solution then is:
fit_fun <- function(dfz, colz) {
fml <- paste(colz, '~ s(month, bs="cc", k=12) + s(datediff,bs="ts",k=20)')
fml <- formula(fml)
gamm(fml, data = df, correlation = corARMA(form = ~ 1|yr, p=2))
}
This really is not an issue with corARMA() part, other than that triggers somewhat different evaluation code for the formula. The guiding mantra here is to always get a formula as you would type it if not programming with formulas. You would never (or should never) write a formula like
gamm(df[[var]] ~ x + s(z), ....)
While this might work in some settings, it will fail miserably if you ever want to use predict()` and it fails when you have to do something a little more complicated.

Related

using lm function in R with a variable name in a loop

I am trying to create a simple linear model in R a for loop where one of the variables will be specified as a parameter and thus looped through, creating a different model for each pass of the loop. The following does NOT work:
model <- lm(test_par[i] ~ weeks, data=all_data_plant)
If I tried the same model with the "test_par[i]" replaced with the variable's explicit name, it works just as expected:
model <- lm(weight_dry ~ weeks, data=all_data_plant)
I tried reformulate and paste ineffectively. Any thoughts?
Maybe try something like this:
n <- #add the column position of first variable
m <- #add the column position of last variable
lm_models <- lapply(n:m, function(x) lm(all_data_plant[,x] ~ weeks, data=all_data_plant))
You can pass the argument "formula" in lm() as character using paste(). Here a working example:
data("trees")
test_par <- names(trees)
model <- lm(Girth ~ Height, data = trees)
model <- lm("Girth ~ Height", data = trees) # character formula works
model <- lm(paste(test_par[1], "~ Height"), data=trees)

lme within a user defined function in r

I need to use mixed model lme function many times in my code. But I do not know how to use it within a function. If used otherwise, the lme function works just well but when used within the function, it throws errors:
myfunc<- function(cc, x, y, z)
{
model <- lme(fixed = x ~1 , random = ~ 1|y/z,
data=cc,
method="REML")
}
on calling this function:
myfunc (dbcon2, birthweight, sire, dam)
I get the error :
Error in model.frame.default(formula = ~x + y + z, data = list(animal
= c("29601/9C1", : invalid type (list) for variable 'x'
I think, there is a different procedure for using this which I am unaware of. Any help would be greatly appreciated.
Thanks in advance
Not sure if you are looking for this, you may try to use this, as correctly pointed out by #akrun, you may use paste, I am using paste0 however(its a special case of paste), paste concatenates two strings:
Here the idea is to concatenate the variable names with the formula, but since paste converts it to a string hence you can't refer that as formula to build a model,so you need to convert that string to a formula using as.formula which is wrapped around paste0 statement.
To understand above, Try writing a formula like below using paste:
formula <-paste0("mpg~", paste0("hp","+", "am"))
print(formula)
[1] "mpg~hp+am"
class(formula)
[1] "character" ##This should ideally be a formula rather than character
formula <- as.formula(formula) ##conversion of character string to formula
class(formula)
[1] "formula"
To work inside a model, you would always require a formula object, also please also try to learn about collapse and sep option in paste they are very handy.
I don't have your data , hence I have used mtcars data to represent the same.
library("nlme")
myfunc<- function(cc, x, y, z)
{
model <- lme(fixed = as.formula(paste0(x," ~1")) , random = as.formula(paste0("~", "1|",y,"/",z)),
data=cc,
method="REML")
}
models <- myfunc(cc=mtcars, x="hp", y="mpg", z="am")
summary(models)
You can read more about paste by typing ?paste in your console.

metafor coupled with glmulti: How to exclude undesirable interactions

I am fitting the following:
rma.glmulti.ran <- function (formula, data, random, ...) {
rma.mv(as.formula(paste(deparse(formula))), Variance, random= ~1 | Experiment, data = msa, method="REML", ...)
}
msa_res <- glmulti(MSA ~ MAPl+MAT_e+Duration.yrl+Fert+Naddl+Ndepl,
data=msa,
level=2,
exclude=c("MAPl:MAT_e","MAPl:Duration.yrl","MAPl:Fert",
"MAPl:Naddl","MAPl:Ndepl","MAT_e:Duration.yrl","MAT_e:Fert","MAT_e:depl",
"Duration.yrl:Fert","Duration.yrl:Ndepl","Duration.yrl:Naddl","Fert:Ndepl","Naddl:Ndepl"),
fitfunction=rma.glmulti.ran, crit="aicc")
The aim of this code is to include only these two interactions: "Naddl:MAT_e" and "Naddl:Fert". Thus, I am using exclude=c() to filter out all the other undesirable pairwise interactions from the full model (level=2).
This, theoretically should be the same than:
MSA ~ MAPl + MAT_e + ... + Naddl:MAT_e + Naddl:Fert
However I get this error when I add exclude=c() in the formula:
Error in glmulti(MSA ~ MAPl + MAT_e + Duration.yrl + Fert + Naddl + Ndepl, :
Improper call of glmulti.
Am I missing something about exclude=c()? Is there a more elegant way to specify interactions terms in 'glmulti'?
Couldn't find the data you were referring to, so i generated fictitious variables below;
The exclude didn't work for me too, so i hard coded the exclusion into the formula. The following code worked for me, and no undesired interactions went into the glmulti.
library('utils')
retain_var <- c('DOW_MON','DOW_TUE','DOW_WED','DOW_THU') #set variables interested for testing
excl_inter_effects <- c('DOW_FRI','DOW_SAT','DOW_SUN') #set variables to exclude
set_depth <- 2 #generate 2 way interaction to exclude explicitly
excl_inter_form <- t(combn(excl_inter_effects,set_depth))
Paste <- function(x) paste(x, collapse = ":")
excl_inter_form <- apply(excl_inter_form, 1, Paste)
rm(list = c('Paste','set_depth','excl_inter_effects')) #clean up
# ----- Generate formula incorporating exclusion -----
glm_formula <- as.formula(paste("y ~",paste(retain_var, collapse= "+"),"-",paste(excl_inter_form, collapse= "-")))

Save an object in R using a function parameter as name

I looked all over the website and could not get the correct answer for this dilemma:
I have an UDF for evaluating some classification models, with different datasets, and i wanted to have a single function for evaluating them. I want to have something like the following, that given the name of the model and the data, it computes some metrics (confusion matrix for example) and saves them to an object outside the function.
The problem here is that I want to create this object using the name of the model I am evaluating.
I ended up with something like this:
foo <- function(x) {return(as.character(substitute(x)))}
model1 <- lm(Sepal.Width ~ Sepal.Length, iris)
Validation.func <- function(model_name, dataset){
Pred_Train = predict(model_name, dataset)
assign(paste("Pred_Train_",foo(model_name), sep=''), Pred_Train, envir=globalenv())
Pred_Train_prob = predict(model_name, dataset, type = "prob")
MC_Train = confusionMatrix(Pred_Train, dataset$target_salto)
}
Running it for Validation.func(model1,iris) We would want to get the variable stored as "Pred_Train_model1".
As model_name is not a string we had to try to convert it using the foo function (which is the answer i found in here) foo = function(x)deparse(substitute(x)) I do not get what I want, since it saves the object as: "Pred_Train_model_name" instead of "Pred_Train_model1".
Does anyone know how to solve it?
model_name in your function must be a model object, hence cannot be used in paste function, which expects characters.
I think you want your function to know that the model object is actually called "model1" in the environment where it comes from. I think this is quite tricky attempt since your model object may be called by various names.
The easiest implementation would be to give both model object and the name separately, and the use the former for prediction and the latter for naming the outcome.
func1 <- function(model, model_str, dataset)
{
p <- predict(model, dataset)
assign(paste("predict_", model_str, sep=""), p, envir=globalenv())
}
model1 <- lm(mpg ~ cyl, data=mtcars)
func1(model1, "model1", mtcars)
predict_model1
Another implementation, tricky but works if used with care, would be to give only the character name of the model and obtain the model object by get function from the parent environment.
func2 <- function(model_str, dataset)
{
p <- predict(get(model_str, envir=parent.env(environment())), dataset)
assign(paste("predict_", model_str, sep=""), p, envir=globalenv())
}
model2 <- lm(mpg ~ cyl, data=mtcars)
func2("model2", mtcars)
predict_model2
Finally, in order to give the model object to the function and let the function to find the variable name, then you can use match.call function to recover how the function has been called.
func3 <- function(model, dataset)
{
s <- match.call()
model_str <- as.character(s)[2]
p <- predict(model, dataset)
assign(paste("predict_", model_str, sep=""), p, envir=globalenv())
}
model3 <- lm(mpg ~ cyl, data=mtcars)
func3(model3, mtcars)
predict_model3
So here's a suggestion, that does not exactly solve the problem, but does make the function work.
Validation.func <- function(model_name, dataset){
model_name_obj<- eval(parse(text = model_name))
Pred_Train = predict(model_name_obj, dataset)
assign(paste("Pred_Train_",model_name, sep=''), Pred_Train, envir=globalenv())
Pred_Train_prob = predict(model_name_obj, dataset, type = "prob")
MC_Train = confusionMatrix(Pred_Train, dataset$target_salto)
}
Validation.func("model1", data)
What I did is pretty much the opposite of what you were trying. I passed model_name as a string, and then evaluate it using parse(text = model_name). Note that the evaluated object is now called model_name_obj and it is passed in the predict function.
I got some errors later on in the function, but they are irrelevant to the issue at hand. They had to do with the type argument in predict and about not recognizing the confusionMatrix, because I assume I didn't load the corresponding package.

entering variable name as argument of a function

I'm trying to make a function that will run and compare a set of models given a dataset and a variable name (essentially to be able to change just one model set and have them apply to all relevant dependent variables--selecting an a priori modelset to compare rather than using a data-dredging existing function like glmulti). A simple example:
RunModelset<- function(dataset, response)
{
m1 <- lm(formula=response ~ 1, data=dataset)
m2 <- lm(formula=response ~ 1 + temperature, data=dataset)
comp <- AICctab(m1,m2, base = T, weights = T, nobs=length(data))
return(comp)
}
If I manually enter a specific variable name within the function, it runs the models correctly. However, using the code above and entering a text value for the response argument doesn't work:
RunModel(dataset=MyData,response="responsevariablename")
yields an error: invalid type (NULL) for variable 'dataset$response', which I interpret to mean it isn't finding the column I'm telling it to use. My problem must be in how R inserts a text value as an argument in the function.
How do I enter the response variable name so R knows that "formula=response ~" becomes "formula=dataset$responsevariablename ~"?
ETA Working answer based on this solution:
RunModel<- function(dataset, response)
{
resvar <- eval(substitute(response),dataset)
m1 <- lm(formula=resvar ~ 1, data=dataset)
m2 <- lm(formula=resvar ~ 1 + R.biomass, data=dataset)
comp <- AICctab(m1,m2, base = T, weights = T, nobs=length(data))
return(comp)
}
RunModel(dataset=MyData,response=responsevariablename)
NB - this didn't work when I had quotes on the response argument.
You should be able to use match.call() to achieve this.
See this post

Resources