Creating multiple GLM in a for loop, skipping models in the loop where the coefficients do not work in R - r

I have created a list that contains all possible combinations of my dependent variables and I am trying to create multiple glm with all of those combinations.
combinations = list()
models = list()
for(i in length(combinations) {
models[[i]]<- glm ( as.formula(paste("(x) ~", combinations[i])), family= 'Gamma' , data = df)
}
I get the error message:
Error: no valid set of coefficients has been found: please supply starting values
I know that is because one or a few models the combination of dependent variables seems to create some issues. How can I skip the models that create those issues (or leave them blank) and keep the loop going?
As a side comment: I tried implementing a normal exponential glm, which worked, but I would really like to stay with the gamma family.
combinations = list()
models = list()
for(i in length(combinations) {
models[[i]]<- glm ( as.formula(paste("exp(x) ~", combinations[i])), family= 'gaussian' , data = df)
}
Many thanks!

Related

Using a For Loop to Run Multiple Response Variables through a Train function to create multiple seperate models in R

I am trying to create a for loop to index thorugh each individual response variable I have and train a model using the train() funciton within the Caret Package. I have about 30 response variable and 43 predictor variables. I can train each model individually but I would like to automate the process and have a for loop run through a model (I would like to eventually upscale to multiple models if possible, i.e. lm, rf, cubist, etc.). I then want to save each model to a dataframe along with R-squared values and RMSE values. The individual models that I currenlty have that will run for me goes as follows, with column 11 being the response variable and column 35-68 being predictor variables.
data_Mg <- subset(data_3, !is.na(Mg))
mg.lm <- train(Mg~., data=data_Mg[,c(11,35:68)], method="lm", trControl=control)
mg.cubist <- train(Mg~., data=data_Mg[,c(11,35:68)], method="cubist", trControl=control)
mg.rf <- train(Mg~., data=data_Mg[,c(11,35:68)], method="rf", trControl=control, na.action = na.roughfix)
max(mg.lm$results$Rsquared)
min(mg.lm$results$RMSE)
max(mg.cubist$results$Rsquared)
min(mg.cubist$results$RMSE)
max(mg.rf$results$Rsquared) #Highest R squared
min(mg.rf$results$RMSE)
This gives me 3 models with everything the relevant information that I need. Now for the for loop. I've only tried the lm model so far for this.
bucket <- list()
for(i in 1:ncol(data_4)) { # for-loop response variables, need to end it at response variables, rn will run through all variables
data_y<-subset(data_4, !is.na(i))#get rid of NA's in the "i" column
predictors_i <- colnames(data_4)[i] # Create vector of predictor names
predictors_1.1 <- noquote(predictors_i)
i.lm <- train(predictors_1.1~., data=data_4[,c(i,35:68)], method="lm", trControl=control)
bucket <- i.lm
#mod_summaries[[i - 1]] <- summary(lm(y ~ ., data_y[ , c("i.lm", predictors_i)]))
#data_y <- data_4
}
Below is the error code that I am getting, with Bulk_Densi being the first variable in predictors_1.1. The error code is that variable lengths differ so I originally thought that my issue was that quotes were being added around "Bulk_Densi" but after trying the NoQuote() function I have not gotten anywehre so I am unsure of where I am going wrong.
Error code that I am getting
Please let me know if I can provide any extra info and thanks in advance for the help! I've already tried the info in How to train several models within a loop for and was struggling with that as well.

for loop in train(caret) to select different predictors in a lm model

I'm just a beginner, so i hope you can help with a problem due the KNN model (via train in caret) in R.
I tried this:
models.list = as.list(vector(length = ncol(FIFA21_db)))
for(i in 1:ncol(mtcars)) {
models.list[[i]] <- train(x = mtcars[,i], y = mtcars[,1], method = "lm")
}
This cause the error " Please use column names for x": Do you know how i can use the column names instead of observations in a for loop? My goal is to use different variables for a lm regression.

Fitting Step functions

AIM: The aim here was to find a suitable fit, using step functions, which uses age to describe wage, in the Wage dataset in the library ISLR.
PLAN:
To find a suitable fit, I'll try multiple fits, which will have different cut points. I'll use the glm() function (of the boot library) for the fitting purpose. In order to check which fit is the best, I'll use the cv.glm() function to perform cross-validation over the fitted model.
PROBLEM:
In order to do so, I did the following:
all.cvs = rep(NA, 10)
for (i in 2:10) {
lm.fit = glm(wage~cut(Wage$age,i), data=Wage)
all.cvs[i] = cv.glm(Wage, lm.fit, K=10)$delta[2]
}
But this gives an error:
Error in model.frame.default(formula = wage ~ cut(Wage$age, i), data =
list( : variable lengths differ (found for 'cut(Wage$age, i)')
Whereas, when I run the code given below, it runs.(It can be found here)
all.cvs = rep(NA, 10)
for (i in 2:10) {
Wage$age.cut = cut(Wage$age, i)
lm.fit = glm(wage~age.cut, data=Wage)
all.cvs[i] = cv.glm(Wage, lm.fit, K=10)$delta[2]
}
Hypotheses and Results:
Well, it might be possible that cut() and glm() might not work together. But this works:
glm(wage~cut(age,4),data=Wage)
Question:
So, basically we're using the cut() function, saving it's results in a variable, then using that variable in the glm() function. But we can't put the cut function inside the glm() function. And that too, only if the code is in a loop.
So, why is the first version of the code not working?
This is confusing. Any help appreciated.

For loops regression in R

I'm fitting GARCH model to the residuals of and ARIMA, and trying to apply ARCH(p) for p from 1 to 10 to compare the fitness. Here is my code. Errors are returned in the for loop part but I cannot figure out the reason why. Could anyone give some tips?
So for the single value p=1 the codes are as below and it's no problem.
fitone<- garchFit(~garch(1,0),data=logprice)
coef(fitone)
summary(fitone)
And for the for loop my codes go like
for (n in 1:10) {
fit [[n]]<- garchFit(~garch(n,0),data=logprice)
coef(fit[[n]])
summary(fit[[n]])
}
Error in .garchArgsParser(formula = formula, data = data, trace = FALSE) :
Formula and data units do not match.
I never wrote a loop code before. Can someone help me with the codes?
The problem is that generally one tries to evaluate all the variables in a formula in the context of the data= parameter, but your n variable isn't coming from logprice, it's coming from the global environment. You will need to dynamically create the formula. Here's one way to run all the models with lapply rather than a for look would be
library(fGarch)
#sample data
x.vec = as.vector(garchSim(garchSpec(rseed = 1985), n = 200)[,1])
fits <- lapply(1:10, function(n) {
garchFit(bquote(~garch(.(n),0)), data = x.vec, trace = FALSE)
})
and then we can get the coefs with
lapply(fits, coef)

How to make a list of lmer model objects to use in a for loop in R?

I'm trying to write a for loop in R (my first!) in order to produce and save diagnostic plots of several mixed effects models fitted using the function lmer in the package lme4. This is what I've done so far exemplified with the sleepstudy data:
require(lme4)
mod1<-lmer(Reaction ~ Days + (1|Subject),sleepstudy)
mod2<-lmer(Reaction ~ 1 + (1|Subject),sleepstudy)
List<-c(mod1,mod2)
names<-c("mod1","mod2")
i=1
for (i in 1:length(List)) {
jpeg(file = paste("modelval_", names[i], ".jpg", sep=""))
par(mfrow=c(2,2))
plot(resid(List[i]) ~ fitted(List[i]),main="residual plot")
abline(h=0)
qqnorm(resid(List[i]), main="Q-Q plot of residuals")
qqnorm(ranef(List[i])$Subject$"(Intercept)", main="Q-Q plot of random effect" )
dev.off()
}
I get the following error message when typing into R consol:
Error in function (formula, data = NULL, subset = NULL, na.action = na.fail, :
invalid type (NULL) for variable 'resid(list[i])'
I've got a feeling the problem is related to the list of models I've created and not the for loop itself and I think it might be related to the model objects being of class S4. Is it possible to make such a list?
I've also tried to make the list like below, with no improvements (still get the same error message)
List<-list(mod1,mod2)
First using c can risk losing the class structure of the objects you've created. To make a list containing your models, use list(mod1, mod2).
Second, List[i] is a list of length 1 containing the i'th element of List. Use List[[i]] to extract the element itself (your model).

Resources