nlme error with lme function: function evaluation limit reached without convergence - r

I am conducting a linear mixed effect model with random intercept and slopes with the lme function from the ````nlme``` package in R.
library(readr)
library(nlme)
#Data
post = read_csv("C:/data.csv")
post$regions <- as.numeric(as.factor(post$regions))
post.sc = data.frame(scale(post))
post.sc = na.omit(post.sc)
#Regressions
lm.reg <- lm(V1 ~ V2+V3+V4+V5+V6+V7+V8+V9+V10, data = post.sc)
lmm.null <- lme(V1 ~ 1, data = post.sc, random = ~1 | regions, method = 'ML')
lmm.reg <- lme(V1 ~ V2+V3+V4+V5+V6+V7+V8+V9+V10, data = post.sc, random = ~1 | regions, method = 'ML')
lmm.reg.slope <- lme(V1 ~ V2+V3+V4+V5+V6+V7+V8+V9+V10, data = post.sc, random = ~1+V1 ~ V2+V3+V4+V5+V6+V7+V8+V9+V10| regions, method = 'ML', control = lmeControl(maxIter = 10000, msMaxIter = 10000))
I get the following error message:
Error in lme.formula(V1 ~ V2+V3+V4+V5+V6+V7+ :
nlminb problem, convergence error code = 1
message = function evaluation limit reached without convergence (9)
I have not encountered questions with similar problems.
Is there a way to fix this?

Related

Error in model.frame.default(form = classvariable ~ ., data = trainingDataset, : variable lengths differ (found for 'Sepal.Length')

I've tried to look at similar questions but can't figure out my problem.
I was already able to complete my analysis with random forest (using caret), tuning parameters separately. Now I'm trying to create a function that will perform my analysis all at once.
I created a function with two inputs, the dataset, and variable to be classified.
For now I'm using the iris dataset for simplicity.
RF <- function(data, classvariable) {
# Best mtry
trControl <- trainControl(method = "cv", number = 10,
search = "grid")
set.seed(1234)
tuneGrid <- expand.grid(.mtry = c(1: 3))
RF_mtry <- train(classvariable ~.,
data = dataset,
method = "rf",
metric = "Accuracy",
tuneGrid = tuneGrid,
trControl = trControl,
importance = TRUE,
ntree = 100)
print(RF_mtry)
mtry = 0
for (i in 1:nrow(RF_mtry$results)) {
if (RF_mtry$results[i,2] > mtry) mtry <-
RF_mtry$results[i,2]
}
trial_mtry <- c(1:3)
best_mtry <- trial_mtry[i]
best_mtry
}
Once I run the function
RF(data = iris, classvariable = Species)
I get the error
Error in `[.data.frame`(data, , all.vars(Terms), drop = FALSE) :
undefined columns selected
Tried running the code without putting it in a function, so i wrote directly iris instead of dataset and Species instead of classvariable, and it works.
previously I was getting the error
Error in model.frame.default(form = classvariable ~ ., data = trainingDataset, :
variable lengths differ (found for 'Sepal.Length')
Anybody have an idea why it does not work?
Thank you very much.

How to fix "variable length differ" error in cv.zipath?

Trying to run a Cross validation of a zero-inflated poisson model using cv.zipath from the mpath package.
Fitting the LASSO
fit.lasso = zipath(estimation_sample_nomiss ~ .| .,
data = missings,
nlambda = 100,
family = "poisson",
link = "logit")
Cross validation
n <- dim(docvisits)[1]
K <- 10
set.seed(197)
foldid <- split(sample(1:n), rep(1:K, length = n))
fitcv <- cv.zipath(F_time_unemployed~ . | .,
data = estimation_sample_nomiss, family = "poisson",
nlambda = 100, lambda.count = fit.lasso$lambda.count[1:30],
lambda.zero = fit.lasso$lambda.zero[1:30], maxit.em = 300,
maxit.theta = 1, theta.fixed = FALSE, penalty = "enet",
rescale = FALSE, foldid = foldid)
I encounter the following error:
Error in model.frame.default(formula = F_time_unemployed ~ . + ., data = list(: variable lengths differ (found for '(weights)')
I have cleaned the sample of all NA's but still encounter the error message.
The solution turns out to be that the cv.zipath() command does not accept tibble data formats - at least in this instance. (No guarantee as to how this statement can be generalised). Having used dplyr commands, one needs to convert back to data frame. Thus, the solution is as simple as as.dataframe().

Modeling beta-binomial distributed data using glmmTBM

Im trying to fit a mixed effect model to asses for effects upon the rate of germinated polen grains. I started with a binomial distribution with a model structure like this:
glmer(cbind(NGG,NGNG) ~ RH3*Altitude + AbH + Date3 + (1 | Receptor/Code/Plant) +
(1 | Mountain/Community), data=database, family="binomial",
control = glmerControl(optimizer="bobyqa"))
Where NGG is the number of successes (germinated grains per stigma, can vary from 0 to e.g. 55), NGNG the number of failures (non-germinated grains 0 to e.g. 80). The issue is, after seeing the results, data seems to be over-dispersed, as indicated by the function (found in http://rstudio-pubs-static.s3.amazonaws.com/263877_d811720e434d47fb8430b8f0bb7f7da4.html):
overdisp_fun <- function(model) {
vpars <- function(m) {
nrow(m)*(nrow(m)+1)/2
}
model.df <- sum(sapply(VarCorr(model), vpars)) + length(fixef(model))
rdf <- nrow(model.frame(model))-model.df
rp <- residuals(model, type = "pearson") # computes pearson residuals
Pearson.chisq <- sum(rp^2)
prat <- Pearson.chisq/rdf
pval <- pchisq(Pearson.chisq, df = rdf, lower.tail = FALSE)
c(chisq = Pearson.chisq, ratio = prat, rdf = rdf, p = pval)
}
The output was:
chisq = 1.334567e+04, ratio = 1.656201e+00, rdf = 8.058000e+03, p = 3.845911e-268
So I decided to try a beta-binomial in glmmTMB as follows (its important to keep this hierarchical structure):
glmmTMB(cbind(NGG,NGNG) ~ RH3*Altitude + AbH + Date3 + (1 | Receptor/Code/Plant) +
(1 | Mountain/Community), data=database,
family=betabinomial(link = "logit"), na.action = na.omit, weights=NGT)
When I run it.. says:
Error in nlminb(start = par, objective = fn, gradient = gr, control = control$optCtrl) : (converted from warning) NA/NaN function evaluation
Is there something wrong in the model writing? I already checked for posible issues in (http://rstudio-pubs-static.s3.amazonaws.com/263877_d811720e434d47fb8430b8f0bb7f7da4.html) but did not find any solution yet.
thanks

Multilevel models in R using nmle package

I am using the nlme package to learn multilevel models, and following examples from the textbook "Discovering Statistics Using R" when it happened.
Mixed Models Code
The data set is Honeymoon Period.dat, also downloadable under their companion website.
Data Set - Multilevel Models
require(nlme)
require(reshape2)
satisfactionData = read.delim("Honeymoon Period.dat", header = TRUE)
restructuredData<-melt(satisfactionData, id = c("Person", "Gender"), measured = c("Satisfaction_Base", "Satisfaction_6_Months", "Satisfaction_12_Months", "Satisfaction_18_Months"))
names(restructuredData)<-c("Person", "Gender", "Time", "Life_Satisfaction")
#print(restructuredData)
#restructuredData.sorted<-restructuredData[order(Person),]
intercept <-gls(Life_Satisfaction~1, data = restructuredData, method = "ML", na.action = na.exclude)
randomIntercept <-lme(Life_Satisfaction ~1, data = restructuredData, random = ~1|Person, method = "ML", na.action = na.exclude, control = list(opt="optim"))
anova(intercept, randomIntercept)
timeRI<-update(randomIntercept, .~. + Time)
timeRS<-update(timeRI, random = ~Time|Person)
ARModel<-update(timeRS, correlation = corAR1(0, form = ~Time|Person))
The error occured at this moment, when I am trying to update "timeRS" model.
The error message is as follows:
Error in as.character.factor(X[[i]], ...) : malformed factor
Any stats people/programmers here who knows what this means?
I have looked at this book. It appears that the coding is wrong. The error you are getting is because your time variable is a factor and you need it to be numeric. In the author's first figure in the book he represents time as numeric (0 - 3) but his code for the models is incorrect. I've recoded it for you:
## First, Time needs to be recoded as a numeric
restructuredData$Time.Numeric <- with(restructuredData, ifelse(Time == "Satisfaction_Base", 0,
ifelse(Time == "Satisfaction_6_Months", 1,
ifelse(Time == "Satisfaction_12_Months", 2,
ifelse(Time == "Satisfaction_18_Months", 3, NA)))))
## Baseline Model
intercept <-gls(Life_Satisfaction~1, data = restructuredData, method = "ML", na.action = na.exclude)
summary(intercept)
## Model where intercept can vary for Individuals
randomIntercept <- lme(Life_Satisfaction ~ 1, data = restructuredData, random = ~1|Person, method = "ML", na.action = na.exclude, control = list(opt = "optim"))
summary(randomIntercept)
## Add time as a fixed effect
timeRI <- lme(Life_Satisfaction ~ Time.Numeric, data = restructuredData, random = ~1|Person, method = "ML", na.action = na.exclude, control = list(opt = "optim"))
summary(timeRI)
## Add a random slope to the model by nesting the Individual within the test time
timeRS <- lme(Life_Satisfaction ~ Time.Numeric, data = restructuredData, random = ~Time.Numeric|Person, method = "ML", na.action = na.exclude, control = list(opt = "optim"))
summary(timeRS)
## Modeling the covariance structure structure of the errors with a first-order autoregressive covariance structure
ARModel <- update(timeRS, correlation = corAR1(0, form = ~Time.Numeric|Person))
summary(ARModel)
anova(intercept, randomIntercept, timeRI, timeRS, ARModel)
The anova read out for model comparisons is now exactly as shown in the book.

How to apply filter based feature selection for logistic regression in R's caret package?

I am trying to apply filter based feature selection in caret package for logistic regression. I was successful at using sbf() function for random forest and LDA models (using rfSBF and ldaSBF respectively).
The way I modified lmSBF is as follows:
# custom lmSBF
logisticRegressionWithPvalues <- lmSBF
logisticRegressionWithPvalues$score <- pScore
logisticRegressionWithPvalues$summary <- fiveStats
logisticRegressionWithPvalues$filter <- pCorrection
logisticRegressionWithPvalues$fit <- glmFit
# my training control parameters for sbf (selection by filter)
myTrainControlSBF = sbfControl(method = "cv",
number = 10,
saveDetails = TRUE,
verbose = FALSE,
functions = logisticRegressionWithPvalues)
# fit the logistic regression model
logisticRegressionModelWithSBF <- sbf(x = input_predictors,
y = input_labels,
sbfControl = myTrainControlSBF)
Here, glmFit function (mentioned above) is as follows:
# fit function for logistic regression
glmFit <- function(x, y, ...) {
if (ncol(x) > 0) {
tmp <- as.data.frame(x)
tmp$y <- y
glm(y ~ ., data = tmp, family = binomial)
}
else nullModel(y = y)
}
But while calling logisticRegressionModelWithSBF I am getting an error as:
Error in { : task 1 failed - "inputs must be factors"
What am I doing wrong?

Resources