Error variable lengths differ (found for' (weights)') with Effects package - r

I am having an error when trying to extract values form a mixed effect model using the 'effects' package. A reproducible example below.
#create database.
df<-NULL
df$col1<-as.numeric(c(4,18,10,41,30,40,47,30,41,18,11,16,47,26,20,10,11,11,
10,22,0,11,20,11,23,17,8,27,17,47,40,28,10,10,40,17,17,
20,17,11,28,41,10,20,37,22,31,18,17,30,16,11,27,20,10,10,
28,41,20,20,20,13,11,41,13,10,27,13,18,17,24,6,11,17,10,24,
17,10,22))
df$col2<-as.factor(c('location01','location01','location01','location01','location01','location01','location01','location01','location01','location01','location01','location01','location01',
'location01','location01','location03','location03','location03','location03','location03','location03','location03','location03','location05','location05','location05',
'location05','location05','location08','location08','location08','location08','location08','location08','location08','location08','location08','location17','location17',
'location17','location17','location20','location20','location20','location20','location20','location20','location20','location20','location20','location23','location23',
'location23','location23','location23','location23','location23','location23','location23','location26','location26','location26','location26','location26','location31',
'location31','location31','location31','location31','location31','location31','location31','location31','location34','location34','location34','location34','location34','location34'))
df$col3<-as.factor(c('a','a','a','a','c','c','c','c','c','b','b','b','b','b','b','a','a','a',
'a','a','a','a','a','c','c','c','c','c','a','a','a','c','c','c','b','b',
'b','c','c','c','c','a','a','a','a','a','b','b','b','b','a','a','a','a',
'c','c','c','c','c','b','b','b','b','b','c','c','c','c','b','b','b','b',
'b','a','a','a','b','b','b'))
df$col4<-as.factor(c('x','x','x','x','y','y','y','y','y','y','y','y','y','y','y','y','y',
'y','y','y','y','y','y','x','x','x','x','x','x','x','x','y','y',
'y','y','y','y','x','x','x','x','x','x','x','x','x','y','y','y',
'y','y','y','y','y','x','x','x','x','x','x','x','x','x','x','y',
'y','y','y','x','x','x','x','x','y','y','y','x','x','x'))
df<-as.data.frame(df)
#run two potential models
M1<-lme(col1 ~ col3*col4,random = ~ 1 | col2, data = df)
M2<-lme(col1 ~ col3*col4, random = ~ 1 | col2,data = df,weights = varIdent(form = ~ 1 | col3), method = "REML")
#both models work, and their results make sense. So now extract values with effects package
effsM1 <- effect("col3*col4", M1) #works!
effsM2 <- as.data.frame(effect("col3*col4", M2))# doesn't work
Error in model.frame.default(formula = col1 ~ col3 * col4, data = df,
: variable lengths differ (found for '(weights)')
I have a feeling that it is because I do not have the same number replicates for each treatment combination but this didn't used to give me any problems in the past.

Related

Removing completely separated observations from glm()

I'm doing a bit of exploratory data analysis using HMDA data from the AER package; however, the variables that I used to fit the model seem to contain some observations that perfectly determine the outcomes, an issue known as "separation." So I tried to remedy this using the solution recommended by this thread, yet when I tried to execute the first set of source code from glm.fit(), R returned an error message:
Error in family$family : object of type 'closure' is not subsettable
so I could not proceed any further to remove those fully determined observations from my data with this code. I am wondering if anyone could help me fix this?
My current code is provided at below for your reference.
# load the AER package and HMDA data
library(AER)
data(HMDA)
# fit a 2-degree olynomial probit model
probit.fit <- glm(deny ~ poly(hirat, 2), family = binomial, data = HMDA)
# using the revised source code from that stackexchage thread to find out observations that received a warning message
library(tidyverse)
library(dplyr)
library(broom)
eps <- 10 * .Machine$double.eps
if (family$family == "binomial") {
if (any(mu > 1 - eps) || any(mu < eps))
warning("glm.fit: fitted probabilities numerically 0 or 1 occurred",
call. = FALSE)
}
# this return the following error message
# Error in family$family : object of type 'closure' is not subsettable
probit.resids <- augment(probit.fit) %>%
mutate(p = 1 / (1 + exp(-.fitted)),
warning = p > 1-eps)
arrange(probit.resids, desc(.fitted)) %>%
select(2:5, p, warning) %>%
slice(1:10)
HMDA.nwarning <- filter(HMDA, !probit.resids$warning)
# using HMDA.nwarning should solve the problem...
probit.fit <- glm(deny ~ poly(hirat, 2), family = binomial, data = HMDA.nwarning)
This chunk of code
if (family$family == "binomial") {
if (any(mu > 1 - eps) || any(mu < eps))
warning("glm.fit: fitted probabilities numerically 0 or 1 occurred",
call. = FALSE)
}
there is a function, binomial() called when you run glm with family == "binomial". If you look under glm (just type glm):
if (is.character(family))
family <- get(family, mode = "function", envir = parent.frame())
if (is.function(family))
family <- family()
if (is.null(family$family)) {
print(family)
stop("'family' not recognized")
}
And the glm function checks binomial()$family during the fit, and if any of the predicted values differ from 1 or 0 by eps, it raises that warning.
You don't need to run that part, and yes, you need to set eps <- 10 * .Machine$double.eps . So let's run the code below, and if you run a probit, you need to specify link="probit" in binomial, otherwise the default is logit:
library(AER)
library(tidyverse)
library(dplyr)
library(broom)
data(HMDA)
probit.fit <- glm(deny ~ poly(hirat, 2), family = binomial(link="probit"), data = HMDA)
eps <- 10 * .Machine$double.eps
probit.resids <- augment(probit.fit) %>%
mutate(p = 1 / (1 + exp(-.fitted)),
warning = p > 1-eps)
The column warning indicates if the observations raises a warning, in this dataset, there's one:
table(probit.resids$warning)
FALSE TRUE
2379 1
We can use the next step to filter it
HMDA.nwarning <- filter(HMDA, !probit.resids$warning)
dim(HMDA.nwarning)
[1] 2379 14
And rerun the regression:
probit.fit <- glm(deny ~ poly(hirat, 2), family = binomial(link="probit"), data = HMDA.nwarning)
coefficients(probit.fit)
(Intercept) poly(hirat, 2)1 poly(hirat, 2)2
-1.191292 8.708494 6.884404

nlmeODE does not work when an additional parameter was added to the Theoph model example in R documentation for nlmeODE

I tried running the Theoph model provided in the R documentation for nlmeODE package with my own dataset and it was able to return the parameter estimates. However, when I added an additional parameter that I wanted to estimate, I received an error message as shown below. Could anyone advise what went wrong and how I could fix this?
The example model provided in the R documentation looks like:
data(Theoph)
TheophODE <- Theoph
TheophODE$Dose[TheophODE$Time!=0] <- 0
TheophODE$Cmt <- rep(1,dim(TheophODE)[1])
OneComp <- list(DiffEq=list(
dy1dt = ~ -ka*y1 ,
dy2dt = ~ ka*y1-ke*y2),
ObsEq=list(
c1 = ~ 0,
c2 = ~ y2/CL*ke),
Parms=c("ka","ke","CL"),
States=c("y1","y2"),
Init=list(0,0))
TheophModel <- nlmeODE(OneComp,TheophODE)
Theoph.nlme <- nlme(conc ~ TheophModel(ka,ke,CL,Time,Subject),
data = TheophODE, fixed=ka+ke+CL~1, random = pdDiag(ka+CL~1),
start=c(ka=0.5,ke=-2.5,CL=-3.2),
control=list(returnObject=TRUE,msVerbose=TRUE),
verbose=TRUE)
plot(augPred(Theoph.nlme,level=0:1))
My modified model looks like:
group1 <- groupedData(conc ~ Time | Subject,
data=myowndata)
OneComp2 <- list(DiffEq=list(
dy1dt = ~ -ka*y1 + kr*y2,
dy2dt = ~ ka*y1 - kr*y2 -ke*y2), #I added in a reabsorption rate kr
ObsEq=list(
c1 = ~ 0,
c2 = ~ y2/CL*ke),
Parms=c("ka","ke","kr","CL"),
States=c("y1","y2"),
Init=list(0,0))
Model2 <- nlmeODE(OneComp2,group1)
nlme2 <- nlme(conc ~ Model2(ka,ke,kr,CL,Time,Subject),
data = group1, fixed=ka+ke+kr+CL~1, random = pdDiag(ka+CL~1),
start=c(ka=0.5,ke=-2.5,kr=-2.5,CL=-3.2),
control=list(returnObject=TRUE,msVerbose=TRUE),
verbose=TRUE)
The error message reads:
Error in numericDeriv(form[[3L]], names(ind), env) :
Missing value or an infinity produced when evaluating the model

R: cant get a lme{nlme} to fit when using self-constructed interaction variables

I'm trying to get a lme with self constructed interaction variables to fit. I need those for post-hoc analysis.
library(nlme)
# construct fake dataset
obsr <- 100
dist <- rep(rnorm(36), times=obsr)
meth <- dist+rnorm(length(dist), mean=0, sd=0.5); rm(dist)
meth <- meth/dist(range(meth)); meth <- meth-min(meth)
main <- data.frame(meth = meth,
cpgl = as.factor(rep(1:36, times=obsr)),
pbid = as.factor(rep(1:obsr, each=36)),
agem = rep(rnorm(obsr, mean=30, sd=10), each=36),
trma = as.factor(rep(sample(c(TRUE, FALSE), size=obsr, replace=TRUE), each=36)),
depr = as.factor(rep(sample(c(TRUE, FALSE), size=obsr, replace=TRUE), each=36)))
# check if all factor combinations are present
# TRUE for my real dataset; Naturally TRUE for the fake dataset
with(main, all(table(depr, trma, cpgl) >= 1))
# construct interaction variables
main$depr_trma <- interaction(main$depr, main$trma, sep=":", drop=TRUE)
main$depr_cpgl <- interaction(main$depr, main$cpgl, sep=":", drop=TRUE)
main$trma_cpgl <- interaction(main$trma, main$cpgl, sep=":", drop=TRUE)
main$depr_trma_cpgl <- interaction(main$depr, main$trma, main$cpgl, sep=":", drop=TRUE)
# model WITHOUT preconstructed interaction variables
form1 <- list(fixd = meth ~ agem + depr + trma + depr*trma + cpgl +
depr*cpgl +trma*cpgl + depr*trma*cpgl,
rndm = ~ 1 | pbid,
corr = ~ cpgl | pbid)
modl1 <- nlme::lme(fixed=form1[["fixd"]],
random=form1[["rndm"]],
correlation=corCompSymm(form=form1[["corr"]]),
data=main)
# model WITH preconstructed interaction variables
form2 <- list(fixd = meth ~ agem + depr + trma + depr_trma + cpgl +
depr_cpgl + trma_cpgl + depr_trma_cpgl,
rndm = ~ 1 | pbid,
corr = ~ cpgl | pbid)
modl2 <- nlme::lme(fixed=form2[["fixd"]],
random=form2[["rndm"]],
correlation=corCompSymm(form=form2[["corr"]]),
data=main)
The first model fits without any problems whereas the second model gives me following error:
Error in MEEM(object, conLin, control$niterEM) :
Singularity in backsolve at level 0, block 1
Nothing i found out about this error so far helped me to solve the problem. However the solution is probably pretty easy.
Can someone help me? Thanks in advance!
EDIT 1:
When i run:
modl3 <- lm(form1[["fixd"]], data=main)
modl4 <- lm(form2[["fixd"]], data=main)
The summaries reveal that modl4 (with the self constructed interaction variables) in contrast to modl3 shows many more predictors. All those that are in 4 but not in 3 show NA as coefficients. The problem therefore definitely lies within the way i create the interaction variables...
EDIT 2:
In the meantime I created the interaction variables "by hand" (mainly paste() and grepl()) - It seems to work now. However I would still be interested in how i could have realized it by using the interaction() function.
I should have only constructed the largest of the interaction variables (combining all 3 simple variables).
If i do so the model gets fit. The likelihoods then are very close to each other and the number of coefficients matches exactly.

correlation in multivariate mixed model in r

I am running multivariate mixed model in R by using nlme package. Suppose that x and y are responses variables for longitudinal data which assumed that the error within group is correlated. The residual error matrix is presented as:
So my question is how to involve the correlation into lme function?
I tried commands corr = corComSymm(from =~ 1 | x) or corr = corAR1(from =~ 1 | x) but did not work!
here en example:
# visiting time by months
time = rep(c(0,3,6,9),time = 4, 200)
# subjects
subject = rep(1:50, each = 4)
# first response variable "identity"
x = c(rep(0, 100), rep(1,100))
# second response variable "identity"
y = c(rep(1, 100), rep(0,100))
# values of both reponses variables (x_1, x_2)
value = c(rnorm(100,20,1),rnorm(100,48,1))
# variables refer to reponses variables (x_1, x_2)
variable = factor(c(rep(0,150),rep(1,50)), label=c("X","Y"))
df = data.frame(subject , time, x,y,value, variable)
library(nlme)
# fit the model that each response variable has intercept and slope (time) for each random and fixed effects
# as well as fixed effects slopes for sex and lesion, and each response has different variance
f= lme(value ~ -1 + x + y + x:time + y:time , random = ~ -1 + (x + y) + time:( x + y)|subject ,
weights = varIdent(form=~1| x),corr = corAR1(from = ~ 1|x), control=lmeControl(opt="optim"), data =df)
Error in corAR1(from = ~1 | x) : unused argument (from = ~1 | x)
Any suggestions?
I found this website (below) which helpful and useful, I posted here in case someone might has this problem in future.
https://rpubs.com/bbolker/3336

Can the boxTidwell function handle binary outcome variables?

I initially wanted to run a boxTidwell() (found in the "car" package) analysis on my prospective Logistic Regression model (BinaryOutcomeVar ~ ContinuousPredVar + ContinuousPredVar^2 + ContinuousPredVar^3). I ran into issues:
Error in x - xbar : non-numeric argument to binary operator
In addition: Warning message:
In mean.default(x) : argument is not numeric or logical: returning NA
So, I created a reproducable example for demonstrating the error:
Doesn't work:
boxTidwell(formula = Treatment ~ uptake, other.x = ~ poly(x = colnames(CO2)[c(1,2,4)], degree = 2), data = CO2)
boxTidwell(y = CO2$Treatment, x = CO2$uptake)
Works:
boxTidwell(formula = prestige ~ income + education, other.x = ~ poly(x = women , degree = 2), data = Prestige)
I've been goofing around with the other.x parameter and am guessing that's the issue.
Question
So, does anyone know if 1. the boxTidwell() function works with binary outcome variables 2. the logic behind the other.x, because I can't get my dummy example to work either.
After further searching, it looks like the car:::boxTidwell can't handle the binary outcome variable in the formula, but it can be hand coded:
require(MASS)
require(car)
d1<-read.csv("path for your csv file",sep=',',header=TRUE)
x<-d1$explanatory variable name
y<-d1$dependent variable name
#FIT IS DONE USING THE glm FUNCTION
m1res <- glm(y ~ x,family=binomial(link = "logit"))
coeff1<- coefficients(summary(m1res))
lnx<-x*log(x)
m2res <- glm(y ~ x+lnx ,family=binomial(link = "logit"))
coeff2<- coefficients(summary(m2res))
alpha0<-1.0
pvalue<-coeff2[3,4]
pvalue
beta1<-coeff1[2,1]
beta2<-coeff2[3,1]
iter<-0
err<-1
while (pvalue<0.1) {
alpha <-(beta2/beta1)+alpha0
err<-abs(alpha-alpha0)
alpha0<-alpha
mx<-x^alpha
m1res <- glm(y ~ mx,family=binomial(link = "logit"))
coeff1<- coefficients(summary(m1res))
mlnx<-mx*log(x)
m2res <- glm(y ~ mx+mlnx ,family=binomial(link = "logit"))
coeff2<- coefficients(summary(m2res))
pvalue<-coeff2[3,4]
beta1<-coeff1[2,1]
beta2<-coeff2[3,1]
iter<- iter+1
}
# PRINT THE POWER TO CONSOLE
alpha
above code taken from:
https://sites.google.com/site/ayyalaprem/box-tidwelltransform

Resources