Fixing random slopes and random intercepts values in lme4 - r

I'm currently trying to calculate the effect size (Cohen's f^2) of a given effect but need to run a null and partial model with the random effects pre-specified in order to do so (according to Selya et al., 2012 for dealing with continuous predictors). Selya et al. outline the code necessary to do so in SAS, but I'm trying to figure out how one would do it in R.
When I went to use the last piece of code from a prior similar question, I kept getting the error that the thetas do not match ("3!=4"). I think the issue arises because my original model has a cross-level interaction term while the prior post has only random intercepts. I'm not just trying to hold constant the random effects on the intercept but also the random effects of a slope. How can I amend the code to make this run? I called the getME function with "theta" and saw indeed that my mod3b model has 4 theta values listed, so I imagine I will need to add another parameter term to the code. I just cannot figure out how to get the variance of the fourth theta term to show in my original output. I appreciate any and all help!
Here's my original cross-classified model with random slopes & random intercepts:
mod3b <- lmer(FitBelong~Condition*Gender +
(1+Condition|ResponseID) + (1|Stimuli),
data=LFS1Ensemble, REML=TRUE)
summary(mod3b)
I've adapted code from the answer on the link to be as follows:
#Effect size of interaction#
buildMM <- function(theta) {
dd <- as.function(mod3b)
ff <- dd(theta)
opt <- list(par=c(0,0,0),fval=ff,conv=0)
mm <- mkMerMod(environment(dd), opt, lmod$reTrms, fr = lmod$fr,
mc = quote(hacked_lmer()))
return(mm)
}
objfun <- function(x,target=c(3.92244,0.08805,0.09683)) {
mm <- buildMM(sqrt(x))
return(sum((unlist(VarCorr(mm))-target)^2))
}
s0 <- c(3.92244,0.08805,0.09683)/sigma(mod3b)^2
opt <- optim(fn=objfun,par=s0)
mm_final <- buildMM(sqrt(opt$par))
summary(mm_final)
The error it throws is: "Error: theta size mismatch"
When I do the traceback it gives me:
6 stop(structure(list(message = "theta size mismatch", call = NULL,
cppstack = NULL), .Names = c("message", "call", "cppstack"
), class = c("std::invalid_argument", "C++Error", "error", "condition"
)))
5 dd(theta)
4 buildMM(sqrt(x))
3 fn(par, ...)
2 (function (par)
fn(par, ...))(c(2.1571475240413, 0.0484231344499436, 0.0532516991344468
))
1 optim(fn = objfun, par = s0)
Any and all help much much appreciated!

Related

Getting R^2 from Dredge() when global model has an optimizer

Goal: get R^2 marginal and conditional in dredge results using an optimizer in the origninal model
This branches off of this question: dredge doesnt work when specifying glmer optimizer and the two solutions provided.
Solution 1: change r.squaredLR.R package code
Solution 2: add a function into the dredge function to call r.squaredGLMM instead of r.squaredLR
I tried Solution 2 first, which works perfectly on the simulated data, but when I try it on my model I get the error :
Error in r.squaredGLMM(x, null = nullmodel)["delta", ] :
subscript out of bounds
I then tried Solution 1 by altering the source code of r.squaredLR.R as descripbed and saving it as a R script and using source() to call the edited 'null.fit' function as to avoid editing r.squaredLR.R permenantly (I call MuMIn before sourcing the edited function). Yet this doesn't work.
Back to Solution 2...
I tried to simulate data similar to mine and was able to get the same error (the lmercontrol argument is disregarded in this global model, but I get the desired error so I didn't try to correct the data to need lmercontrol).
#Solution 2 attempt
set.seed(101)
dd <- data.frame(x1= rnorm(1920), x2=rnorm(1920), x3=rnorm(1920), x4=rnorm(1920),
treatment = factor(rep(1:2, each=3)),
replicate = factor(rep(1:3, each=1)),
stage = factor(rep(1:5, each=384)),
country = factor(rep(1:4, each=96)),
plot = factor(rep(1:10, each=24)),
chamber = factor(rep(1:6, each=1)),
n = 1920)
library(lme4)
dd$y <- simulate(~ x1 + x2 + x3 + (1|plot),
family = binomial,
weights = dd$n,
newdata = dd,
newparams = list(beta = c(1,1,1,1),
theta = 1))[[1]]
# my real response variable 'y' has a poisson distribution, but I had difficulty figuring
# out how to simulate a poisson distribution so I left the bionomial.
m0 <- lmer(y~ x1 + x2 + x3 + x4 + treatment*replicate*stage + (1|chamber) + (1|country/plot),
data=dd,
na.action = "na.fail",
REML = F,
lmercontrol = glmerControl(optimizer="bobyqa"))
nullmodel <- MuMIn:::.nullFitRE(m0)
dredge(m0, m.lim = c(0,5), rank = "AIC", extra =list(R2 = function(x) {
r.squaredGLMM(x, null = nullmodel)["delta", ]}))
A suggested reason for the error "subscript out of bounds" was that "the data being put into the algorithm are not in the format that the function expects."
Indeed, the function works when I remove ["delta", ] and I get the columns R21 and R22, but without taking into account the delta column these values are probably incorrect and I'm not sure which one is marginal and conditional R^2.
If you have any ideas, I'm all ears! Thanks in advance for all help.

How to solve "impacts()" neighbors length error after running spdep::lagsarlm (Spatial Autoregressive Regression model)?

I have 9,150 polygons in my dataset. I was trying to run a spatial autoregressive model (SAR) in spdep to test spatial dependence of my outcome variable. After running the model, I wanted to examine the direct/indirect impacts, but encountered an error that seems to have something to do with the length of neighbors in the weights matrix not being equal to n.
I tried running the very same equation as SLX model (Spatial Lag X), and impacts() worked fine, even though there were some polygons in my set that had no neighbors. I Googled and looked at spdep documentation, but couldn't find a clue on how to solve this error.
# Defining queen contiguity neighbors for polyset and storing the matrix as list
q.nbrs <- poly2nb(polyset)
listweights <- nb2listw(q.nbrs, zero.policy = TRUE)
# Defining the model
model.equation <- TIME ~ A + B + C
# Run SAR model
reg <- lagsarlm(model.equation, data = polyset, listw = listweights, zero.policy = TRUE)
# Run impacts() to show direct/indirect impacts
impacts(reg, listw = listweights, zero.policy = TRUE)
Error in intImpacts(rho = rho, beta = beta, P = P, n = n, mu = mu, Sigma = Sigma, :
length(listweights$neighbours) == n is not TRUE
I know that this is a question from 2019, but maybe it can help people dealing with the same problem. I found out that in my case the problem was the type of dataset, your data=polyset should be of type "SpatialPolygonsDataFrame". Which can be achieved by converting your data:
polyset_spatial_sf <- sf::as_Spatial(polyset, IDs = polyset$ID)
Then rerun your code.

Parameter estimates using FME ODE model fitting in R

I have a system of ODE equations that I am trying to fit to generated data, synthetic or lab. The final product I am interested in is the parameter and it's estimated error. We use the R package FME with modCost and modFit. As an example, a system of ODEs may be defined as such:
eqs <- function (time, y, parms, ...) {
with(as.list(c(parms, y)), {
dP <- k2*PA - k1*A*P # concentration of nucleic acid
dA <- dP # concentration of free protein
dPA <- -dP
list(c(dA,dP,dPA))
}
}
with parameters k1 and k2 and variables A,P and PA. I import the data (not shown) and define the cost function used in modFit
cost <- function(p, data, ...) {
yy <- p[c("A","P","PA")]
pp <- p[c("k1", "k2")]
out <- ode(yy, time, eqs, pp)
modCost(out, data, ...)
}
I set some initial conditions with a parms vector and then do the fitting with
fit <- modFit(f = cost, p = parms, data = dat, weight = "std",
lower = rep(0, 8), upper = c(600,100,600,0.01,0.01), method = "Marq")
I then do a final ode to get the generated fits with best parameters, Bob's your uncle, and boom, estimated parameters. The input numbers don't matter, I hope my process outline is legible for those who use this package.
My issue and question centers around two things: I'm a scientist, a physicist, and the error of the estimated parameters is important to report. Can I generate the estimated error from MFE somehow or is there a separate package for that kind of return?
I don't get your point. You can just use:
summary(fit)
to see the Std. Error.

Fit state space model using dlm

I am about to fit a state space model to at univariate time series (y_t). The model i try to fit is:
y_t=F x_t+\epsilon_t, \epsilon_t \sim N(0,V)
x_{t+1}=G x_t+w_t, w_t \sim N(0,W)
x_0 \sim N(m_0,C_0)
I use the following R-code:
# Create function of unknown parameters, which returns dlm object
Build <- function(theta) {dlm(FF=theta[1],
GG=theta[2],V=theta[3],W=theta[4],m0=theta[5],
C0=theta[6])}
# Fit model to data using MLE
f1 <- dlmMLE(y,parm=c(1,1,0.1,0.1,0,0.1),Build)
But I get the following error message (after running f1):
Error in dlm(FF = theta[1], GG = theta[2], V = theta[3], W = theta [4], :
V is not a valid variance matrix
My problem is that I don't understand why V is not a valid variance matrix..
Does anyone know what is wrong?
Thank you in advance
Regards fuente
EDIT:
I tried doing the same, but instead of my real data I used:
y <- rnorm(72,6.44,1.97)
This produced, however, an error involving W (and not V...):
Error in dlm(FF = theta[1], GG = theta[2], V = theta[3], W = theta[4], :
W is not a valid variance matrix
I'm confused. Does it have something to do with the starting values passed to parm=...?

Error when using msmFit in R

I'm trying to simulate this paper (Point Forecast Markov Switching Model for U.S. Dollar/ Euro Exchange Rate, by Hamidreza Mostafei) in R. The table that I'm trying to get is on page 483. Here is a link to a pdf.
I wrote the following codes and then got an error at the last line:
mydata <- read.csv("C:\\Users\\User\\Downloads\\EURUSD_2.csv", header=T)
mod <- lm(EURUSD~EURUSD.1, mydata)
mod.mswm = msmFit(mod, k=2, p=1, sw=c(T,T,T,T), control=list(parallel=F))
Error in if ((max(abs(object["Fit"]["logLikel"] - oldll))/(0.1 + max(abs(object["Fit"]["logLikel"]))) < :
missing value where TRUE/FALSE needed
Basically the data that's being used is EURUSD, which is the level change in monthly frequency. EURUSD.1 is the one lag variable. Both EURUSD and EURUSD.1 are in my csv file. (I'm not sure how to attach the csv file here. If someone could point that out that would be great).
I changed the EURUSD.1 values to something random and msmFit function seemed to work. But whenever I tried using the original value, i.e. the lag value, the error came out.
Something degenerate is happening when one variable is simply lagged from the other. Consider:
Sample data frame where Y is lagged X:
> d = data.frame(X=runif(100))
> d$Y=c(.5, d$X[-100])
> mod <- lm(X~Y,d)
> mod.mswm = msmFit(mod, k=2, p=1, sw=c(T,T,T,T), control=list(parallel=F))
Error in if ((max(abs(object["Fit"]["logLikel"] - oldll))/(0.1 + max(abs(object["Fit"]["logLikel"]))) < :
missing value where TRUE/FALSE needed
that gives your error. Let's add a tiny tiny bit of noise to Y and see what happens:
> d$Y=d$Y+rnorm(100,0,.000001)
> mod <- lm(X~Y,d)
> mod.mswm = msmFit(mod, k=2, p=1, sw=c(T,T,T,T), control=list(parallel=F))
> mod.mswm
Markov Switching Model
Call: msmFit(object = mod, k = 2, sw = c(T, T, T, T), p = 1, control = list(parallel = F))
AIC BIC logLik
4.3109 47.45234 3.84455
Coefficients:
(Intercept)(S) Y(S) X_1(S) Std(S)
Model 1 0.8739622 -22948.89 22948.83 0.08194545
Model 2 0.4220748 77625.21 -77625.17 0.21780764
Transition probabilities:
Regime 1 Regime 2
Regime 1 0.3707261 0.3886715
Regime 2 0.6292739 0.6113285
It works! Now either:
Having perfectly lagged variables causes some "divide by zero" error because its a purely degenerate case (like having perfectly co-linear variables in a linear model). A little experimenting shows that in this case the resulting output is very sensitive to how much noise you add, so I'm thinking its on a knife-edge here. I suspect having perfectly lagged variables here leads to some singularity or degeneracy.
or
There's some bug in the function.
I have no idea what msmFit does, so that's for you to sort out.

Resources