Calculating the standard error of parameters in nlme - r

I am running a non-linear mixed model in nlme, and I am having trouble calculating the standard errors of the three parameters. We have our final model here:
shortG.nlme9 <- update(shortG.nlme6,
fixed = Asym + xmid + scal ~ Treatment * Breed + Environment,
start = c(shortFix6[1:16], rep(0,2),
shortFix6[17:32], rep(0,2),
shortFix6[33:48], rep(0,2)),
control = nlmeControl(pnlsTol = 0.02, msVerbose = TRUE))
And when we plug it in with the summary statement, we can get the standard errors of each of the treatments, breeds, treatment*breed interactions, and environments. However, we are looking at making growth curves for specific combinations (treatment1/breed1, treatment2/breed1, treatment3/breed1, etc), so we need to combine effects of treatment, breed, and the environments for the parameter values, and logically combine their standard errors to get the SE of the full parameter. To do this, is there either a way to get R to come up with the full SE on its own, or is there an easy way to have R give us a covariate matrix so we can calculate the values by hand? When we look at the basic statistics by simply plugging in the summary(shortG.nlme9) statement, we are automatically given a correlation matrix, so is there something we could write in for a covariate matrix instead?

Related

Syntax error when fitting a Bayesian logistic regression

I am attempting to model binary species traits, where presence is represented by 1 and absence by 0, as a function of some sampling variables. To accomplish this, I have constructed a brms model and added a phylogenetic structure to it. Here is the model I used:
model <- brms::brm(male_head | trials(1 + 0) ~
PC1 + PC2 + PC3 +
(1|gr(phylo, cov = covariance_matrix)),
data = data,
family = binomial(),
prior = prior,
data2 = list(covariance_matrix = covariance_matrix))
Each line of my df represents one observation with a binary outcome.
Initially, I was unsure about which arguments to use in the trials() function. Since my species are non-repeated and some have the traits I'm modeling while others do not, I thought that trials(1 + 0) might be appropriate. I recall seeing a vignette that suggested this, but I can't find it now. Is this syntax correct?
Furthermore, for some reason I'm unaware, the model is producing one estimate value for each line of my predictors. As my df has 362 lines, the model summary displays a lengthy list of 362 estimate values. I would prefer to have one estimate value for each sampling variable instead. Although I have managed to achieve this by making the treatment effect a random effect (i.e., (1|PC1) + (1|PC2) + (1|PC3)), I don't think this is the appropriate approach. I also tried bernoulli() but no success either. Do you have any suggestions for how I can address this issue?
EDIT:
For some reason the values of my sampling variables/principal components were being read as factors. The second part of this question was solved.

Heteroskedasticity and Autocorrelation Standard errors for Least Square Dummy Variables (LSDV)

I have panel data set with N = 17 Spanish regions and T = 32 years and I want to perform a fixed effect model which controls for individual heterogeneity. However, as I have 2 time invariant independent variables I can't use the whitin estimator from plm() because it drops them off. Thus, I must use the LSDV like this one:
mcorr <- lm(subv ~ preelec + elec + postelec + ideo + ali + crec_pib + pob + pob16 + pob64 + factor(ccaa)-1, data = datos)
where ccaa is the name of the variable that indicates the individual (region). Of course the results of the coeficients are the same as if I performed the same model using plm() and the whitin estimator.
Nevertheless, when I use robust standard errors to fix heteroskedasticity and autocorrelation in panel data I get different values in the errors, for example while using coeftest(mcorr, vcovHC(mcorr), method = "arellano"). When I use another alternative for the LSDV model, with the command vcovHAC(), errors are similar, but still not identical.
Which is the best way to account for that heteroskedasticity and autocorrelation while using the LSDV method?

Gamma distribution in a GLMM

I am trying to create a GLMM in R. I want to find out how the emergence time of bats depends on different factors. Here I take the time difference between the departure of the respective bat and the sunset of the day as dependent variable (metric). As fixed factors I would like to include different weather data (metric) as well as the reproductive state (categorical) of the bats. Additionally, there is the transponder number (individual identification code) as a random factor to exclude inter-individual differences between the bats.
I first worked in R with a linear mixed model (package lme4), but the QQ plot of the residuals deviates very strongly from the normal distribution. Also a histogram of the data rather indicates a gamma distribution. As a result, I implemented a GLMM with a gamma distribution. Here is an example with one weather parameter:
model <- glmer(formula = difference_in_min ~ repro + precipitation + (1+repro|transponder number), data = trip, control=ctrl, family=gamma(link = log))
However, since there was no change in the QQ plot this way, I looked at the residual diagnostics of the DHARMa package. But the distribution assumption still doesn't seem to be correct, because the data in the QQ plot deviates very much here, too.
Residual diagnostics from DHARMa
But if the data also do not correspond to a gamma distribution, what alternative is there? Or maybe the problem lies somewhere else entirely.
Does anyone have an idea where the error might lie?
But if the data also do not correspond to a gamma distribution, what alternative is there?
One alternative is called the lognormal distribution (https://en.wikipedia.org/wiki/Log-normal_distribution)
Gaussian (or normal) distributions are typically used for data that are normally distributed around zero, which sounds like you do not have. But the lognormal distribution does not have the same requirements. Following your previous code, you would fit it like this:
model <- glmer(formula = log(difference_in_min) ~ repro + precipitation + (1+repro|transponder number), data = trip, control=ctrl, family=gaussian(link = identity))
or instead of glmer you can just call lmer directly where you don't need to specify the distribution (which it may tell you to do in a warning message anyway:
model <- lmer(formula = log(difference_in_min) ~ repro + precipitation + (1+repro|transponder number), data = trip, control=ctrl)

How to specify random effects names in a newdata data.frame used in predict() function? - lme4

I have a problem using the predict() function in lme4.
More precisely, I am not very clear on how to declare the names of random effects to be used in the newdata data frame, which I feed the predict() function with, in order to get some predictions.
I will try and describe my problem in detail.
Data
The data I am working with is longitudinal. I have 119 observations, for each of which I have several (6-7) measurements for each observation, which represent the size of proteins, which aggregate in time and grow bigger (let's call it LDL).
Model
The model used to describe this process is a Richard's curve (generalized logistic function), which can be written as
Now, I fit a separate curve for the group of measurements of each observation, with the following fixed, random effects, and variables:
alpha_fix - a fixed effect for alpha
alpha|Obs - a random effect for alpha, which varies among observations
gamma_fix - a fixed effect for gamma
gamma|Obs - a random effect for gamma, which varies among observations
delta_f - a fixed effect
Time - a continuous variable, time in hours
LDL - response variable, continuous, representing size of proteins at time point t.
Predictions
Once I fit the model, I want to use it to predict the value of LDL at a specific time point, for each observation. In order to do this, I need to use the predict function and assign a data frame for newdata. reading through the documentation here, it says the following:
If any random effects are included in re.form (see below), newdata
must contain columns corresponding to all of the grouping variables
and random effects used in the original model, even if not all are
used in prediction; however, they can be safely set to NA in this case
Now, the way I understand this, I need to have a data frame newdata, which in my case contains the following columns: "Time", "Obs", "alpha_f", "gamma_f", "delta_f", as well as two columns for the random effects of alpha and gamma, respectively. However, I am not sure how these two columns with random effects should be named, in order for the predict() function to understand them. I tried with "alpha|Obs" and "gamma|Obs", as well as "Obs$alpha", "Obs$gamma", but both throw the error
Error in FUN(X[[i]], ...) : random effects specified in re.form
that were not present in original model.
I was wondering whether anyone has any idea what the correct way to do this is.
For completeness, the code used to fit the model is provided below:
ModelFunction = function (alpha, gamma, delta, Time) {
15 + (alpha-15) / (1 + exp ((gamma-Time) / delta))
}
ModelGradient = deriv(body(ModelFunction) [[2]], namevec = c ("alpha", "gamma", "delta"), function.arg = ModelFunction)
starting_conditions = c (alpha = 5000, gamma = 1.5, delta = 0.2) # Based on visual observation
fit = nlmer (
LDL ~
ModelGradient (alpha, gamma, delta, Time)
~ (gamma | Obs) + (alpha | Obs),
start = starting_conditions,
nlmerControl(optimizer="bobyqa", optCtrl = list(maxfun = 100000)),
data = ldlData)
I would really appreciate it if someone could give me some advice.
Thanks for your time!

Mixed Modelling - Different Results between lme and lmer functions

I am currently working through Andy Field's book, Discovering Statistics Using R. Chapter 14 is on Mixed Modelling and he uses the lme function from the nlme package.
The model he creates, using speed dating data, is such:
speedDateModel <- lme(dateRating ~ looks + personality +
gender + looks:gender + personality:gender +
looks:personality,
random = ~1|participant/looks/personality)
I tried to recreate a similar model using the lmer function from the lme4 package; however, my results are different. I thought I had the proper syntax, but maybe not?
speedDateModel.2 <- lmer(dateRating ~ looks + personality + gender +
looks:gender + personality:gender +
(1|participant) + (1|looks) + (1|personality),
data = speedData, REML = FALSE)
Also, when I run the coefficients of these models I notice that it only produces random intercepts for each participant. I was trying to then create a model that produces both random intercepts and slopes. I can't seem to get the syntax correct for either function to do this. Any help would be greatly appreciated.
The only difference between the lme and the corresponding lmer formula should be that the random and fixed components are aggregated into a single formula:
dateRating ~ looks + personality +
gender + looks:gender + personality:gender +
looks:personality+ (1|participant/looks/personality)
using (1|participant) + (1|looks) + (1|personality) is only equivalent if looks and personality have unique values at each nested level.
It's not clear what continuous variable you want to define your slopes: if you have a continuous variable x and groups g, then (x|g) or equivalently (1+x|g) will give you a random-slopes model (x should also be included in the fixed-effects part of the model, i.e. the full formula should be y~x+(x|g) ...)
update: I got the data, or rather a script file that allows one to reconstruct the data, from here. Field makes a common mistake in his book, which I have made several times in the past: since there is only a single observation in the data set for each participant/looks/personality combination, the three-way interaction has one level per observation. In a linear mixed model, this means the variance at the lowest level of nesting will be confounded with the residual variance.
You can see this in two ways:
lme appears to fit the model just fine, but if you try to calculate confidence intervals via intervals(), you get
intervals(speedDateModel)
## Error in intervals.lme(speedDateModel) :
## cannot get confidence intervals on var-cov components:
## Non-positive definite approximate variance-covariance
If you try this with lmer you get:
## Error: number of levels of each grouping factor
## must be < number of observations
In both cases, this is a clue that something's wrong. (You can overcome this in lmer if you really want to: see ?lmerControl.)
If we leave out the lowest grouping level, everything works fine:
sd2 <- lmer(dateRating ~ looks + personality +
gender + looks:gender + personality:gender +
looks:personality+
(1|participant/looks),
data=speedData)
Compare lmer and lme fixed effects:
all.equal(fixef(sd2),fixef(speedDateModel)) ## TRUE
The starling example here gives another example and further explanation of this issue.

Resources