Using lme4 for random effect [closed] - r

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 8 years ago.
Improve this question
I have 3 random variables, x, y z ( all random effect)
x is nested in y, but y is crossed in z
I use the following function in lme4, but it does not work.
<- lmer(A ~ 1 + (1 | x/y) + (1 | y*z) + (1|x/y*z), my data)
Does anyone help me? Many thanks

I'm afraid this is still very unclear. More context would be useful. My guess is that you want
A ~ 1 + (1|y)+ (1|z) + (1|y:z) + (1|y:x)
or equivalently
A ~ 1 + (1|y*z) + (1|y:x)
but it's almost impossible to know for sure.
the first two random effects terms give among-y and among-z variances
the third term gives the variance among combinations of y and z -- you will only want this if you have multiple observations for each {y,z} combination
the last term gives the effect of x nested within y.
The expression A ~ 1 + (1|y/x) + (1|z/y) should give you the same results, because a/b expands in general to a + a:b (order matters for / but not for :), but it's less clear.
Crossed random effects are generally denoted by (1|y) + (1|z), or by (1|y*z) (which expands to (1|y) + (1|z) + (1|y:z)) if as discussed above there are multiple observations per {y,z} combination.

Related

Fitting random factors for a linear model using lme4

I have 4 random factors and I want to provide its linear model using lme4. But struggled to fit the model.
Assuming A is nested within B (2 levels), which in turn nested within each of xx preceptors (P). All responded to xx Ms (M).
I want to fit my model to get variances for each factor and their interactions.
I have used the following codes to fit the model, but I was unsuccessful.
lme4::lmer(value ~ A +
(1 + A|B) +
(1 + P|A),
(1+ P|M),
data = myData, na.action = na.exclude)
I also read interesting materials here, but Still, I struggle to fit the model. Any help?
At a guess, if the nesting structure is ( P (teachers) / B (occasions) / A (participants) ), meaning that the occasions for one teacher are assumed to be completely independent of the occasions for any other teacher, and that participants in turn are never shared across occasions or teachers, but questions (M) are shared across all teachers and occasions and participants:
value ~ 1 + (1| P / B / A) + (1|M)
Some potential issues:
as you hint in the comments, it may not be practical to fit random effects for factors with small numbers of levels (say, < 5); this is likely to lead to the dreaded "singular model" message (see the GLMM FAQ for more detail).
if all of the questions (M) are answered by every participant, then in principle it's possible to fit a model that takes account of the among-question correlation within participants: the maximal model would be ~ 1 + (M | P / B / A) (which would look for among-question correlations at the level of teacher, occasion within teacher, and participant within occasion within teacher). However, this is very unlikely to work in practice (especially if each participant answers each question only once, in which case the teacher:occasion:participant:question variance will be confounded with the residual variance in a linear model). In this case, you will get an error about "probably unidentifiable": see e.g. this question for more explanation/detail.

maximizing with two functions in R [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I would like to find a maximum economic stress scenario restricted by a limit of the mahalanobis distance of this scenario. For this, I have to consider two functions in the optimization.
To make it easier, we can work with a simplifying problem: We have a simple linear model: y=a+bx. For this I want to minimize: sum(a+bx-y)^2. But also, I have for example the restriction that: (ab*5)/2<30.
To calculate this problem with the excel solver is not a problem. But, how I get this in r?
You could try to incorporate the constraint into the objective function, like this
# example data whose exact solution lies outside the constraint
x <- runif(100, 1, 10)
y <- 3 + 5*x + rnorm(100, mean=0, sd=.5)
# big but not too big
bigConst <- sum(y^2) * 100
# if the variables lie outside the feasible region, add bigConst
f <- function(par, x, y)
sum((par["a"]+par["b"]*x-y)^2) +
if(par["a"]*par["b"]>12) bigConst else 0
# simulated annealing can deal with non-continous objective functions
sol <- optim(par=c(a=1, b=1), fn=f, method="SANN", x=x, y=y)
# this is how it looks like
plot(x,y)
abline(a=sol$par["a"], b=sol$par["b"])

Continuous independent variable using Lm function in R [duplicate]

This question already has an answer here:
Datatype for linear model in R
(1 answer)
Closed 5 years ago.
I ran a regression in data set cwm. Intention:
Test the correlation between cm14 (continuous) and medran (continuous)
Control for type, disint, leverage, and end, which affect cm14
ranking <- lm(cm14 ~ medran + cm10a + type + disint + leverage + end, data=cwm)
summary(ranking)
The summary results gave an independent regression figure for every category in medran.
How can I obtain a figure for the overall impact of medran on cm14?
If any of your variables is categorical, you can use anova(ranking) to do this.

code for anova model with nesting and crossing in R [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
The model I want to construct/test is: dependent variable = factor A + factor B + factor C + interaction between factors A and C + interaction between factors B and C + factor B nested within factor A
An example I came across online is described in the file "ANOVA: advanced designs" (http://web.grinnell.edu/individuals/kuipers/stat2labs/Handouts/DOE%20Advancede.pdf) (thanks to the author(s) for sharing this file online). In this file, the example described in Split Plot/Repeated Measures Designs (slides 9-10) is similar to my case. Here, factor A is brand, factor B is box, and factor C is temp. If we assume (1) box is a fixed effect (i.e. those 3 boxes represent all possible levels of the factor "box"), (2) all bags within each box are assigned to a temp, and (3) there are more than two levels of temperatures (e.g. there are four levels of temperature, 10, 20, 30, 40) and the number of bags within each box assigned to a certain temp is randomly determined (i.e. the numbers of bags assigned to different temperatures are not equal and it could be that no bag is assigned to a certain temperature in some boxes), then this example is almost the same as what I am trying to describe. Also, my design is not balanced.
I want to test which factors and how these factors contribute to the dependent variable. The hypotheses are the hypotheses for a 3-way (in the example of popcorn, brand, temperature, box) anova. In the example of popcorn, the null hypothesis might be: brand, temp and/or box do not influence % popped kernels. The alternative hypothesis is just the opposite to null. Also, probably box in my case could also be a random effect, just as box, but I would like to take both these two situations into consideration (box as fixed and random effect).
What is the appropriate way to address this question?
Thanks.
I'm not 100% sure we agree on terminology, but I'll take a shot ...
You say you want
factor A + factor B + factor C + interaction between factors A and C + interaction between factors B and C + factor B nested within factor A
The main thing to note is that "B nested within A" is equivalent, at least in the world that I'm familiar with, to "include the main effect of A and the interaction between A and B, but not the main effect of B" (i.e. ~A/B == ~A+A:B. But then you say you do want the main effect of factor B, so this seems a little strange. Following your specification exactly would give
~ A + B + C + A:C + B:C + A/B
but this is equivalent to
~ A + B + C + A:C + B:C + A + A:B
R automatically discards the redundant A term, so this is also equivalent to
~ A + B + C + A:C + B:C + A:B
But since this is essentially the main effects plus all two-way interactions, you could also write it as
~(A+B+C)^2
Because redundant terms are discarded you could write this equivalently in many different ways: ~A*B+A*C+B*C (A*B is equivalent to A+B+A:B) or ~A*C+B*C+A/B ... if you want to check what R has actually produced, you can use colnames(model.matrix(my_formula,my_data)).
This is all assuming we're working in the lm()/fixed-effect context ...

Adjusting regression weight based on feedback [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 9 years ago.
Improve this question
Let's say I want to predict a dependent variable D, where:
D<-rnorm(100)
I cannot observe D, but I know the values of three predictor variables:
I1<-D+rnorm(100,0,10)
I2<-D+rnorm(100,0,30)
I3<-D+rnorm(100,0,50)
I want to predict D by using the following regression equation:
I1 * w1 + I2 * w2 + I3 * w3 = ~D
however, I do not know the correct values of the weights (w), but I would like to fine-tune them by repeating my estimate:
in the first step I use equal weights:
w1= .33, w2=.33, w3=.33
and I estimate D using these weights:
EST= I1 * .33 + I2 * .33 + I3 *. 33
I receive feedback, which is a difference score between D and my estimate (diff=D-EST)
I use this feedback to modify my original weights and fine-tune them to eventually minimize the difference between D and EST.
My question is:
Is the difference score sufficient for being able to fine-tune the weights?
What are some ways of manually fine-tuning the weights? (e.g. can I look at the correlation between diff and I1,I2,I3 and use that as a weight?
The following command,
coefficients(lm(D ~ I1 + I2 + I3))
will give you the ideal weights to minimize diff.
Your defined diff will not tell you enough to manually manipulate the weights correctly as there is no way to isolate the error component of each I.
The correlation between D and the I's is not sufficient either as it only tells you the strength of the predictor, not the weight. If your I's are truly independent (both from each other, all together and w.r.t. D - a strong assumption, but true when using rnorm for each), you could try manipulating one at a time and notice how it affects diff, but using a linear regression model is the simplest way to do it.

Resources