Pairwise comparisons after Glm.nb - r

I am studying some behavioral data based on the scan sampling method (number of occurrences of each behavior recorded among the total number of occurrences of all behaviors) in rabbits. I study two main effects that are the age of the animals (3 levels) and the time they are outside on a pasture (2 levels).
I have used this model for let's say the Grooming behavior:
glm.nb.Grooming = glm.nb(Grooming ~ Age * Time, data = sa)
The Anova showed an effect of age and time on the expression of this behavior (P_value(Time) < 0.05 and P_value(Age) < 0.05 with no effect of the interaction). I want to present my data in a table like this one down below because it exists an effect of Time x Age on some other behaviors.
When I want to run a "pairs" to know which values are different from the others, I get this problem (T3 and T8 are for Time outside and the numbers are the ages)
Grooming.em = emmeans(glm.nb.Grooming, ~ Time * Age, type="response") ; Grooming.em.em ; pairs(Grooming.em)
The pairwise comparisons has no p_value under 5% despite the effect of Age and Time as shown with the Anova.
I supposed it is because of some of the SE that are very high... or the log(0) I need to make these comparisons but I actually have no idea how to fix this. Can you help me? Thanks a lot

Related

Lmer for longitudinal design

I have a longitudinal dataset where I have the following variables for each subject:
IV: 3 factors (factorA, factorB, factorC, factorD), each measured twice, at the beginning and at the end of an intervention.
DV: one outcome variable (behavior), also measure twice, at the beginning and at the end of the intervention.
I would like to create a model that uses the change in factorA, factorB, factorC, factorD (change from beginning to end of the intervention) to predict the change in behavior (again from beginning to end).
I thought to use the delta values of factorA, factorB, factorC, factorD (from pre to post intervention) and use these delta values to predict the delta values of D1. I would also like to covary-out the absolute values of each factor (A, B, C and D) (e.g. using only the value at the beginning of the intervention for each factor) to make sure I account for the change that the absolute values (rather than the change) of these IVs may have on the DV.
Here is my dataset:
enter image description here
Here is my model so far:
Model <- lmer(Delta_behavior ~ Absolute_factorA + Absolute_factorB +
Absolute_factorC + Absolute_factorD + Delta_factorA +
Delta_factorB + Delta_factorC + Delta_factorD +
(1|Subject),a)
I think I am doing something wrong because I get this error:
Error: number of levels of each grouping factor must be < number of observations
What am I doing wrong? Is the data set structured weirdly? Should I not use the delta values? Should I use another test (not lmer)?
Because you have reduced your data to a single observation per subject, you don't need to use a multi-level/mixed model. The reason that lmer is giving you an error is that in this situation the between-subject variance is confounded with the residual variance.
You can probably go ahead and use a linear model (lm) for this analysis.
More technical detail
The equation for the distribution of the ith observation is something like [fixed-effect predictors] + eps(subject(i)) + eps(i) where eps(subject(i)) is the Normal error term of the subject associated with the ith observation, and eps(i) is the Normal residual error associated with the ith observation. If we only have one observation per subject, then each observation has two error terms that are unique to it. The sum of two Normal variables with zero means and variances of V1 and V2 is also Normal with mean zero and variance V1+V2 ... therefore V1 and V2 are jointly unidentifiable. You can use lmerControl to override the error if you really want to; lmer will return some arbitrary combination of V1, V2 estimates that sum to the total variance.
There's a similar example illustrated here.

How would I devise code to get both within subject and between subject comparisons when attempting to carry out a repeated measures ANOVA?

I understand I can use lmer but I would like to undertake a repeated measures anova in order to carry out both a within group and a between group analysis.
So I am trying to compare the difference in metabolite levels between three groups ( control, disease 1 and disease 2) over time ( measurements collected at two timepoints), and to also make a within group comparison, comparing time point 1 with time point 2.
Important to note - these are subjects sending in samples not timed trial visits where samples would have been taken on the same day or thereabouts. For instance time point 1 for one subject could be 1995, time point 1 for another subject 1996, the difference between timepoint 1 and timepoint 2 is also not consistent. There is an average of around 5 years, however max is 15, min is .5 years.
I have 43, 45, and 42 subjects respectively in each group. My response variable would be say metabolite 1, the predictor would be Group. I also have covariates I would like to be accounted for such as age, BMI, and gender. I would also need to account for family ID (which I have as a random effect in my lmer model). My column with Time has a 0 to mark the time point 1 and 1 is timepoint 2). I understand I must segregate the within and between subjects command, however, I am unsure how to do this. From my understanding so far;
If I am using the anova_test, my formula that needs to be specified for between subjects would be;
Metabolite1 ~ Group*Time
Whilst for within subjects ( seeing whether there is any difference within each group at TP1 vs TP2), I am unsure how I would specify this ( the below is not correct).
Metabolite1 ~ Time + Error(ID/Time)
The question is, how do I combine this altogether to specify the between and within subject comparisons I would like and accounting for the covariates such as gender, age and BMI? I am assuming if I specify covariates it will become an ANCOVA not an ANOVA?
Some example code that I found that had both a between and within subject comparison design (termed mixed anova).
aov1 <- aov(Recall~(Task*Valence*Gender*Dosage)+Error(Subject/(Task*Valence))+(Gender*Dosage),ex5)
Where he specifies that the within subject comparison is within the Error term. Also explained here https://rpkgs.datanovia.com/rstatix/reference/anova_test.html
However, mine, which I realise is very wrong currently ( is missing a correct within subject comparison).
repmes<-anova_test(data=mets, Metabolite1~ Group*Time + Error(ID/Time), covariate=c("Age", "BMI",
"Gender", "FamilyID")
I ultimately would like to determine from this with appropriate post hoc tests ( if p < 0.05) whether there are any significant differences in Metabolite 1 expression between groups between the two time points (i.e over time), and whether there are any significant differences between subjects comparing TP1 with TP2. Please can anybody help.

Fitting random factors for a linear model using lme4

I have 4 random factors and I want to provide its linear model using lme4. But struggled to fit the model.
Assuming A is nested within B (2 levels), which in turn nested within each of xx preceptors (P). All responded to xx Ms (M).
I want to fit my model to get variances for each factor and their interactions.
I have used the following codes to fit the model, but I was unsuccessful.
lme4::lmer(value ~ A +
(1 + A|B) +
(1 + P|A),
(1+ P|M),
data = myData, na.action = na.exclude)
I also read interesting materials here, but Still, I struggle to fit the model. Any help?
At a guess, if the nesting structure is ( P (teachers) / B (occasions) / A (participants) ), meaning that the occasions for one teacher are assumed to be completely independent of the occasions for any other teacher, and that participants in turn are never shared across occasions or teachers, but questions (M) are shared across all teachers and occasions and participants:
value ~ 1 + (1| P / B / A) + (1|M)
Some potential issues:
as you hint in the comments, it may not be practical to fit random effects for factors with small numbers of levels (say, < 5); this is likely to lead to the dreaded "singular model" message (see the GLMM FAQ for more detail).
if all of the questions (M) are answered by every participant, then in principle it's possible to fit a model that takes account of the among-question correlation within participants: the maximal model would be ~ 1 + (M | P / B / A) (which would look for among-question correlations at the level of teacher, occasion within teacher, and participant within occasion within teacher). However, this is very unlikely to work in practice (especially if each participant answers each question only once, in which case the teacher:occasion:participant:question variance will be confounded with the residual variance in a linear model). In this case, you will get an error about "probably unidentifiable": see e.g. this question for more explanation/detail.

mixed models: r spss difference

I want to do a mixed model analysis on my data. I used both R and SPSS to verify whether my R results where correct, but the results differ enormous for one variable. I can't figure out why there is such a large difference myself, your help would be appreciated! I already did various checks on the dataset.
DV: score on questionnaire (QUES)
IV: time (after intervention, 3 month follow-up, 9 month follow-up)
IV: group (two different interventions)
IV: score on questionnaire before the intervention (QUES_pre)
random intercept for participants
SPSS code:
MIXED QUES BY TIME GROUP WITH QUES_pre
/CRITERIA=CIN(95) MXITER(100) MXSTEP(10) SCORING(1) SINGULAR(0.000000000001) HCONVERGE(0,
ABSOLUTE) LCONVERGE(0, ABSOLUTE) PCONVERGE(0.000001, ABSOLUTE)
/FIXED=TIME GROUP QUES_pre TIME*GROUP | SSTYPE(3)
/METHOD=REML
/PRINT=SOLUTION TESTCOV
/RANDOM=INTERCEPT | SUBJECT(ID) COVTYPE(AR1)
/REPEATED=Index1 | SUBJECT(ID) COVTYPE(AR1).
R code:
model1 <- lme(QUES ~ group + time + time:group + QUES_pre, random = ~1|ID, correlation = corAR1(0, form =~1|Onderzoeksnummer), data = data,na.action=na.omit, method = "REML")
The biggest difference lies in the effect of group. For the SPSS code, the p value is .045, for the R code the p-value is .28. Is there a mistake in my code, or has anyone a suggestion of something else that might go wrong?

3 way anova nested in r

I'm trying to figure out the model for a fully factorial experiment.
I have the following factors
Treatment Day Hour Subject ResponseVariable
10 days of measurements, 4 different time points within each day, 2 different treatments measured, 12 subjects )6 subjects within treatment 1, and 6 different subjects in treatment 2)
for each day I measured: 6 subjects in treatment 1, the other 6 in treatment 2, at 4 different time points.
For Subjects, I have 12 different subjects, but Subjects 1-6 are in Treatment-1 and Subjects 7-12 are in Treatment-2. The subjects did not change treatments, thus I measured the same set of subjects for each treatment each of the 10 days
So what's tripping me up is specifying the correct error term.
I thought I had the general model down but R is giving me "Error() model is singular"
aov(ResponseVariable ~ T + R + S + TR + TS + RS + Error(T/S)
any thoughts would help?
I've gotten the same error, and I think my problem was missing observations. Are you missing any observations? I believe they're less of a problem for linear mixed effects, and I've read that some people use lme instead of repeated-measures ANOVA for those cases.
Your error term can be interpreted as "the S effect within each T". It sounds from your description as though that's what you want, so I don't think that's what's causing your error message.
One note: I see you've got a variable named "T". R let you do that? T is normally reserved for meaning "TRUE". That might be part of your problem.

Resources