ANOVA with block design and repeated measures - r

I'm attempting to run some statistical analyses on a field trial that was constructed over 2 sites over the same growing season.
At both sites (Site, levels: HF|NW) the experimental design was a RCBD with 4 (n=4) blocks (Block, levels: 1|2|3|4 within each Site).
There were 4 treatments - 3 different forms of nitrogen fertiliser and a control (no nitrogen fertiliser) (Treatment, levels: AN, U, IU, C).
During the field trial there were 3 distinct periods that commenced with fertiliser addition and ended with harvesting of the grass. These periods have been given the levels 1|2|3 under the factor N_app.
There are a range of measurements that I would like to test the following null hypothesis H0 on:
Treatment (H0) had no effect on measurement
Two of the measurements I am particularly interested in are: grass yield and ammonia emissions.
Starting with grass yield (Dry_tonnes_ha) as
shown here, a nice balanced data set
The data can be downloaded in R using the following code:
library(tidyverse)
download.file('https://www.dropbox.com/s/w5ramntwdgpn0e3/HF_NW_grass_yield_data.csv?raw=1', destfile = "HF_NW_grass_yield_data.csv", method = "auto")
raw_data <- read.csv("HF_NW_grass_yield_data.csv", stringsAsFactors = FALSE)
HF_NW_grass <- raw_data %>% mutate_at(vars(Site, N_app, Block, Plot, Treatment), as.factor) %>%
mutate(Date = as.Date(Date, format = "%d/%m/%Y"),
Treatment = factor(Treatment, levels = c("AN", "U", "IU", "C")))
I have had a go at running an ANOVA on this using the following approach:
model_1 <- aov(formula = Dry_tonnes_ha ~ Treatment * N_app + Site/Block, data = HF_NW_grass, projections = TRUE)
I have a few concerns with this.
Firstly, what is the best way to test assumptions? For a simple one-way ANOVA I would use shapiro.test() and bartlett.test() on the dependent variable (Dry_tonnes_ha) to assess normality and heterogeneity of variance. Can I use the same approach here?
Secondly, I am concerned that N_app is a repeated measure as the same measurement is taken from the same plot over 3 different periods - what is the best way to build this repeated measures into the model?
Thirdly, I'm not sure of the best way to nest Block within Site. At both sites the levels of Block are 1:4. Do I need to have unique Block levels for each site?
I have another data set for NH3 emissions here. R code to download:
download.file('https://www.dropbox.com/s/0ax16x95m2z3fb5/HF_NW_NH3_emissions.csv?raw=1', destfile = "HF_NW_NH3_emissions.csv", method = "auto")
raw_data_1 <- read.csv("HF_NW_NH3_emissions.csv", stringsAsFactors = FALSE)
HF_NW_NH3 <- raw_data_1 %>% mutate_at(vars(Site, N_app, Block, Plot, Treatment), as.factor) %>%
mutate(Treatment = factor(Treatment, levels = c("AN", "U", "IU", "C")))
For this I have all the concerns above with the addition that the data set is unbalanced.
At HF for N_app 1 n=3, but for N_app 2 & 3 n=4
At NW n=4 for all N_app levels.
At NF measurements were only made on the Treatment levels U and IU
At NW measuremnts were made on Treatment levels AN, U and IU
I'm not sure how to deal with this added level of complexity. I am tempted to just analyse as 2 separate site (the fact that the N_app periods are not the same at each site may encourage this approach).
Can I use a type iii sum of squares ANOVA here?
It has been suggested to me that a linear mixed modelling approach may be the way forward but I'm not familiar with using these.
I would welcome your thoughts on any of the above. Thanks for your time.
Rory

To answer your first question on the best way of testing assumptions. While your attempt of using another statistical test, implemented in R, is reasonable, I would actually just visualize the distribution and see if the data meet ANOVA assumptions. This approach may seem somewhat subjective, but it does work in most cases.
independently, identically distributed (i.i.d) data: this is a question that you may already have an answer based on how much you know about your data. It's possible to use a chi-square test to determine independence (or not).
normally distributed data: use a histogram / QQ plot to check. Based on the distribution, I think it is reasonable to use aov despite the slightly bimodal distribution.
(It appears that log-transformation help further meet normality assumption. This is something you may consider, especially for downstream analyses.)
par(mfrow=c(2,2))
plot(density(HF_NW_grass$Dry_tonnes_ha), col="red", main="Density")
qqnorm(HF_NW_grass$Dry_tonnes_ha, col="red", main="qqplot")
qqline(HF_NW_grass$Dry_tonnes_ha)
DTH_trans <- log10(HF_NW_grass$Dry_tonnes_ha)
plot(density(DTH_trans), col="blue", main="transformed density")
qqnorm(DTH_trans, col="blue", main="transformed density")
qqline(DTH_trans)
Regarding your second question on what the best way to build repeated measures into the model is: Unfortunately, it is difficult to pinpoint such a "best" model, but based on my knowledge (mostly through genomics big data), you may want to use a linear mixed effect model. This can be implemented through the lme4 R package, for example. Since it appears you already know how to construct a linear model in R, you should have no problem with applying lme4 functions.
Your third question regarding whether to nest two variables is tricky. If I were you, I would start with Site and Block as if they were independent factors. However, if you know they are not independent, you should probably nest them.
I think your questions and concerns are quite open-ended. My recommendation is that as long as you have a plausible justification, go ahead and proceed.

I agree with #David C on the use of visual diagnostics. Simple QQ plots should work
# dependent variable.
par(mfrow=c(1,2))
qqnorm(dt[,dry_tonnes_ha]); qqline(dt[,dry_tonnes_ha], probs= c(0.15, 0.85))
qqnorm(log(dt[,dry_tonnes_ha])); qqline(log(dt[,dry_tonnes_ha]), probs= c(0.15, 0.85))
The log transformation looks reasonable to me. You can also see this from the density plot, which is long tailed and somewhat bi-modal
par(mfrow=c(1,1))
plot(density(dt[,dry_tonnes_ha]))
You could alternatively use lineup plots (Buja et al, 2009) if you wish. I'm not sure they're needed in this case. Vignette provided
library(nullabor)
# this may not be the best X variable. I'm not familiar with your data
dt_l <- lineup(null_permute("dry_tonnes_ha"), dt)
qplot(dry_tonnes_ha, treatment, data = dt_l) + facet_wrap(~ .sample)
For the other assumptions, you can just use the standard diagnostic plots from the lm
lm2 <- lm(log(dry_tonnes_ha) ~ treatment * n_app + site/block, data = dt)
plot(lm2)
I don't see anything too troublesome in these plots.

Related

Latent class growth modelling in R/flexmix with multinomial outcome variable

How to run Latent Class Growth Modelling (LCGM) with a multinomial response variable in R (using the flexmix package)?
And how to stratify each class by a binary/categorical dependent variable?
The idea is to let gender shape the growth curve by cluster (cf. Mikolai and Lyons-Amos (2017, p. 194/3) where the stratification is done by education. They used Mplus)
I think I might have come close with the following syntax:
lcgm_formula <- as.formula(rel_stat~age + I(age^2) + gender + gender:age)
lcgm <- flexmix::stepFlexmix(.~ .| id,
data=d,
k=nr_of_classes, # would be 1:12 in real analysis
nrep=1, # would be 50 in real analysis to avoid local maxima
control = list(iter.max = 500, minprior = 0),
model = flexmix::FLXMRmultinom(lcgm_formula,varFix=T,fixed = ~0))
,which is close to what Wardenaar (2020,p. 10) suggests in his methodological paper for a continuous outcome:
stepFlexmix(.~ .|ID, k = 1:4,nrep = 50, model = FLXMRglmfix(y~ time, varFix=TRUE), data = mydata, control = list(iter.max = 500, minprior = 0))
The only difference is that the FLXMRmultinom probably does not support varFix and fixed parameters, altough adding them do produce different results. The binomial equivalent for FLXMRmultinom in flexmix might be FLXMRglm (with family="binomial") as opposed FLXMRglmfix so I suspect that the restrictions of the LCGM (eg. fixed slope & intercept per class) are not specified they way it should.
The results are otherwise sensible, but model fails to put men and women with similar trajectories in the same classes (below are the fitted probabilities for each relationship status in each class by gender):
We should have the following matches by cluster and gender...
1<->1
2<->2
3<->3
...but instead we have
1<->3
2<->1
3<->2
That is, if for example men in class one and women in class three would be forced in the same group, the created group would be more similar than the current first row of the plot grid.
Here is the full MVE to reproduce the code.
Got similar results with another dataset with diffent number of classes and up to 50 iterations/class. Have tried two alternative ways to predict the probabilities, with identical results. I conclude that the problem is most likely in the model specification (stepflexmix(...,model=FLXMRmultinom(...) or this is some sort of label switch issue.
If the model would be specified correctly and the issue is that similar trajectories for men/women end up in different classes, is there a way to fix that? By for example restricting the parameters?
Any assistance will be highly appreciated.
This seems to be a an identifiability issue apparently common in mixture modelling. In other words the labels are switched so that while there might not be a problem with the modelling as such, men and women end up in different groups and that will have to be dealt with one way or another
In the the new linked code, I have swapped the order manually and calculated the predictions with by hand.
Will be happy to hear, should someone has an alternative approach to deal with the label swithcing issue (like restricting parameters or switching labels algorithmically). Also curious if the model could/should be specified in some other way.
A few remarks:
I believe that this is indeed performing a LCGM as we do not specify random effects for the slopes or intercepts. Therefore I assume that intercepts and slopes are fixed within classes for both sexes. That would mean that the model performs LCGM as intended. By the same token, it seems that running GMM with random intercept, slope or both is not possible.
Since we are calculating the predictions by hand, we need to be able to separate parameters between the sexes. Therefore I also added an interaction term gender x age^2. The calculations seems to slow down somewhat, but the estimates are similar to the original. It also makes conceptually sense to include the interaction for age^2 if we have it for age already.
varFix=T,fixed = ~0 seem to be reduntant: specifying them do not change anything. The subsampling procedure (of my real data) was unaffected by the set.seed() command for some reason.
The new model specification becomes:
lcgm_formula <- as.formula(rel_stat~ age + I(age^2) +gender + age:gender + I(age^2):gender)
lcgm <- flexmix::flexmix(.~ .| id,
data=d,
k=nr_of_classes, # would be 1:12 in real analysis
#nrep=1, # would be 50 in real analysis to avoid local maxima (and we would use the stepFlexmix function instead)
control = list(iter.max = 500, minprior = 0),
model = flexmix::FLXMRmultinom(lcgm_formula))
And the plots:

Visualizing regression coefficient of a regression

I am trying to figure out the best way to display a list of 30+ coefficients on a regression of a continuous variable.
(This may belong more in CrossValidated, I am not sure.)
Here is my example:
library("nycflights13")
library(dplyr)
flights <- nycflights13::flights
flights<- sample_n (flights, 3000)
m1<- glm(formula = arr_delay ~ . , data = flights)
summary(m1)
An option is dwplot from dotwhisker
library(dotwhisker)
dwplot(m1)
As #BenBolker commented, by default, the dwplot scales regression coeffficients by 2 standard deviations of the predictor variable
Or if we need a data.frame/tibble, then use tidy from broom
library(broom)
tidy(m1)
This may help. You could select a specific coefficient with the following :
str(flights) # to print list of data features
coef(m1)["age"] # here I just suppose that you have an axis called "age", you could select as many features coefficients as you want. For this you coud use a vector of relevant axis.
You could have a look at :
extract coefficients from glm in R
tl;dr dwplot is still (a) right answer, but there's a lot to say about the details of how you're fitting this model (and why it takes a really really long time).
glm vs lm
You're using glm() to fit a linear model, which isn't incorrect (and which would allow you to generalize to problems with count or binary responses). However, it's overkill in this case — lm() will work just fine, and be faster [considerably faster when it comes to generating confidence intervals etc.]
system.time(m1 <- glm(formula = arr_delay ~ . , data = flights)) ## 6 seconds
system.time(m2 <- lm(formula = arr_delay ~ . , data = flights, x=TRUE)) ## 13 seconds
(the reason for including x=TRUE will be discussed below)
The time difference becomes more stark when tidying/computing confidence intervals:
setTimeLimit(elapsed=600)
system.time(tidy(m1, conf.int=TRUE)) ## gave up after 10 minutes
system.time(tt <- tidy(m2, conf.int=TRUE)) ## 3.2 seconds
Tidying glms by default uses MASS::confint.glm() to compute confidence intervals by likelihood profiling, which is more accurate than Wald (mean +/- 1.96*SE) intervals for non-Gaussian responses), but way slower.
modeling choices
One of the reasons that everything is so slow is that there are lots of parameters (length(coef(m2)) is 1761). Why?
Although there are only 19 columns in the input data frame (so we might naively expect 18 coefficients), 4 of them are categorical, so get expanded to indicator variables:
catvars <- names(flights)[sapply(flights,is.character)]
sapply(catvars, function(x) length(unique(flights[[x]])))
## carrier tailnum origin dest
## 15 1653 3 94
So, most of the coefficients come from modeling the departures of individual planes (tailnum) [table(table(flights$tailnum)) shows that in this subsample of the data, more than half of the planes are recorded only once ...] It might not make sense to include this variable (if I were going to use tailnum, I would treat it as a random effect, although that would add a lot of modeling complexity).
Let's proceed without tailnum (we will still have plenty of coefficients to worry about).
plotting
At this point we're doing approximately what dotwhisker::dwplot does, but doing it by hand for more flexibility (in particular, ordering the terms by value).
The next step (1) extracts coefficients/conf int etc.; (2) scales non-binary variables by 2SD (using an internal function from dotwhisker); (3) drops the intercept; (4) makes term a factor ordered by the coefficient value and computes whether the term is significant (i.e., whether the lower and upper CI limits are both above or both below zero).
tt <- (tidy(m3, conf.int=TRUE)
%>% dotwhisker::by_2sd(flights)
%>% filter(term!="(Intercept)")
%>% mutate(term=reorder(factor(term),estimate),
sig=(conf.low*conf.high)>1)
)
Plot:
(ggplot(tt, aes(x=estimate,y=term,xmin=conf.low,xmax=conf.high))
+ geom_pointrange(aes(colour=sig))
+ geom_vline(xintercept=0,lty=2)
+ scale_colour_manual(values=c("black","red"))
)

How to run a multinomial logit regression with both individual and time fixed effects in R

Long story short:
I need to run a multinomial logit regression with both individual and time fixed effects in R.
I thought I could use the packages mlogit and survival to this purpose, but I am cannot find a way to include fixed effects.
Now the long story:
I have found many questions on this topic on various stack-related websites, none of them were able to provide an answer. Also, I have noticed a lot of confusion regarding what a multinomial logit regression with fixed effects is (people use different names) and about the R packages implementing this function.
So I think it would be beneficial to provide some background before getting to the point.
Consider the following.
In a multiple choice question, each respondent take one choice.
Respondents are asked the same question every year. There is no apriori on the extent to which choice at time t is affected by the choice at t-1.
Now imagine to have a panel data recording these choices. The data, would look like this:
set.seed(123)
# number of observations
n <- 100
# number of possible choice
possible_choice <- letters[1:4]
# number of years
years <- 3
# individual characteristics
x1 <- runif(n * 3, 5.0, 70.5)
x2 <- sample(1:n^2, n * 3, replace = F)
# actual choice at time 1
actual_choice_year_1 <- possible_choice[sample(1:4, n, replace = T, prob = rep(1/4, 4))]
actual_choice_year_2 <- possible_choice[sample(1:4, n, replace = T, prob = c(0.4, 0.3, 0.2, 0.1))]
actual_choice_year_3 <- possible_choice[sample(1:4, n, replace = T, prob = c(0.2, 0.5, 0.2, 0.1))]
# create long dataset
df <- data.frame(choice = c(actual_choice_year_1, actual_choice_year_2, actual_choice_year_3),
x1 = x1, x2 = x2,
individual_fixed_effect = as.character(rep(1:n, years)),
time_fixed_effect = as.character(rep(1:years, each = n)),
stringsAsFactors = F)
I am new to this kind of analysis. But if I understand correctly, if I want to estimate the effects of respondents' characteristics on their choice, I may use a multinomial logit regression.
In order to take advantage of the longitudinal structure of the data, I want to include in my specification individual and time fixed effects.
To the best of my knowledge, the multinomial logit regression with fixed effects was first proposed by Chamberlain (1980, Review of Economic Studies 47: 225–238). Recently, Stata users have been provided with the routines to implement this model (femlogit).
In the vignette of the femlogit package, the author refers to the R function clogit, in the survival package.
According to the help page, clogit requires data to be rearranged in a different format:
library(mlogit)
# create wide dataset
data_mlogit <- mlogit.data(df, id.var = "individual_fixed_effect",
group.var = "time_fixed_effect",
choice = "choice",
shape = "wide")
Now, if I understand correctly how clogit works, fixed effects can be passed through the function strata (see for additional details this tutorial). However, I am afraid that it is not clear to me how to use this function, as no coefficient values are returned for the individual characteristic variables (i.e. I get only NAs).
library(survival)
fit <- clogit(formula("choice ~ alt + x1 + x2 + strata(individual_fixed_effect, time_fixed_effect)"), as.data.frame(data_mlogit))
summary(fit)
Since I was not able to find a reason for this (there must be something that I am missing on the way these functions are estimated), I have looked for a solution using other packages in R: e.g., glmnet, VGAM, nnet, globaltest, and mlogit.
Only the latter seems to be able to explicitly deal with panel structures using appropriate estimation strategy. For this reason, I have decided to give it a try. However, I was only able to run a multinomial logit regression without fixed effects.
# state formula
formula_mlogit <- formula("choice ~ 1| x1 + x2")
# run multinomial regression
fit <- mlogit(formula_mlogit, data_mlogit)
summary(fit)
If I understand correctly how mlogit works, here's what I have done.
By using the function mlogit.data, I have created a dataset compatible with the function mlogit. Here, I have also specified the id of each individual (id.var = individual_fixed_effect) and the group to which individuals belongs to (group.var = "time_fixed_effect"). In my case, the group represents the observations registered in the same year.
My formula specifies that there are no variables correlated with a specific choice, and which are randomly distributed among individuals (i.e., the variables before the |). By contrast, choices are only motivated by individual characteristics (i.e., x1 and x2).
In the help of the function mlogit, it is specified that one can use the argument panel to use panel techniques. To set panel = TRUE is what I am after here.
The problem is that panel can be set to TRUE only if another argument of mlogit, i.e. rpar, is not NULL.
The argument rpar is used to specify the distribution of the random variables: i.e. the variables before the |.
The problem is that, since these variables does not exist in my case, I can't use the argument rpar and then set panel = TRUE.
An interesting question related to this is here. A few suggestions were given, and one seems to go in my direction. Unfortunately, no examples that I can replicate are provided, and I do not understand how to follow this strategy to solve my problem.
Moreover, I am not particularly interested in using mlogit, any efficient way to perform this task would be fine for me (e.g., I am ok with survival or other packages).
Do you know any solution to this problem?
Two caveats for those interested in answering:
I am interested in fixed effects, not in random effects. However, if you believe there is no other way to take advantage of the longitudinal structure of my data in R (there is indeed in Stata but I don't want to use it), please feel free to share your code.
I am not interested in going Bayesian. So if possible, please do not suggest this approach.

Multinomial logit: estimation on a subset of alternatives in R

As McFadden (1978) showed, if the number of alternatives in a multinomial logit model is so large that computation becomes impossible, it is still feasible to obtain consistent estimates by randomly subsetting the alternatives, so that the estimated probabilities for each individual are based on the chosen alternative and C other randomly selected alternatives. In this case, the size of the subset of alternatives is C+1 for each individual.
My question is about the implementation of this algorithm in R. Is it already embedded in any multinomial logit package? If not - which seems likely based on what I know so far - how would one go about including the procedure in pre-existing packages without recoding extensively?
Not sure whether the question is more about doing the sampling of alternatives or the estimation of MNL models after sampling of alternatives. To my knowledge, there are no R packages that do sampling of alternatives (the former) so far, but the latter is possible with existing packages such as mlogit. I believe the reason is that the sampling process varies depending on how your data is organized, but it is relatively easy to do with a bit of your own code. Below is code adapted from what I used for this paper.
library(tidyverse)
# create artificial data
set.seed(6)
# data frame of choser id and chosen alt_id
id_alt <- data.frame(
id = 1:1000,
alt_chosen = sample(1:30, 1)
)
# data frame for universal choice set, with an alt-specific attributes (alt_x2)
alts <- data.frame(
alt_id = 1:30,
alt_x2 = runif(30)
)
# conduct sampling of 9 non-chosen alternatives
id_alt <- id_alt %>%
mutate(.alts_all =list(alts$alt_id),
# use weights to avoid including chosen alternative in sample
.alts_wtg = map2(.alts_all, alt_chosen, ~ifelse(.x==.y, 0, 1)),
.alts_nonch = map2(.alts_all, .alts_wtg, ~sample(.x, size=9, prob=.y)),
# combine chosen & sampled non-chosen alts
alt_id = map2(alt_chosen, .alts_nonch, c)
)
# unnest above data.frame to create a long format data frame
# with rows varying by choser id and alt_id
id_alt_lf <- id_alt %>%
select(-starts_with(".")) %>%
unnest(alt_id)
# join long format df with alts to get alt-specific attributes
id_alt_lf <- id_alt_lf %>%
left_join(alts, by="alt_id") %>%
mutate(chosen=ifelse(alt_chosen==alt_id, 1, 0))
require(mlogit)
# convert to mlogit data frame before estimating
id_alt_mldf <- mlogit.data(id_alt_lf,
choice="chosen",
chid.var="id",
alt.var="alt_id", shape="long")
mlogit( chosen ~ 0 + alt_x2, id_alt_mldf) %>%
summary()
It is, of course, possible without using the purrr::map functions, by using apply variants or looping through each row of id_alt.
Sampling of alternatives is not currently implemented in the mlogit package. As stated previously, the solution is to generate a data.frame with a subset of alternatives and then using mlogit (and importantly to use a formula with no intercepts). Note that mlogit can deal with unbalanced data, ie the number of alternatives doesn't have to be the same for all the choice situations.
My recommendation would be to review the mlogit package.
Vignette:
https://cran.r-project.org/web/packages/mlogit/vignettes/mlogit2.pdf
the package has a set of example exercises that (in my opinion) are worth looking at:
https://cran.r-project.org/web/packages/mlogit/vignettes/Exercises.pdf
You may also want to take a look at the gmnl package (I have not used it)
https://cran.r-project.org/web/packages/gmnl/index.html
Multinomial Logit Models with Continuous and Discrete Individual Heterogeneity in R: The gmnl Package
Mauricio Sarrias' (Author) gmnl Web page
Question: What specific problem(s) are you trying to apply a multinomial logit model too? Suitably intrigued.
Aside from the above question, I hope the above points you in the right direction.

R - linear model does not match experimental data

I am trying to perform a linear regression on experimental data consisting of replicate measures of the same condition (for several conditions) to check for the reliability of the experimental data. For each condition I have ~5k-10k observations stored in a data frame df:
[1] cond1 repA cond1 repB cond2 repA cond2 repB ...
[2] 4.158660e+06 4454400.703 ...
[3] 1.458585e+06 4454400.703 ...
[4] NA 887776.392 ...
...
[5024] 9571785.382 9.679092e+06 ...
I use the following code to plot scatterplot + lm + R^2 values (stored in rdata) for the different conditions:
for (i in seq(1,13,2)){
vec <- matrix(0, nrow = nrow(df), ncol = 2)
vec[,1] <- df[,i]
vec[,2] <- df[,i+1]
vec <- na.exclude(vec)
plot(log10(vec[,1]),log10(vec[,2]), xlab = 'rep A', ylab = 'rep B' ,col="#00000033")
abline(fit<-lm(log10(vec[,2])~log10(vec[,1])), col='red')
legend("topleft",bty="n",legend=paste("R2 is",rdata[1,((i+1)/2)] <- format(summary(fit)$adj.r.squared,digits=4)))
}
However, the lm seems to be shifted so that it does not fit the trend I see in the experimental data:
It consistently occurs for every condition. I unsuccesfully tried to find an explanation by looking up the scource code and browsing different forums and posts (this or here).
Would have like to simply comment/ask a few questions, but can't.
From what I've understood, both repA and repB are measured with error. Hence, you cannot fit your data using an ordinary least square procedure, which only takes into account the error in Y (some might argue a weighted OLS may work, however I'm not skilled enough to discuss that). Your question seem linked to this one.
What you can use is a total least square procedure: it takes into account the error in X and Y. In the example below, I've used a "normal" TLS assuming there is the same error in X and Y (thus error.ratio=1). If it is not, you can specify the error ratio by entering error.ratio=var(y1)/var(x1) (at least I think it's var(Y)/var(X): check on the documentation to ensure that).
library(mcr)
MCR_reg=mcreg(x1,y1,method.reg="Deming",error.ratio=1,method.ci="analytical")
MCR_intercept=getCoefficients(MCR_reg)[1,1]
MCR_slope=getCoefficients(MCR_reg)[2,1]
# CI for predicted values
x_to_predict=seq(0,35)
predicted_values=MCResultAnalytical.calcResponse(MCR_reg,x_to_predict,alpha=0.05)
CI_low=predicted_values[,4]
CI_up=predicted_values[,5]
Please note that, in Deming/TLS regressions, your x- and y-errors are supposed to follow normal distribution, as explained here. If it's not the case, go for a Passing-Bablok regressions (and the R code is here).
Also note that the R2 isn't defined for Deming nor Passing Bablok regressions (see here). A correlation coefficient is a good proxy, although it does not exactly provide the same information. Since you're studying a linear correlation between two factors, see Pearson's product moment correlation coefficient, and use e.g. the rcorrfunction.

Resources