I am trying to create a smooth spline for each sample in a grouped dataframe. For this I am using a nest and map approach and mgcv gam (following this example https://smu095.github.io/2019/02/16/2019-02-16-tidytuesday-fitting-multiple-time-series-models-using-purrr/).
After running the gam I would like to use broom::augment to extract the fitted data and calculate confidence intervals.
This code works using broom 0.5.6 but throws an error using the new broom 0.7 version. broom::tidy and broom:glance still work with this format but augment stops with "Error: Problem with mutate() input augment_spline.
x object 'year' not found"
Example code below
library(tidyverse)
library(dslabs)
#Use the gapminder dataset that comes with dslabs as an example
glimpse(gapminder)
gapminder_nest <- gapminder %>%
group_by(country) %>%
nest()%>%
mutate(splined =map(data, ~mgcv::gam(population ~ s(year, k=5, bs="tp"), data=.x))) %>%
mutate(augment_spline= map(splined, broom::augment))%>%
unnest(augment_spline)%>%
dplyr::select(country, population,.fitted,.se.fit)
Same code runs if using broom 0.5.6
devtools::install_version("broom", version = "0.5.6", repos = "http://cran.us.r-project.org")
All online tutorials I could find present similar code that doesn't seem to work using broom 0.7
In the newer version I think it also needs data in newdata argument. You can pass the data as separate argument with map2.
library(tidyverse)
library(dslabs)
gapminder_nest <- gapminder %>%
group_by(country) %>%
nest()%>%
mutate(splined = map(data, ~mgcv::gam(population ~ s(year, k=5, bs="tp"), data=.x))) %>%
mutate(augment_spline = map2(splined, data, ~broom::augment(.x, newdata = .y))) %>%
unnest(augment_spline)
Although this works but this doesn't return all the columns as the 0.5.6 version of broom does i.e se.fit,.resid,.hat, .sigma and .cooksd.This only returns .fitted column.
Related
I am using a dataset from an online practice tutorial and the code can be found at the bottom of Page 4 (https://tomhouslay.files.wordpress.com/2017/02/indivvar_mv_tutorial_asreml.pdf)
In the tutorial, they get the function to work using the code listed below, but in my R session, I get an error that says:
No tidy method for objects of class lmerMod.
I have tried using the package "parsnip" as well as restarting my R session and I have tried requiring broom as suggested in other answers to similar questions.
The haggis practice csv file can be downloaded from here: https://figshare.com/articles/Haggis_data_behavioural_syndromes/4702540
library(asreml)
library(nadiv)
library(tidytext)
library(tidyverse)
library(broom)
require(broom)
library(lme4)
library(data.table)
library(parsnip)
HData<- read_csv("haggis practice.csv")
lmer_b <- lmer(boldness ~ scale(assay_rep, scale=FALSE) +
scale(body_size) +
(1|ID),
data = HData)
plot(lmer_b)
qqnorm(residuals(lmer_b))
hist(residuals(lmer_b))
summary(lmer_b)
rep_bold <- tidy(lmer_b, effects = "ran_pars", scales = "vcov") %>%
select(group, estimate) %>%
spread(group, estimate) %>%
mutate(repeatability = ID/(ID + Residual))
Providing an answer (from the comments).
The tidy methods for multilevel/mixed-type models (e.g. from lme4, brms, MCMCglmm, ...) were moved to broom.mixed. You can either install/load the broom.mixed package, or use the broomExtra package, which is a "meta-package" that looks for methods in both broom and broom.mixed ...
I am using a dataset from an online practice tutorial and the code can be found at the bottom of Page 4 (https://tomhouslay.files.wordpress.com/2017/02/indivvar_mv_tutorial_asreml.pdf)
In the tutorial, they get the function to work using the code listed below, but in my R session, I get an error that says:
No tidy method for objects of class lmerMod.
I have tried using the package "parsnip" as well as restarting my R session and I have tried requiring broom as suggested in other answers to similar questions.
The haggis practice csv file can be downloaded from here: https://figshare.com/articles/Haggis_data_behavioural_syndromes/4702540
library(asreml)
library(nadiv)
library(tidytext)
library(tidyverse)
library(broom)
require(broom)
library(lme4)
library(data.table)
library(parsnip)
HData<- read_csv("haggis practice.csv")
lmer_b <- lmer(boldness ~ scale(assay_rep, scale=FALSE) +
scale(body_size) +
(1|ID),
data = HData)
plot(lmer_b)
qqnorm(residuals(lmer_b))
hist(residuals(lmer_b))
summary(lmer_b)
rep_bold <- tidy(lmer_b, effects = "ran_pars", scales = "vcov") %>%
select(group, estimate) %>%
spread(group, estimate) %>%
mutate(repeatability = ID/(ID + Residual))
Providing an answer (from the comments).
The tidy methods for multilevel/mixed-type models (e.g. from lme4, brms, MCMCglmm, ...) were moved to broom.mixed. You can either install/load the broom.mixed package, or use the broomExtra package, which is a "meta-package" that looks for methods in both broom and broom.mixed ...
Using the Owls data and the glmmTMB package, I want to visually compare the regression coefficients from different zero-Inflated models that differ in the family used (ZIPOISS, ZINB1, ZINB2) and with/out the offset (logBroodSize).
However my first problem is to get the coefficients. The tidy function from package broom should provide you with the coefficients to plot them later with ggplot, but I get the following error when I try to get them:
modList= list(zipoiss, zinb1, zinb2, zinb1_bs, zinb2_bs)
coefs= ldply(modList,tidy,effect="fixed",conf.int=TRUE,
.id="model") %>%
mutate(term=abbfun(term)) %>%
select(model,term,estimate,conf.low,conf.high) %>%
filter(!term %in% c("Intercept","Intercept.1","NCalls","zi_NCalls"))
Error in as.data.frame.default(x) :
cannot coerce class ""glmmTMB"" to a data.frame
Also: Warning message:
In tidy.default(X[[i]], ...) :
No method for tidying an S3 object of class glmmTMB , using as.data.frame
Any idea of what could be wrong? I was already told that not having a right version of broom could be the reason, however I have had installed the right version of it... Code for a reproducible example is provided next:
# Packages and dataset
library(glmmTMB)
library(broom) # devtools::install_github("bbolker/broom")
library(plyr)
library(dplyr)
data(Owls,package="glmmTMB")
Owls = plyr::rename(Owls, c(SiblingNegotiation="NCalls"))
Owls = transform(Owls,
ArrivalTime=scale(ArrivalTime, center=TRUE, scale=FALSE),
obs=factor(seq(nrow(Owls))))
# Models
zipoiss<-glmmTMB(NCalls~(FoodTreatment+ArrivalTime)*SexParent+
offset(logBroodSize)+(1|Nest),
data=Owls,
ziformula = ~ 1,
family="poisson")
zinb2<- glmmTMB(NCalls~(FoodTreatment+ArrivalTime)*SexParent+
offset(logBroodSize)+(1|Nest),
data=Owls,
ziformula = ~ 1,
family="nbinom2")
zinb1 <- glmmTMB(NCalls~(FoodTreatment+ArrivalTime)*SexParent+
offset(logBroodSize)+(1|Nest),
data=Owls,
ziformula = ~ 1,
family="nbinom1")
zinb1_bs<- glmmTMB(NCalls~(FoodTreatment+ArrivalTime)*SexParent+
BroodSize+(1|Nest),
data=Owls,
ziformula = ~ 1,
family="nbinom1")
zinb2_bs<- glmmTMB(NCalls~(FoodTreatment+ArrivalTime)*SexParent+
BroodSize+(1|Nest),
data=Owls,
ziformula = ~ 1,family="nbinom2")
# Get coefficients ("coefs" does not work yet...)
modList = list(zipoiss, zinb1, zinb2, zinb1_bs, zinb2_bs)
coefs = ldply(modList,tidy,effect="fixed",conf.int=TRUE,
.id="model") %>%
mutate(term=abbfun(term)) %>%
select(model,term,estimate,conf.low,conf.high) %>%
filter(!term %in% c("Intercept","Intercept.1","NCalls","zi_NCalls"))
This now works (as of today) with the new/under-development broom.mixed package, e.g.
devtools::install_github("bbolker/broom.mixed")
Hopefully this will be on CRAN sometime soon, but it's only medium priority
for me, and I don't want to release it to CRAN until it's at least 90% baked. Pull requests welcome!
The last step changed a little bit (for one thing, I don't seem to have abbfun):
modList = lme4:::namedList(zipoiss, zinb1, zinb2, zinb1_bs, zinb2_bs)
coefs = ldply(modList,tidy,effect="fixed",conf.int=TRUE,
.id="model") %>%
select(model,term,component,estimate,conf.low,conf.high) %>%
filter(!term %in% c("(Intercept)","NCalls"))
Caveat: The development of these tools for glmmTMB models is pretty new/experimental; you should (1) sanity-check your results carefully and (2) report an issue if something doesn't work as expected.
I'm trying to run regressions within a nested data frame as described here. For my purposes, I'm using felm from the lfe package because I have many levels of fixed effects.
If I re-do the example in the link above using felm instead of lm, it works for the most part until I try to use broom::augment.
library(tidyverse)
library(broom)
library(gapminder)
library(lfe)
by_country <- gapminder %>%
group_by(continent, country) %>%
nest()
country_felm <- function(data){
felm(lifeExp ~ year, data = data)
}
by_country <- by_country %>%
mutate(model = purrr::map(data, country_felm)
)
Everything works up to this point except that I had to use a function instead of a formula in purrr::map in the last line of code, possibly another felm quirk.
Now if I try to use broom to extract the model output, it works for glance and tidy, but not for augment.
by_country %>% unnest(model %>% purrr::map(broom::glance))
by_country %>% unnest(model %>% purrr::map(broom::tidy))
by_country %>% unnest(model %>% purrr::map(broom::augment))
Trying to use augment results in the following error message:
Error in mutate_impl(.data, dots) :
argument must be coercible to non-negative integer
In addition: Warning message:
In seq_len(nrow(x)) : first element used of 'length.out' argument
It looks like augment is having trouble finding the data for the data argument, which is generally the dataset used for fitting.
The problem is easier to see if working with a single one of these models rather than all of them at once.
This doesn't work, with your given error:
augment(by_country$model[[1]])
But explicitly passing the data to the data argument does:
augment(by_country$model[[1]], data = by_country$data[[1]])
A work-around is therefore to pass the dataset to augment as the second argument. This can be done via purrr:map2, looping through both the model and data columns at the same time.
by_country %>%
unnest(model %>% purrr::map2(., data, augment))
I would like to run each variable in a dataset as a univariate glmer model using the lme4 package in R. I would like to prepare the data with the dplyr/tidyr packages, and organize the results from each model with the broom package (i.e. do(glance(glmer...). I would most appreciate help that stuck within that framework. I'm not that great in R, but was able to produce a dataset that throws an error and has the same structure as the data I'm using:
library(lme4)
library(dplyr)
library(tidyr)
library(broom)
Bird<-c(rep(c(0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0),10))
Stop<-c(rep(seq(1,10), 20))
Count<-c(rep(c(rep(c(1,2), each=10)), each=10))
Route<-c(rep(seq(1,10), each=20))
X1<-rnorm(200, 50, 10)
X2<-rnorm(200, 10, 1)
X3<-c(rep(c(0),200))#trouble maker variable
Data<-data.frame(cbind(Bird, Stop, Count, Route, X1, X2, X3))
Data%>%
gather(Variable, Value, 5:7)%>%
group_by(Variable)%>%
do(glance(glmer(Bird~Value+Stop+(1+Stop|Route/Count), data=., family=binomial)))
The last variable produces an error so there is no output. What I would like is it to produce NA values in the output if this occurs, or just skip that variable. I've tried using 'try' to blow past the trouble maker variable:
do(try(glance(glmer(Bird~Value+Stop+(1+Stop|Route/Count), data=., family=binomial))))
which it does, but still an output is not produced because it can't coerce a 'try-error' to a data.frame. Unfortunately there is no tryharder function. I've tried some if statements which make sense to me but not the computer. I'm sure I'm not doing it right, but if for example I use:
try(glance(glmer(Bird~Value+Stop+(1+Stop|Route/Count), data=., family=binomial)))->mod
if(is.data.frame(mod)){do(mod)}
I get subscript out of bounds errors. Thanks very much for any input you can provide!
Use tryCatch before the call to glance:
zz = Data %>%
gather(Variable, Value, 5:7) %>%
group_by(Variable) %>%
do(aa = tryCatch(glmer(Bird~Value+Stop+(1+Stop|Route/Count), data=.,
family=binomial), error = function(e) data.frame(NA)))
zz %>%
glance(aa)