Goodness of Fit statistic Tobit model - r

I have estimated a Tobit model using the censReg package, along with the censReg function. Alternatively, the same Tobit model is estimated using the tobit function in the AER package.
Now, I really like to have some goodness of fit statistic, such as the Pseudo-R2. However, whenever I try to estimate this, the output returns as NA. For example:
Tobit <- censReg(Listing$occupancy_rate ~ ., left = -Inf, right = 1, data = Listing)
PseudoR2(Tobit, which = "McFadden")
[1] NA
So far, I have only seen reported Pseudo-R2's when people use Stata. Does anyone know how to estimate it in R?
Alternatively, Tobit estimates the (log)Sigma, which is basically the standard deviation of the residuals. Could I use this to calculate the R2?
All help is really appreciated.

You can use DescTools package to calculate PseudoR2. You have not provided any sample data. So, it is hard for me to run your model. I am using a default dataset like
library(DescTools)
r.glm <- glm(Survived ~ ., data=Untable(Titanic), family=binomial)
PseudoR2(r.glm, c("McFadden"))
For your model, you can use something like
library(AER)
data("Affairs", package = "AER")
fm.tobit <- tobit(affairs ~ age + yearsmarried + religiousness + occupation + rating,
data = Affairs)
#Create a function for pseudoR2 calculation
pseudoR2 <- function(obj) 1 - as.vector(logLik(obj)/logLik(update(obj, . ~ 1)))
pseudoR2(fm.tobit)
#>[1] 0.05258401
Or using censReg as you have used
library(censReg)
data("Affairs", package = "AER")
estResult <- censReg(affairs ~ age + yearsmarried + religiousness +
occupation + rating, data = Affairs)
summary(estResult)
pseudoR2(estResult)
#>[1] 0.05258401
You can find the details about pseudoR2 in the following link
R squared in logistic regression

Related

Probing interactions in nlme using the "interactions" package in R

I am running a linear mixed effects models using the "nlme" package looking at stress and lifestyle as predictors of change in cognition over 4 years in a longitudinal dataset. All variables in the model are continuous variables.
I am able to create the model and get the summary statistics using this code:
mod1 <- lme(MS ~ age + sex + edu + GDST1*Time + HLI*Time + GDST1*HLI*Time, random= ~ 1|ID, data=NuAge_long, na.action=na.omit)
summary(mod1)
I am trying to use the "interactions" package to probe the 3-way interaction:
sim_slopes(model = mod1, pred = Time, modx = GDST1, mod2 = HLI, data = NuAge_long)
but am receiving this error:
Error in if (tcol == "df") tcol <- "t val." : argument is of length zero
I am also trying to plot the interaction using the same "interactions" package:
interact_plot(model = mod1, pred = Time, modx = GDST1, mod2 = HLI, data = NuAge_long)
and am receiving this error:
Error in UseMethod("family") : no applicable method for 'family' applied to an object of class "lme"
I can't seem to find what these errors mean and why I'm getting them. Any help would be appreciated!
From ?interactions::sim_slopes:
The function is tested with ‘lm’, ‘glm’,
‘svyglm’, ‘merMod’, ‘rq’, ‘brmsfit’, ‘stanreg’ models. Models
from other classes may work as well but are not officially
supported. The model should include the interaction of
interest.
Note this does not include lme models. On the other hand, merMod models are those generated by lme4::[g]lmer(), and as far as I can tell you should be able to fit this model equally well with lmer():
library(lme4)
mod1 <- lmer(MS ~ age + sex + edu + GDST1*Time + HLI*Time + GDST1*HLI*Time
+ (1|ID), data=NuAge_long)
(things will get harder if you want to specify correlation structures, e.g. correlation = corAR1(), which works for lme() but not lmer() ...)

How to estimate coefficients for linear regression model on CEM matched data using `R` `cem` package?

I'm working on a project where we'd like to run a follow-up linear regression model on treatment-control data where the treatments have been matched to the controls using the cem package to perform coarsened exact matching:
match <- cem(treatment="cohort", data=df, drop=c("member_id","period","cohort_period"))
est <- att(match, total_cost ~ cohort + period + cohort_period, data = df)
where I'd like to estimate the coefficient and 95% CI on the "cohort_period" interaction term. It seems the att function in the cem package only estimates the coefficient for the specified treatment variable (in this case, "cohort") while adjusting for other variables in the regression.
Is there a way to return the coefficients and 95% CIs for the other regression terms?
Figured it out! I was using the wrong package - instead of cem I discovered the MatchIt and Zelig packages allow me to perform both exact matching and parameteric regression on the matched data:
library(MatchIt)
library(Zelig)
matched_df <- matchit(cohort ~ age_catg + sex + market_code + risk_score_catg, method="exact", data=df)
matched_df_reg <- zelig(total_cost ~ cohort + period + cohort_period, data = match.data(matched_df), model = "ls")

Comparing script for random intercept and slope independent between nlme and lme4

So for random mixed effects, I am making a comparison list of scripts between the 2 packages.
For independent random intercept and slope, if I am using the following code in lme4 package, what is the corresponding script in nlme?
model1 <- lmer(y~A + (1+site) + (0+A|site), data, REML = FALSE)
Also, for nested mixed effects, which calculates the random effect in different way from the above, are my scripts correct?
model2 <- lme(y~A, random = ~1+site/A, data, method="REML")
and
model3 <- lmer(y~A + (1|site) + (1|site:A), data, method=FALSE)
Thank you so much!
I hope this answer is not too late!
For your first model the version in nlme would be:
model1 <- lme(y ~ A ,
random = list(A = pdDiag(~time)),
data=data)
Your seccond and third models are equivalent. Model 3 in lme4 package can be also written as:
model3 <- lmer(y~A + (1|site/A), data, method=FALSE)
I foud this link that might help you a lot to compare nlme and lme4 packages
https://rpsychologist.com/r-guide-longitudinal-lme-lmer#conditional-growth-model-dropping-intercept-slope-covariance

Plot Effects of Variables in Interaction Terms

I would like to plot the effects of variables in interaction terms, using panel data and a FE model.
I have various interaction effects in my equation, for example this one here:
FIXED1 <- plm(GDPPCgrowth ~ FDI * PRIVCR, data = dfp)
I can only find solutions for lm, but not for plm.
So on the x-axis there should be PRIVCR and on the y-axis the effect of FDI on growth.
Thank you for your help!
Lisa
I am not aware of a package that supports plm objects directly. As you are asking for FE models, you can just take an LSDV approach for FE and do the estimation by lm to get an lm object which works with the effects package. Here is an example for the Grunfeld data:
library(plm)
library(effects)
data("Grunfeld", package = "plm")
mod_fe <- plm(inv ~ value + capital + value:capital, data = Grunfeld, model = "within")
Grunfeld[ , "firm"] <- factor(Grunfeld[ , "firm"]) # needs to be factor in the data NOT in the formula [required by package effects]
mod_lsdv <- lm(inv ~ value + capital + value:capital + firm, data = Grunfeld)
coefficients(mod_fe) # estimates are the same
coefficients(mod_lsdv) # estimates are the same
eff_obj <- effects::Effect(c("value", "capital"), mod_lsdv)
plot(eff_obj)

Does season package support splines?

I was running a casecross function in season package with this code
library(season)
library(splines)
data(CVDdaily)
CVDdaily<-subset(CVDdaily,date<=as.Date('1987-12-31'))
# Effect of ozone on CVD death
model1<- casecross(cvd ~ o3mean+tmpd+Mon+Tue+Wed+Thu+Fri+Sat, data=CVDdaily)
But when I use the natural splines I get an error message "Error in ns(tmpd, df = 6) : object 'tmpd' not found"
model2<-casecross(cvd ~ o3mean + ns(tmpd, df=6) +Mon+Tue+Wed+Thu+Fri+Sat, data=CVDdaily)
Does it mean that the package has no support for splines? If yes, I would appreciate any ideas on how to adjust the nonlinear effect of temperature using a season package?
You can fit splines in season using the dlnm package. (I've used 4 degrees of freedom rather than 6).
library(season)
library(dlnm)
data(CVDdaily)
CVDdaily = subset(CVDdaily,date<=as.Date('1987-12-31'))
t.spline = crossbasis(CVDdaily$tmpd,lag=0,argvar=list(fun="ns",df=4))
model2 = casecross(cvd ~ o3mean + t.spline +Mon+Tue+Wed+Thu+Fri+Sat, data=CVDdaily)
summary(model2)
To get a nice plot of the estimated spline you need to extract the coefficients and the variance-covariance matrix.
coef = coefficients(model2$c.model)[2:5]
var.cov = model2$c.model$var[2:5,2:5]
pred.t = crosspred(basis=t.spline, at=45:86, vcov=var.cov, coef=coef, model.link='identity')
plot(pred.t,"overall")

Resources