Using coef and summary.lm with robcov in R (extracting p-values) - r

I can extract the p-values for my slope & intercept from an ols object this way:
library(rms)
m1 <- ols(wt ~ cyl, data= mtcars, x= TRUE, y= TRUE)
coef(summary.lm(m1))
But when I try the same thing with a robcov object, summary.lm gives me the p-values from the original model (m1), not the robcov model:
m2 <- robcov(m1)
m2
coef(summary.lm(m2))
I think this must be related to the Warning from the robcov help page,
Warnings
Adjusted ols fits do not have the corrected standard errors printed
with print.ols. Use sqrt(diag(adjfit$var)) to get this, where adjfit
is the result of robcov.
but I'm not sure how.
Is there a way to extract the p-values from a robcov object? (I'm really only interested in the one for the slope, if that makes a difference...)

Hacking through print.ols and prModFit, I came up with this.
errordf <- m2$df.residual
beta <- m2$coefficients
se <- sqrt(diag(m2$var))
Z <- beta/se
P <- 2 * (1 - pt(abs(Z), errordf))
Change m2 to another robcov model.
Try it for yourself by comparing the results of P to print(m2)

Related

Fixest and Standardized Coefficients from Easystats in R

I was wondering what would be the best way to calculate and present standardized coefficients using fixest. Here is what I tried using easystats
library(parameters)
library(effectsize)
library(fixest)
m <- lm(rating ~ complaints, data = attitude)
standardize_parameters(m, method="basic")# works
m <- feols(rating ~ complaints, data = attitude)
standardize_parameters(m, method="basic")# Error in stats::model.frame(model)[[1]] : subscript out of bounds
I also tried the modelsummary approach, but it shows unstandardized coefficients with no error.
library(parameters)
library(effectsize)
m <- lm(rating ~ complaints, data = attitude)
modelsummary(m, standardize="refit") # works, coeffs are different
m <- feols(rating ~ complaints, data = attitude)
modelsummary(m, standardize="refit")# doesn't work, coeffs are the same
Any insight or advice on how to elegantly and easily pull standardized coefficients out of fixest estimation results would be greatly appreciated. My goal is to replicate the venerable to use listcoef package in Stata. Many thanks to the authors of the packages mentioned in this post!
Edit: ``` > packageVersion("modelsummary")
[1] ‘1.1.0.9000’
One potential solution is to just manually calculate the standardized coefficients yourself, as [detailed here][1]. As an example, below I scale your predictor and outcome, then calculate the standardized beta coefficient of the only predictor in your model.
#### Scale Predictor and Outcome ####
scale.x <- sd(attitude$complaints)
scale.y <- sd(attitude$rating)
#### Obtain Standardized Coefficients ####
sb <- coef(m)[2] * scale.x / scale.y
sb
Which gives you this (you can ignore the column name, as it is just borrowing it from the original coef vector):
complaints
0.8254176
[1]: https://www.sciencedirect.com/topics/mathematics/standardized-regression-coefficient#:~:text=The%20standardized%20regression%20coefficient%2C%20found,one%20of%20its%20standardized%20units%20(

Confusion matrix for multinomial logistic regression & ordered logit

I would like to create confusion matrices for a multinomial logistic regression as well as a proportional odds model but I am stuck with the implementation in R. My attempt below does not seem to give the desired output.
This is my code so far:
CH <- read.table("http://data.princeton.edu/wws509/datasets/copen.dat", header=TRUE)
CH$housing <- factor(CH$housing)
CH$influence <- factor(CH$influence)
CH$satisfaction <- factor(CH$satisfaction)
CH$contact <- factor(CH$contact)
CH$satisfaction <- factor(CH$satisfaction,levels=c("low","medium","high"))
CH$housing <- factor(CH$housing,levels=c("tower","apartments","atrium","terraced"))
CH$influence <- factor(CH$influence,levels=c("low","medium","high"))
CH$contact <- relevel(CH$contact,ref=2)
model <- multinom(satisfaction ~ housing + influence + contact, weights=n, data=CH)
summary(model)
preds <- predict(model)
table(preds,CH$satisfaction)
omodel <- polr(satisfaction ~ housing + influence + contact, weights=n, data=CH, Hess=TRUE)
preds2 <- predict(omodel)
table(preds2,CH$satisfaction)
I would really appreciate some advice on how to correctly produce confusion matrices for my 2 models!
You can refer -
Predict() - Maybe I'm not understanding it
Here in predict() you need to pass unseen data for prediction.

Categorical Regression with Centered Levels

R's standard way of doing regression on categorical variables is to select one factor level as a reference level and constraining the effect of that level to be zero. Instead of constraining a single level effect to be zero, I'd like to constrain the sum of the coefficients to be zero.
I can hack together coefficient estimates for this manually after fitting the model the standard way:
x <- lm(data = mtcars, mpg ~ factor(cyl))
z <- c(coef(x), "factor(cyl)4" = 0)
y <- mean(z[-1])
z[-1] <- z[-1] - y
z[1] <- z[1] + y
z
## (Intercept) factor(cyl)6 factor(cyl)8 factor(cyl)4
## 20.5021645 -0.7593074 -5.4021645 6.1614719
But that leaves me without standard error estimates for the former reference level that I just added as an explicit effect, and I need to have those as well.
I did some searching and found the constrasts functions, and tried
lm(data = mtcars, mpg ~ C(factor(cyl), contr = contr.sum))
but this still only produces two effect estimates. Is there a way to change which constraint R uses for linear regression on categorical variables properly?
Think I've figured it out. Using contrasts actually is the right way to go about it, you just need to do a little work to get the results into a convenient looking form. Here's the fit:
fit <- lm(data = mtcars, mpg ~ C(factor(cyl), contr = contr.sum))
Then the matrix cs <- contr.sum(factor(cyl)) is used to get the effect estimates and the standard error.
The effect estimates just come from multiplying the contrast matrix by the effect estimates lm spits out, like so:
cs %*% coef(fit)[-1]
The standard error can be calculated using the contrast matrix and the variance-covariance matrix of the coefficients, like so:
diag(cs %*% vcov(fit)[-1,-1] %*% t(cs))

change null hypothesis in lmtest in R

I have a linear model generated using lm. I use the coeftest function in the package lmtest go test a hypothesis with my desired vcov from the sandwich package. The default null hypothesis is beta = 0. What if I want to test beta = 1, for example. I know I can simply take the estimated coefficient, subtract 1 and divide by the provided standard error to get the t-stat for my hypothesis. However, there must be functionality for this already in R. What is the right way to do this?
MWE:
require(lmtest)
require(sandwich)
set.seed(123)
x = 1:10
y = x + rnorm(10)
mdl = lm(y ~ x)
z = coeftest(mdl, df=Inf, vcov=NeweyWest)
b = z[2,1]
se = z[2,2]
mytstat = (b-1)/se
print(mytstat)
the formally correct way to do this:
require(multcomp)
zed = glht(model=mdl, linfct=matrix(c(0,1), nrow=1, ncol=2), rhs=1, alternative="two.sided", vcov.=NeweyWest)
summary(zed)
Use an offset of -1*x
mdl<-lm(y~x)
mdl2 <- lm(y ~ x-offset(x) )
> mdl
Call:
lm(formula = y ~ x)
Coefficients:
(Intercept) x
0.5255 0.9180
> mdl2
Call:
lm(formula = y ~ x - offset(x))
Coefficients:
(Intercept) x
0.52547 -0.08197
You can look at summary(mdl2) to see the p-value (and it is the same as in mdl.
As far as I know, there is no default function to test the model coefficients against arbitrary value (1 in your case). There is the offset trick presented in the other answer, but it's not that straightforward (and always be careful with such model modifications). So, your expression (b-1)/se is actually a good way to do it.
I have two notes on your code:
You can use summary(mdl) to get the t-test for 0.
You are using lmtest with covariance structure (which will change the t-test values), but your original lm model doesn't have it. Perhaps this could be a problem? Perhaps you should use glm and specify the correlation structure from the start.

How to update summary when using NeweyWest?

I am using NeweyWest standard errors to correct my lm() / dynlm() output. E.g.:
fit1<-dynlm(depvar~covariate1+covariate2)
coeftest(fit1,vcov=NeweyWest)
Coefficients are displayed the way I´d like to, but unfortunately I loose all the regression output information like R squared, F-Test etc. that is displayed by summary. So I wonder how I can display robust se and all the other stuff in the same summary output.
Is there a way to either get everything in one call or to overwrite the 'old' estimates?
I bet I just missed something badly, but that is really relevant when sweaving the output.
Test example, taken from ?dynlm.
require(dynlm)
require(sandwich)
data("UKDriverDeaths", package = "datasets")
uk <- log10(UKDriverDeaths)
dfm <- dynlm(uk ~ L(uk, 1) + L(uk, 12))
#shows R-squared, etc.
summary(dfm)
#no such information
coeftest(dfm, vcov = NeweyWest)
btw.: same applies for vcovHC
coefficients is just a matrix in the lm (or dynlm) summary object, so all you need to do is unclass the coeftest() output.
library(dynlm)
library(sandwich)
library(lmtest)
temp.lm <- dynlm(runif(100) ~ rnorm(100))
temp.summ <- summary(temp.lm)
temp.summ$coefficients <- unclass(coeftest(temp.lm, vcov. = NeweyWest))
If you specify the covariance matrix, the F-statistics change and you need to compute it again using waldtest() right? Because
temp.summ$coefficients <- unclass(coeftest(temp.lm, vcov. = NeweyWest))
only overwrites the coefficients.
F-statistics change but R^2 remains the same .

Resources