I am attempting to mess around with a simple autoregressive model and need to perform a simple linear regression in julia and am running into an issue that says
ERROR: MethodError: no method matching fit(::Type{LinearModel}, ::Matrix{Float64}, ::Matrix{Float64}, ::Nothing)
My code is
using XLSX, DataFrames, GLM, StatsBase
Quakes = DataFrame(XLSX.readtable("/Users/jjtan/Downloads/QUAKE.xlsx", "quakes"))
for i in 1:98
Quakes[!, Symbol("t", i)] = [zeros(i); Quakes[1:end-i, :x]]
end
ols = lm(#formula(x ~ t1), Quakes)
Note that
typeof(Quakes)
returns a DataFrame 99x99
I have additionally tried the documentation regression to ensure that everything is working properly ie
using DataFrames, GLM
data = DataFrame(X=[1,2,3], Y=[2,4,7])
ols = lm(#formula(Y ~ X), data)
Which has produced the expected
StatsModels.TableRegressionModel{LinearModel{GLM.LmResp{Vector{Float64}}, GLM.DensePredChol{Float64, LinearAlgebra.CholeskyPivoted{Float64, Matrix{Float64}, Vector{Int64}}}}, Matrix{Float64}}
Y ~ 1 + J
Coefficients:
─────────────────────────────────────────────────────────────────────────
Coef. Std. Error t Pr(>|t|) Lower 95% Upper 95%
─────────────────────────────────────────────────────────────────────────
(Intercept) -0.666667 0.62361 -1.07 0.4788 -8.59038 7.25704
J 2.5 0.288675 8.66 0.0732 -1.16797 6.16797
─────────────────────────────────────────────────────────────────────────
I have tried changing the variables to call using df.var notation as well as :var notation to no avail. I am quite confused as to why with pretty much identical lines of code one is running into an error and the other is not.
Related
Im running a series of coxph models in R and compiling the output into latex tables using the modelsummary package and command. The coxph provides SE and robust se as outputs and the p-value is based on the robust se. Here is a quick example to illustrate the output:
test_data <- list(time=c(4,3,1,1,2,2,3)
, status=c(1,1,1,0,1,1,0)
, x=c(0,2,1,1,1,0,0)
, sex=c(0,0,0,0,1,1,1))
model <- coxph(Surv(time, status) ~ x + cluster(sex), test_data)
Call:
coxph(formula = Surv(time, status) ~ x, data = test1, cluster = sex)
coef exp(coef) se(coef) robust se z p
x 0.460778 1.585306 0.562800 0.001052 437.9 <2e-16
Likelihood ratio test=0.66 on 1 df, p=0.4176
n= 7, number of events= 5
Next, Im trying to create latex tables from these models, displaying the robust se.
modelsummary(model, output = "markdown", fmt = 3, estimate = "{estimate}{stars}", statistic = "std.error")
| | Model 1 |
|:---------------------|:----------:|
|x | 0.461*** |
| | (0.563) |
|Num.Obs. | 7 |
|R2 | 0.090 |
|AIC | 13.4 |
As we can see, only the non-adjusted se is displayed. I could not find any alternative for this statistic = "std.error" parameter that fits and also something like vcov="robust" does not work.
How can I display any kind of robust standard errors using modelsummary for coxph models?
Thanks for reading and any help is appreciated.
The next version of modelsummary (now available on Github) will produce a more informative error message in those cases:
library(survival)
library(modelsummary)
test_data <- list(time=c(4,3,1,1,2,2,3)
, status=c(1,1,1,0,1,1,0)
, x=c(0,2,1,1,1,0,0)
, sex=c(0,0,0,0,1,1,1))
model <- coxph(Surv(time, status) ~ x + cluster(sex), test_data)
modelsummary(model, vcov = "robust")
# Error: Unable to extract a variance-covariance matrix for model object of class
# `coxph`. Different values of the `vcov` argument trigger calls to the `sandwich`
# or `clubSandwich` packages in order to extract the matrix (see
# `?insight::get_varcov`). Your model or the requested estimation type may not be
# supported by one or both of those packages, or you were missing one or more
# required arguments in `vcov_args` (like `cluster`).
As you can see, the problem is that modelsummary does not compute robust standard errors itself. Instead, it delegates this task to the sandwich or clubSandwich packages. Unfortunately, this coxph model does not appear appear to be supported by those packages:
sandwich::vcovHC(model)
#> Error in apply(abs(ef) < .Machine$double.eps, 1L, all): dim(X) must have a positive length
sandwich is the main package in the R ecosystem to compute robust standard errors. AFAICT, all the other table-making packages available (e.g., stargazer, texreg) also use sandwich, so you are unlikely to have success by looking at those. If you find another package which can compute robust standard errors for Cox models, please file a report on the modelsummary Github repository. I will investigate to see if it’s possible to add support then.
If the info you want is available in the summary object, you can add this information by following the instructions here:
https://vincentarelbundock.github.io/modelsummary/articles/modelsummary.html#adding-new-information-to-existing-models
tidy_custom.coxph <- function(x, ...) {
s <- summary(x)$coefficients
data.frame(
term = row.names(s),
robust.se = s[, "robust se", drop = FALSE])
}
modelsummary(model, statistic = "robust.se")
Model 1
x
0.461
(0.001)
Num.Obs.
7
AIC
13.4
BIC
13.4
RMSE
0.61
I’ve run an individual-fixed effects panel model in R using the plm-package. I now want to plot the marginal effects.
However, neither plot_model() nor effect_plot() work for plm-objects. plot_model() works for type = “est” but not for type = “pred”.
My online search so far only suggests using ggplot (which however only displays OLS-regressions, not fixed effects) or outdated functions (i.e, sjp.lm())
Does anyone have any recommendations how I can visualize effects of plm-objects?
IFE_Aut_uc <- plm(LoC_Authorities_rec ~ Compassion_rec, index = c("id","wave"), model = "within", effect = "individual", data = D3_long2)
summary(IFE_Aut_uc)
plot_model(IFE_Aut_uc, type = "pred”)
Error in data.frame(..., check.names = FALSE) :
arguments imply differing number of rows: 50238, 82308
and:
effect_plot(IFE_Pol_uc, pred = Compassion_rec)
Error in `stop_wrap()`:
! ~does not appear to be a one- or two-sided formula.
LoC_Politicians_recdoes not appear to be a one- or two-sided formula.
Compassion_recdoes not appear to be a one- or two-sided formula.
Backtrace:
1. jtools::effect_plot(IFE_Pol_uc, pred = Compassion_rec)
2. jtools::get_data(model, warn = FALSE)
4. jtools:::get_lhs(formula)
Edit 2022-08-20: The latest version of plm on CRAN now includes a predict() method for within models. In principle, the commands illustrated below using fixest should now work with plm as well.
In my experience, plm models are kind of tricky to deal with, and many of the packages which specialize in “post-processing” fail to handle these objects properly.
One alternative would be to estimate your “within” model using the fixest package and to plot the results using the marginaleffects package. (Disclaimer: I am the marginaleffects author.)
Note that many of the models estimated by plm are officially supported and tested with marginaleffects (e.g., random effects, Amemiya, Swaymy-Arora). However, this is not the case of this specific "within" model, which is even trickier than the others to support.
First, we estimate two models to show that the plm and fixest versions are equivalent:
library(plm)
library(fixest)
library(marginaleffects)
library(modelsummary)
data("EmplUK")
mod1 <- plm(
emp ~ wage * capital,
index = c("firm", "year"),
model = "within",
effect = "individual",
data = EmplUK)
mod2 <- feols(
emp ~ wage * capital | firm,
se = "standard",
data = EmplUK)
models <- list("PLM" = mod1, "FIXEST" = mod2)
modelsummary(models)
PLM
FIXEST
wage
0.000
0.000
(0.034)
(0.034)
capital
2.014
2.014
(0.126)
(0.126)
wage × capital
-0.043
-0.043
(0.004)
(0.004)
Num.Obs.
1031
1031
R2
0.263
0.986
R2 Adj.
0.145
0.984
R2 Within
0.263
R2 Within Adj.
0.260
AIC
4253.9
4253.9
BIC
4273.7
4273.7
RMSE
1.90
1.90
Std.Errors
IID
FE: firm
X
Now, we use the marginaleffects package to plot the results. There are two main functions for this:
plot_cap(): plot conditional adjusted predictions. How does my predicted outcome change as a function of a covariate?
plot_cme(): plot conditional marginal effects. How does the slope of my model with respect to one variable (i.e., a derivative or “marginal effect”) change with respect to another variable?
See the website for definitions and details: https://vincentarelbundock.github.io/marginaleffects/
plot_cap(mod2, condition = "capital")
plot_cme(mod2, effect = "wage", condition = "capital")
I have the following regression in R using the 'fixest' package. Yields are a function of N, N^2, P, K and S with producer by year fixed effects.
yield<- feols(yield ~ N + N_square + P + K + S |producer*year, data=data, se="hetero")
I need to use the delta method from the 'car' package to estimate the optimal rate of N and obtain the standard error. In the below example, using the marginal effect of nitrogen from my regression, I am finding the optimal rate of N at an input:output price ratio = 4.
deltaMethod(yield, "(4 - b1)/(2*b2)", parameterNames= paste("b", 0:2, sep=""))
My issue is I am unable to run the deltaMethod with the feols regression. I am given the following error:
Warning: In vcov.fixest(object, complete = FALSE):'complete' is not a valid argument of
function vcov.fixest (fyi, some of its
main arguments are 'vcov' and 'ssc').
Error in eval(g., envir) : object 'b1' not found
The deltaMethod works with lm functions. This is an issue for me, as I cannot run my regression instead as an lm function with the fixed effects as factors. This is because with my chosen data set and fixed effect variables it is extremely slow to run.
Is there any alternatives to the deltaMethod function that works with feols regressions?
#g-grothendieck's answer covers the main issue. (Which is to say that car::deltaMethod does work with fixest objects; you just have to specify the coefficient names in a particular way.) I'd also recommend updating your version of fixest, since you appear to be using an old release.
But for posterity, let me quickly tackle the subsidiary question:
Is there any alternatives to the deltaMethod function that works with feols regressions?
You can use the "hypothesis" functionality of the (excellent) marginaleffects package. Please note that this is a relatively new feature, so you'll need to install the development version of marginaleffects at the time of writing this comment.
Here's an example that replicates Gabor's one above.
library(fixest)
fm <- feols(conc ~ uptake + Treatment | Type, CO2, vcov = "hetero")
# remotes::install_github("vincentarelbundock/marginaleffects") # dev version
library(marginaleffects)
marginaleffects(
fm,
newdata = "mean",
hypothesis = "(4 - uptake)/(2 * Treatment) = 0"
) |>
summary() ## optional
#> Average marginal effects
#> Term Effect Std. Error z value Pr(>|z|) 2.5 % 97.5 %
#> 1 hypothesis -0.06078 0.01845 -3.295 0.00098423 -0.09693 -0.02463
#>
#> Model type: fixest
#> Prediction type: response
PS. For anyone else reading this, marginaleffects is something of a spiritual successor to margins, but is basically superior in every way (speed, model coverage, etc.)
In the absence of a reproducible example we will use the built-in CO2 data frame. (In the future please provide a reproducible example that can be used in answers -- see the info at the top of the r tag page.)
1) The default method of deltaMethod does not support the parameterNames argument so use the original names.
library(fixest)
library(car)
fm <- feols(conc ~ uptake + Treatment | Type, CO2, vcov = "hetero")
deltaMethod(fm, "(4 - uptake)/(2 * Treatmentchilled)")
## Estimate SE 2.5 % 97.5 %
## (4 - uptake)/(2 * Treatmentchilled) -0.060780 0.018446 -0.096934 -0.0246
2) Alternately it can work with just the coefficients and variance matrix so try this:
co <- setNames(coef(fm), c("b1", "b2"))
deltaMethod(co, "(4 - b1)/(2*b2)", vcov(fm))
## Estimate SE 2.5 % 97.5 %
## (4 - b1)/(2 * b2) -0.060780 0.018446 -0.096934 -0.0246
I can't figure out how to reconstruct the results nor the formula from the predict function of a linear model. I get the same results also when using this data in ggplot geom_smooth(method='lm',formula,y ~ exp(x)).
Here's some sample data
x=c(1,10,100,1000,10000,100000,1000000,3000000)
y=c(1,1,10,15,20,30,40,60)
I would like to use an exponential function so (ignore for the moment that I log the x value, because exp() fails for very large values):
model = lm( y ~ exp(log10(x)))
mypred = predict(model)
plot(log(x),mypred)
I have tried
lm_coef <- coef(model)
plot(log10(x),lm_coef[1]*exp(-lm_coef[2]*x))
However this is giving me a decreasing exponential instead of the increasing. My goal is to extract the equation of the exponential function so I can reuse the coefficients in another context.. What equation is predict() using and is there a way to see it?
I did something along the lines of:
Df<-data.frame(x=c(1,10,100,1000,10000,100000,1000000,3000000),
y=c(1,1,10,15,20,30,40,60))
model<-lm(data = Df, formula = y~log(x))
predict(model)
plot(log(Df$x),predict(model))
summary(model)
The relevant output you get is:
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -6.0700 4.7262 -1.284 0.246386
log(x) 3.5651 0.5035 7.081 0.000398 ***
---
Your equation therefore is 3.5651*log(x)-6.0700
I have a lm object and I would like to bootstrap only its standard errors. In practice I want to use only part of the sample (with replacement) at each replication and get a distribution of standard erros. Then, if possible, I would like to display the summary of the original linear regression but with the bootstrapped standard errors and the corresponding p-values (in other words same beta coefficients but different standard errors).
Edited: In summary I want to "modify" my lm object by having the same beta coefficients of the original lm object that I ran on the original data, but having the bootstrapped standard errors (and associated t-stats and p-values) obtained by computing this lm regression several times on different subsamples (with replacement).
So my lm object looks like
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.812793 0.095282 40.016 < 2e-16 ***
x -0.904729 0.284243 -3.183 0.00147 **
z 0.599258 0.009593 62.466 < 2e-16 ***
x*z 0.091511 0.029704 3.081 0.00208 **
but the associated standard errors are wrong, and I would like to estimate them by replicating this linear regression 1000 times (replications) on different subsample (with replacement).
Is there a way to do this? can anyone help me?
Thank you for your time.
Marco
What you ask can be done following the line of the code below.
Since you have not posted an example dataset nor the model to fit, I will use the built in dataset mtcars an a simple formula with two continuous predictors.
library(boot)
boot_function <- function(data, indices, formula){
d <- data[indices, ]
obj <- lm(formula, d)
coefs <- summary(obj)$coefficients
coefs[, "Std. Error"]
}
set.seed(8527)
fmla <- as.formula("mpg ~ hp * cyl")
seboot <- boot(mtcars, boot_function, R = 1000, formula = fmla)
colMeans(seboot$t)
##[1] 6.511530646 0.068694001 1.000101450 0.008804784
I believe that it is possible to use the code above for most needs with numeric response and predictors.