Get model estimates from another reference level, without running new model? - r

I am wondering if there is a simple way to change what values are in the intercept, perhaps mathematically, without re-running large models. As an example:
mtcars$cyl<-as.factor(mtcars$cyl)
summary(
lm(mpg~cyl+hp,data=mtcars)
)
Output:
Call:
lm(formula = mpg ~ cyl + hp, data = mtcars)
Residuals:
Min 1Q Median 3Q Max
-4.818 -1.959 0.080 1.627 6.812
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 28.65012 1.58779 18.044 < 2e-16 ***
cyl6 -5.96766 1.63928 -3.640 0.00109 **
cyl8 -8.52085 2.32607 -3.663 0.00103 **
hp -0.02404 0.01541 -1.560 0.12995
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 3.146 on 28 degrees of freedom
Multiple R-squared: 0.7539, Adjusted R-squared: 0.7275
F-statistic: 28.59 on 3 and 28 DF, p-value: 1.14e-08
Now I can change the reference level to 6 cyl, and can see how 8 cyl now compares to 6 cyl, rather than 4 cyl:
mtcars$cyl<-relevel(mtcars$cyl,"6")
summary(
lm(mpg~cyl+hp,data=mtcars)
)
Output:
Call:
lm(formula = mpg ~ cyl + hp, data = mtcars)
Residuals:
Min 1Q Median 3Q Max
-4.818 -1.959 0.080 1.627 6.812
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 22.68246 2.22805 10.18 6.48e-11 ***
cyl4 5.96766 1.63928 3.64 0.00109 **
cyl8 -2.55320 1.97867 -1.29 0.20748
hp -0.02404 0.01541 -1.56 0.12995
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 3.146 on 28 degrees of freedom
Multiple R-squared: 0.7539, Adjusted R-squared: 0.7275
F-statistic: 28.59 on 3 and 28 DF, p-value: 1.14e-08
What I am wondering is there a way to get these values without re-running a model? You can see that the comparison from 4 cyl to 6 cyl is the same in each model (-5.96 and 5.96), but how would I get the estimate for the 'other' coefficient in either model (e.g. the -2.55 from the first model). Of course in this case, it takes a fraction of a second to run the other model. But with very large models, it would be convenient to be able to change reference level without re-running. Are there relatively simple ways to convert all of the estimates and standard errors to be based off of a different reference level, or is it too complicated to do such a thing?
Any solutions for lme4, glmmTMB, or rstanarm models would be appreciated.

Here's a function that will give you the coefficiencts for every rearrangement of a given factor variable without having to run the model again or specify contrasts:
rearrange_model_factors <- function(model, var)
{
var <- deparse(substitute(var))
coefs <- coef(model)
level_coefs <- grep(paste0("^", var), names(coefs))
coefs[level_coefs] <- coefs[1] + coefs[level_coefs]
used_levels <- gsub(var, "", names(coefs[level_coefs]))
all_levels <- levels(model$model[[var]])
names(coefs)[1] <- paste0(var, setdiff(all_levels, used_levels))
level_coefs <- grep(paste0("^", var), names(coefs))
levs <- coefs[level_coefs]
perms <- gtools::permutations(length(levs), length(levs))
perms <- lapply(seq(nrow(perms)), function(i) levs[perms[i,]])
lapply(perms, function(x) {
x[-1] <- x[-1] - x[1]
coefs[level_coefs] <- x
names(coefs)[level_coefs] <- names(x)
names(coefs)[1] <- "(Intercept)"
coefs
})
}
Suppose you had a model like this:
iris_mod <- lm(Sepal.Width ~ Species + Sepal.Length, data = iris)
To see how your coefficients would change if Species were in a different order, you would just do:
rearrange_model_factors(iris_mod, Species)
#> [[1]]
#> (Intercept) Speciesversicolor Speciesvirginica Sepal.Length
#> 1.6765001 -0.9833885 -1.0075104 0.3498801
#>
#> [[2]]
#> (Intercept) Speciesvirginica Speciesversicolor Sepal.Length
#> 1.6765001 -1.0075104 -0.9833885 0.3498801
#>
#> [[3]]
#> (Intercept) Speciessetosa Speciesvirginica Sepal.Length
#> 0.69311160 0.98338851 -0.02412184 0.34988012
#>
#> [[4]]
#> (Intercept) Speciesvirginica Speciessetosa Sepal.Length
#> 0.69311160 -0.02412184 0.98338851 0.34988012
#>
#> [[5]]
#> (Intercept) Speciessetosa Speciesversicolor Sepal.Length
#> 0.66898976 1.00751035 0.02412184 0.34988012
#>
#> [[6]]
#> (Intercept) Speciesversicolor Speciessetosa Sepal.Length
#> 0.66898976 0.02412184 1.00751035 0.34988012
Or with your own example:
mtcars$cyl <- as.factor(mtcars$cyl)
rearrange_model_factors(lm(mpg ~ cyl + hp, data = mtcars), cyl)
#> [[1]]
#> (Intercept) cyl6 cyl8 hp
#> 28.65011816 -5.96765508 -8.52085075 -0.02403883
#>
#> [[2]]
#> (Intercept) cyl8 cyl6 hp
#> 28.65011816 -8.52085075 -5.96765508 -0.02403883
#>
#> [[3]]
#> (Intercept) cyl4 cyl8 hp
#> 22.68246309 5.96765508 -2.55319567 -0.02403883
#>
#> [[4]]
#> (Intercept) cyl8 cyl4 hp
#> 22.68246309 -2.55319567 5.96765508 -0.02403883
#>
#> [[5]]
#> (Intercept) cyl4 cyl6 hp
#> 20.12926741 8.52085075 2.55319567 -0.02403883
#>
#> [[6]]
#> (Intercept) cyl6 cyl4 hp
#> 20.12926741 2.55319567 8.52085075 -0.02403883
We need a bit of exposition to see why this works.
Although the function above only runs the model once, let's start by creating a list containing 3 versions of mtcars, where the baseline factor levels of cyl are all different.
df_list <- list(mtcars_4 = within(mtcars, cyl <- factor(cyl, c(4, 6, 8))),
mtcars_6 = within(mtcars, cyl <- factor(cyl, c(6, 4, 8))),
mtcars_8 = within(mtcars, cyl <- factor(cyl, c(8, 4, 6))))
Now we can extract the coefficients of your model for all three versions at once using lapply. For clarity, we will remove the hp coefficient, which remains static across all three versions anyway:
coefs <- lapply(df_list, function(x) coef(lm(mpg ~ cyl + hp, data = x))[-4])
coefs
#> $mtcars_4
#> (Intercept) cyl6 cyl8
#> 28.650118 -5.967655 -8.520851
#>
#> $mtcars_6
#> (Intercept) cyl4 cyl8
#> 22.682463 5.967655 -2.553196
#>
#> $mtcars_8
#> (Intercept) cyl4 cyl6
#> 20.129267 8.520851 2.553196
Now, we remind ourselves that the coefficient for each factor level is given relative to the baseline level. That means for the non-intercept coefficients, we can simply add the intercept value to their coefficients to get their absolute value. That means that these numbers represent the expected value for mpg when hp equals 0 for all three levels of cyl
coefs <- lapply(coefs, function(x) c(x[1], x[-1] + x[1]))
coefs
#> $mtcars_4
#> (Intercept) cyl6 cyl8
#> 28.65012 22.68246 20.12927
#>
#> $mtcars_6
#> (Intercept) cyl4 cyl8
#> 22.68246 28.65012 20.12927
#>
#> $mtcars_8
#> (Intercept) cyl4 cyl6
#> 20.12927 28.65012 22.68246
Since we now have all three values as absolutes, let's rename "Intercept" to the appropriate factor level:
coefs <- mapply(function(x, y) { names(x)[1] <- y; x},
x = coefs, y = c("cyl4", "cyl6", "cyl8"), SIMPLIFY = FALSE)
coefs
#> $mtcars_4
#> cyl4 cyl6 cyl8
#> 28.65012 22.68246 20.12927
#>
#> $mtcars_6
#> cyl6 cyl4 cyl8
#> 22.68246 28.65012 20.12927
#>
#> $mtcars_8
#> cyl8 cyl4 cyl6
#> 20.12927 28.65012 22.68246
Finally, let's rearrange the order so we can compare the absolute values of all three factor levels:
coefs <- lapply(coefs, function(x) x[order(names(x))])
coefs
#> $mtcars_4
#> cyl4 cyl6 cyl8
#> 28.65012 22.68246 20.12927
#>
#> $mtcars_6
#> cyl4 cyl6 cyl8
#> 28.65012 22.68246 20.12927
#>
#> $mtcars_8
#> cyl4 cyl6 cyl8
#> 28.65012 22.68246 20.12927
We can see they are all the same. This is why the ordering of factors is arbitrary in lm: changing the order of the factor levels gives the same numerical predictions in the end, even if the summary appears different.
TL;DR
So the answer to your question of where do you get the -2.55 if you only have the first model is find the difference between the non-intercept coefficients. In this case
(-8.520851) -(-5.967655)
#> [1] -2.553196
Alternatively, add the intercept on to the non-intercept coefficients and you can see what the intercept would be if any of the levels were baseline and you can get the coefficient for any level relative to any other by simple subtraction. That's how my function rearrange_model_factors works.
Created on 2020-10-05 by the reprex package (v0.3.0)

Related

Can I parallelize a function which by default returns a list? (in R)

Can I parallelize a function which by default returns a list? (in R) I have tried with the parLapply function of the parallel package, but I did not succeed.
Yes, parLapply can return a list of lists. If the function called by parLapply returns a list that's what you get.
library(parallel)
# data
data(mtcars)
# function - model fit on bootstrapped samples from the Boston dataset
model_fit <- function(x) {
n <- nrow(mtcars)
i <- sample(n, n, replace = TRUE)
fit <- lm(mpg ~ ., data = mtcars[i, ])
fit
}
# detect the number of cores
n.cores <- detectCores() - 2L
# Repl bootstrap replicates
Repl <- 100L
cl <- makeCluster(n.cores)
clusterExport(cl, "mtcars")
model_list <- parLapply(cl, 1:Repl, model_fit)
stopCluster(cl)
# a list of class "lm"
summary(model_list[[1]])
#>
#> Call:
#> lm(formula = mpg ~ ., data = mtcars[i, ])
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -2.86272 -0.80106 -0.08815 0.68233 2.79325
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) -19.353679 20.403857 -0.949 0.353648
#> cyl -9.469350 1.921317 -4.929 7.10e-05 ***
#> disp 0.008405 0.014370 0.585 0.564876
#> hp 0.037101 0.020395 1.819 0.083185 .
#> drat 9.573765 2.289820 4.181 0.000421 ***
#> wt -2.543876 2.111203 -1.205 0.241630
#> qsec 3.360474 1.091440 3.079 0.005692 **
#> vs -32.223824 6.698522 -4.811 9.39e-05 ***
#> am -26.326478 5.534103 -4.757 0.000107 ***
#> gear 11.903213 2.420963 4.917 7.30e-05 ***
#> carb -4.022473 1.047176 -3.841 0.000949 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 1.745 on 21 degrees of freedom
#> Multiple R-squared: 0.9437, Adjusted R-squared: 0.9169
#> F-statistic: 35.2 on 10 and 21 DF, p-value: 7.204e-11
Created on 2022-09-02 by the reprex package (v2.0.1)

How do I extract variables that have a low p-value in R

I have a logistic model with plenty of interactions in R.
I want to extract only the variables and interactions that are either interactions or just predictor variables that are significant.
It's fine if I can just look at every interaction that's significant as I can still look at which non-significant fields were used to get them.
Thank you.
This is the most I have
broom::tidy(logmod)[,c("term", "estimate", "p.value")]
Here is a way. After fitting the logistic model use a logical condition to get the significant predictors and a regex (logical grep) to get the interactions. These two index vectors can be combined with &, in the case below returning no significant interactions at the alpha == 0.05 level.
fit <- glm(am ~ hp + qsec*vs, mtcars, family = binomial)
summary(fit)
#>
#> Call:
#> glm(formula = am ~ hp + qsec * vs, family = binomial, data = mtcars)
#>
#> Deviance Residuals:
#> Min 1Q Median 3Q Max
#> -1.93876 -0.09923 -0.00014 0.05351 1.33693
#>
#> Coefficients:
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept) 199.02697 102.43134 1.943 0.0520 .
#> hp -0.12104 0.06138 -1.972 0.0486 *
#> qsec -10.87980 5.62557 -1.934 0.0531 .
#> vs -108.34667 63.59912 -1.704 0.0885 .
#> qsec:vs 6.72944 3.85348 1.746 0.0808 .
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> (Dispersion parameter for binomial family taken to be 1)
#>
#> Null deviance: 43.230 on 31 degrees of freedom
#> Residual deviance: 12.574 on 27 degrees of freedom
#> AIC: 22.574
#>
#> Number of Fisher Scoring iterations: 8
alpha <- 0.05
pval <- summary(fit)$coefficients[,4]
sig <- pval <= alpha
intr <- grepl(":", names(coef(fit)))
coef(fit)[sig]
#> hp
#> -0.1210429
coef(fit)[sig & intr]
#> named numeric(0)
Created on 2022-09-15 with reprex v2.0.2

How to automatically inserting default key in prais winsten estimation?

I got an error message saying
Error in prais_winsten(agvprc.lm1, data = data) :
argument "index" is missing, with no default
How can I avoid inputting all the features
agvprc.lm1=lm(log(avgprc) ~ mon+tues+wed+thurs+t+wave2+wave3)
summary(agvprc.lm1)
agvprc.lm.pw = prais_winsten(agvprc.lm1, data=data)
summary(agvprc.lm.pw)
Not sure if I've understood the question correctly, but to avoid the "Error in prais_winsten(agvprc.lm1, data = data) : argument "index" is missing, with no default" error you need to provide an 'index' to the function (which is a character variable of "ID" and "time"). Using the inbuilt mtcars dataset as an example, with "cyl" as "time":
library(tidyverse)
#install.packages("prais")
library(prais)
#> Loading required package: sandwich
#> Loading required package: pcse
#>
#> Attaching package: 'pcse'
#> The following object is masked from 'package:sandwich':
#>
#> vcovPC
ggplot(mtcars, aes(x = cyl, y = mpg, group = cyl)) +
geom_boxplot() +
geom_jitter(aes(color = hp), width = 0.2)
agvprc.lm1 <- lm(log(mpg) ~ cyl + hp, data = mtcars)
summary(agvprc.lm1)
#>
#> Call:
#> lm(formula = log(mpg) ~ cyl + hp, data = mtcars)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -0.35699 -0.09882 0.01111 0.11948 0.24118
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 3.7829495 0.1062183 35.615 < 2e-16 ***
#> cyl -0.1072513 0.0279213 -3.841 0.000615 ***
#> hp -0.0011031 0.0007273 -1.517 0.140147
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 0.1538 on 29 degrees of freedom
#> Multiple R-squared: 0.7503, Adjusted R-squared: 0.7331
#> F-statistic: 43.57 on 2 and 29 DF, p-value: 1.83e-09
agvprc.lm.pw <- prais_winsten(formula = agvprc.lm1,
data = mtcars,
index = c("hp", "cyl"))
#> Iteration 0: rho = 0
#> Iteration 1: rho = 0.6985
#> Iteration 2: rho = 0.7309
#> Iteration 3: rho = 0.7285
#> Iteration 4: rho = 0.7287
#> Iteration 5: rho = 0.7287
#> Iteration 6: rho = 0.7287
#> Iteration 7: rho = 0.7287
summary(agvprc.lm.pw)
#>
#> Call:
#> prais_winsten(formula = agvprc.lm1, data = mtcars, index = c("hp",
#> "cyl"))
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -0.33844 -0.08166 0.03109 0.13612 0.25811
#>
#> AR(1) coefficient rho after 7 iterations: 0.7287
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 3.7643405 0.1116876 33.704 <2e-16 ***
#> cyl -0.1061198 0.0298161 -3.559 0.0013 **
#> hp -0.0011470 0.0007706 -1.489 0.1474
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 0.1077 on 29 degrees of freedom
#> Multiple R-squared: 0.973, Adjusted R-squared: 0.9712
#> F-statistic: 523.3 on 2 and 29 DF, p-value: < 2.2e-16
#>
#> Durbin-Watson statistic (original): 0.1278
#> Durbin-Watson statistic (transformed): 0.4019
Created on 2022-02-28 by the reprex package (v2.0.1)
# To present the index without having to write out all of the variables
# perhaps you could use:
agvprc.lm.pw <- prais_winsten(formula = agvprc.lm1,
data = mtcars,
index = names(agvprc.lm1$coefficients)[3:2])
summary(agvprc.lm.pw)
#>
#> Call:
#> prais_winsten(formula = agvprc.lm1, data = mtcars, index = names(agvprc.lm1$coefficients)[3:2])
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -0.33844 -0.08166 0.03109 0.13612 0.25811
#>
#> AR(1) coefficient rho after 7 iterations: 0.7287
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 3.7643405 0.1116876 33.704 <2e-16 ***
#> cyl -0.1061198 0.0298161 -3.559 0.0013 **
#> hp -0.0011470 0.0007706 -1.489 0.1474
#> ---
#> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#>
#> Residual standard error: 0.1077 on 29 degrees of freedom
#> Multiple R-squared: 0.973, Adjusted R-squared: 0.9712
#> F-statistic: 523.3 on 2 and 29 DF, p-value: < 2.2e-16
#>
#> Durbin-Watson statistic (original): 0.1278
#> Durbin-Watson statistic (transformed): 0.4019
NB. there are a number of assumptions here that may not apply to your actual data; with more information, such as a minimal reproducible example, you will likely get a better/more informed answer on stackoverflow

Is it possible to add multiple variables in the same regression model?

These are the regression models that I want to obtain. I want to select many variables at the same time to develop a multivariate model, since my data frame has 357 variables.
summary(lm(formula = bci_bci ~ bti_acp, data = qog))
summary(lm(formula = bci_bci ~ wdi_pop, data = qog))
summary(lm(formula = bci_bci ~ ffp_sl, data = qog))
Instead of listing all your variables using + signs, you can also use the shorthand notation . to add all variables in data as explanatory variables (except the target variable on the left hand side of course).
data("mtcars")
mod <- lm(mpg ~ ., data = mtcars)
summary(mod)
#>
#> Call:
#> lm(formula = mpg ~ ., data = mtcars)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -3.4506 -1.6044 -0.1196 1.2193 4.6271
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 12.30337 18.71788 0.657 0.5181
#> cyl -0.11144 1.04502 -0.107 0.9161
#> disp 0.01334 0.01786 0.747 0.4635
#> hp -0.02148 0.02177 -0.987 0.3350
#> drat 0.78711 1.63537 0.481 0.6353
#> wt -3.71530 1.89441 -1.961 0.0633 .
#> qsec 0.82104 0.73084 1.123 0.2739
#> vs 0.31776 2.10451 0.151 0.8814
#> am 2.52023 2.05665 1.225 0.2340
#> gear 0.65541 1.49326 0.439 0.6652
#> carb -0.19942 0.82875 -0.241 0.8122
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 2.65 on 21 degrees of freedom
#> Multiple R-squared: 0.869, Adjusted R-squared: 0.8066
#> F-statistic: 13.93 on 10 and 21 DF, p-value: 3.793e-07
par(mfrow=c(2,2))
plot(mod)
par(mfrow=c(1,1))
Created on 2021-12-21 by the reprex package (v2.0.1)
If you want to include all two-way interactions, the notation would be this:
lm(mpg ~ (.)^2, data = mtcars)
If you want to include all three-way interactions, the notation would be this:
lm(mpg ~ (.)^3, data = mtcars)
If you create very large models (with many variables or interactions), make sure that you also perform some model size reduction after that, e.g. using the function step(). It's very likely that not all your predictors are actually going to be informative, and many could be correlated, which causes problems in multivariate models. One way out of this could be to remove any predictors that are highly correlated to other predictors from the model.

Different independent variables and table from summary-values

I have problem that I have been trying to solve for a couple of hours now but I simply can't figure it out (I'm new to R btw..).
Basically, what I'm trying to do (using mtcars to illustrate) is to make R test different independent variables (while adjusting for "cyl" and "disp") for the same independent variable ("mpg"). The best soloution I have been able to come up with is:
lm <- lapply(mtcars[,4:6], function(x) lm(mpg ~ cyl + disp + x, data = mtcars))
summary <- lapply(lm, summary)
... where 4:6 corresponds to columns "hp", "drat" and "wt".
This acutually works OK but the problem is that the summary appers with an "x" instead of for instace "hp":
$hp
Call:
lm(formula = mpg ~ cyl + disp + x, data = mtcars)
Residuals:
Min 1Q Median 3Q Max
-4.0889 -2.0845 -0.7745 1.3972 6.9183
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 34.18492 2.59078 13.195 1.54e-13 ***
cyl -1.22742 0.79728 -1.540 0.1349
disp -0.01884 0.01040 -1.811 0.0809 .
x -0.01468 0.01465 -1.002 0.3250
---
Signif. codes:
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 3.055 on 28 degrees of freedom
Multiple R-squared: 0.7679, Adjusted R-squared: 0.743
F-statistic: 30.88 on 3 and 28 DF, p-value: 5.054e-09
Questions:
Is there a way to fix this? And have I done this in the smartest way using lapply, or would it be better to use for instance for loops or other options?
Ideally, I would also very much like to make a table showing for instance only the estimae and P-value for each dependent variable. Can this somehow be done?
Best regards
One approach to get the name of the variable displayed in the summary is by looping over the names of the variables and setting up the formula using paste and as.formula:
lm <- lapply(names(mtcars)[4:6], function(x) {
formula <- as.formula(paste0("mpg ~ cyl + disp + ", x))
lm(formula, data = mtcars)
})
summary <- lapply(lm, summary)
summary
#> [[1]]
#>
#> Call:
#> lm(formula = formula, data = mtcars)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -4.0889 -2.0845 -0.7745 1.3972 6.9183
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 34.18492 2.59078 13.195 1.54e-13 ***
#> cyl -1.22742 0.79728 -1.540 0.1349
#> disp -0.01884 0.01040 -1.811 0.0809 .
#> hp -0.01468 0.01465 -1.002 0.3250
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 3.055 on 28 degrees of freedom
#> Multiple R-squared: 0.7679, Adjusted R-squared: 0.743
#> F-statistic: 30.88 on 3 and 28 DF, p-value: 5.054e-09
Concerning the second part of your question. One way to achieve this by making use of broom::tidy from the broom package which gives you a summary of regression results as a tidy dataframe:
lapply(lm, broom::tidy)
#> [[1]]
#> # A tibble: 4 x 5
#> term estimate std.error statistic p.value
#> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 (Intercept) 34.2 2.59 13.2 1.54e-13
#> 2 cyl -1.23 0.797 -1.54 1.35e- 1
#> 3 disp -0.0188 0.0104 -1.81 8.09e- 2
#> 4 hp -0.0147 0.0147 -1.00 3.25e- 1
We could use reformulate to create the formula for the lm
lst1 <- lapply(names(mtcars)[4:6], function(x) {
fmla <- reformulate(c("cyl", "disp", x),
response = "mpg")
model <- lm(fmla, data = mtcars)
model$call <- deparse(fmla)
model
})
Then, get the summary
summary1 <- lapply(lst1, summary)
summary1[[1]]
#Call:
#"mpg ~ cyl + disp + hp"
#Residuals:
# Min 1Q Median 3Q Max
#-4.0889 -2.0845 -0.7745 1.3972 6.9183
#Coefficients:
# Estimate Std. Error t value Pr(>|t|)
#(Intercept) 34.18492 2.59078 13.195 1.54e-13 ***
#cyl -1.22742 0.79728 -1.540 0.1349
#disp -0.01884 0.01040 -1.811 0.0809 .
#hp -0.01468 0.01465 -1.002 0.3250
#---
#Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#Residual standard error: 3.055 on 28 degrees of freedom
#Multiple R-squared: 0.7679, Adjusted R-squared: 0.743
#F-statistic: 30.88 on 3 and 28 DF, p-value: 5.054e-09

Resources