R summary output not printing formula, but only its variable - r

I my R code, I have the folllowing:
input.formula = 'some_valid_formula'
glm.model <- glm(inp.formula,data=data, family='quasipoisson')
summary <- capture.output(summary(glm.model))
print(summary)
The problem is that the summary prints:
Call: [INFO] - model coefficients: glm(formula = input.formula, family =
\"quasipoisson\")
...
Note that the formula is printed as the formula variable name and not the formula itself. What am I missing here?

One option is to use do.call to construct and execute the glm call. This will return the evaluated glm call with the formula displayed. For example:
data <- data.frame(group = gl(3,3), type = gl(3,1,9), counts = rpois(9, 2))
input.formula = "counts ~ group + type"
glm.model <- do.call("glm", list(input.formula, 'quasipoisson', data))
summary(glm.model)

Related

How to run regression when formula is given by a string?

Let's consider data following:
set.seed(42)
y <- runif(100)
df <- data.frame("Exp" = rexp(100), "Norm" = rnorm(100), "Wei" = rweibull(100, 1))
I want to perform linear regression but when formula is a string in format:
form <- "Exp + Norm + Wei"
I thought that I only have to use:
as.formula(lm(y~form, data = df))
However it's not working. The error is about variety in length of variables. (it seems like it still treats form as a string vector of length 1, but I have no idea why).
Do you know how I can do it ?
We can use paste to construct the formula, and use it directly on lm
lm(paste('y ~', form), data = df)
-output
#Call:
#lm(formula = paste("y ~", form), data = df)
#Coefficients:
#(Intercept) Exp Norm Wei
# 0.495861 0.026988 0.046689 0.003612

Can't seem to figure out how to use lapply with gls() [duplicate]

I am trying to use lme function from nlme package inside a lapply loop. This works for lmer function from lme4 package, but produces an error message for lme. How can I loop lme functions similarly to the lmer function in the example below?
library("nlme")
library("lme4")
set.seed(1)
dt <- data.frame(Resp1 = rnorm(100, 50, 23), Resp2 = rnorm(100, 80, 15), Pred = rnorm(100,10,2), group = factor(rep(LETTERS[1:10], each = 10)))
## Syntax:
lmer(Resp1 ~ Pred + (1 |group), data = dt)
lme(Resp1 ~ Pred, random = ~1 | group, data = dt)
## Works for lme4
lapply(c("Resp1", "Resp2"), function(k) {
lmer(substitute(j ~ Pred + (1 | group), list(j = as.name(k))), data = dt)})
## Does not work for nlme
lapply(c("Resp1", "Resp2"), function(k) {
lme(substitute(j ~ Pred, list(j = as.name(k))), random = ~1 | group, data = dt)})
# Error in UseMethod("lme") :
# no applicable method for 'lme' applied to an object of class "call"
PS. I am aware that this solution exists, but I would like to use a method substituting response variable directly in the model function instead of subsetting data using an additional function.
Instead of fiddling around with substitute and eval you also could do the following:
lapply(c("Resp1", "Resp2"), function(r) {
f <- formula(paste(r, "Pred", sep = "~"))
m <- lme(fixed = f, random = ~ 1 | group, data = dt)
m$call$fixed <- f
m})
You could use the same trick if you want to provide different data sets to a modelling function:
makeModel <- function(dat) {
l <- lme(Resp1 ~ Pred, random = ~ 1 | group, data = dat)
l$call$data <- as.symbol(deparse(substitute(dat)))
l
}
I use this snippet quite a bit, when I want to generate a model from within a function and want to update it afterwards.
As #CarlWitthoft suggested, adding eval into the function will solve the issue:
lapply(c("Resp1", "Resp2"), function(k) {
lme(eval(substitute(j ~ Pred, list(j = as.name(k)))), random = ~1 | group, data = dt)})
Also see #thothal's alternative.

R order lapply output from a function with multiple outputs by variable (column) rather than by function

I have a function in R which includes multiple other functions, including a custom one. I then use lapply to run the combined function across multiple variables. However, when the output is produced it is in the order of
function1: variable a, variable b, variable c
function2: variable a, variable b, variable c
When what I would like is for it to be the other way around:
variable a: function 1, function 2...
variable b: function 1, function 2...
I have recreated an example below using the mtcars dataset, with number of cylinders as a predictor variable, and vs and am as outcome variables.
library(datasets)
library(tidyverse)
library(skimr)
library(car)
data(mtcars)
mtcars_binary <- mtcars %>%
dplyr::select(cyl, vs, am)
# logistic regression function
logistic.regression <- function(logmodel) {
dev <- logmodel$deviance
null.dev <- logmodel$null.deviance
modelN <- length(logmodel$fitted.values)
R.lemeshow <- 1 - dev / null.dev
R.coxsnell <- 1 - exp ( -(null.dev - dev) / modelN)
R.nagelkerke <- R.coxsnell / ( 1 - ( exp (-(null.dev / modelN))))
cat("Logistic Regression\n")
cat("Hosmer and Lemeshow R^2 ", round(R.lemeshow, 3), "\n")
cat("Cox and Snell R^2 ", round(R.coxsnell, 3), "\n")
cat("Nagelkerke R^2" , round(R.nagelkerke, 3), "\n")
}
# all logistic regression results
log_regression_tests1 <- function(df_vars, df_data) {
glm_summary <- glm(df_data[,df_vars] ~ df_data[,1], data = df_data, family = binomial, na.action = "na.omit")
glm_print <- print(glm_summary)
log_results <- logistic.regression(glm_summary)
blr_coefficients <- exp(glm_summary$coefficients)
blr_confint <- exp(confint(glm_summary))
list(glm_summary = glm_summary, glm_print = glm_print, log_results = log_results, blr_coefficients = blr_coefficients, blr_confint = blr_confint)
}
log_regression_results1 <- sapply(colnames(mtcars_binary[,2:3]), log_regression_tests1, mtcars_binary, simplify = FALSE)
log_regression_results1
When I do this, the output is being produced as:
glm_summary: vs, am
log_results: vs, am
etc. etc.
When what I would like for the output to be ordered is:
vs: all function outputs
am: all function outputs
In addition, when I run this line of code, log_regression_results1 <- sapply(colnames(mtcars_binary[,2:3]), log_regression_tests1, mtcars_binary, simplify = FALSE) I get only the results of the logistic regression function, but when I print the overall results log_regression_results1 I get the remaining output, could anyone explain why?
Finally, the glm_summary function is not producing all of the output which it should. When I run the functions independently on a single variable, like so
glm_vs <- glm(vs ~ cyl, data = mtcars_binary, family = binomial, na.action = "na.omit")
summary(glm_vs)
logistic.regression(glm_vs)
exp(glm_vs$vs)
exp(confint(glm_vs))
it also produces the standard error, z value, and p value for summary(glm_vs) which it does not do embedded in the function, even though I have ```glm_print <- print(glm_summary)' included. Is there a way to get the output for the full summary function within the log_regression_tests1 function?
when I run your code up to log_regression_results1 I got exactly what you ask for:
summary(log_regression_results1)
Length Class Mode
vs 5 -none- list
am 5 -none- list
maybe you meant to ask the other way round?

update() does not work for models created via lapply()

I would like to use lapply() to compute several models in R, but it seems that the update() function can't handle models generated through lapply().
A minimal example:
d1 <- data.frame(y = log(1:9), x = 1:9, trt = rep(1:3, each = 3))
f <- list(y ~ 1, y ~ x, y ~ trt)
modsa <- lapply(f, function(formula) glm(formula, data = d1))
modsb <- lapply(f, glm, data = d1)
update(modsa[[1]], data = d1[1:7, ])
#> Error: object of type 'closure' is not subsettable
update(modsb[[1]], data = d1[1:7, ])
#> Error in FUN(formula = X[[i]], data = d1[1:7, ]): could not find function "FUN"
Is there a way that allows update() to deal with models generated through lapply()?
The error is occurring because the call elements of the glm objects are being overwritten by the argument name passed to the anonymous function
modsa <- lapply(f, function(x) glm(x, data = d1))
modsa[[1]]$call
glm(formula = x, data = d1)
#compare with a single instance of the model
moda1<-glm(y ~ 1, data=d1)
moda1$call
glm(formula = y ~ 1, data = d1)
If you add back in the formula, it will correctly recreate the call
update(modsa[[1]], data = d1[1:7, ], formula=f[[1]])
This doesn't work for the second instance, but you can see that if you manually update the call element the update functionality is rescued.
modsb[[1]]$call<-getCall(moda1)
update(modsb[[1]], data = d1[1:7, ])
Esther is correct, the problem is with the call element of glm. From ?update:
‘update’ will update and (by default) re-fit a model. It does this by
extracting the call stored in the object, updating the call and (by
default) evaluating that call.
As already mentioned, one can update including the formula as well:
update(modsa[[1]], data = d1[1:7, ], formula=f[[1]])
If for some reason this is not convenient, here is how to run your lapply() and have it assign directly the correct formula to the call element:
modsa <- lapply(f, function(formula) eval(substitute(glm(F, data = d1), list(F=formula))))
This substutites the respective formula to the glm call, and then evaluates it. With this long one-liner you can run update(modsa[[1]], data = d1[1:7, ]) with no problem.

How to use substitute() to loop lme functions from nlme package?

I am trying to use lme function from nlme package inside a lapply loop. This works for lmer function from lme4 package, but produces an error message for lme. How can I loop lme functions similarly to the lmer function in the example below?
library("nlme")
library("lme4")
set.seed(1)
dt <- data.frame(Resp1 = rnorm(100, 50, 23), Resp2 = rnorm(100, 80, 15), Pred = rnorm(100,10,2), group = factor(rep(LETTERS[1:10], each = 10)))
## Syntax:
lmer(Resp1 ~ Pred + (1 |group), data = dt)
lme(Resp1 ~ Pred, random = ~1 | group, data = dt)
## Works for lme4
lapply(c("Resp1", "Resp2"), function(k) {
lmer(substitute(j ~ Pred + (1 | group), list(j = as.name(k))), data = dt)})
## Does not work for nlme
lapply(c("Resp1", "Resp2"), function(k) {
lme(substitute(j ~ Pred, list(j = as.name(k))), random = ~1 | group, data = dt)})
# Error in UseMethod("lme") :
# no applicable method for 'lme' applied to an object of class "call"
PS. I am aware that this solution exists, but I would like to use a method substituting response variable directly in the model function instead of subsetting data using an additional function.
Instead of fiddling around with substitute and eval you also could do the following:
lapply(c("Resp1", "Resp2"), function(r) {
f <- formula(paste(r, "Pred", sep = "~"))
m <- lme(fixed = f, random = ~ 1 | group, data = dt)
m$call$fixed <- f
m})
You could use the same trick if you want to provide different data sets to a modelling function:
makeModel <- function(dat) {
l <- lme(Resp1 ~ Pred, random = ~ 1 | group, data = dat)
l$call$data <- as.symbol(deparse(substitute(dat)))
l
}
I use this snippet quite a bit, when I want to generate a model from within a function and want to update it afterwards.
As #CarlWitthoft suggested, adding eval into the function will solve the issue:
lapply(c("Resp1", "Resp2"), function(k) {
lme(eval(substitute(j ~ Pred, list(j = as.name(k)))), random = ~1 | group, data = dt)})
Also see #thothal's alternative.

Resources