How to save summary(lm) to a file? - r

I'm using R for a pharmacodynamic analysis and I'm fairly new to programming.
The thing is, I'm carrying out linear regression analysis and in the future I will perform more advanced methods. Because I'm performing a large number of analysis (and I'm too lazy to manually copy paste every time I run the script), I would like to save the summaries of the analysis to a file. I've tried different methods, but nothing seems to work.
What I'm looking for is the following as (preferably) a text file:
X_Y <- lm(X ~ Y)
sum1 <- summary(X_Y)
> sum1
Call:
lm(formula = AUC_cumulative ~ LVEF)
Residuals:
Min 1Q Median 3Q Max
-910.59 -434.11 -89.17 349.39 2836.81
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1496.4215 396.5186 3.774 0.000268 ***
LVEF 0.8243 7.3265 0.113 0.910640
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 619.9 on 104 degrees of freedom
(32 observations deleted due to missingness)
Multiple R-squared: 0.0001217, Adjusted R-squared: -0.009493
F-statistic: 0.01266 on 1 and 104 DF, p-value: 0.9106
I've searched for methods to save summary functions to a .csv or .txt, but those files don't represent the data in a way I can understand it.
Things I've tried:
fileConn <- file("output.txt")
writeLines(sum1, fileConn)
close(fileConn)
This returns:
Error in writeLines(sum1, fileConn) : invalid 'text' argument
An attempt using the write.table command gave:
> write.table(Sum1, 'output.csv', sep=",", row.names=FALSE, col.names=TRUE, quote=FALSE)
Error in as.data.frame.default(x[[i]], optional = TRUE, stringsAsFactors = stringsAsFactors) : cannot coerce class ""summary.lm"" to a data.frame
Using the write command:
> write(sum1, 'output.txt')
Error in cat(list(...), file, sep, fill, labels, append) : argument 1 (type 'list') cannot be handled by 'cat'
Then I was getting closer with the following:
> write.table(sum1, 'output.csv', sep=",", row.names=FALSE, col.names=TRUE, quote=FALSE)
But this file did not have the same readable information as the printed summary
I hope someone can help, because this is way to advanced programming for me.

I think one option could be sink() which will output the results to a text file rather than the console. In the absence of your dataset I've used cars for an example:
sink("lm.txt")
print(summary(lm(cars$speed ~ cars$dist)))
sink() # returns output to the console
lm.txt now looks like this:
Call:
lm(formula = cars$speed ~ cars$dist)
Residuals:
Min 1Q Median 3Q Max
-7.5293 -2.1550 0.3615 2.4377 6.4179
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 8.28391 0.87438 9.474 1.44e-12 ***
cars$dist 0.16557 0.01749 9.464 1.49e-12 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 3.156 on 48 degrees of freedom
Multiple R-squared: 0.6511, Adjusted R-squared: 0.6438
F-statistic: 89.57 on 1 and 48 DF, p-value: 1.49e-12
#Roland 's suggestion of knitr is a bit more involved, but could be worth it because you can knit input, text output, and figures in to one report or html file easily.

The suggestion above work great. Depending what you need you can use the tidy() function for the coefficients and glance() for the table.
library( broom )
a <- lm(cars$speed ~ cars$dist)
write.csv( tidy( a ) , "coefs.csv" )
write.csv( glance( a ) , "an.csv" )

Should you want to re-import the data into R but still want to have it in a text file, there is also dput, e.g.,
dput(summary(lm(cars$speed~cars$dist)),file="summary_lm.txt",control="all")
This allows to re-import the summary object via
res=dget("summary_lm.txt")
Let's check the class of res
class(res)
[1] "summary.lm"

try apaStyle package:
library(apaStyle)
apa.regression(reg1, variables = NULL, number = "1", title = " title ",
filename = "APA Table1 regression.docx", note = NULL, landscape = FALSE, save = TRUE, type = "wide")

Related

How to fit a Biphasic Dose Response Curve using R?

I am trying to process NanoBRET assay data to analyze competition between Ternary Complex (TC) formation and binary binding between Chimeric Targeted Molecule and weaker affinity interacting species using R. I could not locate the correct library function that helps perform the biphasic dose-response curve fit using the following formula. Can someone direct me to the appropriate R Library if available?
Concn CompoundX CompoundX
0.00001 0.309967 0.28848
0.000004 0.239756 0.386004
0.0000015 0.924346 0.924336
0.00000075 1.409483 1.310479
0.00000025 2.128796 2.007222
0.0000001 2.407227 2.371517
3.75E-08 2.300768 2.203162
1.63E-08 1.826203 1.654133
6.25E-09 0.978104 1.06907
2.5E-09 0.483403 0.473238
1.06E-09 0.235191 0.251971
4.06E-10 0.115721 0.114867
1.56E-10 0.06902 0.053681
6.25E-11 0.031384 0.054416
2.66E-11 0.023007 0.028945
1.09E-11 0.003956 0.020866
Plot generated in GraphPad PRISM using biphasic dose-response equation.
I needed to answer my questions by following further links in the article suggested by #I_O. Apparently the bell-shaped response curve which I thought looked more like the "bell-shaped" model described in the skimpy Prism documentation is precisely what the referenced article was calling "biphasic". See: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4660423/pdf/srep17200.pdf
The R code to do the fitting is in the supplemental material referenced at https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4660423/#S1
dput(dat)
structure(list(Concn = c(1e-05, 4e-06, 1.5e-06, 7.5e-07, 2.5e-07,
1e-07, 3.75e-08, 1.63e-08, 6.25e-09, 2.5e-09, 1.06e-09, 4.06e-10,
1.56e-10, 6.25e-11, 2.66e-11, 1.09e-11), CompoundX = c(0.309967,
0.239756, 0.924346, 1.409483, 2.128796, 2.407227, 2.300768, 1.826203,
0.978104, 0.483403, 0.235191, 0.115721, 0.06902, 0.031384, 0.023007,
0.003956), CompoundX.2 = c(0.28848, 0.386004, 0.924336, 1.310479,
2.007222, 2.371517, 2.203162, 1.654133, 1.06907, 0.473238, 0.251971,
0.114867, 0.053681, 0.054416, 0.028945, 0.020866)), class = "data.frame", row.names = c(NA,
-16L))
> m0<-drm(CompoundX~log(Concn), data = dat, fct = gaussian())
> summary(m0)
Model fitted: Gaussian (5 parms)
Parameter estimates:
Estimate Std. Error t-value p-value
b:(Intercept) 2.031259 0.086190 23.567 9.128e-11 ***
c:(Intercept) 0.012121 0.040945 0.296 0.7727
d:(Intercept) 2.447918 0.067136 36.462 7.954e-13 ***
e:(Intercept) -16.271552 0.045899 -354.509 < 2.2e-16 ***
f:(Intercept) 2.095870 0.195703 10.709 3.712e-07 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error:
0.07894641 (11 degrees of freedom)
> plot(m0, type = "all", col= "black", log = "")
Warning message:
In min(dose[dose > 0]) : no non-missing arguments to min; returning Inf
That Supplement goes on to compare a variety of model variations and constraints, so it should be read, digested and followed more closely than
space permits here.

How to change decimal mark in papaja::apa_print?

I wanted to use papaja::apa_print() to make things easier in my report but I need to adjust the output for my language's APA style adaptation. It means displaying leading zeros and setting comma as the decimal mark. It's not a problem with most of papaja functions but with apa_print() is not that straightforward.
library(papaja)
model <- lm(Sepal.Length ~ Species, data = iris)
anova(model) |> apa_print(decimal.mark = ",", gt1 = TRUE)
The options are applied to most of the output but not to p-values and F-ratios (at least in ANOVA). I thought it might apply to all LaTeX parts of the output but eta-squared seems to be OK. I believe it might be a bug. So how do I deal with it? Should I report it as a bug? Part of the output for reference:
$estimate
$estimate$Species
[1] "$\\hat{\\eta}^2_G = 0,401$, 90\\% CI $[0,300, 0,485]$"
$statistic
$statistic$Species
[1] "$F(2, 147) = 49.16$, $p < .001$"
$full_result
$full_result$Species
[1] "$F(2, 147) = 49.16$, $p < .001$, $\\hat{\\eta}^2_G = 0,401$, 90\\% CI $[0,300, 0,485]$"
$table
A data.frame with 7 labelled columns:
term estimate conf.int statistic df df.residual p.value
1 Species 0,401 [0,300, 0,485] 49.16 2 147 < .001

Stargazer for ARDL package's output: 'Error: Unrecognized object type'

So ARDL package in R implements dynlm which is an accepted input in stargazer as per this question and answer.
However, I am unable to get stargazer table from ardl or auto_ardl. It throws the unrecognized object type error. Is there a way out of this?
Here's a reproducible example:
set.seed(10)
library(ARDL)
library(stargazer)
x=rnorm(100,mean = 5,sd=2)
y=rnorm(100,mean = 7,sd=3)
df=cbind(x,y)
model1=auto_ardl(y~x,data = df,max_order = 4)
class(model1)
[1] "list"
stargazer(model1)
% Error: Unrecognized object type.
class(model1$best_model)
[1] "dynlm" "lm" "ardl"
stargazer(model1$best_model)
% Error: Unrecognized object type.
I'm sorry I don't know how to do this in stargazer, but this model type is supported out-of-the box by the latest version of the modelsummary package (disclaimer: I am the maintainer).
set.seed(10)
library(ARDL)
library(modelsummary)
x=rnorm(100,mean = 5,sd=2)
y=rnorm(100,mean = 7,sd=3)
df=cbind(x,y)
model1=auto_ardl(y~x,data = df,max_order = 4)
modelsummary(model1$best_model)
Model 1
(Intercept)
6.849
(1.705)
L(y, 1)
0.061
(0.106)
x
-0.103
(0.166)
L(x, 1)
-0.027
(0.167)
L(x, 2)
-0.075
(0.166)
L(x, 3)
0.043
(0.167)
L(x, 4)
0.048
(0.169)
Num.Obs.
96
R2
0.013
R2 Adj.
-0.054
AIC
492.8
BIC
513.3
Log.Lik.
-238.398

R Markdown, output test results in loop

I'm looking for a nicely formated markdown output of test results that are produced within a for loop and structured with headings. For example
df <- data.frame(x = rnorm(1000),
y = rnorm(1000),
z = rnorm(1000))
for (v in c("y","z")) {
cat("##", v, " (model 0)\n")
summary(lm(x~1, df))
cat("##", v, " (model 1)\n")
summary(lm(as.formula(paste0("x~1+",v)), df))
}
whereas the output should be
y (model 0)
Call:
lm(formula = x ~ 1, data = df)
Residuals:
Min 1Q Median 3Q Max
-3.8663 -0.6969 -0.0465 0.6998 3.1648
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.05267 0.03293 -1.6 0.11
Residual standard error: 1.041 on 999 degrees of freedom
y (model 1)
Call:
lm(formula = as.formula(paste0("x~1+", v)), data = df)
Residuals:
Min 1Q Median 3Q Max
-3.8686 -0.6915 -0.0447 0.6921 3.1504
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.05374 0.03297 -1.630 0.103
y -0.02399 0.03189 -0.752 0.452
Residual standard error: 1.042 on 998 degrees of freedom
Multiple R-squared: 0.0005668, Adjusted R-squared: -0.0004346
F-statistic: 0.566 on 1 and 998 DF, p-value: 0.452
z (model 0)
and so on...
There are several results discussing parts of the question like here or here suggesting the asis-tag in combination with the cat-statement. This one includes headers.
Closest to me request seems to be this question from two years ago. However, even though highly appreciated, some of suggestions are deprecated like the asis_output or I can't get them to work in general conditions like the formattable suggestion (e.g. withlm-output). I just wonder -- as two years have past since then -- if there is a modern approach that facilitates what I'm looking for.
Solution Type 1
You could do a capture.output(cat(.)) approach with some lapply-looping. Send the output to a file and use rmarkdown::render(.).
This is the R code producing a *.pdf.
capture.output(cat("---
title: 'Test Results'
author: 'Tom & co.'
date: '11 10 2019'
output: pdf_document
---\n\n```{r setup, include=FALSE}\n
knitr::opts_chunk$set(echo = TRUE)\n
mtcars <- data.frame(mtcars)\n```\n"), file="_RMD/Tom.Rmd") # here of course your own data
lapply(seq(mtcars), function(i)
capture.output(cat("# Model", i, "\n\n```{r chunk", i, ", comment='', echo=FALSE}\n\
print(summary(lm(mpg ~ ", names(mtcars)[i] ,", mtcars)))\n```\n"),
file="_RMD/Tom.Rmd", append=TRUE))
rmarkdown::render("_RMD/Tom.Rmd")
Produces:
Solution Type 2
When we want to automate the output of multiple model summaries in the rmarkdown itself, we could chose between 1. selecting chunk option results='asis' which would produce code output but e.g. # Model 1 headlines, or 2. to choose not to select it, which would produce Model 1 but destroys the code formatting. The solution is to use the option and combine it with inline code that we can paste() together with another sapply()-loop within the sapply() for the models.
In the main sapply we apply #G.Grothendieck's venerable solution to nicely substitute the Call: line of the output using do.call("lm", list(.)). We need to wrap an invisible(.) around it to avoid the unnecessary sapply() output [[1]] [[2]]... of the empty lists produced.
I included a ". " into the cat(), because leading white space like ` this` will be rendered to this in lines 6 and 10 of the summary outputs.
This is the rmarkdown script producing a *pdf that can also be executed ordinary line by line:
---
title: "Test results"
author: "Tom & co."
date: "15 10 2019"
output: pdf_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
# Overview
This is an example of an ordinary code block with output that had to be included.
```{r mtcars, fig.width=3, fig.height=3}
head(mtcars)
```
# Test results in detail
The test results follow fully automated in detail.
```{r mtcars2, echo=FALSE, message=FALSE, results="asis"}
invisible(sapply(tail(seq(mtcars), -2), function(i) {
fo <- reformulate(names(mtcars)[i], response="mpg")
s <- summary(do.call("lm", list(fo, quote(mtcars))))
cat("\n## Model", i - 2, "\n")
sapply(1:19, function(j)
cat(paste0("`", ". ", capture.output(s)[j]), "` \n"))
cat(" \n")
}))
```
***Note:*** This is a concluding remark to show that we still can do other stuff afterwards.
Produces:
(Note: Site 3 omitted)
Context
I was hit by the same need as that of OP when trying to generate multiple plots in a loop, but one of them would apparently crash the graphical device (because of unpredictable bad input) even when called using try() and prevent all the remaining figures from being generated. I needed really independent code blocks, like in the proposed solution.
Solution
I've thought of preprocessing the source file before it was passed to knitr, preferably inside R, and found that the jinjar package was a good candidate. It uses a dynamic template syntax based on the Jinja2 templating engine from Python/Django. There are no syntax clashes with document formats accepted by R Markdown, but the tricky part was integrating it nicely with its machinery.
My hackish solution was to create a wrapper rmarkdown::output_format() that executes some code inside the rmarkdown::render() call environment to process the source file:
preprocess_jinjar <- function(base_format) {
if (is.character(base_format)) {
base_format <- rmarkdown:::create_output_format_function(base_format)
}
function(...) {
# Find the markdown::render() environment.
callers <- sapply(sys.calls(), function(x) deparse(as.list(x)[[1]]))
target <- grep('^(rmarkdown::)?render$', callers)
target <- target[length(target)] # render may be called recursively
render_envir <- sys.frames()[[target]]
# Modify input with jinjar.
input_paths <- evalq(envir = render_envir, expr = {
original_knit_input <- sub('(\\.[[:alnum:]]+)$', '.jinjar\\1', knit_input)
file.rename(knit_input, original_knit_input)
input_lines <- jinjar::render(paste(input_lines, collapse = '\n'))
writeLines(input_lines, knit_input)
normalize_path(c(knit_input, original_knit_input))
})
# Add an on_exit hook to revert the modification.
rmarkdown::output_format(
knitr = NULL,
pandoc = NULL,
on_exit = function() file.rename(input_paths[2], input_paths[1]),
base_format = base_format(...),
)
}
}
Then I can call, for example:
rmarkdown::render('input.Rmd', output_format = preprocess_jinjar('html_document'))
Or, more programatically, with the output format specified in the source file metadata as usual:
html_jinjar <- preprocess_jinjar('html_document')
rmarkdown::render('input.Rmd')
Here is a minimal example for input.Rmd:
---
output:
html_jinjar:
toc: false
---
{% for n in [1, 2, 3] %}
# Section {{ n }}
```{r block-{{ n }}}
print({{ n }}**2)
```
{% endfor %}
Caveats
It's a hack. This code depends on the internal logic of markdown::render() and likely there are edge cases where it won't work. Use at your own risk.
For this solution to work, the output format contructor must be called by render(). Therefore, evaluating it before passing it to render() will fail:
render('input.Rmd', output_format = 'html_jinja') # works
render('input.Rmd', output_format = html_jinja) # works
render('input.Rmd', output_format = html_jinja()) # fails
This second limitation could be circumvented by putting the preprocessing code inside the pre_knit() hook, but then it would only run after other output format hooks, like intermediates_generator() and other pre_knit() hooks of the format.

Display subset of R output with knitR

Is there a way to display only part of the R output with knitR? I want to display only part of the summary output from an lm model in a beamer presentation so that it doesn't run off the slide. (As a side note, why is my code not wrapping?) A minimal example is provided below.
\documentclass{beamer}
\begin{document}
\title{My talk}
\author{Me}
\maketitle
\begin{frame}[fragile, t]{Slide 1}
<<setup, include=FALSE, cache=FALSE, tidy=TRUE>>=
options(width=60, digits=5, show.signif.stars=FALSE)
#
<<mod1, tidy=TRUE>>==
data(cars) # load data
g <- lm(dist ~ speed + I(speed^2) + I(speed^3), data = cars)
summary(g)
#
\end{frame}
\end{document}
To be very specific, say that I wanted to return only the following output:
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -19.50505 28.40530 -0.687 0.496
speed 6.80111 6.80113 1.000 0.323
I(speed^2) -0.34966 0.49988 -0.699 0.488
I(speed^3) 0.01025 0.01130 0.907 0.369
Residual standard error: 15.2 on 46 degrees of freedom
Multiple R-squared: 0.6732, Adjusted R-squared: 0.6519
F-statistic: 31.58 on 3 and 46 DF, p-value: 3.074e-11
There's probably a better way to do this, but the following should work for you. It uses capture.output to select what parts of the printed output to display:
\documentclass{beamer}
\begin{document}
\title{My talk}
\author{Me}
\maketitle
\begin{frame}[fragile, t]{Slide 1}
<<setup, include=FALSE, cache=FALSE, tidy=TRUE>>=
options(width=60, digits=5, show.signif.stars=FALSE)
#
<<mod1, tidy=TRUE>>==
data(cars) # load data
g <- lm(dist ~ speed + I(speed^2) + I(speed^3), data = cars)
tmp <- capture.output(summary(g))
cat(tmp[9:length(tmp)], sep='\n')
#
\end{frame}
\end{document}
The summary.lm() method being invoked here returns a list of relevant outputs formatted nicely with print.summary.lm. If you want individual components of the list, try double brackets:
Input:
summary(g)[[4]]
summary(g)[[6]]
summary(g)[[7]]
summary(g)[[8]]
Output:
> summary(g)[[4]]
Estimate Std. Error t value Pr(>|t|)
(Intercept) -19.50504910 28.40530273 -0.6866693 0.4957383
speed 6.80110597 6.80113480 0.9999958 0.3225441
I(speed^2) -0.34965781 0.49988277 -0.6994796 0.4877745
I(speed^3) 0.01025205 0.01129813 0.9074113 0.3689186
> summary(g)[[6]]
[1] 15.20466
> summary(g)[[7]]
[1] 4 46 4
> summary(g)[[8]]
[1] 0.6731808
There must be a better way to combine the niceness of the summary method with list indexing, though.

Resources