Display subset of R output with knitR - r

Is there a way to display only part of the R output with knitR? I want to display only part of the summary output from an lm model in a beamer presentation so that it doesn't run off the slide. (As a side note, why is my code not wrapping?) A minimal example is provided below.
\documentclass{beamer}
\begin{document}
\title{My talk}
\author{Me}
\maketitle
\begin{frame}[fragile, t]{Slide 1}
<<setup, include=FALSE, cache=FALSE, tidy=TRUE>>=
options(width=60, digits=5, show.signif.stars=FALSE)
#
<<mod1, tidy=TRUE>>==
data(cars) # load data
g <- lm(dist ~ speed + I(speed^2) + I(speed^3), data = cars)
summary(g)
#
\end{frame}
\end{document}
To be very specific, say that I wanted to return only the following output:
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -19.50505 28.40530 -0.687 0.496
speed 6.80111 6.80113 1.000 0.323
I(speed^2) -0.34966 0.49988 -0.699 0.488
I(speed^3) 0.01025 0.01130 0.907 0.369
Residual standard error: 15.2 on 46 degrees of freedom
Multiple R-squared: 0.6732, Adjusted R-squared: 0.6519
F-statistic: 31.58 on 3 and 46 DF, p-value: 3.074e-11

There's probably a better way to do this, but the following should work for you. It uses capture.output to select what parts of the printed output to display:
\documentclass{beamer}
\begin{document}
\title{My talk}
\author{Me}
\maketitle
\begin{frame}[fragile, t]{Slide 1}
<<setup, include=FALSE, cache=FALSE, tidy=TRUE>>=
options(width=60, digits=5, show.signif.stars=FALSE)
#
<<mod1, tidy=TRUE>>==
data(cars) # load data
g <- lm(dist ~ speed + I(speed^2) + I(speed^3), data = cars)
tmp <- capture.output(summary(g))
cat(tmp[9:length(tmp)], sep='\n')
#
\end{frame}
\end{document}

The summary.lm() method being invoked here returns a list of relevant outputs formatted nicely with print.summary.lm. If you want individual components of the list, try double brackets:
Input:
summary(g)[[4]]
summary(g)[[6]]
summary(g)[[7]]
summary(g)[[8]]
Output:
> summary(g)[[4]]
Estimate Std. Error t value Pr(>|t|)
(Intercept) -19.50504910 28.40530273 -0.6866693 0.4957383
speed 6.80110597 6.80113480 0.9999958 0.3225441
I(speed^2) -0.34965781 0.49988277 -0.6994796 0.4877745
I(speed^3) 0.01025205 0.01129813 0.9074113 0.3689186
> summary(g)[[6]]
[1] 15.20466
> summary(g)[[7]]
[1] 4 46 4
> summary(g)[[8]]
[1] 0.6731808
There must be a better way to combine the niceness of the summary method with list indexing, though.

Related

Stargazer for ARDL package's output: 'Error: Unrecognized object type'

So ARDL package in R implements dynlm which is an accepted input in stargazer as per this question and answer.
However, I am unable to get stargazer table from ardl or auto_ardl. It throws the unrecognized object type error. Is there a way out of this?
Here's a reproducible example:
set.seed(10)
library(ARDL)
library(stargazer)
x=rnorm(100,mean = 5,sd=2)
y=rnorm(100,mean = 7,sd=3)
df=cbind(x,y)
model1=auto_ardl(y~x,data = df,max_order = 4)
class(model1)
[1] "list"
stargazer(model1)
% Error: Unrecognized object type.
class(model1$best_model)
[1] "dynlm" "lm" "ardl"
stargazer(model1$best_model)
% Error: Unrecognized object type.
I'm sorry I don't know how to do this in stargazer, but this model type is supported out-of-the box by the latest version of the modelsummary package (disclaimer: I am the maintainer).
set.seed(10)
library(ARDL)
library(modelsummary)
x=rnorm(100,mean = 5,sd=2)
y=rnorm(100,mean = 7,sd=3)
df=cbind(x,y)
model1=auto_ardl(y~x,data = df,max_order = 4)
modelsummary(model1$best_model)
Model 1
(Intercept)
6.849
(1.705)
L(y, 1)
0.061
(0.106)
x
-0.103
(0.166)
L(x, 1)
-0.027
(0.167)
L(x, 2)
-0.075
(0.166)
L(x, 3)
0.043
(0.167)
L(x, 4)
0.048
(0.169)
Num.Obs.
96
R2
0.013
R2 Adj.
-0.054
AIC
492.8
BIC
513.3
Log.Lik.
-238.398

RMarkdown: Statamarkdown produces undesired output when collectcode=TRUE

I'm using Statamarkdown to produce HTML documents using RMarkdown and Stata.
As documented here, each code chunk is executed as a separate Stata session. collectcode=TRUE is a chunk option to collect Stata code across chunks.
While this works neatly, the outputs of the second (and any further) chunks follwing the first with collectcode=TRUE contain an undesired echo at the top:
Running .......\profile.do
For instance, when running a second chunk with {stata stata2, echo = T,collectcode=TRUE}
reg mpg price i.foreign , noheader
yields this output:
reg mpg price i.foreign , noheader
Running C:\Cloud\Methods\prog\profile.do . reg mpg price i.foreign , noheader
------------------------------------------------------------------------------
mpg | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
price | -.000959 .0001815 -5.28 0.000 -.001321 -.000597
|
foreign |
Foreign | 5.245271 1.163592 4.51 0.000 2.925135 7.565407
_cons | 25.65058 1.271581 20.17 0.000 23.11512 28.18605
------------------------------------------------------------------------------
Here's my RMarkdown repex:
---
title: "Statamarkdown output problem"
output: html_document
---
```{r setup, include = F}
library(Statamarkdown)
```
First chunk is clean:
```{stata stata1,collectcode=TRUE}
sysuse auto
su mpg price
```
Second Stata Output contains undesired `Running .......\profile.do` output:
```{stata stata2, echo = T,collectcode=TRUE}
reg mpg price i.foreign , noheader
```
Problem persists even in chunks with `collectcode=FALSE`:
```{stata new_data, echo = T,collectcode=F}
webuse bpwide, clear
su sex agegrp
```
`cleanlog = F` does not do the trick:
```{stata new_data2, echo = T,collectcode=F, cleanlog = FALSE}
webuse bpwide, clear
su sex agegrp
```
Avoiding collectcode=T alltogether, i.e. load and preparing the data for each chunks would of course be a workaround, but extremely tedious.
I'm using R 3.6.3 and Stata 16.1 on a Windows machine.
Any ideas are very much appreciated!
It turns out Stata changed from
running .......\profile.do
to
Running .......\profile.do
A new version of the Statamarkdown package (0.5.0) accomodates this, now.

R Markdown, output test results in loop

I'm looking for a nicely formated markdown output of test results that are produced within a for loop and structured with headings. For example
df <- data.frame(x = rnorm(1000),
y = rnorm(1000),
z = rnorm(1000))
for (v in c("y","z")) {
cat("##", v, " (model 0)\n")
summary(lm(x~1, df))
cat("##", v, " (model 1)\n")
summary(lm(as.formula(paste0("x~1+",v)), df))
}
whereas the output should be
y (model 0)
Call:
lm(formula = x ~ 1, data = df)
Residuals:
Min 1Q Median 3Q Max
-3.8663 -0.6969 -0.0465 0.6998 3.1648
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.05267 0.03293 -1.6 0.11
Residual standard error: 1.041 on 999 degrees of freedom
y (model 1)
Call:
lm(formula = as.formula(paste0("x~1+", v)), data = df)
Residuals:
Min 1Q Median 3Q Max
-3.8686 -0.6915 -0.0447 0.6921 3.1504
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.05374 0.03297 -1.630 0.103
y -0.02399 0.03189 -0.752 0.452
Residual standard error: 1.042 on 998 degrees of freedom
Multiple R-squared: 0.0005668, Adjusted R-squared: -0.0004346
F-statistic: 0.566 on 1 and 998 DF, p-value: 0.452
z (model 0)
and so on...
There are several results discussing parts of the question like here or here suggesting the asis-tag in combination with the cat-statement. This one includes headers.
Closest to me request seems to be this question from two years ago. However, even though highly appreciated, some of suggestions are deprecated like the asis_output or I can't get them to work in general conditions like the formattable suggestion (e.g. withlm-output). I just wonder -- as two years have past since then -- if there is a modern approach that facilitates what I'm looking for.
Solution Type 1
You could do a capture.output(cat(.)) approach with some lapply-looping. Send the output to a file and use rmarkdown::render(.).
This is the R code producing a *.pdf.
capture.output(cat("---
title: 'Test Results'
author: 'Tom & co.'
date: '11 10 2019'
output: pdf_document
---\n\n```{r setup, include=FALSE}\n
knitr::opts_chunk$set(echo = TRUE)\n
mtcars <- data.frame(mtcars)\n```\n"), file="_RMD/Tom.Rmd") # here of course your own data
lapply(seq(mtcars), function(i)
capture.output(cat("# Model", i, "\n\n```{r chunk", i, ", comment='', echo=FALSE}\n\
print(summary(lm(mpg ~ ", names(mtcars)[i] ,", mtcars)))\n```\n"),
file="_RMD/Tom.Rmd", append=TRUE))
rmarkdown::render("_RMD/Tom.Rmd")
Produces:
Solution Type 2
When we want to automate the output of multiple model summaries in the rmarkdown itself, we could chose between 1. selecting chunk option results='asis' which would produce code output but e.g. # Model 1 headlines, or 2. to choose not to select it, which would produce Model 1 but destroys the code formatting. The solution is to use the option and combine it with inline code that we can paste() together with another sapply()-loop within the sapply() for the models.
In the main sapply we apply #G.Grothendieck's venerable solution to nicely substitute the Call: line of the output using do.call("lm", list(.)). We need to wrap an invisible(.) around it to avoid the unnecessary sapply() output [[1]] [[2]]... of the empty lists produced.
I included a ". " into the cat(), because leading white space like ` this` will be rendered to this in lines 6 and 10 of the summary outputs.
This is the rmarkdown script producing a *pdf that can also be executed ordinary line by line:
---
title: "Test results"
author: "Tom & co."
date: "15 10 2019"
output: pdf_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
# Overview
This is an example of an ordinary code block with output that had to be included.
```{r mtcars, fig.width=3, fig.height=3}
head(mtcars)
```
# Test results in detail
The test results follow fully automated in detail.
```{r mtcars2, echo=FALSE, message=FALSE, results="asis"}
invisible(sapply(tail(seq(mtcars), -2), function(i) {
fo <- reformulate(names(mtcars)[i], response="mpg")
s <- summary(do.call("lm", list(fo, quote(mtcars))))
cat("\n## Model", i - 2, "\n")
sapply(1:19, function(j)
cat(paste0("`", ". ", capture.output(s)[j]), "` \n"))
cat(" \n")
}))
```
***Note:*** This is a concluding remark to show that we still can do other stuff afterwards.
Produces:
(Note: Site 3 omitted)
Context
I was hit by the same need as that of OP when trying to generate multiple plots in a loop, but one of them would apparently crash the graphical device (because of unpredictable bad input) even when called using try() and prevent all the remaining figures from being generated. I needed really independent code blocks, like in the proposed solution.
Solution
I've thought of preprocessing the source file before it was passed to knitr, preferably inside R, and found that the jinjar package was a good candidate. It uses a dynamic template syntax based on the Jinja2 templating engine from Python/Django. There are no syntax clashes with document formats accepted by R Markdown, but the tricky part was integrating it nicely with its machinery.
My hackish solution was to create a wrapper rmarkdown::output_format() that executes some code inside the rmarkdown::render() call environment to process the source file:
preprocess_jinjar <- function(base_format) {
if (is.character(base_format)) {
base_format <- rmarkdown:::create_output_format_function(base_format)
}
function(...) {
# Find the markdown::render() environment.
callers <- sapply(sys.calls(), function(x) deparse(as.list(x)[[1]]))
target <- grep('^(rmarkdown::)?render$', callers)
target <- target[length(target)] # render may be called recursively
render_envir <- sys.frames()[[target]]
# Modify input with jinjar.
input_paths <- evalq(envir = render_envir, expr = {
original_knit_input <- sub('(\\.[[:alnum:]]+)$', '.jinjar\\1', knit_input)
file.rename(knit_input, original_knit_input)
input_lines <- jinjar::render(paste(input_lines, collapse = '\n'))
writeLines(input_lines, knit_input)
normalize_path(c(knit_input, original_knit_input))
})
# Add an on_exit hook to revert the modification.
rmarkdown::output_format(
knitr = NULL,
pandoc = NULL,
on_exit = function() file.rename(input_paths[2], input_paths[1]),
base_format = base_format(...),
)
}
}
Then I can call, for example:
rmarkdown::render('input.Rmd', output_format = preprocess_jinjar('html_document'))
Or, more programatically, with the output format specified in the source file metadata as usual:
html_jinjar <- preprocess_jinjar('html_document')
rmarkdown::render('input.Rmd')
Here is a minimal example for input.Rmd:
---
output:
html_jinjar:
toc: false
---
{% for n in [1, 2, 3] %}
# Section {{ n }}
```{r block-{{ n }}}
print({{ n }}**2)
```
{% endfor %}
Caveats
It's a hack. This code depends on the internal logic of markdown::render() and likely there are edge cases where it won't work. Use at your own risk.
For this solution to work, the output format contructor must be called by render(). Therefore, evaluating it before passing it to render() will fail:
render('input.Rmd', output_format = 'html_jinja') # works
render('input.Rmd', output_format = html_jinja) # works
render('input.Rmd', output_format = html_jinja()) # fails
This second limitation could be circumvented by putting the preprocessing code inside the pre_knit() hook, but then it would only run after other output format hooks, like intermediates_generator() and other pre_knit() hooks of the format.

How to save summary(lm) to a file?

I'm using R for a pharmacodynamic analysis and I'm fairly new to programming.
The thing is, I'm carrying out linear regression analysis and in the future I will perform more advanced methods. Because I'm performing a large number of analysis (and I'm too lazy to manually copy paste every time I run the script), I would like to save the summaries of the analysis to a file. I've tried different methods, but nothing seems to work.
What I'm looking for is the following as (preferably) a text file:
X_Y <- lm(X ~ Y)
sum1 <- summary(X_Y)
> sum1
Call:
lm(formula = AUC_cumulative ~ LVEF)
Residuals:
Min 1Q Median 3Q Max
-910.59 -434.11 -89.17 349.39 2836.81
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1496.4215 396.5186 3.774 0.000268 ***
LVEF 0.8243 7.3265 0.113 0.910640
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 619.9 on 104 degrees of freedom
(32 observations deleted due to missingness)
Multiple R-squared: 0.0001217, Adjusted R-squared: -0.009493
F-statistic: 0.01266 on 1 and 104 DF, p-value: 0.9106
I've searched for methods to save summary functions to a .csv or .txt, but those files don't represent the data in a way I can understand it.
Things I've tried:
fileConn <- file("output.txt")
writeLines(sum1, fileConn)
close(fileConn)
This returns:
Error in writeLines(sum1, fileConn) : invalid 'text' argument
An attempt using the write.table command gave:
> write.table(Sum1, 'output.csv', sep=",", row.names=FALSE, col.names=TRUE, quote=FALSE)
Error in as.data.frame.default(x[[i]], optional = TRUE, stringsAsFactors = stringsAsFactors) : cannot coerce class ""summary.lm"" to a data.frame
Using the write command:
> write(sum1, 'output.txt')
Error in cat(list(...), file, sep, fill, labels, append) : argument 1 (type 'list') cannot be handled by 'cat'
Then I was getting closer with the following:
> write.table(sum1, 'output.csv', sep=",", row.names=FALSE, col.names=TRUE, quote=FALSE)
But this file did not have the same readable information as the printed summary
I hope someone can help, because this is way to advanced programming for me.
I think one option could be sink() which will output the results to a text file rather than the console. In the absence of your dataset I've used cars for an example:
sink("lm.txt")
print(summary(lm(cars$speed ~ cars$dist)))
sink() # returns output to the console
lm.txt now looks like this:
Call:
lm(formula = cars$speed ~ cars$dist)
Residuals:
Min 1Q Median 3Q Max
-7.5293 -2.1550 0.3615 2.4377 6.4179
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 8.28391 0.87438 9.474 1.44e-12 ***
cars$dist 0.16557 0.01749 9.464 1.49e-12 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 3.156 on 48 degrees of freedom
Multiple R-squared: 0.6511, Adjusted R-squared: 0.6438
F-statistic: 89.57 on 1 and 48 DF, p-value: 1.49e-12
#Roland 's suggestion of knitr is a bit more involved, but could be worth it because you can knit input, text output, and figures in to one report or html file easily.
The suggestion above work great. Depending what you need you can use the tidy() function for the coefficients and glance() for the table.
library( broom )
a <- lm(cars$speed ~ cars$dist)
write.csv( tidy( a ) , "coefs.csv" )
write.csv( glance( a ) , "an.csv" )
Should you want to re-import the data into R but still want to have it in a text file, there is also dput, e.g.,
dput(summary(lm(cars$speed~cars$dist)),file="summary_lm.txt",control="all")
This allows to re-import the summary object via
res=dget("summary_lm.txt")
Let's check the class of res
class(res)
[1] "summary.lm"
try apaStyle package:
library(apaStyle)
apa.regression(reg1, variables = NULL, number = "1", title = " title ",
filename = "APA Table1 regression.docx", note = NULL, landscape = FALSE, save = TRUE, type = "wide")

Combine texreg, knitr, booktabs & dcolumn

I'm trying to compile a LaTeX report using RStudio and knitr. I'm having a hard time getting the packages booktabs and dcolumn to work with my texreg-generated table.
As an example, I am trying to recreate Table 2 in this example:.
My attempt as a .Rnw -file is below:
\documentclass{article}
\usepackage{booktabs}
\usepackage{dcolumn}
<<setup, include=FALSE >>=
library(texreg)
#
\begin{document}
<<analysis, include=FALSE>>=
ctl <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14)
trt <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69)
group <- gl(2,10,20, labels=c("Ctl","Trt"))
weight <- c(ctl, trt)
m1 <- lm(weight ~ group)
m2 <- lm(weight ~ group - 1) # omitting intercept
#
<<table, results='asis'>>=
texreg(m2)
#
\end{document}
However, the generated LaTex table (below) includes neither the booktabs horizontal lines nor the dcolumn alignment. How to incorporate them? Many thanks for help!
\begin{table}
\begin{center}
\begin{tabular}{l c }
\hline
& Model 1 \\
\hline
groupCtl & $5.03^{***}$ \\
& $(0.22)$ \\
groupTrt & $4.66^{***}$ \\
& $(0.22)$ \\
\hline
R$^2$ & 0.98 \\
Adj. R$^2$ & 0.98 \\
Num. obs. & 20 \\
\hline
\multicolumn{2}{l}{\scriptsize{$^{***}p<0.001$, $^{**}p<0.01$, $^*p<0.05$}}
\end{tabular}
\caption{Statistical models}
\label{table:coefficients}
\end{center}
\end{table}
Just to clarify: the crucial part in Robert's solution is the argument use.packages=FALSE not the seperation of code into two chunks.
The reason is the following: The way you call texreg() now makes it include the following in the tex output it produces:
\usepackage{booktabs}
\usepackage{dcolumn}
Saving the output in an object and then using cat() does not solve this.
You cannot use \usepackage()' outside the preamble. Knitr still compiles a PDF but apparently this use of\usepackage{}' in the document body screws up the use of booktabs and dcolumn even if you've loaded them in the preamble.
Add the argument use.packages=FALSE in texreg() - if set to FALSE, the use package statements are omitted from the output. Write the use package statements into the preamble of your document yourself and you'll have the regression table with booktabs and aligned numbers.
Try this:
\begin{document}
<<analysis, include=FALSE>>=
ctl <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14)
trt <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69)
group <- gl(2,10,20, labels=c("Ctl","Trt"))
weight <- c(ctl, trt)
m1 <- lm(weight ~ group)
m2 <- lm(weight ~ group - 1) # omitting intercept
table = texreg(m2,booktabs = TRUE,dcolumn = TRUE,use.packages=FALSE)
table2=texreg(list(m1,m2),booktabs = TRUE,dcolumn = TRUE,use.packages=FALSE)
#
<<table, results='asis',echo=FALSE>>=
cat(table)
cat(table2)
#
\end{document}

Resources