If I run a linear regression with significance stars, render it through pander, and "Knit PDF" such as this:
pander(lm(crimerate ~ conscripted + birthyr + indigenous + naturalized, data = data), add.significance.stars = T)
I occasionally get output where there is weird spacing issues between rows in the output table.
I've tried setting pander options to report fewer digits panderOptions('digits', 2), but the problem persists.
Does anybody have any ideas?
I had the same problem. Something is wrong with the cell alignment, this error disappeared when i changed style to rmarkdown.
library(data.table)
dt <- data.table(Test = c("0 - 10 000"),
ALDT = "99.18 %")
First(space in table):
pandoc.table(dt, justify = c("left", "right"))
# From pandoc below
------------------
Test ALDT
---------- -------
0 - 10 000 99.18 %
------------------
Second(good formatting):
pandoc.table(dt, style = "rmarkdown", justify = c("left", "right"))
# From pandoc below
| Test | ALDT |
|:--------------|--------:|
| 0 - 10 000 | 99.18 % |
The first try doesn't work, something is wrong with the formatting pandoc gives us. But if you specify the style as rmarkdown it seems like the formatting is as it should be.
Related
I'm looking for a nicely formated markdown output of test results that are produced within a for loop and structured with headings. For example
df <- data.frame(x = rnorm(1000),
y = rnorm(1000),
z = rnorm(1000))
for (v in c("y","z")) {
cat("##", v, " (model 0)\n")
summary(lm(x~1, df))
cat("##", v, " (model 1)\n")
summary(lm(as.formula(paste0("x~1+",v)), df))
}
whereas the output should be
y (model 0)
Call:
lm(formula = x ~ 1, data = df)
Residuals:
Min 1Q Median 3Q Max
-3.8663 -0.6969 -0.0465 0.6998 3.1648
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.05267 0.03293 -1.6 0.11
Residual standard error: 1.041 on 999 degrees of freedom
y (model 1)
Call:
lm(formula = as.formula(paste0("x~1+", v)), data = df)
Residuals:
Min 1Q Median 3Q Max
-3.8686 -0.6915 -0.0447 0.6921 3.1504
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.05374 0.03297 -1.630 0.103
y -0.02399 0.03189 -0.752 0.452
Residual standard error: 1.042 on 998 degrees of freedom
Multiple R-squared: 0.0005668, Adjusted R-squared: -0.0004346
F-statistic: 0.566 on 1 and 998 DF, p-value: 0.452
z (model 0)
and so on...
There are several results discussing parts of the question like here or here suggesting the asis-tag in combination with the cat-statement. This one includes headers.
Closest to me request seems to be this question from two years ago. However, even though highly appreciated, some of suggestions are deprecated like the asis_output or I can't get them to work in general conditions like the formattable suggestion (e.g. withlm-output). I just wonder -- as two years have past since then -- if there is a modern approach that facilitates what I'm looking for.
Solution Type 1
You could do a capture.output(cat(.)) approach with some lapply-looping. Send the output to a file and use rmarkdown::render(.).
This is the R code producing a *.pdf.
capture.output(cat("---
title: 'Test Results'
author: 'Tom & co.'
date: '11 10 2019'
output: pdf_document
---\n\n```{r setup, include=FALSE}\n
knitr::opts_chunk$set(echo = TRUE)\n
mtcars <- data.frame(mtcars)\n```\n"), file="_RMD/Tom.Rmd") # here of course your own data
lapply(seq(mtcars), function(i)
capture.output(cat("# Model", i, "\n\n```{r chunk", i, ", comment='', echo=FALSE}\n\
print(summary(lm(mpg ~ ", names(mtcars)[i] ,", mtcars)))\n```\n"),
file="_RMD/Tom.Rmd", append=TRUE))
rmarkdown::render("_RMD/Tom.Rmd")
Produces:
Solution Type 2
When we want to automate the output of multiple model summaries in the rmarkdown itself, we could chose between 1. selecting chunk option results='asis' which would produce code output but e.g. # Model 1 headlines, or 2. to choose not to select it, which would produce Model 1 but destroys the code formatting. The solution is to use the option and combine it with inline code that we can paste() together with another sapply()-loop within the sapply() for the models.
In the main sapply we apply #G.Grothendieck's venerable solution to nicely substitute the Call: line of the output using do.call("lm", list(.)). We need to wrap an invisible(.) around it to avoid the unnecessary sapply() output [[1]] [[2]]... of the empty lists produced.
I included a ". " into the cat(), because leading white space like ` this` will be rendered to this in lines 6 and 10 of the summary outputs.
This is the rmarkdown script producing a *pdf that can also be executed ordinary line by line:
---
title: "Test results"
author: "Tom & co."
date: "15 10 2019"
output: pdf_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
# Overview
This is an example of an ordinary code block with output that had to be included.
```{r mtcars, fig.width=3, fig.height=3}
head(mtcars)
```
# Test results in detail
The test results follow fully automated in detail.
```{r mtcars2, echo=FALSE, message=FALSE, results="asis"}
invisible(sapply(tail(seq(mtcars), -2), function(i) {
fo <- reformulate(names(mtcars)[i], response="mpg")
s <- summary(do.call("lm", list(fo, quote(mtcars))))
cat("\n## Model", i - 2, "\n")
sapply(1:19, function(j)
cat(paste0("`", ". ", capture.output(s)[j]), "` \n"))
cat(" \n")
}))
```
***Note:*** This is a concluding remark to show that we still can do other stuff afterwards.
Produces:
(Note: Site 3 omitted)
Context
I was hit by the same need as that of OP when trying to generate multiple plots in a loop, but one of them would apparently crash the graphical device (because of unpredictable bad input) even when called using try() and prevent all the remaining figures from being generated. I needed really independent code blocks, like in the proposed solution.
Solution
I've thought of preprocessing the source file before it was passed to knitr, preferably inside R, and found that the jinjar package was a good candidate. It uses a dynamic template syntax based on the Jinja2 templating engine from Python/Django. There are no syntax clashes with document formats accepted by R Markdown, but the tricky part was integrating it nicely with its machinery.
My hackish solution was to create a wrapper rmarkdown::output_format() that executes some code inside the rmarkdown::render() call environment to process the source file:
preprocess_jinjar <- function(base_format) {
if (is.character(base_format)) {
base_format <- rmarkdown:::create_output_format_function(base_format)
}
function(...) {
# Find the markdown::render() environment.
callers <- sapply(sys.calls(), function(x) deparse(as.list(x)[[1]]))
target <- grep('^(rmarkdown::)?render$', callers)
target <- target[length(target)] # render may be called recursively
render_envir <- sys.frames()[[target]]
# Modify input with jinjar.
input_paths <- evalq(envir = render_envir, expr = {
original_knit_input <- sub('(\\.[[:alnum:]]+)$', '.jinjar\\1', knit_input)
file.rename(knit_input, original_knit_input)
input_lines <- jinjar::render(paste(input_lines, collapse = '\n'))
writeLines(input_lines, knit_input)
normalize_path(c(knit_input, original_knit_input))
})
# Add an on_exit hook to revert the modification.
rmarkdown::output_format(
knitr = NULL,
pandoc = NULL,
on_exit = function() file.rename(input_paths[2], input_paths[1]),
base_format = base_format(...),
)
}
}
Then I can call, for example:
rmarkdown::render('input.Rmd', output_format = preprocess_jinjar('html_document'))
Or, more programatically, with the output format specified in the source file metadata as usual:
html_jinjar <- preprocess_jinjar('html_document')
rmarkdown::render('input.Rmd')
Here is a minimal example for input.Rmd:
---
output:
html_jinjar:
toc: false
---
{% for n in [1, 2, 3] %}
# Section {{ n }}
```{r block-{{ n }}}
print({{ n }}**2)
```
{% endfor %}
Caveats
It's a hack. This code depends on the internal logic of markdown::render() and likely there are edge cases where it won't work. Use at your own risk.
For this solution to work, the output format contructor must be called by render(). Therefore, evaluating it before passing it to render() will fail:
render('input.Rmd', output_format = 'html_jinja') # works
render('input.Rmd', output_format = html_jinja) # works
render('input.Rmd', output_format = html_jinja()) # fails
This second limitation could be circumvented by putting the preprocessing code inside the pre_knit() hook, but then it would only run after other output format hooks, like intermediates_generator() and other pre_knit() hooks of the format.
I'm creating a R Markdown document using knitr and am running into trouble using xtable to create a table. My table is very large and I'm trying to reduce the size using the size command in the print statement. The issue I'm running into is that the command seems to add two extra curly braces which show up in the PDF, one before the table and one after.
Does anyone know a way to fix this?
MWE:
---
output:
pdf_document:
keep_tex: yes
tables: true
---
```{r, results='asis', echo=FALSE}
library(xtable)
my.df <- data.frame(matrix(c(1:18),nrow=2))
glossaryprint <- xtable(my.df, caption="Summary of Terms")
print(glossaryprint,
comment=FALSE,
floating=FALSE,
size="footnotesize"
)
```
Note: The issue that is subject of the question and this answer has been resolved in xtable 1.8-2.
Although the question has been self-answered with a workaround, I think for other users some more details might be helpful.
What happens?
To understand what is happening here, we need to take a close look at the conversion steps the document undergoes on its way from RMD to PDF. The steps are:
RMD --> MD --> TEX --> PDF
Let's look at the files in reversed order:
PDF: (generated from TEX by pdflatex)
TEX: (generated from MD by pandoc)
% …
\{\footnotesize
\begin{tabular}{rrrr}
\hline
& X1 & X2 & X3 \\
\hline
1 & 1 & 3 & 5 \\
2 & 2 & 4 & 6 \\
\hline
\end{tabular}
\}
% …
MD (generated from RMD by knitr)
---
output:
pdf_document:
keep_tex: yes
---
{\footnotesize
\begin{tabular}{rrrr}
\hline
& X1 & X2 & X3 \\
\hline
1 & 1 & 3 & 5 \\
2 & 2 & 4 & 6 \\
\hline
\end{tabular}
}
RMD: (source file)
---
output:
pdf_document:
keep_tex: yes
---
```{r, results='asis', echo=FALSE}
library(xtable)
mytable <- xtable(data.frame(matrix(c(1:6), nrow = 2)))
print(mytable,
comment = FALSE,
floating = FALSE,
size = "footnotesize"
)
```
Recall: The problem is, that there are visible curly braces in the PDF. Where do they come from?
They are the result of the escaped curly braces in the TEX file (\{ and \}).
These curly braces also exist in the MD file, but there they are not escaped.
So we know two things by now: We see the curly braces because they are escaped and they are escaped by pandoc.
But why do these curly braces exist at all? print.xtable outputs them when a size is specified. The goal is to create a group and to apply size only within that group. (With floating = TRUE, no grouping by curly braces is required because there is a figure environment whithin which the size is set. The curly braces are printed anyways.)
And why does pandoc escape that pair of curly braces but leaves all the other curly braces (e.g. in \begin{tabular}) unescaped? This is because pandoc is supposed to escape special characters that are meant literally but leave raw LaTeX unescaped. However, pandoc does not know that the outer curly braces are LaTeX commands and not text.
(With floating = TRUE the problem does not occur because the curly braces are within a figure environment which is recognized as LaTeX.)
Solutions
After having understood the problem, what can we do about it? One solution has already been posted by the OP: Abstain from spefifying size in print.xtable and insert the footnotesize command manually:
---
output:
pdf_document:
keep_tex: yes
---
```{r, results='asis', echo=FALSE}
library(xtable)
mytable <- xtable(data.frame(matrix(c(1:6), nrow = 2)))
cat("\\begin{footnotesize}")
print(mytable,
comment = FALSE,
floating = FALSE
)
cat("\\end{footnotesize}")
```
However, on the long run it would be nice if xtable (current version: 1.8-0) generated LaTeX code that survives the pandoc conversion. This is quite simple: print.xtable checks if size is set and if so, inserts { before the size specification and } at the end of the table:
if (is.null(size) || !is.character(size)) {
BSIZE <- ""
ESIZE <- ""
}
else {
if (length(grep("^\\\\", size)) == 0) {
size <- paste("\\", size, sep = "")
}
BSIZE <- paste("{", size, "\n", sep = "")
ESIZE <- "}\n"
}
This small modification replaces { with \begingroup and } with \endgroup:
if (is.null(size) || !is.character(size)) {
BSIZE <- ""
ESIZE <- ""
}
else {
if (length(grep("^\\\\", size)) == 0) {
size <- paste("\\", size, sep = "")
}
BSIZE <- paste("\\begingroup", size, "\n", sep = "")
ESIZE <- "\\endgroup\n"
}
For LaTeX, this makes no difference, but as pandoc recognizes \begingroup (as oppsed to {) it should solve the problem. I reported this as a bug in xtable and hopefully the issue will be fixed in future versions.
I was able to fix this by not including the size parameter in the print statement but rather directly before and after the chunk.
\begin{footnotesize}
#chunk
\end{footnotesize}
I would like to add symbols and letters before and after some numbers when using knitr's kable function, but do not know how to do this efficiently. I am however also willing to consider pandoc/pander if its is better/more efficient.
The end result should be an HTML table...or very good graphic of one....
Please see the following code as a mock reproducible example that is in a .Rmd file:
### Notional and Cumulative P&L
```{r echo=FALSE}
Notional <- 10000
yday_pnl <- -2942
wtd_pnl <- 2300
mtd_pnl <- -3334
ytd_pnl <- 5024
yday_rtn <- (yday_pnl/Notional)*10000
wtd_rtn <- (wtd_pnl/Notional)*10000
mtd_rtn <- (mtd_pnl/Notional)*10000
ytd_rtn <- (ytd_pnl/Notional)*10000
Value <- c(Notional,yday_pnl,wtd_pnl,mtd_pnl,ytd_pnl)
rtn <- c(NA,yday_rtn,wtd_rtn,mtd_rtn,ytd_rtn)
COB.basics <- as.data.frame(cbind(Value,rtn))
rownames(COB.basics) <- c('Notional','yday pnl','wtd_pnl','mtd_pnl','ytd_pnl')
```
```{r results='asis',echo=FALSE}
kable(COB.basics,digits=2)
```
So similar to Excel's format type of currency or accountancy I would like the value field to have the $ sign for the Value column, and for the rtn column I would like to have the string bps after the numbers...also for readability purposes is it possible to have commas after three digits if it is before the decimal point? i.e. to represent thousands etc.
Also is it possible to colour the cells? and also colour the text/numbers too? i.e. red for negative values?
Partial solution with pander:
Set "big mark" for pander so that it would be used for all numbers:
panderOptions('big.mark', ',')
You can also set the table syntax to rmarkdown (optional, as now rmarkdoen v2 also uses Pandoc, where the multiline format has some cool features compared to what rmarkdown format offered before:
panderOptions('table.style', 'rmarkdown')
You can highlight some cells with e.g. which and some custom R expression:
emphasize.strong.cells(which(COB.basics > 0, arr.ind = TRUE))
Simply call pander on your data.frame:
> library(pander)
> emphasize.strong.cells(which(COB.basics > 0, arr.ind = TRUE))
> panderOptions('big.mark', ',')
> pander(COB.basics)
-----------------------------------
Value rtn
-------------- ---------- ---------
**Notional** **10,000** NA
**yday pnl** -2,942 -2,942
**wtd_pnl** **2,300** **2,300**
**mtd_pnl** -3,334 -3,334
**ytd_pnl** **5,024** **5,024**
-----------------------------------
> panderOptions('table.style', 'rmarkdown')
> pander(COB.basics)
| | Value | rtn |
|:--------------:|:-------:|:------:|
| **Notional** | 10,000 | NA |
| **yday pnl** | -2,942 | -2,942 |
| **wtd_pnl** | 2,300 | 2,300 |
| **mtd_pnl** | -3,334 | -3,334 |
| **ytd_pnl** | 5,024 | 5,024 |
To color the cells, you could add some custom HTML/CSS markup manually (or LaTeX if working with pdf in the long run), and the same stands also for adding % or other symbols/strings to your cells with e.g. paste and apply -- but pls feel free to submit a feature request at https://github.com/Rapporter/pander
I am currently using xtable to generate Latex tables from R. It works fine, but in one of the tables I have significance stars to some of the numbers. Something like this dataframe X:
1 2 3 4 5 Test1 Test2 Test3
a "1.34" "0.43" "-0.26" "0.13" "0.05" "3.35^{.}" "343^{***}" "3244^{***}"
b "2.02" "2.17" "-3.19" "4.43" "1.43" "390.1^{***}" "31.23^{***}" "24^{***}"
c "23.07" "32.1" "24.3" "3.89" "0.4" "429.38^{***}" "17.04^{***}" "2424^{***}"
d "21.48" "14.45" "14.19" "22.04" "0.15" "385.17^{***}" "2424^{***}" "2424^{***}"
I am using '^' before the stars because in Latex significance stars look better in that format. The other option would be:
a "1.34" "0.43" "-0.26" "0.13" "0.05" "3.35." "343***" "3244***"
b "2.02" "2.17" "-3.19" "4.43" "1.43" "390.1***" "31.23^***" "24***"
# etc.
If I use xtable via:
print(xtable(X, label="X"),
size="normalsize",
include.rownames=FALSE,
include.colnames=TRUE,
caption.placement="top",
hline.after=NULL
)
I get an output like the following:
\begin{table}[ht]
\centering
{\normalsize
\begin{tabular}{llllllll}
1 & 2 & 3 & 4 & 5 & Test1 & Test2 & Test3 \\
242 & 123 & -42.3 & 0.43 & 34 & 3.35\verb|^|\{.\} # Hhere is the problem: \verb
& 242.58\verb|^|\{***\} & 0.06\verb|^|\{***\} \\ # etc. etc.
\end{tabular}
}
\end{table}
The problem here is the \verb which was added. If xtable didn't add it, the table would be fine for me. So my question is: Is there a way around that? I just want significane stars which are in the format:
^{***}
in the Latex table, but produced in R already, so I can quickly produce new tables in the right format.
Right now I am using the following function to create the stars, then I use 'paste' in a different function (not shown) to add them to the tests in the respective cases:
symnum(s[[p]], corr = FALSE, cutpoints = c(0, .001,.01,.05, .1, 1),
symbols = c("^{***}","^{**}","^{*}","^{.}"," "))
But maybe there is a better solution. Let me know.
Try setting sanitize.text.function = function(x) x to turn off the sanitizing of non-numeric values.
However, I would also recommend not using stars at all.
At the moment, I'm working with RMarkdown and Pandoc. My data.frames in R look like this:
3.538e+01 3.542e+01 3.540e+01
9.583e+00 9.406e+00 9.494e+00
2.601e+05 2.712e+05 5.313e+05
After I ran pandoc, the result looks like this:
35.380 35.420 35.400
9.583 9.406 9.494
260116.000 271217.000 531333.000
What it should look like is:
35,380 35,420 35,400
9,583 9,406 9,494
260.116 271.217 531.333
So I want commas instead of dots and I want no comma or dot after 260116 (thousand numbers). The dots to separate the thousand would be nice. Is there a way to directly Change the appearance in R or do I have to set options in knitr/markdown?
Thanks
Here's an example of some of the conversions that can be done with format():
x <- c(35.38, 35.42, 35.4, 9.583, 9.406, 9.494, 260100, 271200, 531300)
format(x, decimal.mark=",", big.mark=".", scientific=FALSE)
# [1] " 35,380" " 35,420" " 35,400" " 9,583" " 9,406"
# [6] " 9,494" "260.100,000" "271.200,000" "531.300,000"
There are several other options, such as trim, justify, and so on that might be of interest in getting your output ready for pandoc.
As this question was really inspiring, I recently introduced that big.mark feature in my pander package, that can return markdown formatted tables from R objects with predefined options -- building on format by the way. Small demo:
Load the package (installed from GH until this features gets to CRAN):
> library(pander)
Create a demo data.frame:
> x <- matrix(c(35.38, 35.42, 35.4, 9.583, 9.406, 9.494, 260100, 271200, 531300), 3, byrow = TRUE)
Set your default options: (values for US context may need to be switched)
> panderOptions('decimal.mark', ',')
> panderOptions('big.mark', '.')
Let pander do the rest:
> pander(x)
------- ------- -------
35,38 35,42 35,4
9,583 9,406 9,494
260.100 271.200 531.300
------- ------- -------
You can find and use even more options there (like the markdown syntax for the table).