My data frame has ugly column names, but when displaying the table in my report, I want to their "real" names including special characters '(', new lines, greek letters, repeated names, etc.
Is there an easy way of replacing the names in knitr to allow such formatting?
Proposed solution
What I have tried to do is suppress the printing of the data frame names and use add_header_above for better names and names that span several columns. Some advice I've seen says to use:
x <- kable(df)
gsub("<thead>.*</thead>", "", x)
to remove the column names. That's fine, but the issue is that when I subsequently add_header_above, the original column names come back. If I use col.names=rep('',times=ncol(d.df)) in kable(...) the names are gone but the row remains, leaving a gap between my new column names and the table body. Here's a code chunk to illustrate:
```{r functions,echo=T}
drawTable <- function(d.df,caption='Given',hdr.above){
require(knitr)
require(kableExtra)
require(dplyr)
hdr.2 <- rep(c('Value','Rank'),times=ncol(d.df)/2)
x <- knitr::kable(d.df,format='latex',align='c',
col.names=rep('',times=ncol(d.df))) %>%
kable_styling(bootstrap_options=c('striped','hover',
'condensed','responsive'),position='center',
font_size = 9,full_width=F)
x %>% add_header_above(hdr.2) %>%
add_header_above(hdr.above)
}
```
```{r}
df <- data.frame(A=c(1,2),B=c(4,2),C=c(3,4),D=c(8,7))
hdr.above <- c('A2','B2','C2','D2')
drawTable(df,hdr.above = hdr.above)
```
I am not sure where you got the advice to replace rownames, but it seems excessively complex. It is much easier just to use the built-in col.names argument within kable. This solution works for both HTML and LaTeX outputs:
---
output:
pdf_document: default
html_document: default
---
```{r functions,echo=T}
require(knitr)
df <- data.frame(A=c(1,2),B=c(4,2),C=c(3,4),D=c(8,7))
knitr::kable(df,
col.names = c("Space in name",
"(Special Characters)",
"$\\delta{m}_1$",
"Space in name"))
```
PDF output:
HTML output:
If you're targeting HTML, then Δ is an option too.
I couldn't get the accepted answer to work on HTML, so used the above.
Related
I have a table in a Rmd file to print to pdf in which I need to add a dagger symbol into a column header. The basic test code is:
---
title: "Untitled"
author: "L. G. Hunsicker"
date: "4/29/2022"
output: pdf_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = F)
library(magrittr)
library(knitr)
library(kableExtra)
```
```{r test}
df <- data.frame(x = 1:10)
names(df)[1] <- 'maxCpep (ng/mL) †'
df %>% kbl()
```
I have tried several ways to get the dagger symbol to print, but with each effort, the outcome is that the code for the dagger symbol is simply printed as text or raises an error (with the backslash). I have tried using bquote, \dagger, \dagger, etc. I also tried to use a superscript "a" as an alternative. Again, kbl just prints the literal code text without substituting the required symbol. There must be a straight-forward way to insert special characters or math strings into a kable header, but I have been unable to find it.
Thanks in advance for any help with this issue.
Larry Hunsicker
You can directly add a Unicode escape:
df <- data.frame(x = 1:10)
names(df)[1] <- 'maxCpep (ng/mL) \u2020'
df %>% kbl()
Use intToUtf8 to get the special character.
names(df)[1] <- paste('maxCpep (ng/mL)', intToUtf8(8224))
I have a really wide table (300+ columns) and would like to display it by wrapping the columns. In the example I will just use 100 columns.
What I have in mind is repetitively using kable to display the subset of the table:
library(kableExtra)
set.seed(1)
data = data.frame(matrix(rnorm(300, 10, 1), ncol = 100))
kable(data[, 1:5], 'latex', booktabs = T)
kable(data[, 6:10], 'latex', booktabs = T)
kable(data[, 11:15], 'latex', booktabs = T)
But this is apparently tedious... I know there are scaling down options but since I have so many columns, it won't be possible.
Is there any parameter I can twist in kable to make it happen?
Updated:
#jay.sf 's answer seems working well, but it didn't yield the same result here. Instead I got some plain code - could you please have a second look and let me know where can I improve? Thanks!
my sessionInfo() is: R version 3.5.1 (2018-07-02) with rmarkdown::pandoc_version() of 1.19.2.1.
This question is actually trickier than I thought at first glance. I used some tidyverse functions, specifically dplyr::select to get columns and purrr::map to move along groups of column indices.
My thinking with this was to make a list of vectors of column indices to choose, such that the first list item is 1:20, the second is 21:40, and so on, in order to break the data into 20 tables of 5 columns each (the number you use can be a different factor of ncol(data)). I underestimated the work to do that, but got ideas from an old SO post to rep the numbers 1 to 20 along the number of columns, sort it, and use that as the grouping then to split the columns.
Then each of those vectors becomes the column indices in select. The resulting list of data frames each gets passed to knitr::kable and kableExtra::kable_styling. Leaving things off there would get map's default of printing names as well, which isn't ideal, so I added a call to purrr::walk to print them neatly.
Note also that making the kable'd tables this way meant putting results="asis" in the chunk options.
---
title: "knitr chunked"
output: pdf_document
---
```{r include=FALSE}
library(knitr)
library(kableExtra)
library(dplyr)
library(purrr)
set.seed(1)
data = data.frame(matrix(rnorm(300, 10, 1), ncol = 100))
```
```{r results='asis'}
split(1:ncol(data), sort(rep_len(1:20, ncol(data)))) %>%
map(~select(data, .)) %>%
map(kable, booktabs = T) %>%
map(kable_styling) %>%
walk(print)
```
Top of the PDF output:
You could use a matrix containing your columns numbers and give it into a for loop with the cat function inside.
---
output: pdf_document
---
```{r, results="asis", echo=FALSE}
library(kableExtra)
set.seed(1)
dat <- data.frame(matrix(rnorm(300, 10, 1), ncol=100))
m <- matrix(1:ncol(dat), 5)
for (i in 1:ncol(m)) {
cat(kable(dat[, m[, i]], 'latex', booktabs=TRUE), "\\newline")
}
```
Result
I have the following dataframe:
site_name | site_url
--------------------| ------------------------------------
3D Printing | https://3dprinting.stackexchange.com
Academia | https://academia.stackexchange.com
Amateur Radio | https://ham.stackexchange.com
I want to generate a third column with the link integrated with the text. In HTML I came up with the following pseudo code:
df$url_name <- "[content of site_name](content of site_url)"
resulting in the following working program code:
if (knitr::is_html_output()) {
df <- df %>% dplyr::mutate(url_name = paste0("[", df[[1]], "](", df[[2]], ")"))
knitr::kable(df)
}
Is there a way to this in LaTeX with knitr as well?
(I am preferring a solution compatible with kableExtra, but if this is not possible I am ready to learn whatever table package can do this.)
*** ADDED: I just noticed that the above code works within a normal .Rmd document with the yaml header output: pdf_document. But not in my bookdown project.
The problem is with knitr::kable. It doesn't recognize that the bookdown project needs Markdown output, so you need to tell it that explicitly:
df <- df %>% dplyr::mutate(url_name = paste0("[", df[[1]], "](", df[[2]], ")"))
knitr::kable(df, format = "markdown")
This will work for any kind of Markdown output: html_document, pdf_document, bookdown::pdf_book, etc.
Alternatively, if you need LaTeX output for some other part of the table, you could write the LaTeX equivalent. This won't work for HTML output, of course, but should be okay for the PDF targets:
df <- df %>% dplyr::mutate(urlName = paste0("\\href{", df[[2]], "}{", df[[1]], "}"))
knitr::kable(df, format = "latex", escape = FALSE)
For this one I had to change the column name; underscores are special in LaTeX. You could probably get away without doing that if you left it as format = "markdown", but then you'd probably be better off using the first solution.
I am creating a pandoc.table with long column name that I want to wrap so my table does not go off the pdf page. I know you can use split.tables, but that takes away the clarity of the table. Using split.cells doesn't seem to do anything, even when supplied as a vector.
---
output : pdf_document
---
```{r,echo=FALSE, results="asis"}
library(pander)
mytab = data.frame(ReallySuperExtraLongColumnNameThatIWantToWrap=1:2, col2=2001:2002)
pandoc.table(mytab)
```
The following will produce a table with a line break in the header:
```{r,echo=FALSE, results="asis"}
library(pander)
mytab = data.frame("ReallySuperExtraLongColumn\nNameThatIWantToWrap"=1:2,
col2=2001:2002,
check.names = FALSE)
pandoc.table(mytab)
```
The line break is encoded with \n. This is not an allowed character in a columnname, and the data.frame() function would normally remove it. You can use check.names = FALSE to suppress this behaviour and keep the column names exactly as you entered them. Alternatively, you could redefine the column name on a separate line:
mytab = data.frame(ReallySuperExtraLongColumnNameThatIWantToWrap=1:2, col2=2001:2002)
names(mytab)[1] = "ReallySuperExtraLongColumn\nNameThatIWantToWrap"
You can also set the width of the cells with split.cells. The line breaks will then be generated automatically, however, breaks only appear when there is a space in your column header. An example:
```{r,echo=FALSE, results="asis"}
library(pander)
mytab = data.frame("Really Super Extra Long Column Name That I Want To Wrap"=1:2,
col2=2001:2002,
check.names = FALSE)
pandoc.table(mytab, split.cells = 15)
```
This gives breaks after "Extra" and "Name". Note that you still need check.names = FALSE, because also spaces are not allowed in data frame names.
I am trying to combine two tables in R Markdown into a single table, one below the other & retaining the header. The figure below shows the desired output. After putting my markdown code I will show the actual output. I realize that the way I have structured the pander statements will not allow me to get the output I want but searching SO I was unsuccessful in finding the right way to do so.
I can do some post processing in Word to get the output exactly as I want but I am trying to avoid that overhead.
The testdat.RData file is here: https://drive.google.com/file/d/0B0hTmthiX5dpWDd5UTdlbWhocVE/view?usp=sharing
The R Markdown RMD file is here: https://drive.google.com/file/d/0B0hTmthiX5dpSEFIcGRNQ1MzM1E/view?usp=sharing
Desired Output
```{r,echo=FALSE,message = FALSE, tidy=TRUE}
library(pander)
load("testdat.RData")
pander::pander(t1,big.mark=',', justify=c('left','right','right','right'))
pander::pander(t2,big.mark=',', justify=c('left','right','right','right'))
```
Actual Output
Thanks,
Krishnan
Here's my attempt using the xtable package:
```{r,echo=FALSE, message = FALSE, results="asis"}
library(xtable)
# Add thousands separator
t1[-1] = sapply(t1[-1], formatC, big.mark=",")
t2[-1] = sapply(t2[-1], formatC, big.mark=",")
t1$Mode = as.character(t1$Mode)
# Bind together t1, extra row of column names, and t2
t1t2 = rbind(t1, names(t1), t2)
# Render the table using xtable
print(xtable(t1t2, align="rrrrr"), # Right-align all columns (includes extra value for row names)
include.rownames=FALSE, # Don't print rownames
hline.after=NULL,
# Add midrules before/after each set column names
add.to.row = list(pos = list(-1,0,4,5),
command = rep("\\midrule \n",4)))
```
And here's the output:
Allow me to make a formal answer since my comment seemed to work for you.
pander(rbind(t1,names(t2),t2))