How can I include hyperlinks in a table within an Sweave document? - r

I have a data frame containing hyperlinks that I would like to present as clickable links using Sweave. I know about xtable, but am not sure how to use it to treat the contents of a data frame as LaTeX commands.

One strategy is to use the sanitize.text.function from the print function in xtable.
Setting sanitize.text.function = function(x){x} causes print simply to echo the contents of the data frame for later interpretation by LaTeX:
\documentclass{article}
\usepackage{hyperref}
\begin{document}
\title{Example of how to include hyperlinks in Sweave with \texttt{xtable}}
\author{David R. Lovell}
\maketitle
<<load-packages, include=FALSE>>=
require(xtable)
#
<<read-data, tidy=FALSE>>=
hits <- read.table(textConnection(
"Count,Link,Title
1031,http://australianbioinformatics.net/jobs,Jobs
796,http://australianbioinformatics.net/,Home"),
stringsAsFactors=FALSE, sep=",", header=TRUE)
#
<<print-xtable, echo = FALSE, results = 'asis'>>=
print(
xtable(
hits,
align="rrll",
caption="Top content on \\href{http://australianbioinformatics.net}{AustralianBioinformatics.net} in May 2014."
),
include.rownames=FALSE
)
#
<<print-xtable-href, echo = FALSE, results = 'asis'>>=
linkedHits <- transform(hits, href=paste("\\href{", Link, "}{", Title, "}", sep=""))
print(
xtable(
subset(linkedHits, select=c(Count, href)),
align="rrl",
caption="Top content on \\href{http://australianbioinformatics.net}{AustralianBioinformatics.net} in May 2014,
now with added hyperlinks."
),
include.rownames=FALSE,
sanitize.text.function = function(x){x}
)
#
\end{document}
...which produces this PDF output:

Related

Run R Markdown on many different datasets and save each knitted word document separately

I created an R Markdown to check for errors in a series of datasets (e.g., are there any blanks in a given column? If so, then print a statement that there are NAs and which rows have the NAs). I have setup the R Markdown to output a bookdown::word_document2. I have about 100 datasets that I need to run this same R Markdown on and get a word document output for each separately.
Is there a way to run this same R Markdown across all of the datasets and get a new word document for each (and so they are not overwritten)? All the datasets are in the same directory. I know that the output is overwritten each time you knit the document; thus, I need to be able to save each word document according to the dataset/file name.
Minimal Example
Create a Directory with 3 .xlsx Files
library(openxlsx)
setwd("~/Desktop")
dir.create("data")
dataset <-
structure(
list(
name = c("Andrew", "Max", "Sylvia", NA, "1"),
number = c(1, 2, 2, NA, NA),
category = c("cool", "amazing",
"wonderful", "okay", NA)
),
class = "data.frame",
row.names = c(NA,-5L)
)
write.xlsx(dataset, './data/test.xlsx')
write.xlsx(dataset, './data/dataset.xlsx')
write.xlsx(dataset, './data/another.xlsx')
RMarkdown
---
title: Hello_World
author: "Somebody"
output:
bookdown::word_document2:
fig_caption: yes
number_sections: FALSE
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
setwd("~/Desktop")
library(openxlsx)
# Load data for one .xlsx file. The other datasets are all in "/data".
dataset <- openxlsx::read.xlsx("./data/test.xlsx")
```
# Test for Errors
```{r test, echo=FALSE, comment=NA}
# Are there any NA values in the column?
suppressWarnings(if (TRUE %in% is.na(dataset$name)) {
na.index <- which(is.na(dataset$name))
cat(
paste(
"– There are NAs/blanks in the name column. There should be no blanks in this column. The following row numbers in this column need to be corrected:",
paste(na.index, collapse = ', ')
),
".",
sep = "",
"\n",
"\n"
)
})
```
So, I would run this R Markdown with the first .xlsx dataset (test.xlsx) in the /data directory, and save the word document. Then, I would want to do this for every other dataset listed in the directory (i.e., list.files(path = "./data") and save a new word document. So, the only thing that would change in each RMarkdown would be this line: dataset <- openxlsx::read.xlsx("./data/test.xlsx"). I know that I need to set up some parameters, which I can use in rmarkdown::render, but unsure how to do it.
I have looked at some other SO entries (e.g., How to combine two RMarkdown (.Rmd) files into a single output? or Is there a way to generate a cached version of an RMarkdown document and then generate multiple outputs directly from the cache?), but most focus on combining .Rmd files, and not running different iterations of the same file. I've also looked at Passing Parameters to R Markdown.
I have also tried the following from this. Here, all the additions were added to the example R Markdown above.
Added this to the YAML header:
params:
directory:
value: x
Added this to the setup code chunk:
# Pull in the data
dataset <- openxlsx::read.xlsx(file.path(params$directory))
Then, finally I run the following code to render the document.
rmarkdown::render(
input = 'Hello_World.Rmd'
, params = list(
directory = "./data"
)
)
However, I get the following error, although I only have .xlsx files in /data:
Quitting from lines 14-24 (Hello_World.Rmd) Error: openxlsx can only
read .xlsx files
I also tried this on my full .Rmd file and got the following error, although the paths are exactly the same.
Quitting from lines 14-24 (Hello_World.Rmd) Error in file(con,
"rb") : cannot open the connection
*Note: Lines 14–24 are essentially the setup section of the .Rmd.
I'm unsure of what I need to change. I also need to generate multiple output files, using the original filename (like "test" from test.xlsx, "another" from another.xlsx, etc.)
You could call render in a loop to process each file passed as a parameter :
dir_in <- 'data'
dir_out <- 'result'
files <- file.path(getwd(),dir_in,list.files(dir_in))
for (file in files) {
print(file)
rmarkdown::render(
input = 'Hello_World.Rmd',
output_file = tools::file_path_sans_ext(basename(file)),
output_dir = dir_out,
params = list(file = file)
)
}
Rmarkdown :
---
title: Hello_World
author: "Somebody"
output:
bookdown::word_document2:
fig_caption: yes
number_sections: FALSE
params:
file: ""
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
library(openxlsx)
# Load data for one .xlsx file. The other datasets are all in "/data".
dataset <- openxlsx::read.xlsx(file)
```
# Test for Errors
```{r test, echo=FALSE, comment=NA}
# Are there any NA values in the column?
suppressWarnings(if (TRUE %in% is.na(dataset$name)) {
na.index <- which(is.na(dataset$name))
cat(
paste(
"– There are NAs/blanks in the name column. There should be no blanks in this column. The following row numbers in this column need to be corrected:",
paste(na.index, collapse = ', ')
),
".",
sep = "",
"\n",
"\n"
)
})
```
An alternative using purrr rather than the for loop, but using the exact same setup as #Waldi.
Render
dir_in <- 'data'
dir_out <- 'result'
files <- file.path(getwd(),dir_in,list.files(dir_in))
purrr::map(.x = files, .f = function(file){
rmarkdown::render(
input = 'Hello_World.Rmd',
output_file = tools::file_path_sans_ext(basename(file)),
output_dir = dir_out,
params = list(file = file)
)
})
Rmarkdown
---
title: Hello_World
author: "Somebody"
output:
bookdown::word_document2:
fig_caption: yes
number_sections: FALSE
params:
file: ""
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
library(openxlsx)
# Load data for one .xlsx file. The other datasets are all in "/data".
dataset <- openxlsx::read.xlsx(file)
```
# Test for Errors
```{r test, echo=FALSE, comment=NA}
# Are there any NA values in the column?
suppressWarnings(if (TRUE %in% is.na(dataset$name)) {
na.index <- which(is.na(dataset$name))
cat(
paste(
"– There are NAs/blanks in the name column. There should be no blanks in this column. The following row numbers in this column need to be corrected:",
paste(na.index, collapse = ', ')
),
".",
sep = "",
"\n",
"\n"
)
})
```

Create sections through a loop with knitr

See this reproducible example :
---
title: "test"
output: html_document
---
## foo
```{r}
plot(1:3)
```
## bar
```{r}
plot(4:7)
```
## baz
```{r}
plot(8:12)
```
I want to be able to automate the creation of these sections as I can't know how many they will be before going further in my analysis.
My input to get this would be :
my_list <- list(foo = 1:3, bar = 4:7, baz = 8:12)
my_fun <- plot
my_depth <- 2
And the ideal answer (though I'm welcoming any improvement) would help me build a mdapply function so that I could just run:
```{r}
mdapply(X = my_list, FUN = my_fun, title_depth = my_depth)
```
And get the same output.
R package pander can generate Pandoc's markdown on the fly.
The key is to use the chunk option results='asis' to tell R Markdown to render pander's output as Markdown.
You just need to be careful to generate valid Markdown!
Try this:
---
title: "Test sections"
output: html_document
---
## A function that generates sections
```{r}
library(pander)
create_section <- function() {
# Inserts "## Title (auto)"
pander::pandoc.header('Title (auto)', level = 2)
# Section contents
# e.g. a random plot
plot(sample(1000, 10))
# a list, formatted as Markdown
# adding also empty lines, to be sure that this is valid Markdown
pander::pandoc.p('')
pander::pandoc.list(letters[1:3])
pander::pandoc.p('')
}
```
## Generate sections
```{r, results='asis'}
n_sections <- 3
for (i in seq(n_sections)) {
create_section()
}
```
It still looks hackish, but Markdown has its limits...
It seems like I found a way!
The whole idea is to pass what would be typed by hand as a string inside of knit(text=the_string) used in inline code.
So the function basically pastes a bunch of strings together, with a bit of substitute magic to have a function that feels like it's part of the apply family.
Parameter depth decides how many # you want.
Parameter options contains the chunk options, as a vector.
A vector shouldn't be able to contain logical and characters together but here it doesn't matter as it will all be coerced to character anyway, so c(echo= FALSE, results="hide") is fine.
I expect that it's easy to break but seems to work fine when treated gently.
---
title: "test"
output: html_document
---
```{r setup, include = FALSE}
library(knitr)
mdapply <- function(X, FUN, depth, options=""){
FUN <- as.character(substitute(FUN))
list_name <- as.character(substitute(X))
if(options != "")
options <- paste(",",names(options),"=",options,collapse="")
build_chunk <- function(nm)
{
paste0(
paste0(rep("#",depth), collapse=""),
" ",
nm,
"\n\n```{r", options, "}\n",
FUN,
"(", list_name, "[['", nm, "']])\n```")
}
parts <- sapply(names(X), build_chunk)
whole <- paste(parts, collapse="\n\n")
knit(text=whole)
}
```
```{r code}
my_list <- list(foo = 1:3, bar = 4:7, baz = 8:12)
```
`r mdapply(my_list, plot, 2, c(echo=FALSE))`
I would actually suggest a solution that works a little bit different, i.e. create the R-Markdown file from an R-script and then render it from the same R-script:
# function that creates the markdown header
rmd_header <- function(title){
paste0(
"---
title: \"", title, "\"
output: html_document
---
"
)
}
# function that creates the Rmd code for the plots
rmd_plot <- function(my_list, my_fun){
paste0(
"
## ", names(my_list), "
```{r}
", deparse(substitute(my_fun)), "(", deparse(substitute(my_list)), "[[", seq_along(my_list), "]])
```
"
)
}
# your objects
my_list <- list(foo = 1:3, bar = 4:7, baz = 8:12)
my_fun <- plot
my_depth <- 2 # I actually don't get what this is for
# now write everything into an rmd file
cat(rmd_header("Your Title")
, rmd_plot(my_list, plot)
, file = "test.rmd")
# and then create the html from that
rmarkdown::render("test.rmd", output_file = "test.html")
One thing to mention here: the indentation in the Rmd file does matter and when you copy the code here, make sure that R-Studio inserts it in the R-script as intended (because often it doesn't).
Taking a similar approach to #Georgery... but in a somewhat over-engineered fashion (also somewhat more general?). Anyway, here it goes.
make_template <- function(my_list, my_fun, my_depth, my_title, my_output_type, my_template_file){
require(glue)
n <- length(my_list)
# --- Rmd header ---
make_header <- function(my_title, my_output_type){
#
my_header <- glue(
"---", "\n",
"title: ", deparse({my_title}), "\n",
"output: ", deparse({my_output_type}), "\n",
"---", "\n",
"\n",
"\n"
)
return(my_header)
}
# --- one section only ---
make_section <- function(i){
one_section <- glue(
"\n",
"\n",
paste0(rep("#", times = {my_depth}), collapse = ""), " ", names({my_list})[[i]], "\n",
"\n",
"```{{r}}", "\n",
paste0({my_fun}, "(", deparse({my_list}[[i]]), ")"), "\n",
"```", "\n",
"\n",
"\n"
)
return(one_section)
}
# --- produce whole template ---
my_header <- make_header(my_title, my_output_type)
all_my_sections <- ""
for (i in seq_along(my_list)) {
all_my_sections <- paste0(all_my_sections, make_section(i))
}
my_template <- paste0(my_header, "\n", "\n", all_my_sections)
# --- write out
cat(my_template, file = my_template_file)
}
# --- try it
make_template(my_list = list(foo = 1:3, bar = 4:7, baz = 8:12, glop = 1:7),
my_fun = "plot",
my_depth = 4,
my_title = "super cool title",
my_output_type = "html_document",
my_template_file = "my_template_file.Rmd"
)

R Markdown - SparkTables not rendering

I am trying to render a sparktable using Rmarkdown. But the output always comes out in raw html or tex format. This depends on whether I am rendering a PDF or an HTML. Not sure what to do here?
library(sparkTable)
data("AT_Soccer")
content <- list(
function(x) {sum(x)},
function(x) {round(sum(x),2)},
function(x) {round(sum(x), 2)},
newSparkLine(lineWidth = 2,pointWidth = 6),
newSparkBar()
)
names(content) <- c("Points","ShotGoal","GetGoal","GoalDiff","winLose")
vars <- c("points","shotgoal","getgoal","goaldiff","wl")
stab <- newSparkTable(AT_Soccer,content,vars)
export(stab, outputType = "html") ### For HTML R-Markdown files
export(stab, outputType = "tex") #### For PDF R-Markdown files
My output (for html files) looks like:
The pdf output is:
I am trying to get the actual sparktable. I have been able to render the actual table like this:
showSparkTable(stab)
However, that opens the spark table within the Shiny framework. I'm trying to produce multiple rmarkdown documents with spark tables.
I took this example from: https://journal.r-project.org/archive/2015-1/templ-kowarik-meindl.pdf. Page 29.
Solution for HTML
Setting this worked for me. Thanks to Martin. Still stuck on the pdf one though.
knitr::opts_chunk$set(results = 'asis')
After studying the documentation a bit I summarize what I learned about including sparkTables inside Rmd documents:
1. For HTML documents (outputType = 'html'):
Just as I said use the chunk option results = 'asis'.
2. For PDF documents (outputType = 'tex'):
You also need the option above in the case of PDF documents. BUT if you dont use it, you will see the plain LaTeX that is generated by export().
At the very bottom of that output you will find an important hint:
## Information: please do not forget to add the following command before \begin{document} in your tex-fi
##
## \newcommand{\graph}[3]{ \raisebox{-#1mm}{\includegraphics[height=#2em]{#3}}}
So what we have to do here is to
include that line of LateX in our preamble,
add results = 'asis' to the code chunk,
and set the argument infonote of export() to FALSE.
The last point prevents another error that the LaTeX compiler would throw (namely that we already have defined the command \graph).
What follows is a working example for a PDF document:
---
title: "Plotting Plots Under Code"
author: "Martin"
date: "February 1, 2017"
output: pdf_document
header-includes:
- \newcommand{\graph}[3]{ \raisebox{-#1mm}{\includegraphics[height=#2em]{#3}}}
---
```{r setup, echo = F, warning = F, message = F, results = 'asis'}
library(sparkTable)
data('AT_Soccer')
content <- list(
function(x) {sum(x)},
function(x) {round(sum(x), 2)},
function(x) {round(sum(x), 2)},
newSparkLine(lineWidth = 2, pointWidth = 6),
newSparkBar()
)
names(content) <- c('Points', 'ShotGoal', 'GetGoal', 'GoalDiff', 'winLose')
vars <- c('points', 'shotgoal', 'getgoal', 'goaldiff', 'wl')
stab <- newSparkTable(AT_Soccer, content, vars)
export(stab, outputType = 'tex', infonote = F)
```

Iteratively producing latex tables in knitr

I'm working on iteratively producing LaTeX tables using knitr. All is well except I'm left with extra markup before each table. Here's a simple example, though this would ideally work as a template for more complex problems, ie different-size tables, varying data sets etc.
What can I do to get rid of the extra text before each table?
\documentclass{article}
\usepackage{setspace, relsize}
\usepackage[margin=.5in, landscape]{geometry}
\usepackage{pdfpages}
\begin{document}
<<setup, include=FALSE>>=
opts_chunk$set(echo=FALSE, warning = FALSE, message = FALSE, cache = FALSE, error = FALSE)
library("ggplot2")
library("knitr")
library("Hmisc")
mytable_function = function(mydata){
foo = as.matrix(head(mydata))
colnames(foo) = names(mydata)
rownames(foo) = c("First", "Second", "Third", "Fourth", "Fifth", "Sixth")
return(foo)
}
template <- "<<thisthing-{{i}}>>=
mytable = mytable_function(iris[iris$Species == unique(iris$Species)[i],])
latex(mytable, file = '',
title = '',
where = '!h',
caption = 'This is a table',
col.just = rep('r', ncol(mytable)))
#"
for(i in 1:3){
cat(knit(text = knit_expand(text = template, i = i, quiet = TRUE)))
}
#
\end{document}
Fwiw here's a similar question I asked a while ago but because I'm producing tables here and not figures I think this is a slightly different solution.
Print a list of dynamically-sized plots in knitr
The provided code does not match the output you presented. Actually, it produces no output whatsoever.
Step 0: Reproduce output from the question
include=FALSE on the only chunk in the document is quite fatal … replace by echo=FALSE.
The main chunk (setup) as well as the template chunk need results="asis".
message=FALSE should be a chunk option of setup. Setting it as default options within setup won't affect messages from the current chunk.
Step 1: Immediate issue
This line
cat(knit(text = knit_expand(text = template, i = i, quiet = TRUE)))
shoud be
cat(knit(text = knit_expand(text = template, i = i), quiet = TRUE))
quiet is an argument of knit, not knit_expand.
Step 2: A better solution
Although this works, it's an overly complicated overkill. The answer you linked to dynamically generated chunks because fig.height is not vectorized the way it would be required for that case. Here, we can just use a single chunk:
\documentclass{article}
\begin{document}
<<setup, echo = FALSE, results='asis', message = FALSE>>=
mytable_function = function(mydata){
foo = as.matrix(head(mydata))
colnames(foo) = names(mydata)
rownames(foo) = c("First", "Second", "Third", "Fourth", "Fifth", "Sixth")
return(foo)
}
for(i in 1:3){
mytable = mytable_function(iris[iris$Species == unique(iris$Species)[i],])
Hmisc::latex(mytable,
file = '',
title = '',
where = '!h',
caption = 'This is a table',
col.just = rep('r', ncol(mytable)))
}
#
\end{document}

Control multiple column header formatting parameters xtable

I have this example below where I create a basic data frame then output it to PDF through LaTeX using knitr. I am controlling some of the basic formatting and have used the function bold to bold the column headers and have redefined my column names through colnames().
However, there is a function called spaces which on it's own will effectively remove spaces from the column headers. Is there a way to combine bold and spaces and pass it into sanitize.colnames.function to avoid renaming the column headers manually?
The intent is to basically manipulate the data frame in R to make it as dynamic as possible when data frames (printed to tables) change during analysis.
Thanks
\documentclass{article}
\begin{document}
%\SweaveOpts{concordance=TRUE}
hello here is a table - Table \ref{tab:mydataframe}
<<testtable, cache=FALSE, eval=TRUE, echo=FALSE, results = 'asis', warning=FALSE, error=FALSE, message=FALSE>>=
library(xtable)
library(printr)
library(knitr)
library(dplyr)
df_one = data.frame(column_one = c(1:10), column_two = c(11:20))
bold = function(x) {paste('{\\textbf{',x,'}}', sep ='')}
spaces = function(x){gsub("\\."," ",x)} # THIS IS NOT USED
colnames(df_one) = c('Column one', 'Column two')
print(xtable(df_one, type = 'latex', caption ='My test dataframe', label ='tab:mydataframe', align='lll', digits = 5), caption.placement ='top', include.rownames = FALSE, table.placement = '!h', sanitize.colnames.function = bold)
#
\end{document}

Resources