How to replicate Knit HTML in a command line? - r

I know this question is similar to this one. But I couldn't get a solution there so posting it here again.
I want to get the exact same output as I get by clicking "Knit HTML" but via a command. I tryied using knit2html but it messes with the formatting and does not include the title, kable does not work etc.
Example:
This is my test.Rmd file,
---
title: "test"
output: html_document
---
This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see <http://rmarkdown.rstudio.com>.
When you click the **Knit** button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
```{r}
library(knitr,quietly=T)
kable(summary(cars))
```
You can also embed plots, for example:
```{r, echo=FALSE}
plot(cars)
```
Note that the `echo = FALSE` parameter was added to the code chunk to prevent printing of the R code that generated the plot.
Output:
Knit HTML
knit2html

The documentation tells us:
If you are not using RStudio then you simply need to call the rmarkdown::render function, for example:
rmarkdown::render("input.Rmd")
Note that in the case using the “Knit” button in RStudio the basic mechanism is the same (RStudio calls the rmarkdown::render function under the hood).
In essence, rmarkdown::render does a lot more setup than knitr::knit2html, although I don’t have an exhaustive list of all differences.
The most flexible way of rendering the output is, at any rate, to supply your own stylesheet to format the output according to your wishes.
Please note that you need to set up Pandoc manually to work with rmarkdown::render on the command line.
That said, here are two remarks that would improve the knitr::knit2hmtl output, and that are superior to using rmarkdown::render in my opinion:
To include the title, use a Markdown title tag, not a YAML tag:
# My title
To format tables, don’t use the raw kable function. In fact, this is also true when using rmarkdown::render: the alignment of the table cells is completely off. Rmarkdown apparently uses centering as the default alignment but this option is almost never correct. Instead, you should left-align text and (generally) right-align numbers. As of this writing, Knitr cannot do this automatically (as far as I know) but it’s fairly easy to include a filter to do this for you:
```{r echo=FALSE}
library(pander)
# Use this option if you don’t want tables to be split
panderOptions('table.split.table', Inf)
# Auto-adjust the table column alignment depending on data type.
alignment = function (...) UseMethod('alignment')
alignment.default = function (...) 'left'
alignment.integer = function (...) 'right'
alignment.numeric = function (...) 'right'
# Enable automatic table reformatting.
opts_chunk$set(render = function (object, ...) {
if (is.data.frame(object) ||
is.matrix(object)) {
# Replicate pander’s behaviour concerning row names
rn = rownames(object)
justify = c(if (is.null(rn) || length(rn) == 0 ||
(rn == 1 : nrow(object))) NULL else 'left',
sapply(object, alignment))
pander(object, style = 'rmarkdown', justify = justify)
}
else if (isS4(object))
show(object)
else
print(object)
})
```

Related

Using Markdown formatting in table using kable in quarto

Using quarto's HMTL-output functionalities, I am trying to produce a kable from a data.frame that contains some Markdown-style formatting that should show up in the final document. In the actual use case, I have a number of documents already formatted this way and I would like re-use these commands for correctly rendering the output.
Here's my example.qmd:
---
title: "example"
format:
html
---
```{r setup}
library(kableExtra)
```
```{r}
#| echo: false
data.frame(Function = "`read_delim()`",
Formula = "$\\leftarrow$",
Break = "this continues on a<br>new line",
Link = "[Google](www.google.com)") |>
kbl(format = "html")
```
After running the chunk, the preview in RStudio does display the arrow and line break correctly, but ` ` and the link fail to have an effect:
When rendering the qmd to HTML, the result looks like this, i.e. ignores the formatting:
What am I missing? Is there a way to include such formatting commands into a kable when rendering a quarto document to HTML?
When creating a table in Quarto, you can't mix Markdown with HTML - the Markdown syntax won't be processed within the HTML table.
This R code would work
data.frame(Function = "`read_delim()`",
Formula = "$\\leftarrow$",
Break = "this continues on a<br>new line",
Link = "[Google](www.google.com)") |>
kbl(format = "markdown")
So if you can, output only Markdown table which knitr::kable() should do by default.
If you need to output a HTML table (e.g for specific HTML features), you need to use a framework that will render the markdown for you while creating the HTML table.
gt with fmt_markdown() and md()
flextable with ftextra and colformat_md() or as_paragraph_md
This is possible that this limitation of note being able to include raw Markdown inside HTML table will be improve in the future (https://github.com/quarto-dev/quarto-cli/discussions/957#discussioncomment-2807907)

Can an rmarkdown code chunk switch on output type? [duplicate]

I would like to include specific content based on which format is being created. In this specific example, my tables look terrible in MS word output, but great in HTML. I would like to add in some test to leave out the table depending on the output.
Here's some pseudocode:
output.format <- opts_chunk$get("output")
if(output.format != "MS word"){
print(table1)
}
I'm sure this is not the correct way to use opts_chunk, but this is the limit of my understanding of how knitr works under the hood. What would be the correct way to test for this?
Short Answer
In most cases, opts_knit$get("rmarkdown.pandoc.to") delivers the required information.
Otherwise, query rmarkdown::all_output_formats(knitr::current_input()) and check if the return value contains word_document:
if ("word_document" %in% rmarkdown::all_output_formats(knitr::current_input()) {
# Word output
}
Long answer
I assume that the source document is RMD because this is the usual/most common input format for knitting to different output formats such as MS Word, PDF and HTML.
In this case, knitr options cannot be used to determine the final output format because it doesn't matter from the perspective of knitr: For all output formats, knitr's job is to knit the input RMD file to a MD file. The conversion of the MD file to the output format specified in the YAML header is done in the next stage, by pandoc.
Therefore, we cannot use the package option knitr::opts_knit$get("out.format") to learn about the final output format but we need to parse the YAML header instead.
So far in theory. Reality is a little bit different. RStudio's "Knit PDF"/"Knit HTML" button calls rmarkdown::render which in turn calls knit. Before this happens, render sets a (undocumented?) package option rmarkdown.pandoc.to to the actual output format. The value will be html, latex or docx respectively, depending on the output format.
Therefore, if (and only if) RStudio's "Knit PDF"/"Knit HTML" button is used, knitr::opts_knit$get("rmarkdown.pandoc.to") can be used to determine the output format. This is also described in this answer and that blog post.
The problem remains unsolved for the case of calling knit directly because then rmarkdown.pandoc.to is not set. In this case we can exploit the (unexported) function parse_yaml_front_matter from the rmarkdown package to parse the YAML header.
[Update: As of rmarkdown 0.9.6, the function all_output_formats has been added (thanks to Bill Denney for pointing this out). It makes the custom function developed below obsolete – for production, use rmarkdown::all_output_formats! I leave the remainder of this answer as originally written for educational purposes.]
---
output: html_document
---
```{r}
knitr::opts_knit$get("out.format") # Not informative.
knitr::opts_knit$get("rmarkdown.pandoc.to") # Works only if knit() is called via render(), i.e. when using the button in RStudio.
rmarkdown:::parse_yaml_front_matter(
readLines(knitr::current_input())
)$output
```
The example above demonstrates the use(lesness) of opts_knit$get("rmarkdown.pandoc.to") (opts_knit$get("out.format")), while the line employing parse_yaml_front_matter returns the format specified in the "output" field of the YAML header.
The input of parse_yaml_front_matter is the source file as character vector, as returned by readLines. To determine the name of the file currently being knitted, current_input() as suggested in this answer is used.
Before parse_yaml_front_matter can be used in a simple if statement to implement behavior that is conditional on the output format, a small refinement is required: The statement shown above may return a list if there are additional YAML parameters for the output like in this example:
---
output:
html_document:
keep_md: yes
---
The following helper function should resolve this issue:
getOutputFormat <- function() {
output <- rmarkdown:::parse_yaml_front_matter(
readLines(knitr::current_input())
)$output
if (is.list(output)){
return(names(output)[1])
} else {
return(output[1])
}
}
It can be used in constructs such as
if(getOutputFormat() == 'html_document') {
# do something
}
Note that getOutputFormat uses only the first output format specified, so with the following header only html_document is returned:
---
output:
html_document: default
pdf_document:
keep_tex: yes
---
However, this is not very restrictive. When RStudio's "Knit HTML"/"Knit PDF" button is used (along with the dropdown menu next to it to select the output type), RStudio rearranges the YAML header such that the selected output format will be the first format in the list. Multiple output formats are (AFAIK) only relevant when using rmarkdown::render with output_format = "all". And: In both of these cases rmarkdown.pandoc.to can be used, which is easier anyways.
Since knitr 1.18, you can use the two functions
knitr::is_html_output()
and
knitr::is_latex_output()
Just want to add a bit of clarification here, since I often render the same Rmarkdown file (*.Rmd) into multiple formats (*.html, *.pdf, *.docx), so rather than wanting to know if the format of interest is listed amongst those specified in the front matter yaml (i.e. "word_document" %in% rmarkdown::all_output_formats(knitr::current_input()), I want to know which format is being currently rendered. To do this you can either:
Get first element of formats listed in front matter: rmarkdown::all_output_formats(knitr::current_input()[1]; or
Get default output format name: rmarkdown::default_output_format(knitr::current_input())$name
For example...
---
title: "check format"
output:
html_document: default
pdf_document: default
word_document: default
---
```{r}
rmarkdown::all_output_formats(knitr::current_input())[1]
```
```{r}
rmarkdown::default_output_format(knitr::current_input())$name
```
```{r}
fmt <- rmarkdown::default_output_format(knitr::current_input())$name
if (fmt == "pdf_document"){
#...
}
if (fmt == "word_document"){
#...
}
```
One additional point: the above answers don't work for an html_notebook, since code is being executed directly there and knitr::current_input() doesn't respond. If you know the document name you can call all_output_formats as above, specifying the name explicitly. I don't know if there's another way to do this.
This is what I use
library(stringr)
first_output_format <-
names(rmarkdown::metadata[["output"]])[1]
if (!is.null(first_output_format)) {
my_output <- str_split(first_output_format,"_")[[1]][1]
} else {
my_output = "unknown"
}

R Markdown – a concise way to print all code snippets used in the document

I'm writing a report in R Markdown in which I don't want to print any of my R code in the main body of the report – I just want to show plots, calculate variables that I substitute into the text inline, and sometimes show a small amount of raw R output. Therefore, I write something like this:
In the following plot, we see that blah blah blah:
```{r snippetName, echo=F}
plot(df$x, df$y)
```
Now...
That's all well and good. But I would also like to provide the R code at the end of the document for anybody curious to see how it was produced. Right now I have to manually write something like this:
Here is snippet 1, and a description of what section of the report
this belongs to and how it's used:
```{r snippetName, eval=F}
```
Here is snippet 2:
```{r snippetTwoName, eval=F}
```
<!-- and so on for 20+ snippets -->
This gets rather tedious and error-prone once there are more than a few code snippets. Is there any way I could loop over the snippets and print them out automatically? I'm hoping I could do something like:
```{r snippetName, echo=F, comment="This is snippet 1:"}
# the code for this snippet
```
and somehow substitute the following result into the document at a specified point when it's knitted:
This is snippet 1:
```{r snippetName, eval=F}
```
I suppose I could write some post-processing code to scan through the .Rmd file, find all the snippets, and pull out the code with a regex or something (I seem to remember there's some kind of options file you can use to inject commands into the pandoc process?), but I'm hoping there might be something simpler.
Edit: This is definitely not a duplicate – if you read my question thoroughly, the last code block shows me doing exactly what the answer to the linked question suggests (with a slight difference in syntax, which could have been the source of the confusion?). I'm looking for a way to not have to write out that last code block manually for all 20+ snippets in the document.
This is do-able within knitr, no need to use pandoc. Based on an example posted by Yihui at https://github.com/yihui/knitr-examples/blob/master/073-code-appendix.Rnw
Set echo=FALSE throughout your document: opts_chunk$set(echo = FALSE)
Then put this chunk at the end to print all code:
```{r show-code, ref.label=all_labels(), echo = TRUE, eval=FALSE}
```
This will print code for all chunks. Currently they all show up in a single block; I'd love to figure out how to put in the chunk label or some other header... For now I start my chunks with comments (probably not a bad idea in any case).
Updated: to show only the chunks that were evaluated, use:
ref.label = all_labels(!exists('engine')) - see question 40919201
Since this is quite difficult if not impossible to do with knitr, we can take advantage of the next step, the pandoc compilation, and of pandoc's ability to manipulate content with filters. So we write a normal Rmd document with echo=TRUE and the code chunks are printed as usual when they are called.
Then, we write a filter that finds every codeblock of language R (this is how a code chunk will be coded in pandoc), removes it from the document (replacing it, here, with an empty paragraph) and storing it in a list. We then add the list of all codeblocks at the end of the document. For this last step, the problem is that there really is no way to tell a python filter to add content at the end of a document (there might be a way in haskell, but I don't know it). So we need to add a placeholder at the end of the Rmd document to tell the filter to add the R code at this point. Here, I consider that the placeholder will be a CodeBlock with code lastchunk.
Here is the filter, which we could save as postpone_chunks.py.
#!/usr/bin/env python
from pandocfilters import toJSONFilter, Str, Para, CodeBlock
chunks = []
def postpone_chunks(key, value, format, meta):
if key == 'CodeBlock':
[[ident, classes, keyvals], code] = value
if "r" in classes:
chunks.append(CodeBlock([ident, classes, keyvals], code))
return Para([Str("")])
elif code == 'lastchunk':
return chunks
if __name__ == "__main__":
toJSONFilter(postpone_chunks)
Now, we can ask knitr to execute it with pandoc_args. Note that we need to remember to add the placeholder at the end of the document.
---
title: A test
output:
html_document:
pandoc_args: ["--filter", "postpone_chunks.py"]
---
Here is a plot.
```{r}
plot(iris)
```
Here is a table.
```{r}
table(iris$Species)
```
And here are the code chunks used to make them:
lastchunk
There is probably a better way to write this in haskell, where you won't need the placeholder. One could also customize the way the code chunks are returned at the end to add a title before each one for instance.

Format text inside R code chunk

I am making some slides inside Rstudio following instructions here:
http://rmarkdown.rstudio.com/beamer_presentation_format.html
How do I define text size, colors, and "flow" following numbers into two columns?
```{r,results='asis', echo=FALSE}
rd <- sample(x=1e6:1e7, size = 10, replace = FALSE)
cat(rd, sep = "\n")
```
Output is either HTML (ioslides) or PDF (Beamer)
Update:
Currently the code above will only give something like the following
6683209
1268680
8412827
9688104
6958695
9655315
3255629
8754025
3775265
2810182
I can't do anything to change text size, color or put them into a table. The output of R codechunk is just plain text. Maybe it is possible to put them in a table indeed, as mentioned at this post:
http://tex.aspcode.net/view/635399273629833626273734/dynamically-format-labelscolumns-of-a-latex-table-generated-in-rknitrxtable
But I don't know about text size and color.
Update 2:
The idea weaving native HTML code to R output is very useful. I haven't thought of that. This however only works if I want to output HTML. For PDF output, I have to weave the native Latex code with R output. For example, the code following works using "knitr PDF" output:
```{r,results='asis', echo=FALSE}
cat("\\textcolor{blue}{")
rd <- sample(x=1e6:1e7, size = 10, replace = FALSE)
for (n in rd) {
cat(paste0(n, '\\newline \n')) }
cat("}")
```
You are using results='asis', hence, you can simply use print() and formatting markup. If you want your text to be red, simply do:
```{r,results='asis', echo=FALSE}
print("<div class='red2'>")
rd <- sample(x=1e6:1e7, size = 10, replace = FALSE)
cat(rd, sep = "\n")
print("</div>")
```
Hope it helps.
It sounds as if you want the output to be either PDF or HTML.
One possibility is the xtable package. It produces tables either in PDF or HTML format. There's no (output-independent) way to specify colour, however. Here's an example.
xt <- xtable(data.frame(a=1:10))
print(xt, type="html")
print(xt) # Latex default
Another option is the pandoc.table function from the pander package. You need the pandoc binary installed. If you have RStudio, you have this already. The function spits out some markdown which then can be converted to HTML or PDF by pandoc.
Here's how you could use this from RStudio. Create an RMarkdown document like this:
---
title: "Untitled"
author: "You"
date: "20 November 2014"
output: html_document
---
```{r, results='asis'}
library(pander)
tmp <- data.frame(a=1:10,b=1:10)
pandoc.table(tmp)
```
When you click "knit HTML", it will spit out a nice HTML document. If you change output to pdf_document, it will spit out a nice PDF. You can edit the options to change output - e.g.
pandoc.table(tmp, emphasize.strong.rows=c(2,4,6,8,10))
and this will work both in PDF or HTML. (Still no options to change colour though. Homework task: fix pandoc.table to allow arbitrary colours.)
Under the hood, knitr is writing markdown, and pandoc is converting the markdown to whatever you like.

Is there a way to prevent line break in the HTML when results='hide' and echo=FALSE?

In R, using knitr, is there a way to prevent line breaks in the HTML when results='hide' and echo=FALSE?
In this case:
First I do this,
```{r results='hide', echo=FALSE}
x=4;x
```
then I do that.
I get:
First I do this,
then I do that.
with both a break and an extra line between.
I'd like to get:
First I do this, then I do that.
instead.
Generally speaking, I'd like for code chunks to not insert new lines so that markdown is free to eat the one after the first line of text.
Thanks,
I assume you're creating an HTML document from an R Markdown document. In that case, you can use the inline R code capability offered by knitr by using the ` characters starting with the letter r.
Example:
In your R Markdown, write:
First I do this,`r x=4` then I do that. I can call x by doing `r x`.
And as output, you get:
First I do this, then I do that. I can call x by doing 4.
Note that in my example, I evaluated the variable x, but if you do not want to evaluate it, you do not have to. The variable x should still be assigned a value of 4 from the
`r x=4`
part of the R Markdown.
This is Inline R Code, and is documented here under the section "Inline R Code".
EDIT:
Note that Inline R Code has properties that are analogous to "echo=FALSE". And if you want to hide the results from inline R code, you can use base R functions to hide the output. See this question.
Try something like:
``` {r , results="asis", echo=F, eval=T}
if(showMe){
cat("printed")
} else {
cat("<!-- no break line -->")
}
```

Resources