Is there some convenient way to convert rmarkdown to pandoc markdown? - r

I use rstudio to write r-markdown, but sometimes it is not compatible with markdown support by pandoc(math for example. If there is a way allow me to convert r-markdown to pandoc markdown, then it will be convenient to export my articles to pdf, org, rts, latex...
https://pandoc.org/ also seems doesn't mention rmarkdown support.
I have tried to export .html form rstudio and use pandoc convert the html file back to markdown, but it seems doesn't work.

Actually pandoc is used to create pdf and other formats from r-markdown. Therefore there is an intermediate file with pandoc compatible markdown. You could retain this file by:
rmarkdown::render("document.Rmd", output_format = "pdf_document", run_pandoc = FALSE)

Related

Render Rmd file with pipe formatting for tables

Problem
I have a .Rmd template file which I currently use to render PDFs using rmarkdown::render.
I need the option to also render the .Rmd as a markdown file. I have tried setting output_format to md_document in rmarkdown::render however this produces a markdown file containing lots of html tags for the tables etc. I don't want any html; I need it to be readable in a console, with tables formatted using the markdown pipe styling.
What I've tried
I tried setting output_format to
output_format(knitr = rmarkdown::knitr_options(opts_knit = list(knitr.table.format = "pipe")),
pandoc = rmarkdown::pandoc_options(to = rmarkdown::rmarkdown_format())
but this style produces a markdown file with html tags instead of piped tables.
How do I set the markdown render settings to render as I need?
Perhaps try adding results='asis' to knitr opts

How do I keep title & subtitle when using pandoc to convert .docx to .md in R?

I'm downloading a Google Doc as .docx and then converting to markdown for manipulation and export to multiple formats.
Problem: When I convert using pandoc, it strips title (and subtitle) and does not add any YAML header information. I could add title manually in the header, but I need it to be scripted, so need to not lose the title (ideally) or extract title from docx and add to YAML header, which would then be concatenated to the converted markdown file.
Example Code, where title is lost on conversion from docx to markdown:
require(rmarkdown);require(devtools)
examplefile=paste0(tempdir(),"/example.docx")
download.file("https://file-examples.com/wp-content/uploads/2017/02/file-sample_100kB.docx",destfile=examplefile)
pandoc_convert(examplefile,to="markdown",output = "example.rmd", options=c("--extract-media=."))
render(paste0(tempdir(), "/example.rmd"),"html_document")
browseURL(paste0(tempdir(),"/example.html"))
When converting from docx to markdown (or another markup format like rst) you need to include the -s or --standalone option.
From the pandoc documentation:
-s, --standalone
Produce output with an appropriate header and footer (e.g. a standalone HTML, LaTeX, TEI, or RTF file, not a fragment). This option is set automatically for pdf, epub, epub3, fb2, docx, and odt output. For native output, this option causes metadata to be included; otherwise, metadata is suppressed.
Without the -s this data is suppressed.

Word to R Markdown Conversion

I have received a file stored in Microsoft Word that includes formatted words (italics, bold). I would like to do some work with the file (extracting sections, inserting words, etc.) and was planning to do this work with R Markdown. I need to keep the formatting (italics, bold) from Word during this conversion. I know I can convert from Markdown to Word, but is the reverse conversion from Word to Markdown also possible? If not, does anyone have any suggestions of how to bring Word into Markdown (relatively) painlessly while maintaining the italics and bold formatting?
From the pandoc manual under "Demos": pandoc -s example30.docx -t markdown -o example35.md
For rmarkdown, please see this answer, Convert docx to Rmarkdown
You could use first pandoc, then RStudio.
In pandoc, pandoc -o output.md originFile.docx. In which, your
output is a markdown from a Word.
Open your RStudio, you can choose your type file at the bottom of the console, whether you select "markdown" or "Rmarkdown". You will be able to change your markdown file.
Also there is Writeage, from convert markdown to word. This is a pulgin in Word

R Markdown - no ODT and LaTeX options as an output

I found R markdown/knitr useful tool to document my work and generate summary document.
I work with .Rmd (R markdown) files in RStudio.
It seems that knitr provide appropriate functionality to generate .odt (Open Document Text) and .tex (LaTeX) documents from .Rmd.
However, R studio allows to choose .docx, .html and .pdf formats only.
I would like to avoid MS Word format since I prefer open standards and working under Linux.
Is it possible to add .odt and .tex options to Rstudio menu?
It doesn't seem possible to output odt directly in RStudio, but you can always use knitr::knit to produce a markdown document and pandoc to produce the odt:
library(knitr)
knit("myDoc.Rmd")
system("pandoc myDoc.md -o myDoc.odt")
You may have to adjust the pandoc options and adapt the template to get a nice looking result.
As for latex, you can keep the tex sources when compiling to pdf with the following option in your yaml front matter:
---
output:
pdf_document:
keep_tex: true
---

knitr html to Word docx using pandoc

I have been saving some example R markdown html output to Word using pandoc. I actually only do this so I can add some page breaks for easier printing:
system("pandoc -s Exercise1.html -o Exercise1.docx")
Although the output is acceptable I was wondering if there is a way to keep the original syntax highlighting of the R chunks (just as they are in the original knit HTML document)?
Also, I seem to be loosing all images in the conversion process and have to stick them into Word by hand. Is that normal?
Using the rmarkdown package (baked into RStudio Version 0.98.682, the current preview release) it's very simple to convert Rmd to docx, and code highlighting is included in the docx file.
You just need to include this at the top of your markdown text:
---
title: "Untitled" # obviously you can change this
output: word_document # specifies docx output
---
However, it seems that page breaks are still not supported in this conversion.
Why not convert the markdown directly to Word format?
Anyway, Pandoc does not support syntax highlighting in Word: "Currently, the only output formats that uses this information are HTML and LaTeX."
About the images: the Word file would definitely include those if you'd convert the markdown to Word directly. I am not sure about the HTML source, but I suppose you might have a path issue.

Resources