Proper R Markdown Code Organization - r

I have been reading about R Markdown (here, here, and here) and using it to create solid reports. I would like to try to use what little code I am running to do some ad hoc analyses and turn them into more scalable data reports.
My question is rather broad: Is there a proper way to organize your code around an R Markdown project? Say, have one script that generates all of the data structures?
For example: Let's say that I have the cars data set and I have brought in commercial data on the manufacturer. What if I wanted to attach the manufacturer to the current cars data set, and then produce a separate summary table for each company using a manipulated data set cars.by.name as well as plot a certain sample using cars.import?
EDIT: Right now I have two files open. One is an R Script file that has all of the data manipulation: subsetting and re-categorizing values. And the other is the R Markdown file where I am building out text to accompany the various tables and plots of interest. When I call an object from the R Script file--like:
```{r}
table(cars.by.name$make)
```
I get an error saying Error in summary(cars.by.name$make) : object 'cars.by.name' not found
EDIT 2: I found this older thread to be helpful. Link
---
title: "Untitled"
author: "Jeb"
date: "August 4, 2015"
output: html_document
---
This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see <http://rmarkdown.rstudio.com>.
When you click the **Knit** button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
```{r}
table(cars.by.name$make)
```
```{r}
summary(cars)
summary(cars.by.name)
```
```{r}
table(cars.by.name)
```
You can also embed plots, for example:
```{r, echo=FALSE}
plot(cars)
plot(cars.import)
```
Note that the `echo = FALSE` parameter was added to the code chunk to prevent printing of the R code that generated the plot.

There is a solution for this sort of problem, explained here.
Basically, if you have an .R file containing your code, there is no need to repeat the code in the .Rmd file, but you can include the code from .R file. For this to work, the chunks of code should be named in the .R file, and then can be included by name in the .Rmd file.
test.R:
## ---- chunk-1 ----
table(cars.by.name$make)
test.Rmd
Just once on top of the .Rmd file:
```{r echo=FALSE, cache= F}
knitr::read_chunk('test.R')
```
For every chunk you're including (replace chunk-1 with the label of that specific chunk in your .R file):
```{r chunk-1}
```
Note that it should be left empty (as is) and in run-time your code from .R will be brought over here and run.

Often times, I have many reports that need to run the same code with slightly different parameters. Calling all my "stats" functions separately, generating the results and then just referencing is what I typically do. The way to do this is as follows:
---
title: "Untitled"
author: "Author"
date: "August 4, 2015"
output: html_document
---
```{r, echo=FALSE, message=FALSE}
directoryPath <- "rawPath" ##Something like /Users/userid/RDataFile
fullPath <- file.path(directoryPath,"myROutputFile.RData")
load(fullPath)
```
Some Text, headers whatever
```{r}
summary(myStructure$value1) #Where myStructure was saved to the .RData file
```
You can save an RData file by using the save.image() command.
Hope that helps!

Related

Run R markdown (.Rmd) from inside other R script to produce HTML

As the example, if you create a new R markdown file and save it as 'test'. Can one then run or deploy this test.Rmd file from within a normal R script. The purpose being to generate the output in HTML, without having to open the .Rmd file.
I'm hoping to create one master file to do this for many markdown files in one go; which would save considerable time as you then don't have to open many markdown files and wait for each one to complete.
You are looking for rmarkdown::render().
Contents of "test.Rmd"
---
title: "Untitled"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
## R Markdown
This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see <http://rmarkdown.rstudio.com>.
When you click the **Knit** button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
```{r cars}
summary(cars)
```
Contents of script.R
# provided test.Rmd is in the working directory
rmarkdown::render("test.Rmd")
A Way to Render Multiple Rmd
cwd_rmd_files <- list.files(pattern = ".Rmd$")
lapply(cwd_rmd_files, rmarkdown::render)
Thanks the-mad-statter, your answer was very helpful. The issue I faced, required me to prepare markdown dynamically. By adapting your code, that's easily possible:
Contents of "test_dyn.rmd"
---
title: "Untitled"
output: html_document
---
The chunk below adds formatted text, based on your inputs.
```{r text, echo=FALSE, results="asis"}
cat(text)
```
The chunk below uses your input in as code.
```{r results}
y
```
Contents of "script_dyn.r"
in_text <- c("**Test 1**", "*Test 2*")
in_y <- 1:2
lapply(1:2, function(x) {
text <- in_text[[x]]
y <- in_y[[x]]
rmarkdown::render(input = "test_dyn.rmd", output_file = paste0("test", x))
})
Like this you can create files with different text and different variables values in your code.

Is there a way to add line breaks ONLY when exporting to PDF in R Markdown?

I think the question is quite self-explanatory but for avoidance of doubt I'll explain with more detail below:
I have an R Markdown document that works well if converted to HTML or uploaded to GitHub. When converting to PDF (using Latex), the results are not so pretty. I find that the biggest problem in a Latex PDF document are line breaks. I can fix the line breaks issue on the PDF document by adding "\ " characters, but that throws my HTML document out of whack too.
Is there a way to manually add line breaks (or "space before/after paragraphs") for the PDF output only?
Thank you!
You can redefine the relevant spacings in the YAML header. \parskip controls the paragraph spacing. Code blocks are shaded using a snugshade environment from the framed package. We can also redefine the shaded environment for code blocks to have some vertical space at the start. Here's a reproducible example. Note: I also added the keep_tex parameter so you can see exactly what the generated tex file looks like, in case this is useful:
title: "test"
author: "A.N. Other"
header-includes:
- \setlength{\parskip}{\baselineskip}
- \renewenvironment{Shaded}{\vspace{\parskip}\begin{snugshade}}{\end{snugshade}}
output:
pdf_document:
keep_tex: true
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
## R Markdown
This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see <http://rmarkdown.rstudio.com>.
When you click the **Knit** button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
```{r cars}
summary(cars)
```
## Including Plots
You can also embed plots, for example:
```{r pressure, echo=FALSE}
plot(pressure)
```
Note that the `echo = FALSE` parameter was added to the code chunk to prevent printing of the R code that generated the plot.
Once you output to HTML, you can just print the HTML webpage as PDF. that might be an easy way keep the original format

Calling functions in a second file when compiling .Rmd files with knitr

I want to use knitr to format an R markdown file, lets call it Main.rmd. Some code in Main.rmd relies on helper functions in a second file, lets call it Functions.rmd.
When I first run Functions.rmd and then Main.rmd, the code in Main.rmd runs fine. When I first run Functions.rmd and then try to knit Main.rmd, I receive an evaluation:
Error "Object 'myfunction' not found
How can I fix this without combining Main.rmd and Functions.rmd into a single document, which I would like to avoid doing?
Edit: I've added a toy example below. There are very useful suggestions so far for how to call the functions in Functions.rmd from Main.rmd, but they all require converting Functions.rmd to a .R file. However, for my current purpose, it is important that Functions.rmd can also be read as a standalone markdown document.
First, Main.rmd:
---
title: "Main_test"
author: "Matt Nolan"
date: "25/06/2018"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
## Background.
This is the main body of text and code used to display results of analyses, some of which are created by calling functions in Functions.Rmd.
```{r cars}
myexamplefunction(1,2)
```
And, here is Functions.rmd:
---
title: "Functions_test"
author: "Matt Nolan"
date: "25/06/2018"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
## Background
This is a document containing functions used in the document "Main_test".
Because it contains functions and formatted text to explain the functions for an interested reader, it should be usable as a standalone markdown document.
For example, this is a function that adds two numbers.
```{r cars}
myexamplefunction <- function(a, b) {a + b}
```
30Jun2018 Update: R Markdown does not support combining Rmd files
Matt's 25Jun2018 update clarifies the question, asking how to embed one Rmd document in another Rmd document. Per the R Markdown website, R Markdown requires a single Rmd file. It does not currently support the embedding of one Rmd file within another Rmd document.
That said, with the bookdown package, you could structure the Rmd files as chapters in a book, where each Rmd file is a chapter in the book. For details, see Bookdown: Authoring Books with R Markdown 1.4 - Two Rendering Approaches, the Getting Started page, and the Bookdown Demo github repository for an example book built in bookdown.
25Jun2018 Update: Printing the code in an Appendix
Per the comments from the OP, the reason for including the functions in an Rmd file instead of an R file was to obtain a formatted printout of the code in an Appendix. This is possible with the technique I originally posted plus a few changes.
Use named chunks to put the code in an appendix, and use the arguments echo=TRUE and eval=FALSE to avoid executing it multiple times.
Execute code from the Appendix in main flow of the document by way of the ref.label= argument, and keep the code from printing in the main document with the echo=FALSE argument.
In addition to using the source() function, one must print each function in another chunk in the appendix in order to obtain formatted print of each function.
An updated version of my example Rmd file is listed below.
---
title: "TestIncludedFiles"
author: "Len Greski"
date: "June 24, 2018"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
## Background
A question was posted on [Stackoverflow](https://stackoverflow.com/questions/51013924/calling-functions-in-a-second-file-when-compiling-rmd-files-with-knitr) about how to include functions from one Rmd file while knitting another.
If the second file contains R functions to be accessed in the second Rmd file, they're best included as R files rather than Rmd. In this example we'll include three files of functions from the Johns Hopkins University *R Programming* course: `pollutantmean()`, `corr()`, and `complete()`. We'll execute them in a subsequent code block.
After an update to the original post where the original poster noted that he included the functions in an Rmd file in order to provide a formatted printout of the code in the report as an appendix, I've modified this example to account for this additional requirement.
```{r ref.label="sourceCode",echo=FALSE}
# execute sourceCode chunk from appendix
```
## Executing the sourced files
Now that the required R functions have been sourced, we'll execute them.
```{r runCode, echo=TRUE}
pollutantmean("specdata","nitrate",70:72)
complete("specdata",1:10)
corr("specdata",threshold=500)
```
# Appendix
```{r sourceCode,echo=FALSE,eval=FALSE}
# use source() function to source the functions we want to execute
source("./rprogramming/oneLine_pollutantmean.r")
source("./rprogramming/oneLine_complete.r")
source("./rprogramming/oneLine_corr.r")
```
The following is an inventory of the functions used in this Rmd file.
```{r }
pollutantmean
complete
corr
```
...and the output for the Appendix section of the document (with redactions to avoid publishing answers to a class programming assignment).
Original Answer
If the second Rmd file only contains functions, you're better off saving them as an R file and using source() to include them in Main.Rmd. For example:
date: "June 24, 2018"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
## Background
A question was posted on [Stackoverflow](https://stackoverflow.com/questions/51013924/calling-functions-in-a-second-file-when-compiling-rmd-files-with-knitr) about how to include functions from one Rmd file while knitting another.
If the second file contains R functions to be accessed in the second Rmd file, they're best included as R files rather than Rmd. In this example we'll include three files of functions from the Johns Hopkins University *R Programming* course: `pollutantmean()`, `corr()`, and `complete()`. We'll execute them in a subsequent code block.
```{r sourceCode,echo=TRUE}
# use source() function to source the functions we want to execute
source("./rprogramming/pollutantmean.r")
source("./rprogramming/complete.r")
source("./rprogramming/corr.r")
```
## Executing the sourced files
Now that the required R functions have been sourced, we'll execute them.
```{r runCode, echo=TRUE}
pollutantmean("specdata","nitrate",70:72)
complete("specdata",1:10)
corr("specdata",threshold=500)
```
...produces the following output:
DISCLOSURE: This answer includes techniques that I previously posted as a blog article in 2016, ToothGrowth Assignment: Accessing R Code from an Appendix in Knitr.

Estimating read time for an R Markdown document and printing it possibly below the title

System: Windows 10, RStudio 1.0.153, R 3.4.1
I write blog posts in R Markdown. I would like to use Medium's approach of including an estimated reading time for the blog post just underneath the title, because research has suggested that including reading time in your blog post increases the likelihood that people will read it all. To estimate reading time I would like to use a simplified version of the algorithm used on Medium:
compute the number N of words in the document, preferably excluding the YAML preface and the R code chunks with echo=FALSE, i.e., all the text that my visitors won't see)
compute the number m of imported images and R-generated plots: if counting plots makes the problem too complicated, it's ok to count only imported images. I import each image separately in its own R chunk with the include_graphics function, thus even just counting the number of occurences of the include_graphics keyword would work.
estimated time in minutes is
seconds <- N/200*60+12*m
minutes <- round(seconds/60)
fin <- ifelse(minutes>1, " minutes", " minute")
estimation <- paste0("Estimated reading time: ", minutes, fin)
The code for the estimation should be executed each time the R Markdown doc is knitted, so ideally inside an R chunk in the R Markdown doc itself, or in an external function which is then sourced in an R chunk. This way, the estimation would automatically update as I edit the R Markdown document. How can I do that? For example, given the sample R document below, which has N=106 words and m=2 images, the text
"Estimated reading time: 1 minute"
should be be included in the HTML output, just below the title, author and date.
Sample R Markdown document:
---
title: "Testy test"
author: "anonymous"
date: "`r Sys.Date()`"
output: html_document
---
```{r setup, include=FALSE}
library(knitr)
opts_chunk$set(echo = TRUE)
```
## R Markdown
This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see <http://rmarkdown.rstudio.com>.
When you click the **Knit** button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
```{r load_data, echo=FALSE}
include_graphics("doge.jpg")
```
## Including Plots
You can also embed plots, for example:
```{r pressure, echo=FALSE}
plot(pressure)
```
Note that the `echo = FALSE` parameter was added to the code chunk to prevent printing of the R code that generated the plot.

Why does the test-RMarkdown in R-Studio give an html that does not show any results after pressing the knit-button?

I opened a new R Markdown file in R studio and got the default small working example.
---
title: "test"
author: "Katharina Zweig"
date: "30. Januar 2016"
output: html_document
---
This is an R Markdown document. Markdown is a simple formatting syntax
for authoring HTML, PDF, and MS Word documents. For more details on using
R Markdown see <http://rmarkdown.rstudio.com>.
When you click the **Knit** button a document will be generated that
includes both content as well as the output of any embedded R code chunks
within the document. You can embed an R code chunk like this:
```{r}
summary(cars)
```
You can also embed plots, for example:
```{r, echo=FALSE}
plot(cars)
```
Note that the `echo = FALSE` parameter was added to the code chunk to
prevent printing of the R code that generated the plot.
It says, you only need to press the knit button to create an HTML
containg the text, the code and the results of the code. I got some long
error logs that were hardly helpful. Neither did changing the output to
PDF and Word - same result: text was there, code was there, no results of
running the code. By producing the output, the original file vanished.
What is wrong?
When the knit button is used on a file not yet saved, it asks you under which name to save it. The file needs to be saved as an Rmd file - just give no extension and R-Studio will do it right. Then, the file does not vanish and the resulting document contains the results of the r commands. I thought it asked where to save the output and gave it the extension of the output file, i.e., either myfile.html / myfile.pdf / myfile.doc.
In the chunk option try this:
{r, results='asis'}
summary(cars)
You can also embed plots, for example:
{r, echo=FALSE, results='asis'}
plot(cars)
The results = 'asis' command should output the tables and graphs if not please let me know.

Resources