R package to knit a markdown document given some data - r

I am writing a basic R package that reads in data from a user specified database and spits out a markdown report with predefined graphs and tables etc. I have placed the .Rmd file in the R folder, and have a user level function that reads in the data and knits it.
# create_doc.R
create_doc <- function(directory = NULL,
database_name_name) {
if (is.null(directory)) directory <- tclvalue(tkchooseDirectory(
title = "Choose Folder for Input and Output"))
rmarkdown::render("R/doc_generator.Rmd", output_dir = directory)
}
This works fine on my computer, but when I build the package, the .Rmd file has been deleted. This means I can't give it to other users for use on other computers. I realise that the R folder may not be the correct place for this file (I guess it deletes any files not ending in .R), but I'm not sure where else to put it. It is not package documentation, it creates the end result when using the package.
Googling has not helped so far. Is it possible to knit a document using a function in an R package? If yes, what am I doing wrong. If no, are there any other suggestions on how to achieve this?

Related

Sourcing R files housed within an R project and maintaining relative paths

Okay, so I like to use R projects in Rstudio for scripts and data that I'm working with. However, let's say I want to source those scripts in another directory...R does not detect the .Rproj file unless the script is called from the directory where it is housed. Is there any way to source an R script that is part of an R project from another directory?
This is relevant as I have a system where I perform analyses and make figures in one directory, but then produce LaTeX documents that use those figures in another directory. I like to be able to source the R scripts that make the figures and save them to the directory where I'm writing in LaTeX.
Here's a MRE:
With an R project already created in directory (done via Rstudio)...let's call it ~/test.
Create some data:
a <- 1:10
dat <- data.frame(a = a, b = a + rnorm(length(a), 10, 2))
save(dat, file = "test.RData")
Place the following script in ~/test. Let's call it test.R.
load("test.RData")
pdf(file = "plot.pdf")
plot(b ~ a, data = dat)
dev.off()
Works great, right? But if we try the following from any other directory R can't figure it out.
cd ~
Rscript ~/test/test.R
Any thoughtful solutions? I suppose it's easy enough to just setwd() in the script that I'm sourcing the original script from, but this sort of defeats the whole purpose of using R projects.
You could use setwd("~/test/") at the beginning of the script and if necessary change it back later on.

Put figure directly into Knitr document (without saving file of it in folder) Part 2

I am extending a question I recently posted here (Put figure directly into Knitr document (without saving file of it in folder)).
I am writing an R package that generates a .pdf file for users that outputs summarizations of data. I have a .Rnw script in the package (here, my MWE of it is called test.Rnw). The user can do:
1) knit("test.Rnw") to create a test.tex file
2) "pdflatex test.tex" to create the test.pdf summary output file.
The .Rnw file generates many images. Originally, these all got saved in the current directory. These images being saved to the directory (or maybe the .aux or .log files that get created upon calling pdflatex on the .tex file) just does not seem as tidy as it could be (since users must remember to delete these image files). Secondarily, I also worry that this untidiness may cause issues when scripts are run multiple time.
So, in my previous post, we improved the .Rnw file by saving the images to a temporary folder. I have been told the files in the temporary folder get deleted each time a new R session is opened. However, I still worry about certain things:
1) I feel I may need to insert a line, like the one on line 19:
system(sprintf("%s", paste0("rm -r ", temppath, "/*")))
to automatically delete the files in the temporary folder each time the .Rnw file is run (so that the images do not only get deleted each time R gets restarted). This will keep the current directory clean of the images, and the user will not have to remember to manually delete the images. However, I do not know if this "solution" will pass CRAN standards to have a line to delete files in the temporary folder. The reason is that it deletes files in the user's system, which could cause problems if other programs are writing files to the temporary folder. I feel I have read about CRAN not allowing files to be written/deleted from the user's computer for obvious reasons. How strict would CRAN be about such a practice? Is there a safe way to go about it?
2) If writing and deleting the image files in a temporary file will not work, what is another way to accomplish the same effect (run the script without having cumbersome image files created in the folder)? Is it possible to instead have the images directly embedded in the output file (not needing to be saved to any directory)? I am pretty sure this is not possible. However, I have been told it is possible to do so with .Rmd, and that I could convert my .Rnw to .Rmd. This may be difficult because the .Rnw file must follow certain formats (text and margins) for the correct output, and it is very long. Is it possible to make use of the .Rmd capability (of inserting images directly into the output) only for the chunks that generate images, without rewriting the entire .Rnw file?
Below is my MWE:
\documentclass[nohyper]{tufte-handout}
\usepackage{tabularx}
\usepackage{longtable}
\setcaptionfont{% changes caption font characteristics
\normalfont\footnotesize
\color{black}% <-- set color here
}
\begin{document}
<<setup, echo=FALSE>>=
library(knitr)
library(xtable)
library(ggplot2)
# Specify directory for figure output in a temporary directory
temppath <- tempdir()
# Erase all files in this temp directory first?
#system(sprintf("%s", paste0("rm -r ", temppath, "/*")))
opts_chunk$set(fig.path = temppath)
#
<<diamondData, echo=FALSE, fig.env = "marginfigure", out.width="0.95\\linewidth", fig.cap = "The diamond dataset has varibles depth and price.",fig.lp="mar:">>=
print(qplot(depth,price,data=diamonds))
#
<<echo=FALSE,results='asis'>>=
myDF <- data.frame(a = rnorm(1:10), b = letters[1:10])
print(xtable(myDF, caption= 'This data frame shows ten random variables from the distribution and a corresponding letter', label='tab:dataFrame'), floating = FALSE, tabular.environment = "longtable", include.rownames=FALSE)
#
Figure \ref{mar:diamondData} shows the diamonds data set, with the
variables price and depth.Table \ref{tab:dataFrame} shows letters a through j
corresponding to a random variable from a normal distribution.
\end{document}

Specify output directory for R script with knit_hooks$set(purl = hook_purl)

I understand that we shouldn't purl() a chunk with knitrbut instead use knit_hooks$set(purl = hook_purl). That works, but it puts the R script in the working directory. I would like to put it in an R/ directory. It's probably due to my own incompetence, but I couldn't find anything about specifying the directory for the R script (I looked in the R documentation as well as several places online). Anyone have any ideas? I'm knitting from within RStudio, by the way.
You can generate the script under the current directory, and file.rename() it to the R/ directory.

Including Rnw files within a package

I am writing a package and the sole purpose of this package is to create reports. I am using knit to generate the reports from a .Rnw file. This all happens within a function in the package. e.g.
create_report <- function(data) {
knit2pdf(from = "myreport.Rnw", to = "myreport.tex")
# The Rnw in the knit2pdf function uses the data passed to this function
}
My question is simple. Where within my package folders do I store the .Rnw file? Currently my package has the following folders:
.Rproj.user
data
man
R
I am just not sure where my Rnw scripts should go? Do I need another folder called LaTeX for example? This is like having a separate folder for C++ scripts, for example.
Note, I am not looking to create a vignette. I know how to do this. This package is used to do some data manipulation and then generate a report on the data.
I have tried to lay everything out as clearly as I can as some questions I have asked on here before have been misinterpreted. Please ask if anything is unclear.
To answer this question:
Include the .Rnw files in ./pkgname/inst/latex then when you build the package, the ./latex folder will go to the root level of the package. You can then extract the .Rnw files using system.file("latex", "mytemplate.Rnw", package = "pkgname").

How to display images in Markdown on github generated from knitr without using external image hosting?

I like uploading repositories to github that include multiple R Markdown and Markdown files.
Here is an example of such a markdown file on github. And here's a screen grab.
The problem is that images do not display. You can click on the image, and you will go to where the file is stored.
The file referenced is:
https://github.com/... /blob/.../myfigure.png
whereas I presume it needs to reference
https://github.com/... /raw/.../myfigure.png
Things I considered:
imgur: I could use external image hosting (e.g., see this example) by adding the following code:
```{r setup}
opts_knit$set(upload.fun = imgur_upload) # upload all images to imgur.com
````
However, for various reasons I don't want to do this (I have trouble uploading when behind a firewall; it's slow; it creates an unnecessary dependency)
Rpubs: There's also RPubs which is quite cool. However, at time of posting it seems more suited to single markdown documents rather than multiple R markdown documents. And it doesn't provide such a close link between source R Markdown and the Markdown document.
Question
Is there a workflow for using R Markdown and knitr to produce Markdown files which when uploaded to github permit the Markdown file to display images stored in the github repository?
This used to be part of the minimal example, use
opts_knit$set(base.url='https://github.com/.../raw/.../')
See the changes here and here.
Also see http://yihui.name/knitr/options.
EDIT [with update to restore base.url to former value
Regarding switching, you could define a function as
create_gitpath <- function(user, repo, branch = 'master'){
paste0(paste('https://github.com', user, repo, 'raw', branch, sep = '/'),'/')
}
my_repo <- create_gitpath(user, repo)
knit.github <- function(..., git_url ){
old_url <- opts_knit$get('base.url')
on.exit(opts_knit$set(base.url = old_url))
opts_knit$set(base.url = git_url)
knit(..., envir = parent.frame())
}
Run with knit until you want to push to github then run knit.github(..., git_url = my_repo)
What about the following code at the beginning of your markdown file?
``` {r setup,echo=FALSE,message=FALSE}
gitsubdir <- paste(tail(strsplit(getwd(),"/")[[1]],1),"/",sep="")
gitrep <- "https://github.com/mpiktas/myliuduomenis.lt"
gitbranch <- "master"
opts_knit$set(base.url=paste(gitrep,"raw",gitbranch,gitsubdir,sep="/"))
```
It is possible to tweak it so that gitrep and gitbranch will be reported by git. Here I assumed that I am one directory level below the main git repository directory. Again this might be tweaked to accommodate more complicated scenarios.
I've tested on github, here is the Rmd file and corresponding md file.

Resources