Blogdown and Data Files - r

I'm just starting with Blogdown and am running into an issue. I wrote a post where I've read a .CSV file and then generated a Kable table from its contents. No sweat. However, when I build the site that CSV file is copied to "public/post/data". I do not want that file to be made available to the public since it contains proprietary information (the columns I've pulled for this table are public info, but other columns are not). I do not think that the CSV file is needed on the public webpage once it is built, but I cannot find a way to not upload it.
I've tried putting the .CSV file in different locations (the "resources" folder and another folder I created named "data_files"). I added .CSV to the ignoreFiles param in config.toml: ignoreFiles = ["\.Rmd$", "\.Rmarkdown$", "_files$", "_cache$", "\.csv$"] Nothing, though, seems to help.
Is there some way to use a .CSV file to generate a table, ggplot, etc. without having that file uploaded to the public website?
Here's the code chunk where I read the CSV file in case that will help.
```{r peerList, echo=FALSE, results='asis'}
dataloc <- here("resources","peers.csv")
df <- read.csv(dataloc)
names <- select(df,Name,City,State,County,Zip) %>%
mutate(Zip = str_sub(Zip,1,5)) %>% # change the 9-digit zips to 5
mutate(County = str_replace(County,"County","")) %>% # del the word 'county'
arrange(State,Name)
kable(names, "html") %>%
kable_styling(bootstrap_options = "striped",
full_width = F,
position = "center"
)
```

Related

How to extract images from word and powerpoint using media_extract in r?

I am working in rmarkdown to produce a report that extracts and displays images extracted from word and powerpoint.
To do this, I am using the officer package. It has a function called media_extract which can 'extract files from an rdocx or rpptx object'.
I have two issues:
How to view or use the image after I have located it.
In word, how to locate the image without the media_path column.
I have been able to locate an image in pptx using this function: the pptx_summary function creates a data frame with a media_path column, which displays a file path for image elements. The media_path is then used as an argument in the media_extract function to locate the image. See example code from package documentation below:
example_pptx <- system.file(package = "officer",
"doc_examples/example.pptx")
doc <- read_pptx(example_pptx)
content <- pptx_summary(doc)
image_row <- content[content$content_type %in% "image", ]
media_file <- image_row$media_file
png_file <- tempfile(fileext = ".png")
media_extract(doc, path = media_file, target = png_file)
However, when I run media_extract it returns 'TRUE', which is the example output, but I am unsure how to add the image to my report. I've tried assigning the media_extract as a value eg
image <- media_extract(doc, path = media_file, target = png_file)
but this returns 'FALSE'.
How do I include the image as an image in my report?
The second issue I'm having is how to locate an image in word. The documentation for media_extract says it can be used to extract images from both .docx and .pptx, I have only managed to get it to work for the latter. I haven't been able to create a file path for .docx.
The file path is generated using either; docx_summary or pptx_summary, depending on the file type, which create a data frame summary of the files. The pptx_summary includes a column media_path, which displays a file path for the image. The docx_summary data frame doesn't include this column. Another stackoverflow post posed a solution for this using word/media/ subdir which seemed to work, however I'm not sure what this means or how to use it?
How do I extract an image from a word doc, using word/media/ subdir as the media path?
media_extract() is a function that copy the media where you want. We can show the extracted images using R Markdown with at least 3 methods:
knitr::include_graphics()
regular markdown
magick::image_read()
They are illustrated below:
---
title: "media_extract usage"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
library(officer)
library(flextable)
example_pptx <- system.file(package = "officer",
"doc_examples/example.pptx")
doc <- read_pptx(example_pptx)
content <- pptx_summary(doc)
image_row <- content[content$content_type %in% "image", ]
media_file <- image_row$media_file
png_file <- tempfile(fileext = ".png")
media_extract(doc, path = media_file, target = png_file)
```
## include_graphics
```{r out.width="200px"}
knitr::include_graphics(png_file)
```
## markdown
You can't use `tempfile()` here - path is better when defined as relative.
Let's write it to "./file.png".
```{r results='hide'}
media_extract(doc, path = media_file, target = "file.png")
```
![](file.png){style="width:200px;"}
## magick
```{r out.width="200px"}
magick::image_read(png_file)
```
I have continued to research the second issue and found an answer, so thought I would share!
The difficultly I was having extracting images from docx was due to the absence of a media_file column in the summary data frame (produced using docx_summary), which is used to locate the desired image. This column is present in the data frame produced for pptx pptx_summary and is used in the example code from the package documentation.
In the absence of this column you instead need to locate the image using the document subdirectory (file path when the docx is in XML format), which looks like: media_path <- "/word/media/image3.png"
If you want see what this structure looks like you can right click on your document >7-Zip>Extract files.. and a folder containing the document contents will be created, otherwise just change the image number to select the desired image.
Note: sometimes images have names that do not follow the image.png format so you may need to extract the files to find the name of the desired image.
Example using media_extract with docx.
#extracting image from word doc using officer package
report <- read_docx("/Users/user.name/Documents/mydoc.docx")
png_file <- tempfile(fileext = ".png")
media_file <- "/word/media/image3.png"
media_extract(report, path = media_file, target = png_file)

Knit with parameters in R Markdown by selecting shapefile (.shp) as file input

I am trying to render an R Markdown script to a PDF using Knit with parameters. I want other people to be able to render the report using a UI generated by the YAML header. I would like to use a shiny control (file) as as a parameter input instead of the generic text one (i.e. the UI opens up a window in which the user can select the file from a File Explorer).
Minimal reproducible example:
I first create a copy of the sf package's nc.shp so that I can easily find it when testing the UI:
library(sf)
sf_nc <- sf::st_read(system.file("shape/nc.shp", package = "sf"), quiet = TRUE)
sf::st_write(sf_nc, 'C:/Temp/nc_temp.shp')
Here is the R Markdown (.rdm) file
---
title: "Params_Test"
output: pdf_document
params:
shp_program:
input: file
label: 'NC Shapefile'
value: 'C:/Temp/nc_temp.shp'
multiple: FALSE
buttonLabel: 'browse shapefiles'
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
```{r, eval = TRUE, include = TRUE}
library(sf)
library(ggplot2)
sf_nc_temp <- sf::st_read(params$shp_program)
plot <- ggplot2::ggplot(sf_nc_temp) +
geom_sf(aes(color = NAME)) +
geom_sf_text(aes(label = NAME))
plot
```
The tool runs fine when I just Knit using the default (Knit drop down icon > Knit with parameters > Knit). This uses the string to the shapefile path as text.
However I get the following error message when I try to select the shapefile from the UI: Line 20 Error: Cannot open 'C:\Users\username\AppData\Local\Temp\1\Rtmp8gVT2L\file2784148636a\0.shp"; The source could be corrupt or not supported. See st_drivers() for a list of supported formats.
I tried replacing the chunk based on: How do I access the data from a file passed as parameters in a RMarkdown document?
library(sf)
library(ggplot2)
cat(params$shp_program)
c <- sf::st_read(params$shp_program)
c
plot <- ggplot2::ggplot(c) +
geom_sf(aes(color = NAME)) +
geom_sf_text(aes(label = NAME))
plot
As #lbusett had mentioned in their comment, you're selecting only one part of the shapefile. The *.shp file is only a component of the shapefile, which is comprised of several files (.shp, .shx, .dbf, etc.).
One way to work around this is to adjust your parameters so that multiple = TRUE, which will allow you to select all of the files associated with a particular shapefile (i.e. place.shp, place.shx, place.df, etc.)
---
title: "Params_Test"
output: pdf_document
params:
shp_program:
input: file
label: 'NC Shapefile'
value: 'C:/Temp/nc_temp.shp'
multiple: TRUE
buttonLabel: 'browse shapefiles'
---
Later in your code, you will need to identify the respective file paths of each file and copy them to your working directory. This will ensure that they all share the same name and location.
Set the working directory and then use str_which() to identify the appropriate index of params$shp_program for each respective filetype, as follows:
```
{r, eval = TRUE, include = TRUE}
library(sf)
library(ggplot2)
setwd("C:/temp")
shp_index<- str_which(params$shp_program, ".shp")
shx_index <- str_which(params$shapefile, ".shx")
dbf_index <- str_which(params$shapefile, ".dbf")
prj_index <- str_which(params$shapefile, ".prj")
file.copy(params$shapefile[shp_index], "temp_shape.shp")
file.copy(params$shapefile[shx_index], "temp_shape.shx")
file.copy(params$shapefile[dbf_index], "temp_shape.dbf")
file.copy(params$shapefile[prj_index], "temp_shape.prj")
sf_nc_temp <- sf::st_read("temp_shape.shp")
plot <- ggplot2::ggplot(sf_nc_temp) +
geom_sf(aes(color = NAME)) +
geom_sf_text(aes(label = NAME))
plot
```
When using parameters to load files through Shiny, R copies the selected files over to a temporary directory and renames them. Thus, if you selected "place.shp", "place.shx", and "place.dbf" they would be copied to separate subfolders in your local temp directory as "0.shp", "1.shx", and "2.dbf". The original file path is lost in this process, so it prevents people after you from seeing which files you selected. If your workflow requires peer review, this can be a deal breaker.
In addition, you may encounter file size limitations that require additional coding to increase beyond the 5mb default. Specifically, you'll need to drop the following code at the top to increase the file size limit to 30 MB:
options(shiny.maxRequestSize = 30*1024^2)
As a result of these issues, I find it easierto use the file.choose() function instead of parameters. Doing so will allow you to select just the .shp file while preserving the original filepath, so that R will know where the rest of the shapefile's component files are located.

How can I make the output of my function (several ggplot2 graphs) an html file (displaying those graphs)?

I'm writing a personal use package which trains/tests models, and finally runs a myriad of LIME and DALEX explanations on them. I save these as their own ggplot2 objects (say lime_plot_1), and at the end of the function these are all returned to the global environment.
However, what I would like to have happen is that, at the end of the function, not only would I have these graphs in the environment but a small html report would also be rendered - containing all the graphs that were made.
I would like to point out that while I do know I could do this by simply using the function within an Rmarkdown or Rnotebook, I would like to avoid that as I plan on using it as an .R script to streamline the whole process (since I'll be running this with a certain frequency), and from my experience running big chunks in .Rmd tends to crash R.
Ideally, I'd have something like this:
s_plot <- function(...){
1. constructs LIME explanations
2. constructs DALEX explanations
3. saves explanations as ggplot2 objects, and list them under graphs_list
4. render graphs_list as an html file
}
1, 2, and 3 all work but I haven't found a way to tackle 4. that doesn't include doing the whole process in a .Rmd file.
EDIT: Thanks to #Richard Telford's and #Axeman's comments, I figured it out. Below is the function:
s_render <- function(graphs_list = graphs_list, meta = NULL, cacheable = NA){
currentDate <- Sys.Date()
rmd_file <- paste("/path/to/folder",currentDate,"/report.Rmd", sep="")
file.create(rmd_file)
graphs_list <- c(roc_plot, prc_plot, mp_boxplot, vi_plot, corr_plot)
c(Yaml file headers here, just like in a regular .Rmd) %>% write_lines(rmd_file)
rmarkdown::render(rmd_file,
params = list(
output_file = html_document(),
output_dir = rmd_file))}
First, create a simple Rmarkdown file, that takes a parameter. The only objective of this file is to create the report. You can for instance pass a file name:
---
title: "test"
author: "Axeman"
date: "24/06/2019"
output: html_document
params:
file: 'test.RDS'
---
```{r}
plot_list <- readRDS(params$file)
lapply(plot_list, print)
```
I saved this as test.Rmd.
Then in your main script, write the plot list to a temporary file on disk, and pass the file name to your markdown report:
library(ggplot2)
plot_list <- list(
qplot(1:10, 1:10),
qplot(1:10)
)
file <- tempfile()
saveRDS(plot_list, file)
rmarkdown::render('test.Rmd', params = list(file = file))
An .html file with the plots is now on your disk:

knitr: exporting to html file but keeping style

I just found out the awesome knitr library in R, when viewing the result in the viewer it seems nice. However, when I write this to a html file the style is lost.
Code
library(knitr)
library(kableExtra)
some.table <-
data.frame (
x = rep(1,3),
y = rep(1,3)
)
some.table
x <- kable(some.table, format = "html") %>%
kable_styling(bootstrap_options = "striped", full_width = F, position = "left")
x
file <- file('test.html')
write(x, file)
Table in viewer
Table in browser
How can I export the table with the same style to a html file?
Note that I have more data in the html file, so I should be able to append it.
Response to comment(s)
User: #Hao
When I use 'inspect element' in the Rstudio viewer, I can find this link to a stylesheet:
However the code herein seems to be huge as it is 582.298 characters.
The typical way of doing this is to put the code inside a rmarkdown document. It will handle everything for you.
The only case you need to use the save_kable function kableExtra is that you have lots of tables and you want to save them as fragments. In that case, you can use
library(kableExtra)
cars %>%
kable() %>%
kable_styling() %>%
save_kable()

rmarkdown shiny user input in chunk?

I have a shiny app that allows the user to download an HTML file (knitted from a .Rmd file) that includes the code used to run the analysis based on all the user inputs. I am trying to write the base .Rmd file that gets altered when user inputs vary. I am having trouble including user input variables (e.g. input$button1) into R code chunks. Say the user input for input$button1 = "text1".
```{r}
results <- someFun(input$button1)
```
And I'd like to have it knitted like this:
```{r}
results <- someFun('text1')
```
Every time I download the knitted HTML though, I get input$button1 getting written to file. I would also like to be able to produce an .Rmd file that is formatted with this substitution. It seems like knit_expand() might be the key, but I can't seem to relate available examples to my specific problem. Is the proper way to knit_expand() the whole .Rmd file and specify explicitly all the parameters you want subbed in, or is there a more elegant way within the .Rmd file itself? I would prefer a method similar to this, except that instead of using the asis engine, I could use the r one. Any help would be greatly appreciated. Thanks!
Got it. Solution below. Thanks to Yihui for the guidance. The trick was to knit_expand() the whole .Rmd file, then writeLines() to a new one, then render. With hindsight, the whole process makes sense. With hindsight.
For the example, p1 is a character param 'ice cream' and p2 is an integer param 10. There is a user-defined param in ui.R called input$mdType that is used to decide on the format provided for download.
Rmd file:
Some other text.
```{r}
results <- someFun("{{p1}}", {{p2}})
```
in the downloadHandler() within server.R:
content = function(file) {
src <- normalizePath('userReport.Rmd')
# temporarily switch to the temp dir, in case you do not have write
# permission to the current working directory
owd <- setwd(tempdir())
on.exit(setwd(owd))
file.copy(src, 'userReport.Rmd')
exp <- knit_expand('userReport.Rmd', p1=input$p1, p2=input$p2)
writeLines(exp, 'userReport2.Rmd')
out <- rmarkdown::render('userReport2.Rmd', switch(input$mdType,
PDF = pdf_document(), HTML = html_document(), Word = word_document()))
}
file.rename(out, file)
}
Resulting userReport2.Rmd before rendering:
```{r}
results <- someFun("ice cream", 10)
```

Resources