In R flextable can complex symbols appear in column headings - r

Using flextable to make a table, I would like a column heading to be pi (with a hat) and an i subscript. With flextable 6 I see that I can add basic formatting to titles using the compose() function. I can get pi sub i easily but I don't see how to put the hat over the pi.
test <- data.frame(hat_pi_i= c(0.1, 0.9))
# devtools::install_github("davidgohel/flextable") # version 6.x
# devtools::install_github("davidgohel/officer") # for flextable 6 to work
library(flextable)
# compose conflicts with purrr & igraph
flextable(test) %>%
compose(part = "header", j = "hat_pi_i",
value = as_paragraph("hat π", as_sub("i")))
Is it possible to tweak the title above to put the hat on the pi instead of as a word next to it?

There's a hackish way to do it:
flextable(test) %>%
compose(part = "header", j = "hat_pi_i",
value = as_paragraph("\U1D70B\U0302", as_sub("i")))
This uses Unicode characters: \U1D70B is the math style pi, and \U0302 says to put a circumflex on the previous character. You could also do the first char with π, but it doesn't look as good, because (at least on my system) the circumflex ends up misplaced:

Related

Using R/exams in bookdown document (especially for HTML output)

I have created a "book" using bookdown. I would love to be able to add interactive quizzes, without needing shiny etc.
Is it possible using R/exams (http://www.R-exams.org/) with bookdown? I'm mainly interested in the HTML output; PDF output a bonus but hardly essential. The web page offers promise:
Based on (potentially) dynamic exercise templates large numbers of personalized exams/quizzes/tests can be created for various systems: [...] and the possibility to create custom output (in PDF, HTML, Docx, ...).
Exercise types include multiple-choice or single-choice questions, numeric or text answers, or combinations of these. Formatting can be done either in Markdown or LaTeX with the possibility to generate dynamic content using R, e.g., random numbers, graphics, data sets, or shuffled text blocks.
It sounds great. Does anyone know if it is possible to use exams with bookdown (even if just some features)?
If it is possible: how? Any pointers?
If it is not possible: does anyone know a way to do something similar?
General remarks
R/exams is indeed extensible leveraging its building blocks is relatively easy. The workhorse function underlying all the exams2xyz() interfaces is called xexams(). It proceeds in four steps:
sweave: The exercise files are copied to a temporary directory and then run through R, by default using xweave() which provides a unified convenience interface to utils::Sweave() (for Rnw files) and knitr::knit() (for Rmd files).
read: The resulting weaved files are read into R, by default using read_exercise(). For each exercise this yields a list of question, questionlist, solution, solutionlist, metainfo, and supplements. All elements are always there but may be empty, e.g., when there is no solution environment provided in the exercise or when there are no supplementary files.
transform: By default this is empty but can be used to transform the exercise list elements above to a desired format, e.g., HTML.
write: By default this is empty, but can be used to write out results for each of the n replications of the exam.
Embedding exercise texts in Markdown
When you write your exercises in R/Markdown (Rmd) files you can easily run them through xexams() to get some randomized version of them. As an example, let's consider the numeric (num) and single-choice (schoice) version of the derivative exercise, see: deriv, deriv2. Using 1 as the random seed, the numeric exercise has the following question along with the correct solution and tolerance:
set.seed(1)
d1 <- xexams("deriv.Rmd")[[1]][[1]]
d1$question
## [1] "What is the derivative of $f(x) = x^{2} e^{2.3 x}$, evaluated at $x = 0.56$?"
d1$metainfo$solution
## [1] 6.68
d1$metainfo$tolerance
## [1] 0.01
The reason for the [[1]][[1]] index is that this is from the first (and only) exam, the first (and only) exercise. If you generate, say, xexams(..., n = 3) then the first index could be in 1, 2, 3. Similarly, you could inlude more than one exercise if you want.
The single-choice version has
set.seed(1)
d2 <- xexams("deriv2.Rmd")[[1]][[1]]
d2$question
## [1] "What is the derivative of $f(x) = x^{2} e^{2.3 x}$, evaluated at $x = 0.66$?"
## [2] ""
d2$questionlist
## [1] "$8.01$" "$14.09$" "$10.59$" "$15.35$" "$6.02$"
d2$metainfo$solution
## [1] FALSE FALSE TRUE FALSE FALSE
Both of these would be very easy to integrate as static text into any R/Markdown document.
Embedding exercise texts in webex
To turn the static text into a dynamic element in HTML, e.g., a text field where readers could enter a number which is then compared with the reference value from the solution, it is possible to employ Javascript for example. One lightweight R-based framework for generating such output is the webex package by Dale Barr and Lisa DeBruine.
In webex you can create fill-in-the-blank interactions via fitb() for numeric solutions with an optional tolerance (num in R/exams) or for character solutions (string in R/exams). Also, you can create drop-down menu interactions via mcq() for single-choice questions (schoice in R/exams). (Note: The jargon regarding choice questions is not unified: What R/exams calls single-choice is also referred to as multiple-choice. In this case multiple-answer is often used for what R/exams calls multiple-choice.)
At the moment, webex does not support radio buttons as an alternative to drop-down menus. Also, check-boxes for multiple-choice (aka multiple-answer) questions is not available.
Below, I illustrate how to embed simple schoice, num, and string questions in webex. For more elaborate examples with supplementary files, see the comments below. Also, cloze would also be doable but take some more work.
---
title: "Web Exercises with R/exams & webex"
output: webex::webex_default
---
```{r setup, include = FALSE}
knitr::opts_chunk$set(echo = TRUE)
library("webex")
library("exams")
```
`r style_widgets("#DF536B", "#61D04F")`
## `schoice`
```{r swisscapital, echo = FALSE, results = "asis"}
x <- xexams("swisscapital.Rmd")[[1]][[1]]
names(x$questionlist) <- ifelse(x$metainfo$solution, "answer", "")
x <- c(
x$question,
"",
mcq(x$questionlist),
"",
hide("Correct solution"),
"",
x$solution,
"",
paste("*", x$solutionlist),
"",
unhide()
)
writeLines(x)
```
## `num`
```{r deriv, echo = FALSE, results = "asis"}
x <- xexams("deriv.Rmd")[[1]][[1]]
x <- c(
x$question,
"",
fitb(x$metainfo$solution, tol = x$metainfo$tol,
width = min(100, max(20, nchar(x$metainfo$solution)))),
"",
hide("Correct solution"),
"",
x$solution,
"",
unhide()
)
writeLines(x)
```
## `string`
```{r function, echo = FALSE, results = "asis"}
x <- xexams("function.Rmd")[[1]][[1]]
x <- c(
x$question,
"",
fitb(x$metainfo$solution, width = min(100, max(20, nchar(x$metainfo$solution)))),
"",
hide("Correct solution"),
"",
x$solution,
"",
unhide()
)
writeLines(x)
```
Rendering this with rmarkdown::render() gives you a file like shown in the screenshot below. When embedding this in bookdown you need to make sure to embed the webex.css and webex.js from the package.
Further variations
Some extra work is involved when processing exercises that contain images such as boxplots. The default in xexams() is set up for PDF output but the driver$sweave can be tweaked to produce PNG output. In either case, the supplements is then a vector of file paths to the supplementary files:
set.seed(1)
b1 <- xexams("boxplots.Rmd", driver = list(sweave = list(png = TRUE)))[[1]][[1]]
b1$question
## [1] "In the following figure the distributions of a variable"
## [2] "given by two samples (A and B) are represented by parallel boxplots."
## [3] "Which of the following statements are correct? _(Comment: The"
## [4] "statements are either about correct or clearly wrong.)_"
## [5] "\\"
## [6] "![](boxplot-1.png)"
## [7] ""
b1$supplements
## boxplot-1.png
## "/tmp/RtmpA07Hau/file11d77d212e69bf/exam1/exercise1/boxplot-1.png"
## attr(,"dir")
## [1] "/tmp/RtmpA07Hau/file11d77d212e69bf/exam1/exercise1"
Additionally, you can set up a transform driver that converts the R/Markdown already to HTML (rather than having bookdown doing this later on). Here I'm selecting pandoc as the converter, using MathJax for the rendering of mathematical content (like bookdown does as well). Using base64 = TRUE instead of the FALSE below would embed the supplementary PNG image directly in the HTML code using a Base 64 encoding.
set.seed(1)
htmltrafo <- make_exercise_transform_html(converter = "pandoc-mathjax", base64 = FALSE)
b2 <- xexams("boxplots.Rmd", driver = list(sweave = list(png = TRUE), transform = htmltrafo))[[1]][[1]]
b2$question
## [1] "<p>In the following figure the distributions of a variable given by two samples (A and B) are represented by parallel boxplots. Which of the following statements are correct? <em>(Comment: The statements are either about correct or clearly wrong.)</em><br />"
## [2] "<img src=\"boxplot-1.png\" /></p>"
This is Great Achim. I was struggling to find a way to make bookdown talk to exam exercise files and found similar solution before reaching this post. The main difference is that I'm using bootstrap 4 (bookdown::bs4_book) in the html, which looks nicer.
This is how it looks in the Rmarkdown chapter file:
f_in <- fs::dir_ls('00-EOCE-Rmd/Cap08-Programação/',
type = 'file')
build_exercises(f_in, type_doc = my_engine)
The result:

Creating a dataframe from paragraph text scraped from website in R

I'm trying to scrape a website that has numerous different information I want in paragraphs. I got this to work perfect... However, I don't understand how to break the text up and create a dataframe.
Website :Website I want Scraped
Code:
library(rvest)
url <- "https://www.state.nj.us/treasury/administration/statewide-support/motor-fuel-locations.shtml"
#Reading the HTML code from the website
webpage <- read_html(url)
p_nodes<-webpage%>%
html_nodes(xpath = '//p')%>%
html_text()
#replace multiple whitespaces with single space
p_nodes<- gsub('\\s+',' ',p_nodes)
#trim spaces from ends of elements
p_nodes <- trimws(p_nodes)
#drop blank elements
p_nodes <- p_nodes[p_nodes != '']
How I want the dataframe to look:
I'm not sure if this is even possible. I tried to extract each piece of information separately and then make the dataframe like that but it doesn't work since most of the info is stored in the p tag. I would appreciate any guidance. Thanks!
Proof-of-concept (based on what I wrote in the comment):
Code
lapply(c('data.table', 'httr', 'rvest'), library, character.only = T)
tags <- 'tr:nth-child(6) td , tr~ tr+ tr p , td+ p'
burl <- 'https://www.state.nj.us/treasury/administration/statewide-support/motor-fuel-locations.shtml'
url_text <- read_html(burl)
chunks <- url_text %>% html_nodes(tags) %>% html_text()
coordFunc <- function(chunk){
patter_lat <- 'Longitude:.*(-[[:digit:]]{1,2}.[[:digit:]]{0,15})'
ret <- regmatches(x = chunk, m = regexec(pattern = patter_lat, text = chunk))
return(ret[[1]][2])
}
longitudes <- as.numeric(unlist(lapply(chunks, coordFunc)))
Output
# using 'cat' to make the output easier to read
> cat(chunks[14])
Mt. Laurel DOT
Rt. 38, East
1/4 mile East of Rt. 295
Mt. Laurel Open 24 Hrs
Unleaded / Diesel
856-235-3096Latitude: 39.96744662Longitude: -74.88930386
> longitudes[14]
[1] -74.8893
If you do not coerce longitudes to be numeric, you get:
longitudes <- (unlist(lapply(chunks, coordFunc)))
> longitudes[14]
[1] "-74.88930386"
I chose the longitude as a proof-of-concept but you can modify your function to extract all relevant bits in a single call. For getting the right tag you can use SelectorGadget extension (works well in Chrome for me). Alliteratively most browsers let you 'inspect element' to get the html tag. The function could return the extracted values in a data.table which can then be combined into one using rbindlist.
You could even advance pages programatically to scrape the entire website - be sure to check with the usage policy (it's generally frowned upon or restricted to scrape websites).
Edit
the text is not structured the same way throughout the webpage so you'll need to spend more time examining what exceptions can take place.
Here's a new function to resolve each chunk into separate lines and then you can try to use additional regular expressions to get what you want.
newfunc <- function(chunk){
# Each chunk is a couple of lines. First, we split at '\r\n' using strsplit
# the output is a list so we use 'unlist' to get a vector
# then use 'trimws' to remove whitespace around it - try out each of these functions
# separately to understand what is going on. The final output here is a vector.
txt <- trimws(unlist(strsplit(chunk, '\r\n')))
return(txt)
}
This returns the 'text' contained in each chunk as a vector of separate lines. Taking a look at the number of lines in the first 20 chunks, you can see it is not the same:
> unlist(lapply(chunks[1:20], function(z) length(newfunc(z))))
[1] 5 6 5 7 5 5 5 5 5 4 1 6 6 6 5 1 1 1 5 6
A good way to resolve this would be to put in a conditional statement based on the number of lines of text in each chunk, e.g. in newfunc you could add:
if(length(txt) == 1){
return(NULL)
}
This is because that is for the entries that don't have any text in them. since this a proof of concept I haven't checked all entries but there's some simple logic:
The first line is typically the name
the coordinates are in the last line
The fuel can be either unleaded or diesel. You can grep on these two strings to see what each depot offers. e.g. grepl('diesel', newfunc(chunks[12]))
Another approach would be to use a different set of html tags e.g. all coorindates and opening hours are in boldface and have the tag strong. You can extract those separately and then use regular expressions to get what you want.
You could search for 24(Hrs|Hours) to first extract all sites that are open 24 hours and then use selective regex on the remainder to get their operating times.
There is no simple easy answer with most web-scraping, you have to find patterns and then apply some logic based on that. Only on the most structured websites will you find something that works for the entire page/range.
You can use tidyverse package (stringr, tibble, purrr)
library(rvest)
library(tidyverse)
url <- "https://www.state.nj.us/treasury/administration/statewide-support/motor-fuel-locations.shtml"
#Reading the HTML code from the website
webpage <- read_html(url)
p_nodes<-webpage%>%
html_nodes(xpath = '//p')%>%
html_text()
# Split on new line
l = p_nodes %>% stringr::str_split(pattern = "\r\n")
var1 = sapply(l, `[`, 1) # replace var by the name you want
var2 = sapply(l, `[`, 2)
var3 = sapply(l, `[`, 3)
var4 = sapply(l, `[`, 4)
var5 = sapply(l, `[`, 5)
t = tibble(var1,var2,var3,var4,var5) # make tibble
t = t %>% filter(!is.na(var2)) # delete useless lines
purrr::map_dfr(t,trimws) # delete blanks

How to color one cell in my formattable in R

I'm trying to create a table in R with formattable. I can have already used formattable to create a table where the color depended on the values. But there is one cell that I just want to be red, no matter what is in the cell, and I can't figure out how to do this.
result_table <- cbind(Normal = c(1,2,3), Fraud = c(4,5,6))
row.names(result_table) <- c('Normal', "Suspicious", "Fraud")
my_df <- as.data.frame(result_table)
formattable(my_df)
I want to color normal vs normal green, fraud vs fraud green, normal vs fraud red, fraud vs normal red. But since I'm using formattable also for the other tables in my report, I'd like to use it here as well (so that all tables in my report have the same style.)
OK, so here is a solution, although not a perfect one. First, we define the formatters – functions that take the data and turn it into HTML code:
red.f <- formatter("span", style=x~style(color="red"))
green.f <- formatter("span", style=x~style(color="green"))
It is possible to make these functions conditional, but unfortunately neither is formatter well documented nor the authors foresaw your problem.
Now then.
my_df2 <- sapply(colnames(my_df), function(cn) {
sprintf(
ifelse(cn == rownames(my_df), green.f("%s"), red.f("%s")),
my_df[,cn]
)
})
my_df2 <- data.frame(my_df2)
rownames(my_df2) <- rownames(my_df)
formattable(my_df2)
Explanation: I cannot find a way to put the ifelse inside of the formatter. Therefore, I create a character vector with %s as a placeholder and fill it out with values from the column using sprintf. Then, I combine the vectors to a matrix using sapply, turn it into a data frame, add row names and presto.

printable table with partial bolding/italics within a cell

I'm looking for a way to print out a table from R, but with formatting within a cell, like having some of the text within a cell be bold or italic. Normally I would first make the appropriate data.frame/tibble and then I'd format and print it using a package like huxtable or kable. Looking over documentation for huxtable or kableExtra, it seems as though both packages treat formatting as properties of cells, implying that within-cell formatting is either unsupported or must be implemented some other way.
If I was making a ggplot, I'd use expression for text formatting, e.g.
library(tidyverse)
ggplot(data=mtcars) +
ggtitle(expression(paste(bold("bold part"), " not bold part")))
I thought I could be clever by putting expressions into a data.frame, but this doesn't seem to be supported:
data.frame(var = c(expression(paste(bold("bold part"), "not bold part")),
expression(paste(bold("bold part"), "not bold part"))
))
#> Error in as.data.frame.default(x[[i]], optional = TRUE): cannot coerce class ""expression"" to a data.frame
If you want to make changes to data tables, I recommend you use the grid and gridExtra packages to construct your table and then make changes to the theme parameters.
Without any data to play with I can't see exactly what you want but here's a general idea of what you could do (see below). I've included other aesthetic parameters, for future reference.
You could then generate a pdf output to your C drive, which could then be printed.
d <- data.frame(A = c(1,2,3,4,5),
B = c(6,7,8,9,10),
C = c(11,12,13,14,15))
pdf("Test.pdf", height = 11, width = 10)
grid.table(d, rows = NULL, theme = ttheme_minimal(
core=list(fg_params=list(
hjust=0,
x=0.1,
fontface=matrix(c(1,2,3))))))
dev.off()
Re huxtable, you're correct, but you can get round it. Here's a 1 row, 1 column example, assuming you are printing to HTML:
my_hux <- huxtable("<b>Bold part</b> Not bold part")
escape_contents(my_hux)[1, 1] <- FALSE
You can include arbitrary HTML. Something similar would work for TeX, obviously with TeX formatting instead.

grid.table and tableGrob in gridExtra package

I am trying to format the table using gridExtra package. The gridExtra package I have is 2.0 and R version is 3.2.1
I was going through answers here on stackoverflow about the formatting and the suggested options seem to work only with older version of the package. For example,
grid.table(data, h.even.alpha = 1, h.odd.alpha = 0,
v.even.alpha = 1, v.odd.alpha = 1,
gpar.corefill, gpar.coretext)
All of these options are shown as "unused arguments" in the latest version.
Searching further, I found that in new gridExtra package, formatting is defined probably inside theme, example -
tt <- ttheme_default(core=list(fg_params=list(hjust=1, x=0.95)),
colhead=list(fg_params=list(col="brown"))
and then doing
grid.table(data, theme=tt).
What I could not found was how these options inside theme is defined and how all the formatting which was possible in older version can now be done.
In particular, I am looking to do -
Left justification of columns
commas for big.marks (10000 as 10,000)
different row colors for even and odd row numbers
column header color
not showing row names (something like row.names=FALSE)
This recent answer shows how to alter the parameters, and Baptiste gives a link to further examples. As you notice in your question, to alter the formatting you use the theme argument; you can see what parameters to alter by looking at the output of ttheme_default()
# New theme paramters
myt <- ttheme_default(
# Use hjust and x to left justify the text
# Alternate the row fill colours
core = list(fg_params=list(hjust = 1, x=1),
bg_params=list(fill=c("yellow", "pink"))),
# Change column header to white text and red background
colhead = list(fg_params=list(col="white"),
bg_params=list(fill="red"))
)
# Example data - create some large numbers
dat <- mtcars[1:5,1:5]
dat$mpg <- dat$mpg*1000
grid.newpage()
grid.draw(tableGrob(format(dat, big.mark=","), theme=myt, rows=NULL))
The big.mark argument of format is used to add the comma separator, and rownames are removed using the rows=NULL argument.

Resources