I am using Officer to create a Word document, which is a large table. Into this table I want to insert some images. To do this I am using flextable. The following code inserts an image into the flextable.
pupil.tbl <- tribble(
~col1, ~col2,
paste("Name:", pupil$name), paste("Class:", pupil.class),
"attendance_graph", "boxall_graph"
)
# add attendance plot
pupil.ft <- flextable(as.data.frame(pupil.tbl))
pupil.ft <- display(
pupil.ft, i=2, col_key = "col1", pattern = "{{att_tbl}}",
formatters = list(
att_tbl ~ as_image(
col1,
src = "attendance.png",
width = 3.3,
height = 1.65)
)
)
)
This works fine, but I have quite a few images to add so I thought I would abstract it into a function. However when I try to do this I get :
Error in data.frame(image_src = src, width = width, height = height, stringsAsFactors = FALSE) :
object 'image_file' not found
Here is the function and a call to the function(at the moment it is using the global variables for everything except the path to the image)
pupil.ft <- add_img_to_flextable("attendance.png")
add_img_to_flextable <- function(image_file){
return(
display(
pupil.ft, i=2, col_key = "col2", pattern = "{{att_tbl}}",
formatters = list(
att_tbl ~ as_image(
col1,
src = image_file,
width = 3.3,
height = 1.65)
)
)
)
}
If you add the src in a column of the input data.frame, it should work as expected. I can't reproduce everything as I don't have your data and your images.
library(flextable)
library(tibble)
download.file("https://www.r-project.org/logo/Rlogo.png", destfile = "Rlogo.png")
pupil.tbl <- tribble(
~col1, ~col2, ~col3,
"A", "B", "Rlogo.png",
"C", "D", "Rlogo.png"
)
pupil.tbl <- as.data.frame(pupil.tbl)
# display only col1 and col2
pupil.ft <- flextable(pupil.tbl, col_keys = c("col1", "col2") )
add_img_to_flextable <- function(ft, i, j){
display(
ft, i=i, col_key = j, pattern = "{{att_tbl}}",
formatters = list(# use col3 even if not displayed
att_tbl ~ as_image(col3, src = col3, width = 1.29, height = 1)
)
)
}
pupil.ft <- add_img_to_flextable(pupil.ft, i = 2, j = "col2")
pupil.ft
Note that I am not satisfied with the display function, I can see its usage is too complex, I might improve that point later.
Related
I am trying to add sparklines to multiple columns of a Data Table in R. I am able to get a single Sparkline column to render correctly, but when I try to render a sparkline in a second column based on different underlying data than the first, the same sparkline is displayed in the second column. I imagine there is some snippet of code needed in the fnDrawCalback option, but I have not been able to find much out there addressing my issue.
Here is a reproducible example for both a single sparkline (works) and a two sparklines (just repeats the first sparkline in both columns)
# create data with single sparkline column
library(sparkline)
library(DT)
spark_data <- data.frame(
id = c('spark1', 'spark2'),
spark = c(
spk_chr(values = 1:3, elementId = 'spark1'),
spk_chr(values = 3:1, elementId = 'spark2')
)
)
tbl <- datatable(spark_data, escape = F,
rownames = F
, options = list(fnDrawCallback = htmlwidgets::JS('function(){
HTMLWidgets.staticRender();
}'))
)
spk_add_deps(tbl)
# create data with two sparkline columns
library(sparkline)
library(DT)
spark_data <- data.frame(
id = c('spark1', 'spark2'),
spark = c(
spk_chr(values = 1:3, elementId = 'spark1'),
spk_chr(values = 3:1, elementId = 'spark2')
),
second_spark = c(spk_chr(values = 3:1, elementId = 'spark1'),
spk_chr(values = 1:3, elementId = 'spark2'))
)
tbl <- datatable(spark_data, escape = F,
rownames = F
, options = list(fnDrawCallback = htmlwidgets::JS('function(){
HTMLWidgets.staticRender();
}'))
)
spk_add_deps(tbl)
I would like to create a gt table where I display numeric values from two columns together in a single cell, but color the cells based on just one of the column's values.
For example using the ToothGrowth example data I'd like to put the len and dose columns together in a single cell but color the cell backgrounds by the value of dose.
I tried to manually create a vector of colors to color the len_dose column but this does not work because it seems like it is reapplying the color vector to each different level of len_dose, not dose. I guess you could manually format the cells with tab_style() but that seems inefficient and does not give you the nice feature where the text color changes to maximize contrast with background. I don't know an efficient way to do this.
What I tried:
library(gt)
library(dplyr)
library(scales)
library(glue)
# Manually map dose to color
dose_colors <- col_numeric(palette = 'Reds', domain = range(ToothGrowth$dose))(ToothGrowth$dose)
ToothGrowth %>%
mutate(len_dose = glue('{len}: ({dose})')) %>%
gt(rowname_col = 'supp') %>%
cols_hide(c(len, dose)) %>%
data_color(len_dose, colors = dose_colors)
Output (not good because not colored by dose):
Not sure if you found a solution to this yet but here is what I did:
If you use tab_style() you don't need to try and create the vector of colors and can instead set the background color you want based on the dose column. If you want to color values differently based on dose, in addition to what I've colored here, then create another tab_style() for the desired value.
library(gt)
library(dplyr)
library(scales)
library(glue)
ToothGrowth %>%
mutate(len_dose = glue('{len}: ({dose})')) %>%
gt(rowname_col = 'supp') %>%
tab_style(
style = cell_fill(color = "palegreen"),
location = cells_body(
columns = len_dose,
rows = dose >= 1.0
)
) %>%
cols_hide(c(len, dose))
I faced the same issue and adjusted the gt::data_color function to accept separate source and target columns - with that, the following should work to produce your desired output.
# Distinguish SOURCE_columns and TARGET_columns
my_data_color <- function (data, SOURCE_columns, TARGET_columns, colors, alpha = NULL, apply_to = c("fill",
"text"), autocolor_text = TRUE)
{
stop_if_not_gt(data = data)
apply_to <- match.arg(apply_to)
colors <- rlang::enquo(colors)
data_tbl <- dt_data_get(data = data)
colors <- rlang::eval_tidy(colors, data_tbl)
resolved_source_columns <- resolve_cols_c(expr = {
{
SOURCE_columns
}
}, data = data)
resolved_target_columns <- resolve_cols_c(expr = {
{
TARGET_columns
}
}, data = data)
rows <- seq_len(nrow(data_tbl))
data_color_styles_tbl <- dplyr::tibble(locname = character(0),
grpname = character(0), colname = character(0), locnum = numeric(0),
rownum = integer(0), colnum = integer(0), styles = list())
for (i in seq_along(resolved_source_columns)) {
data_vals <- data_tbl[[resolved_source_columns[i]]][rows]
if (inherits(colors, "character")) {
if (is.numeric(data_vals)) {
color_fn <- scales::col_numeric(palette = colors,
domain = data_vals, alpha = TRUE)
}
else if (is.character(data_vals) || is.factor(data_vals)) {
if (length(colors) > 1) {
nlvl <- if (is.factor(data_vals)) {
nlevels(data_vals)
}
else {
nlevels(factor(data_vals))
}
if (length(colors) > nlvl) {
colors <- colors[seq_len(nlvl)]
}
}
color_fn <- scales::col_factor(palette = colors,
domain = data_vals, alpha = TRUE)
}
else {
cli::cli_abort("Don't know how to map colors to a column of class {class(data_vals)[1]}.")
}
}
else if (inherits(colors, "function")) {
color_fn <- colors
}
else {
cli::cli_abort("The `colors` arg must be either a character vector of colors or a function.")
}
color_fn <- rlang::eval_tidy(color_fn, data_tbl)
color_vals <- color_fn(data_vals)
color_vals <- html_color(colors = color_vals, alpha = alpha)
color_styles <- switch(apply_to, fill = lapply(color_vals,
FUN = function(x) cell_fill(color = x)), text = lapply(color_vals,
FUN = function(x) cell_text(color = x)))
data_color_styles_tbl <- dplyr::bind_rows(data_color_styles_tbl,
generate_data_color_styles_tbl(column = resolved_target_columns[i], rows = rows,
color_styles = color_styles))
if (apply_to == "fill" && autocolor_text) {
color_vals <- ideal_fgnd_color(bgnd_color = color_vals)
color_styles <- lapply(color_vals, FUN = function(x) cell_text(color = x))
data_color_styles_tbl <- dplyr::bind_rows(data_color_styles_tbl,
generate_data_color_styles_tbl(column = resolved_target_columns[i],
rows = rows, color_styles = color_styles))
}
}
dt_styles_set(data = data, styles = dplyr::bind_rows(dt_styles_get(data = data),
data_color_styles_tbl))
}
# Add function into gt namespace (so that internal gt functions can be called)
library(gt)
tmpfun <- get("data_color", envir = asNamespace("gt"))
environment(my_data_color) <- environment(tmpfun)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(glue)
# Map dose to color
ToothGrowth %>%
mutate(len_dose = glue('{len}: ({dose})')) %>%
gt(rowname_col = 'supp') %>%
cols_hide(c(len, dose)) %>%
my_data_color(SOURCE_columns = "dose", TARGET_columns = "len_dose",
colors = scales::col_numeric(palette = c("red", "green"), domain = c(min(ToothGrowth$dose), max(ToothGrowth$dose))))
Created on 2022-11-03 with reprex v2.0.2
I am wondering how I could generate a data object like the one you get when calling stats::density(df$variable) on the numeric columns of a spark dataframe?
I am looking into SparkR::spark.lapply but think I am missing something. I have created a little example below. If someone knows how and is willing to help me out, I'd be very thankful.
Best,
NF
Example:
df<- iris
gen_density_data<- function(df){
col_types<- sapply(df, class)
good_cols<- which(col_types %in% c("numeric", "integer"))
tres<- lapply(good_cols, function(x){
expr<- paste0("stats::density(df$", colnames(df)[x], ")")
eval(parse(text=expr))
})
return(tres)
}
res<- gen_density_data(df)
# And for Spark:
sdf<- SparkR::createDataFrame(iris)
gen_spark_density_data<- function(sdf){
tmp_types<- SparkR::coltypes(sdf)
good_cols_idx<- which(tmp_types %in% setdiff(tmp_types, c("character", "POSIXct", "POSIXlt", "logical")))
if(length(good_cols_idx)>=1){
tres<- SparkR::spark.lapply(good_cols_idx, function(x){
eval(parse(text=paste0("stats::density(sdf$", colnames(sdf)[x], ")")))
})
return(tres)
}
}
tst<- gen_spark_density_data(sdf=sdf) # This is where it throws errors.
I came up with a solution that works decently well. I use highcharter for plotting. I think I could further improve how I manage the partitions of the data. Right now, this might not be the most scalable solution for large data sets with columns with large differences between the minimum and maximum values. Some conditional checking is probably in order, but for the purposes of just getting an example out there, this is what I made. Note: I adopted the example from https://rpubs.com/mcocam12/KDF_byHand. Many thanks to Marc for the example.
Data:
df<- do.call("rbind", replicate(10, iris, simplify = FALSE))
sdf<- SparkR::createDataFrame(df)
sdf<- SparkR::repartition(sdf, nrow(sdf))
Functions:
gen_sdf_kernel_density_points<- function(sdf=sdf,num_values, h=1){
x_col<- SparkR::colnames(sdf)[1]
min_max_sdf<- eval(parse(text=paste0("SparkR::agg(sdf, min=min(sdf$", x_col, "), max=max(sdf$", x_col,")) %>% SparkR::collect()")))
Range = seq(min_max_sdf$min-5, min_max_sdf$max+5, 0.01)
Range<- data.frame(Range)
RangeSDF<- SparkR::createDataFrame(Range)
# this is where I think I could be better with partitions, ideas welcomed
#RangeSDF<- SparkR::repartition(RangeSDF, nrow(RangeSDF))
# if(nrow(Range)>1000){
# RangeSDF<- SparkR::repartition(RangeSDF, 200L)
# } else if(nrow(Range) > 64){
# RangeSDF<- SparkR::repartition(RangeSDF, 64L)
# }
tst<- SparkR::crossJoin(sdf, RangeSDF)
tst$density<- eval(parse(text=paste0("exp(-(tst$Range-tst$", x_col,")^2/(2*h^2))/(h*sqrt(2*pi))")))
## Now group by range and get the sum of the density, normalize by the number of values
gb_df<- SparkR::groupBy(tst, tst$Range)
densities2<- SparkR::agg(gb_df, bell_sum=sum(tst$density))
densities2<- SparkR::withColumn(densities2, "kernel_density", densities2$bell_sum / num_values)
densities2<- SparkR::arrange(densities2, asc(densities2$Range))
return(densities2)
}
gen_den_plots_from_spark_res<- function(res){
big_out<- lapply(seq_along(res), function(x){
var_name<- names(res)[x]
rdf<- res[[x]]
tmp<- data.frame(cbind(x = rdf$Range, y = rdf$kernel_density))
x<- highcharter::list_parse(tmp)
hc<- highcharter::highchart() %>%
hc_series(
list(
name="Density Estimate",
data = x,
type = "areaspline",
marker = list(enabled = FALSE),
color = list(
linearGradient = list(x1 = 0, y1 = 1, x2 = 0, y2 = 0),
stops = list(
list(0, "transparent"),
list(0.33, "#0000FF1A"),
list(0.66, "#0000FF33"),
list(1, "#ccc")
)
),
fillColor = list(
linearGradient = list(x1 = 0, y1 = 1, x2 = 0, y2 = 0),
stops = list(
list(0, "transparent"),
list(0.1, "#0000FF1A"),
list(0.5, "#0000FF33"),
list(1, "#0000FF80")
)
)
)
)
hc<- hc %>%
highcharter::hc_title(text=paste0("Density Plot For: ", snakecase::to_title_case(var_name)))# %>% hc_add_series(data =tmp, hcaes(x= tmp$x, y = tmp$y),name="Bars", type="column" )
return(hc)
})
return(big_out)
}
make_hc_grid<- function(tres_out, ncol=2){
hc<- tres_out %>%
highcharter::hw_grid(rowheight = 450, ncol = ncol) %>%htmltools::browsable()
hc
}
Usage:
tmp_types<- SparkR::coltypes(sdf)
good_cols_idx<- which(tmp_types %in% setdiff(tmp_types, c("character", "POSIXct", "POSIXlt", "logical")))
nrows_sdf<- SparkR::count(sdf)
if(length(good_cols_idx)>=1){
out<- lapply(seq_along(good_cols_idx), function(z){
# Need to select a single column for the sdf, otherwise the cross join becomes too big
tmpz<- SparkR::select(sdf, SparkR::colnames(sdf)[good_cols_idx[z]])
out<- gen_sdf_kernel_density_points(sdf = tmpz, num_values = nrows_sdf)
out<- SparkR::collect(out)
return(out)
}) %>% stats::setNames(SparkR::colnames(sdf)[good_cols_idx])
}
Plotting:
tres<- gen_den_plots_from_spark_res(res=out)
all_plots<- make_hc_grid(tres_out = tres)
# View Result
all_plots
Expected Result:
This could all probably be improved...if you have ideas, I'd love to hear them.
Best,
NF
I am using RMarkdown to create a word document (I need the output to be in .docx format).
I'd like to use flextable (or any other package) to format my headers properly.
I'm trying to get the greek symbol delta (∆) to display properly... it seems possible because in the help pages here (https://davidgohel.github.io/flextable/articles/format.html#display-function) the author successfully uses \u03BC to insert the "μ" symbol (and I can too if I use his code, below), but I can't get it to work for delta using \u2206 or \u0394, if I replace \u03BC with either code below. The code I'm using produces this table, but I want to replace the highlighted bit with delta.
This is what I get when I try, for example, \u2206.
Any suggestions?
library(flextable)
if( require("xtable") ){
mat <- round(matrix(c(0.9, 0.89, 200, 0.045, 2.0), c(1, 5)), 4)
mat <- xtable(mat)
ft <- xtable_to_flextable(x = mat, NA.string = "-")
print(ft$col_keys)
ft <- flextable::display(ft, i = 1, col_key = "X1",
pattern = "{{val}}{{pow}}", part = "header",
formatters = list(val ~ as.character("R"), pow ~ as.character("2") ),
fprops = list(pow = fp_text(vertical.align = "superscript", font.size = 8))
)
ft <- flextable::display(ft, i = 1, col_key = "X2",
pattern = "{{val}}{{pow}}", part = "header",
formatters = list(val ~ as.character("\u03BC"), pow ~ as.character("x") ),
fprops = list(pow = fp_text(vertical.align = "superscript", font.size = 8))
)
ft <- flextable::display(ft, i = 1, col_key = "rowname",
pattern = "{{val}}{{pow}}", part = "body",
formatters = list(val ~ as.character("y"), pow ~ as.character("t-1") ),
fprops = list(pow = fp_text(vertical.align = "subscript", font.size = 8))
)
ft <- set_header_labels(ft, X3 = "F-stat", X4 = "S.E.E", X5 = "DW", rowname = "")
ft <- autofit(ft)
ft
}
Update
I am getting closer thanks to a helpful suggestion from David, but (not being very familiar with flextable) I am getting strange behaviour when I try to modify the header in the way suggested:
library(magrittr)
library(flextable)
library(officer)
AICtable <- data.frame(Model = "test", Parameters = 9, AICc = 4000, dAICc = 0, w = 1)
v.epi.aic <- flextable(AICtable) %>%
font(fontname = "Times New Roman", part = "all") %>%
flextable::display(col_key = "dAICc", part = "header",
pattern = "{{D}}{{A}}{{cbit}}",
formatters = list(D ~ as.character("D"),
A ~ as.character("AIC"),
cbit ~ as.character("c") ),
fprops = list(D = fp_text(font.family = "Symbol"),
A = fp_text(font.family = "Times New Roman"),
cbit = fp_text(vertical.align = "subscript")))
v.epi.aic
Notice that column headers are now duplicated, and "AIC" appears before the "∆". The column names should be:
Model, Parameters, AICc, ∆AICc, w (and the "c" in the ∆AICc should be a subscript).
Please use "\u394" instead of "\u0394" to generate the capital delta symbol
After running this code
library(XLConnect)
template <- loadWorkbook ( filename = "template.xlsx" , create = T )
createSheet ( template , c("sheet1","sheet2") )
# setStyleAction(template,XLC$"STYLE_ACTION.NONE")
Data <- data.frame(
a = 1:10,
b = 11:20
)
setDataFormatForType(template, type = XLC$DATA_TYPE.NUMERIC, format = "0.00" )
# list22$`Brand Equity` <- as.numeric(list22$`Brand Equity`)
# list22$`Purchase Intent` <- as.numeric(list22$`Purchase Intent`)
csHeader <- createCellStyle(template, name = "header10")
setFillPattern(csHeader, fill = XLC$BORDER.DOUBLE)
setFillForegroundColor(csHeader, color = XLC$COLOR.DARK_RED)
# setCellFormula(object = template, sheet = (paste0("sheet",i)), row = c(2:4),col = c(1:3), formula = )
setCellStyle(template, sheet = "sheet1", row = 1,
col = c(1:2), cellstyle = csHeader)
setCellStyle(template, sheet = "sheet2", row = 1,
col = c(1:2), cellstyle = csHeader)
for (i in 1:2)
{
setColumnWidth(template, sheet = (paste0("sheet",i)), column = c(1:3), width = 15800)
writeWorksheet ( template , data = Data, sheet = (paste0("sheet",i)), startRow = 1 , startCol = 1 ,
header = TRUE )
}
saveWorkbook ( template )
I obtain
and
It does not seem to pass my argument about the color of the cell. Any ideas ? Moreover is there a way to write transform the numbers in percentages ? So 1 for instance would be 100%, 2 would be 200% etc...
For converting the numbers into percentage, you can write a function similar to this one:
addformatperc<-function(num,roundlevel){
betternum<-paste(prettyNum(round(num*100,roundlevel),big.mark = ","),"%",sep="")
return(betternum)
}
#Output
addformatperc(1,0)
[1] "100%"