I'm trying to use kable and kableExtra to format tables created using pipes, and I can't get the conditional formatting arguments (row_spec, column_spec) to accept variables piped from the generated code.
In the toy example below I create a variable called bg within the dataframe that I want to use to create bands of background colour, but row_spec and column_spec don't seem to recognize that as a variable. Note that creating the variable outside of the pipes isn't an option - the actual use case is much more complicated than that, and the variables used in the process don't exist before that.
library(kableExtra)
set.seed(111)
df = data.frame(var1 = sort(sample(LETTERS[1:3],10,TRUE)),
var2 = sample(1:4,10,TRUE),
var3 = runif(10,0,1))
df %>%
mutate(bg = cumsum(!duplicated(var1))%%2)%>%
kable() %>%kable_styling()%>%
column_spec(1,color=bg)
Error in ensure_len_html(color, nrows, "color") : object 'bg' not found
EDIT: You can do it in two steps easily enough, so I'll include that here, as well as the resulting table that I'm looking for
d = df %>%
mutate(bg = cumsum(!duplicated(var1))%%2)
kable(d) %>% kable_styling(full_width=FALSE) %>%
row_spec(which(d$bg==1),background=grey(0.75))
This might work:
library(tidyverse)
library(kableExtra)
set.seed(111)
df = data.frame(var1 = sort(sample(LETTERS[1:3],10,TRUE)),
var2 = sample(1:4,10,TRUE),
var3 = runif(10,0,1))
df %>%
mutate(bg = cumsum(!duplicated(var1))%%2) %>%
mutate(bg = cell_spec(bg, color = spec_color(bg))) %>%
kable(escape = F) %>% kable_styling()
Output:
You may find more information here (page 16): https://haozhu233.github.io/kableExtra/awesome_table_in_pdf.pdf
Version with background color:
df %>%
mutate(bg = cumsum(!duplicated(var1))%%2) %>%
mutate(bg = cell_spec(bg, color = "white", bold = T,
background = spec_color(bg))) %>%
kable(escape = F) %>% kable_styling()
Output with background color:
Related
when doing a job I have found a problem that I don't know how to solve.
I have a data frame that has 2 columns:
date
value
And it has a total of 1303 rows.
For each date there are 12 values (1 for each month), except in the last year that only has 7
The work I have to do would be to create a 'drilldown' style chart using the 'highcharter' library. The problem is that I don't know how to do it efficiently.
The solution that comes to my mind is not very efficient, below I show my solution so you can see what I mean.
dataframe
# Load packages
library(tidyverse)
library(highcharter)
library(lubridate)
# Load dataset
df <- read.csv('example.csv')
# Prepare df to use
dfDD <- tibble(name = year(df$date),
y = round(df$value, digits = 2),
drilldown = name)
# Create a data frame to use in 'drilldown' (for each year)
df1913 <- df %>%
filter(year(date) == 1913) %>%
data.frame()
df1914 <- df %>%
filter(year(date) == 1914) %>%
data.frame()
# Create a drilldown chart using Highcharter library
highchart() %>%
hc_chart(type = "column") %>%
hc_title(text = "Example Drilldown") %>%
hc_xAxis(type = "category") %>%
hc_legend(enabled = FALSE) %>%
hc_plotOptions(series = list(boderWidth = 2,
dataLabels = list(enabled = TRUE))) %>%
hc_add_series(data = dfDD,
name = "Mean",
colorByPoint = TRUE) %>%
hc_drilldown(allowPointDrilldown = TRUE,
series = list(list(id = 1913,
data = list_parse2(df1913)),
list(id = 1914,
data = list_parse2(df1914))))
Seeing my solution for the first time, I realized that in order to complete the graph I would have to create a subset of values for each year. Having realized that I tried to find a more efficient solution using a 'for loop' but so far I can't get it to work.
Is there a more efficient way to create this graph using a 'loop'!?
If it can be done in another way than using loops, I would also like to know.
Thank you for reading my question and I hope I explained myself well.
Using split and purrr::imap you could split your data by years and loop over the resulting list to convert your data to the nested list object required by hc_drilldown. Note: It's important to make the id a numeric and to pass a unnamed list.
library(tidyverse)
library(highcharter)
library(lubridate)
series <- split(df, year(df$date)) %>%
purrr::imap(function(x, y) list(id = as.numeric(y), data = list_parse2(x)))
# Unname list
names(series) <- NULL
highchart() %>%
hc_chart(type = "column") %>%
hc_title(text = "Example Drilldown") %>%
hc_xAxis(type = "category") %>%
hc_legend(enabled = FALSE) %>%
hc_plotOptions(series = list(boderWidth = 2,
dataLabels = list(enabled = TRUE))) %>%
hc_add_series(data = dfDD,
name = "Mean",
colorByPoint = TRUE) %>%
hc_drilldown(allowPointDrilldown = TRUE,
series = series)
Problem
I would like to produce a good looking table which has ggplots within the cells of one column. One key element is that I would like to create a pdf output of this table eventually.
What I have tried so far
Hopefully the example below is understandable. Essentially I found that I can achieve what I want using the gt package. The problem is this creates a html widget which you then have to use phantomJS and webshot to export as a pdf.
library(dplyr)
library(purrr)
library(gt)
library(ggplot2)
dat = tibble(
RowLabel = letters[1:5],
Numeric = seq(100,500,100)
) %>%
mutate(
plotData = RowLabel %>% map(function(pos){
tibble(y=runif(10)*100) %>%
arrange(desc(y)) %>%
mutate(x=row_number())
}),
plot_obj = plotData %>% map(function(df){
df %>%
ggplot(aes(x=x,y=y))+
geom_col()
}),
plot_grob = plot_obj %>% map(cowplot::as_grob)
)
tab = dat %>%
select(RowLabel, Numeric) %>%
mutate(
ggplot = NA
) %>%
gt() %>%
text_transform(
locations = cells_body(vars(ggplot)),
fn = function(x) {
dat$plot_obj %>%
map(ggplot_image, height = px(50))
}
)
tab
What do I want
I would like an output which is similar to the above example. However, I would like a solution which does not require me to use html widgets and can be saved directly as a pdf without the use of other programs. Is this possible to do using ggplot? I have started to learn more about grids/grobs/gtables etc but have not made any meaningful progress.
Thanks in advance!
Perhaps you could tweak the gtsave() function to suit? E.g.
library(dplyr)
library(purrr)
library(gt)
library(ggplot2)
dat = tibble(
RowLabel = letters[1:5],
Numeric = seq(100,500,100)
) %>%
mutate(
plotData = RowLabel %>% map(function(pos){
tibble(y=runif(10)*100) %>%
arrange(desc(y)) %>%
mutate(x=row_number())
}),
plot_obj = plotData %>% map(function(df){
df %>%
ggplot(aes(x=x,y=y))+
geom_col()
}),
plot_grob = plot_obj %>% map(cowplot::as_grob)
)
tab = dat %>%
select(RowLabel, Numeric) %>%
mutate(
ggplot = NA
) %>%
gt() %>%
text_transform(
locations = cells_body(vars(ggplot)),
fn = function(x) {
dat$plot_obj %>%
map(ggplot_image, height = px(50))
}
)
tab %>%
gt::gtsave(filename = "test.pdf", vwidth = 180, vheight = 250)
(R v4.0.3 / gt v0.2.2)
Hei,
To compare several variants of data I produced a HTML report.
Given a special catagory some indexes in the database should be the same. To detect errors / incorrect entries in the database I compare the different categories in a table.
For better reading, it would be fine, to have coloured tables. This can be done easily with the formattable-Package.
My dataset:
require(tidyverse)
require(formattable)
require(kableExtra)
require(knitr)
df1 <- data.frame(V1 = c(68,sample(c("J","N"),size=15,replace = TRUE)),
V2 = c(10,sample(c("J","N"),size=15,replace = TRUE)),
V3 = c(1,sample(c("J","N"),size=15,replace = TRUE))
)
It has - in this example - 3 differnt variants. Only one is recomended. It is supposed, that the variant with the highest N (=first entry in each Vx-Column) is the real one.
My formated table is produced with this code:
df1 %>%
mutate(
V2 = ifelse((as.character(V2) == as.character(V1)) == FALSE,
cell_spec(V2, color = "red",bold = TRUE),
cell_spec(V2, color = "black",bold = FALSE)),
V3 = ifelse((as.character(V3) == as.character(V1)) == FALSE,
cell_spec(V3, color = "red",bold = TRUE),
cell_spec(V3, color = "black",bold = FALSE))
) %>%
kable(format = "html", escape = FALSE) %>%
kable_styling(c("striped", "condensed"), full_width = FALSE) %>%
row_spec(1, bold = T, color = "white", background = "#D7261E")
Two questions:
How to mutate in a loop?
This is necessary because the different categories I have to investigate can have up to 18 different variants. In each dataset, V1 is everytime the reference variant.
As you can see (run the code!) the first line (the "N"s) is coded in the wrong matter. Is it possible to compare from the second line on only (first line is set to TRUE by default)
This would be fine, because the first line is now formated in a matter that does not really make sense.
Thank you!
To answer your two questions:
Instead of looping over the columns, you can use mutate_all
Just take a copy of the first column and mutate it back in later
I have first made your cell_spec calls into functions to reduce clutter in the code.
red <- function(x) cell_spec(x, color = "red", bold = TRUE)
black <- function(x) cell_spec(x, color = "black", bold = FALSE)
c1 <- as.character(df1[[1]])
Now we can do this:
df1 %>%
select(-V1) %>%
mutate_all(function(x) ifelse(as.character(x) != df1[[1]], red(x), black(x))) %>%
mutate(V1 = black(c1)) %>%
mutate_all(function(x) `[<-`(x, 1, " ")) %>%
select(V1, V2, V3) %>%
kable(format = "html", escape = FALSE) %>%
kable_styling(c("striped", "condensed"), full_width = FALSE) %>%
row_spec(1, bold = T, color = "white", background = "#D7261E")
Which gives this result:
Thank you, #AllanCameron!
I 'm not familiar to the package purrr - I really should do more studies about it.
Your idea with purrr::map_dfc solved the problem.
Instead of the first column I need the first row (the digit-row), and of course with grepl it is possible to solve this. The condition in the ifelse-Statement is a little bit longer then.
My final solution is then:
df1 %>%
map_dfc(function(x) ifelse(as.character(x) != as.character(df1$V1) & !grepl("[[:digit:]]",x),
mark_true(x), mark_false(x))) %>%
select(V1, everything()) %>%
kable(format = "html", escape = FALSE) %>%
kable_styling(c("striped", "condensed"), full_width = FALSE) %>%
row_spec(1, bold = T, color = "white", background = "#D7261E")
Thank you very much!
I'm using R version 3.6.1 in RStudio. I have flextable version 0.5.5 and officer version 0.3.5.
I'm having difficulty with formatting my numbers in flextables within RMarkdown. By default, all numbers show up with 3 decimal places. For some of my numbers, this is fine (and actually preferred), but for others, I want to remove the decimals.
Using the advice found here I was able to adjust my table so that all numbers are rounded to the nearest whole number. My code is below (example table used for reproduciblility; otherwise formatting is the same as my current code).
ft_test <- head(iris) %>% flextable() %>%
hline(part = 'header', border = fp_border(color = "black", width = 3)) %>%
align(align ='center', part = 'all') %>%
align(j = 1, align ='left', part = 'all') %>%
set_formatter_type(fmt_double = "%.0f")
ft_test
However, I only want certain columns to be whole numbers, and other columns to still have decimals. I've tried using the j argument to call certain columns:
ft_test <- head(iris) %>% flextable() %>%
hline(part = 'header', border = fp_border(color = "black", width = 3)) %>%
align(align ='center', part = 'all') %>%
align(j = 1, align ='left', part = 'all') %>%
set_formatter_type(fmt_double = "%.0f", j = 2)
ft_test
... but then I get an error telling me j = 2 is an unused argument.
Any suggestions for how to adjust the numbers of only some columns? Thanks in advance for your help!
You can not use argument j as it is not an argument of set_formatter_type. The function is setting formatters for one or several data type. In your case, it's better to use colformat_num.
library(flextable)
library(officer)
library(magrittr)
ft_test <- head(iris) %>% flextable() %>%
hline(part = 'header', border = fp_border(color = "black", width = 3)) %>%
align(align ='center', part = 'all') %>%
align(j = 1, align ='left', part = 'all') %>%
colformat_num(j = c("Sepal.Length", "Sepal.Width",
"Petal.Length", "Petal.Width"), digits = 1)
ft_test
You can learn more about formatting content here: https://davidgohel.github.io/flextable/articles/display.html
I am trying to generate a flex table in HTML using R Markdown that has several cells that are merged across multiple rows using ReporteRs::spanFlexTableRows(). However, when I merge the cells together, the numbers displayed in the merged cells appear at the top of the cells. Moreover, the border between the two sets of (adjacent) merged cells disappears. I would like to know:
How to center the contents of each merged cell vertically, and
How to retain the border between the two merged cells
The following R script illustrates the issue I am having:
library(tidyverse)
library(ReporteRs)
mtcars[1:4,] %>%
mutate(
x = row.names(mtcars[1:4,]),
var1 = 1,
var2 = 2
) %>%
select(
x, mpg, cyl, disp, var1, var2
) %>%
make_ft %>%
spanFlexTableRows(j=5, from=1, to= 4) %>%
spanFlexTableRows(j=6, from=1, to= 4)
Thanks for your help!
I have a solution to a closely related question -- merging and vertical alignment using officer/flextable rather than ReporteRs. Since officer/flextable are intended to replace ReporteRs I think this is reasonable to post here.
Vertical alignment can intuitively be done in the github (as of 2019-02-10) version of flextable using valign():
library(tidyverse)
library(officer)
library(flextable)
mtcars[1:4,] %>%
mutate(
x = row.names(mtcars[1:4,]),
var1 = 1,
var2 = 2
) %>%
select(
x, mpg, cyl, disp, var1, var2
) %>%
flextable() %>%
flextable::merge_at(j=5, i=1:4) %>%
flextable::merge_at(j=6, i=1:4) %>%
valign(j = 5:6, valign = 'top') -> myft
myft
Vertical alignment in flextable was historically (as of 2019-02-10 CRAN version) part of the flextable::rotate() function. Here's your example:
library(tidyverse)
# library(ReporteRs)
library(officer)
library(flextable)
# turn into flextable, merge and apply a vertical alignment
# (note that center vertical alignment appears to be the default in flextables
# so despite your goals i do top alignment here)
mtcars[1:4,] %>%
mutate(
x = row.names(mtcars[1:4,]),
var1 = 1,
var2 = 2
) %>%
select(
x, mpg, cyl, disp, var1, var2
) %>%
flextable() %>%
flextable::merge_at(j=5, i=1:4) %>%
flextable::merge_at(j=6, i=1:4) %>%
rotate(j = 5:6, align = 'top', rotation = 'tblr') -> myft
# write to a docx using officer
doc = read_docx()
doc = flextable::body_add_flextable( doc, myft )
print(doc, target = "ftex.docx" )