Related
I’m wondering if there is a way to do the following in R:
Produce separate .xlsx workbooks from a single dataset based on a column value
Apply conditional formatting to rows in each .xlsx file based on a column value
I can do each of these separately, but efforts to combine them haven't been successful and I can't find an exact use-case match online. Any help would be greatly appreciated.
I can't share my specific data, but here is a sample that replicates the data I have.
df <- data.frame (
assign = c("YES", "NO", "NO", "YES", "NO", "YES", "YES", "NO"),
dept = c("HIST","HIST", "PSYC", "PSYC", "PSYC", "ENGL", "ENGL", "ENGL"),
class = c(1009, 1330, 1001, 1015, 2190, 1001, 3001, 4390))
I can successfully create separate workbooks by generating a list of the dept variable and then using lapply(), but attempts to incorporate conditional formatting are unsuccessful:
# create a list of dept values to split into separate workbooks
li <- split(df, with(df, df$dept), drop = FALSE)
# using lapply to generate .xlsx docs
lapply(names(li), function(x){write.xlsx(li[[x]], "report", file = paste0("report_", x, ".xlsx"), row.names = FALSE)})
With the following code, I can generate a .xlsx file with conditional formatting, but can only produce a single file with all rows rather than multiple files:
# create style for classes that haven’t finished the assignment
noadmin <- createStyle(fontColour = "#FF0000", fontSize = 10)
# create style for top row
Heading <- createStyle(textDecoration = "bold", fgFill = "#FFFFCC", border = "TopBottomLeftRight")
# workbook call begins here
assign_all <- createWorkbook()
addWorksheet(assign_all, 1, gridLines = TRUE)
writeData(assign_all, 1, df, withFilter = TRUE)
# identify which rows didn’t complete (e.g., need to be formatted)
noRows = data.frame(which(df$assign == "NO", arr.ind=FALSE))
# freeze top row
freezePane(assign_all, 1, firstActiveRow = 2, firstActiveCol = 1)
# add style to header
addStyle(assign_all, 1, cols = 1:ncol(df), rows = 1, style = Heading)
# add style to "NO" rows
addStyle(assign_all, 1, cols = 1:ncol(df), rows = noRows[,1]+1, style = noadmin, gridExpand = TRUE)
saveWorkbook(assign_all, paste0("report.xlsx"), overwrite = TRUE)
This produces the output I want, but with all rows in one file:
Thanks in advance for any guidance you can provide. I've been working on this problem for a few weeks and have run out of ideas.
You could put your code to create the workbook inside a function, then loop over the list of splitted dataframes to create your xlsx files. Instead of lapply I use mapply to loop over both the list and the names:
li <- split(df, df$dept)
library(openxlsx)
# create style for classes that haven’t finished the assignment
noadmin <- createStyle(fontColour = "#FF0000", fontSize = 10)
# create style for top row
Heading <- createStyle(textDecoration = "bold", fgFill = "#FFFFCC", border = "TopBottomLeftRight")
make_xl <- function(x, y) {
assign_all <- createWorkbook()
addWorksheet(assign_all, 1, gridLines = TRUE)
writeData(assign_all, 1, x, withFilter = TRUE)
# identify which rows didn’t complete (e.g., need to be formatted)
noRows = data.frame(which(x$assign == "NO", arr.ind=FALSE))
# freeze top row
freezePane(assign_all, 1, firstActiveRow = 2, firstActiveCol = 1)
# add style to header
addStyle(assign_all, 1, cols = 1:ncol(x), rows = 1, style = Heading)
# add style to "NO" rows
addStyle(assign_all, 1, cols = 1:ncol(x), rows = noRows[,1]+1, style = noadmin, gridExpand = TRUE)
saveWorkbook(assign_all, paste0("report_", y, ".xlsx"), overwrite = TRUE)
}
mapply(make_xl, li, names(li))
#> ENGL HIST PSYC
#> 1 1 1
list.files(pattern = "^report")
#> [1] "report_ENGL.xlsx" "report_HIST.xlsx" "report_PSYC.xlsx"
How do I use openxlsx::conditionalFormatting on a list of column indexes, not necessarily in order? In the documentation, ?conditionalFormatting, all the examples fill the cols argument with a : like cols = 1:5 Meaning 1,2,3,4,5; however, I want my columns to be color coded according to if their index is in a list. The column index isn’t necessarily in a numerical order like 1:5. It could be 1,2,4,6,8 or something like that.
As an example:
library(tidyverse)
# install.packages("openxlsx")
library(openxlsx)
library(writexl)
library(glue)
data_format <- data.frame(vals = c(5,6,2,12,5,12,5,4.5,12,13,3,15,17,30,7,19),
vals1 = c(2,6,2,12,13,12,5,4.5,12,13,3,15,19,30,7,9),
vals2 = c(2,7,2,7,13,12,5,4.5,12,1,3,15,20,30,7,6),
vals3 = c(1,20,2,8,12,1,1,9,4.2,16,11,3,14,10,28,5),
vals4 = c(5,13,2,12,13,12,1,4.5,12,10,3,15,20,29,7,9),
vals5 = c(5,15,2,10,18,11,3,4.5,12,13,2,15,86,90,9,11),
thresh1 = c(4,11,9,13.5,12,12,6,4.8,10,14,3,17,22,80,8,13),
thresh2 = c(6,12,1,13,16,11,5,3,16,12,1,13,19,20,6,10))
data_format <-
data_format %>%
relocate(thresh1, .before=vals)
data_format <-
data_format %>%
relocate(thresh2, .after=thresh1)
wb <- createWorkbook()
addWorksheet(wb, "data_format")
writeData(wb, "data_format",data_format)
Colpink1 <- c(4,5,6,8) # I would expect these columns to be pink when they are less than column A
Colpurple2 <- c(3,7) # I would expect these columns to be purple when they are less than column B
pinkStyle <- createStyle(fontColour = "#FA977C")
purpleStyle <- createStyle(fontColour = "#9B7CFA")
conditionalFormatting(wb, "data_format",
cols = Colpink1,
rows = 2:nrow(data_format), rule = "<$A2", style = pinkStyle
)
conditionalFormatting(wb, "data_format",
cols = Colpurple2,
rows = 2:nrow(data_format), rule = "<$B2", style = purpleStyle
)
filepath <-
glue("PATH/format_coloring.xlsx")
saveWorkbook(wb, file = filepath)
Column C is purple as expected but G has pink and purple values. I would want G to just be color coded according to purple. The other columns have a mixture of pink and purple where I would expect pink. Does anyone have an idea on how to conditionally format according to index not in order?
If anyone has ideas that would be appreciated.
The cols argument looks like it takes ranges, even though you specify c(4,5,6,8) that is interpreted as columns 4-8 (including column 7). You could loop through Colpink1 and Colpurple2, applying the conditional formatting
wb <- createWorkbook()
addWorksheet(wb, "data_format")
writeData(wb, "data_format",data_format)
Colpink1 <- c(4,5,6,8)
Colpurple2 <- c("C","G") #just testing that column letters worked too
pinkStyle <- createStyle(fontColour = "#FA977C")
purpleStyle <- createStyle(fontColour = "#9B7CFA")
for(i in Colpink1){
conditionalFormatting(wb, "data_format",
cols = i,
rows = 2:nrow(data_format), rule = "<$A2", style = pinkStyle)
}
for(i in Colpurple2){
conditionalFormatting(wb, "data_format",
cols = i,
rows = 2:nrow(data_format), rule = "<$B2", style = purpleStyle)
}
openXL(wb)
I will need to do color conditional formatting for 1 particular column, format it to percentage, and export the file as .xlsx. Note that I have 5 data frames that I will run this rule code with, and compile them into 1 workbook each in different sheets. I am stuck on the part where I can't seem to set the conditional rule if I formatted the percentage in it. And vice versa, if I conditional format it first, I'm not sure how I can format percentage for that column. Please refer to my code below.
## Dataframe
cost_table <- read.table(text = "FRUIT COST SUPPLY_RATE
1 APPLE 15 0.026377
2 ORANGE 14 0.01122
3 KIWI 13 0.004122
5 BANANA 11 0.017452
6 AVOCADO 10 0.008324 " , header = TRUE)
## This is the line where I label the %. However if I do that, conditional formatting will not recognize it in the rule
cost_table$SUPPLY_RATE <- label_percent(accuracy = 0.01)(cost_table$SUPPLY_RATE)
## Creating workbook and sheet
Fruits_Table <- createWorkbook()
addWorksheet(Fruits_Table,"List 1")
writeData(Fruits_Table,"List 1",cost_table)
## Style color for conditional formatting
posStyle <- createStyle(fontColour = "#006100", bgFill = "#C6EFCE")
negStyle <- createStyle(fontColour = "#9C0006", bgFill = "#FFC7CE")
## If Supply rate is above 1.5%, it will be green, if it's equivalent or below, it will be red
conditionalFormatting(Fruits_Table, "List 1",
cols = 3,
rows = 2:6, rule = "C2> 0.015", style = posStyle
)
conditionalFormatting(Fruits_Table, "List 1",
cols = 3,
rows = 2:6, rule = "C2<= 0.015", style = negStyle
)
The output should be as shown below.
Regarding Borderline info
What I'm looking at is to apply outside border for c2:c6.
To clarify my purpose, the final output will be shown as below. I have some other codes to format the borders for the headers and column A:B. Because of the percentage style, it affected my borderline.
You don't need to use label_percent from scales package.
You can apply the percentage format along with the color rules to the workbook by using style and then addStyle functions. Another thing, I found in the documented examples of conditionalFormatting that you don't need to specify the column name (such as C) in the rule argument if your rule apply to only one column with no relation to values in another column.
Here is the code that I used:
Fruits_Table <- createWorkbook()
addWorksheet(Fruits_Table,"List 1")
writeData(Fruits_Table,"List 1",cost_table)
conditionalFormatting(Fruits_Table, "List 1",
cols = 3,
rows = 2:6, rule = "> 0.015", style = posStyle)
conditionalFormatting(Fruits_Table, "List 1",
cols = 3,
rows = 2:6, rule = "<= 0.015",
style = negStyle)
percent_style <- createStyle(numFmt = "PERCENTAGE")
addStyle(Fruits_Table,"List 1", style = ,percent_style, rows = 2:6, cols = 3)
I tried that code and it works.
saveWorkbook(Fruits_Table, "my_fruits_table.xlsx", )
Updated to add borderline info
In case you want to create borderline along with the percentage format, you can use border and borderStyle as follows:
percent_border_style<- createStyle(numFmt = "PERCENTAGE",
border = "TopBottomLeftRight",
borderStyle = "medium" )
addStyle(Fruits_Table,"List 1",
style = ,percent_border_style,
rows = 2:6, cols = 3)
saveWorkbook(Fruits_Table, "borderline_fruits_table.xlsx", )
Here is the borderline result
In case you want to customize different styles to different cells, as you explained in your comment, you need to createStyle for a particular style, and then you use addStyle to apply that particular style to a particular cell. So, you need to specify the row and the column for each style. To keep the percentage format style, you also need to keep numFmt to each addStyle.
Here is an example code to apply outside borders to the targeted column. The code customizes borders to three groups of cells:
top_side_line <- createStyle(numFmt = "PERCENTAGE",
border = "TopLeftRight",
borderStyle = "medium")
side_line <- createStyle(numFmt = "PERCENTAGE",
border = "LeftRight",
borderStyle = "medium")
bottom_side_line <- createStyle(numFmt = "PERCENTAGE",
border = "BottomLeftRight",
borderStyle = "medium")
addStyle(Fruits_Table,"List 1",
style = top_side_line, rows = 2, cols = 3)
addStyle(Fruits_Table,"List 1",
style = side_line, rows = 3:5, cols = 3)
addStyle(Fruits_Table,"List 1",
style = bottom_side_line, rows = 6, cols = 3)
saveWorkbook(Fruits_Table, "newline_fruits_table.xlsx")
Here is the result:
I am attempting to create save multiple formatted Excel files, each of which are subsetted from a certain data frame by a factor.
This is an example of what I have tried so far
# Create data
df <- data.frame(category = rep(c("a","b","c","d"),times = 20),
values = rnorm(20,5,2))
# Create workbooks named after specific level of factor
l1 <- sapply(levels(df$category), assign, value = createWorkbook())
# Create styles
hs <- createStyle(fgFill = "#808080", border = "bottom", textDecoration = "bold")
lt8 <- createStyle(bgFill = "#ff0000")
gt30 <- createStyle(bgFill = "#00b0f0")
grn <- createStyle(bgFill = "#00b000")
# For loop
for (i in l1) {
addWorksheet(i, names(i))
writeData(i, names(i), df[df$category == names(i),], headerStyle = hs)
conditionalFormatting(i, names(i), cols = 1:2, rows = 2:nrow(df[df$category == names(i),]), rule = "$B2<2", type = "expression", style = lt8)
conditionalFormatting(i, names(i), cols = 1:2, rows = 2:nrow(df[df$category == names(i),]), rule = "$B2>=7", type = "expression", style = gt30)
conditionalFormatting(i, names(i), cols = 1:2, rows = 2:nrow(df[df$category == names(i),]), rule = "AND($B2>=4, $B2<5.5)", style = grn)
setColWidths(i, names(i), cols=1:2, widths = "auto")
saveWorkbook(paste(i, ".wb", sep = ""), file = paste(i, " Report ", ".xlsx", sep = ""))
}
Each time, I run into this error
Error in if (tolower(sheetName) %in% tolower(wb$sheet_names)) stop("A worksheet by that name already exists! Sheet names must be unique case-insensitive.")
This is the first time I've attempted to assign any sheets so I'm not exactly sure why I keep getting this error.
Ultimately, I would like to save the subsetted and formatted excel workbooks through a repetitive process because my real data would produce many more workbooks. The workbooks must be separate and placing these subsets in sheets won't work.
Any and all advice on how to achieve this would be greatly appreciated.
Your error is coming from this line:
addWorksheet(i, names(i))
because names(i) is empty:
> names(l1[['a']])
character(0)
You might be better off looping over the names of l1, so you have the categories you want, using that to pull the appropriate workbook from the list. Something like:
for (i in names(l1)) {
wb = l1[[i]]
addWorksheet(wb, i)
category_data <- df[df$category == i,]
writeData(wb, i, category_data, headerStyle = hs)
conditionalFormatting(wb, i, cols = 1:2, rows = 2:nrow(category_data), rule = "$B2<2", type = "expression", style = lt8)
conditionalFormatting(wb, i, cols = 1:2, rows = 2:nrow(category_data), rule = "$B2>=7", type = "expression", style = gt30)
conditionalFormatting(wb, i, cols = 1:2, rows = 2:nrow(category_data), rule = "AND($B2>=4, $B2<5.5)", style = grn)
setColWidths(wb, i, cols=1:2, widths = "auto")
saveWorkbook(wb, file = paste(i, " Report ", ".xlsx", sep = ""))
}
There's still one subtle error here:
l1 <- sapply(levels(df$category), assign, value = createWorkbook())
createWorkbook() is only being called once, so you have 4 copies of the same workbook. That means the final save will have all 4 tabs. Compare:
> identical(l1$a, l1$b)
[1] TRUE
with 2 separate calls to createWorkbook():
> identical(createWorkbook(), createWorkbook())
[1] FALSE
Might be worth just looping over the distinct categories, and creating the workbook inside the loop. That is:
library(openxlsx)
# Create data
df <- data.frame(category = rep(c("a","b","c","d"),times = 20),
values = rnorm(20,5,2))
# Create styles
hs <- createStyle(fgFill = "#808080", border = "bottom", textDecoration = "bold")
lt8 <- createStyle(bgFill = "#ff0000")
gt30 <- createStyle(bgFill = "#00b0f0")
grn <- createStyle(bgFill = "#00b000")
# For loop
for (i in levels(df$category)) {
wb <- createWorkbook()
addWorksheet(wb, i)
category_data <- df[df$category == i,]
writeData(wb, i, category_data, headerStyle = hs)
conditionalFormatting(wb, i, cols = 1:2, rows = 2:nrow(category_data), rule = "$B2<2", type = "expression", style = lt8)
conditionalFormatting(wb, i, cols = 1:2, rows = 2:nrow(category_data), rule = "$B2>=7", type = "expression", style = gt30)
conditionalFormatting(wb, i, cols = 1:2, rows = 2:nrow(category_data), rule = "AND($B2>=4, $B2<5.5)", style = grn)
setColWidths(wb, i, cols=1:2, widths = "auto")
saveWorkbook(wb, file = paste(i, " Report ", ".xlsx", sep = ""))
}
I am using R to edit an xlsx worksheet. I would like to format the worksheet with colored rows based on character values in a specific column, then save the workbook in xlsx form. I successfully loaded the workbook using xlsx in R. I can run through the characters in the workbook and change that specific cell background color based on a condition, but am having trouble changing the color of that entire row.
My question is, how can I make the entire row a solid color, rather than just the cell? So far, I have followed the instructions and code located here:
Color cells with specific character values in r to export to xlsx
What would you have to add to the code in the above link to make the entire row the same color as the specific cell that was targeted?
greenStyle <- createStyle(fontColour = "#000000", bgFill = "green")
yellowStyle <- createStyle(fontColour = "#000000", bgFill = "yellow")
conditionalFormatting(wb, "entire report", cols=1:12, rows=1:2000, rule="Finished", style = greenStyle, type = "contains")
conditionalFormatting(wb, "entire report", cols=1:12, rows=1:2000, rule="In Process", style = yellowStyle, type = "contains")
saveWorkbook(wb, file, overwrite=TRUE)
This question is from some time ago, I was looking to do this, then found the answer and wanted to share (I am the author of the post referred in the question btw :) )
So, in order to color an entire row, I made a reproducible example:
Reproducible example:
dfX <- data.frame('a' = c(1:4),
'b' = c(1:2,2:1),
'c' = LETTERS[1:4],
'e' = LETTERS[1:2][2:1],
'f' = c('Finished', 'In Process', 'In Process', 'In Process'))
library(openxlsx)
wb <- createWorkbook()
addWorksheet(wb, "Sheet", gridLines = TRUE)
writeData(wb, "Sheet", dfX)
greenRows = data.frame(which(dfX == "Finished", arr.ind=TRUE))
yellowRows = data.frame(which(dfX == "In Process", arr.ind=TRUE))
## Here I create data frames where it states which rows and columns
## have 'Finished' and which have 'In Process'. From here I want to keep only the
## rows from these data frames.
# Create a heading style
Heading <- createStyle(textDecoration = "bold", border = "Bottom")
# Row styles
greenStyle <- createStyle(fontColour = "#000000", fgFill = "green")
yellowStyle <- createStyle(fontColour = "#000000", fgFill = "yellow")
Important Note: I use "fgFill" instead of "bgFill" because in order to do this, we will use addStyle (and not conditionalFormatting), and in the documentation, it states that bgFill is only for conditionalFormatting
# Apply header style:
addStyle(wb, "Sheet", cols = 1:ncol(dfX), rows = 1, style = Heading)
# Apply greenStyle:
addStyle(wb, "Sheet", cols = 1:ncol(dfX), rows = greenRows[,1]+1,
style = greenStyle, gridExpand = TRUE)
# Apply yellowStyle:
addStyle(wb, "Sheet", cols = 1:ncol(dfX), rows = yellowRows[,1]+1,
style = yellowStyle, gridExpand = TRUE)
saveWorkbook(wb, file, overwrite=TRUE)
Note that in "rows = " I input greenRows[,1]+1, which means only the first column of the greenRows data.frame, plus 1 (the first row would be the header, so skip this one)
Also note that in the last line, in the file part you should specify the directory where to save the file with the .xlsx termination, such as:
saveWorkbook(wb, file = "C:/Documents/newfile.xlsx", overwrite=TRUE)
This post, though not the same question, helped me.