Conditionally format openxlsx numbers stored as text - r

I am trying to format a mixed class vector in an xlsx output using openxlsx.
The code produces the excel file desired, but the numeric values are stored as test, how do I go about conditionally formatting the numbers?
wb <- openxlsx::createWorkbook() # create workbook
openxlsx::addWorksheet(wb, sheetName = 'data') # add sheet
openxlsx::writeData(wb, sheet ='data',
x=df, xy=c(1, 1),withFilter = T) # write data on workbook
# convert square brackets to superscript in xlsx output
for(i in grep("\\_\\[([A-z0-9\\s]*)\\]", wb$sharedStrings)){
# if empty string in superscript notation, then just remove the superscript notation
if(grepl("\\_\\[\\]", wb$sharedStrings[[i]])){
wb$sharedStrings[[i]] <- gsub("\\_\\[\\]", "", wb$sharedStrings[[i]])
next # skip to next iteration
}
# insert additioanl formatting in shared string
wb$sharedStrings[[i]] <- gsub("<si>", "<si><r>", gsub("</si>", "</r></si>", wb$sharedStrings[[i]]))
# find the "_[...]" pattern, remove brackets and udnerline and enclose the text with superscript format
wb$sharedStrings[[i]] <- gsub("\\_\\[([A-z0-9\\s]*)\\]",
"</t></r><r><rPr><vertAlign val=\"superscript\"/></rPr><t xml:space=\"preserve\">\\1</t></r><r><t xml:space=\"preserve\">",
wb$sharedStrings[[i]])
}
# write excel file
openxlsx::saveWorkbook(wb, file="test.xlsx",
overwrite = TRUE)
The output has the error triangles saying "Number stored as Text".
I tried adding, but no luck.
openxlsx::writeData(wb, sheet = 'data', names(df)[i], startCol = i, startRow = 1)
icol <- df[[i]]
for(j in seq_along(icol)){
x <- icol[[j]]
openxlsx::writeData(wb, sheet = 'data', x, startCol = i, startRow = j + 1)
}
}

Related

how can i stop excel showing 'number stored as text' for numbers with commas, when running R program?

I'm running this R code and the excel output file shows 'number stored as text' warning boxes against each of the numbers. I'm aware it's the commas causing this, but i want the commas in there.
Is there a way to stop these warning boxes showing?
df1 <- data.frame(col_1 = c('1,000', '1,500', '5,000'))
wb <- createWorkbook()
sheet.name <- 'test'
addWorksheet(wb, sheet.name)
writeData(wb, sheet = sheet.name, df1)
saveWorkbook(wb, file = "Test.xlsx")
Just save the numbers without the comma and then set the option to display a comma in Excel. You have the values as character string---not as numeric---in R. I don't think you can save them with a comma and simultaneously have it as number.
However, you can remove the comma in R before the export and set an additional style option to have the commas in Excel:
require("openxlsx")
df1 <- data.frame(col_1 = c("1,000", "1,500", "5,000"))
df1[, 1] <- as.numeric(gsub(",", "", df1[, 1]))
wb <- createWorkbook()
sheet.name <- 'test'
addWorksheet(wb, sheet.name)
writeData(wb, sheet = sheet.name, df1)
comma_thousand <- createStyle(numFmt = "COMMA")
addStyle(wb, sheet = sheet.name, style = comma_thousand, cols = 1, rows = seq(2, nrow(df1) + 1))
saveWorkbook(wb, file = "Test.xlsx", overwrite = TRUE)
I have the rows as seq(2, nrow(df1) + 1) since you have a header in the table.
Here, df1[, 1] <- as.numeric(gsub(",", "", df1[, 1])) is used to get rid of the commas before you save the dataframe as xlsy. You could also use other functions. In Rstudio, your dataframe will look like these before and after running this line:
The output looks like this in Excel:
So basically, use the values with the commas in R (for whatever reason), but remove them before you save the xlsx and use the code I provided in order to tell Excel you want the values to be displayed with commas.

Formatting numbers with openxlsx package in R

I am trying to format certain columns using openxlsx to write an r data frame to an excel file.
Here is a snippet of the R data frame:
The square bracket part in the "seed" column is used to superscript the excel output.
Here is the code I used to write the file:
openxlsx::addWorksheet(wb, sheetName = 'data') # add sheet
openxlsx::writeData(wb, sheet ='data',
x=df, xy=c(1, 1),withFilter = T) # write data on workbook
# make quality codes superscript
for(i in grep("\\_\\[([A-z0-9\\s]*)\\]", wb$sharedStrings)){
# if empty string in superscript notation, then just remove the superscript notation
if(grepl("\\_\\[\\]", wb$sharedStrings[[i]])){
wb$sharedStrings[[i]] <- gsub("\\_\\[\\]", "", wb$sharedStrings[[i]])
next # skip to next iteration
}
# insert additional formatting in shared string
wb$sharedStrings[[i]] <- gsub("<si>", "<si><r>", gsub("</si>", "</r></si>", wb$sharedStrings[[i]]))
# find the "_[...]" pattern, remove brackets and udnerline and enclose the text with superscript format
wb$sharedStrings[[i]] <- gsub("\\_\\[([A-z0-9\\s]*)\\]",
"</t></r><r><rPr><vertAlign val=\"superscript\"/></rPr><t xml:space=\"preserve\">\\1</t></r><r><t xml:space=\"preserve\">",
wb$sharedStrings[[i]])
}
openxlsx::modifyBaseFont(wb, fontSize = 10, fontName = 'Arial')
# right-justify data
openxlsx::addStyle(wb, sheet = 'data',
style = openxlsx::createStyle(halign = "right"), rows = 1:nrow(df)+1, cols = 3:12, gridExpand = TRUE)
#apply to rows with "All" in column B
openxlsx::conditionalFormatting(wb,sheet = 'data',
cols = 1:ncol(df),
rows = 1:nrow(df)+1,
rule = 'LEFT($B2,3)="ALL"',
style = openxlsx::createStyle(textDecoration = 'bold', bgFill = '#dad9d9'))
# format numbers
openxlsx::addStyle(wb = wb, sheet = 'data',
style = openxlsx::createStyle(numFmt = "#,###.0"),
rows = 1:nrow(df)+1, cols = c(5:7,10:12),gridExpand = T)
# write excel file
openxlsx::saveWorkbook(wb, file="file.xlsx",
overwrite = TRUE)
The output looks like this:
I am trying to conditionally format the numbers in the seed/harv/yield/prod columns to be numbers, but the "F" values are creating a mixed-class vector at best. (I need these Fs!)
Any ideas?
Thanks!

exporting a list from R to excel in a good format

I'm trying to use the library(xlsx) to write some data from R into excel in a readable format.
My dataset is formatted as:
tbl <- list("some_name"=head(mtcars),"some_name2"=head(iris))
I would like to write this table to excel, with each item in the list being identified and the data being next to the item. E.g. the excel file should look like
"some_name" in cell A1
paste the dataframe head(mtcars) in cell A3
"some_name2" in cell A11
paste the dataframe head(iris) in cell A13
or something similar, e.g. pasting each item into a new worksheet.
Using
write.xlsx(tbl,"output.xlsx")
will output it correctly however it is not formatted in a readable way.
Any help would be great
The following codes create a xlsx file with multiple sheets, each of which holds a list name as the sheet name and a title, and a data frame below the title. You can modify it as you like.
require(xlsx)
ls2xlsx <- function(x, wb){
for(i in 1:length(x)){
sh <- createSheet(wb, names(x[i]))
cl_title <- createCell(createRow(sh, 1), 1)
addDataFrame(x[i], sh, startRow = 2, startColumn = 1)
setCellValue(cl_title[[1, 1]], names(x[i]))
}
}
tbl <- list("some_name" = head(mtcars),"some_name2"=head(iris))
wb <- createWorkbook()
ls2xlsx(tbl, wb)
saveWorkbook(wb, 'test.xlsx')
The following function writes a list of dataframes to an .xlsx file.
It has two modes, given by argument beside.
beside = TRUE is the default. It writes just one sheet, with the dataframe name on the first row, then an empty cell, then the dataframe. And repeats this for all dataframes, written side by side.
beside = FALSE writes one dataframe per sheet. The sheets' names are the dataframes names. If the list members do not have a name, the name is given by argument sheetNamePrefix.
The .xlsx file is written in the directory given by argument file.
writeList_xlsx <- function(x, file, beside = TRUE, sheetNamePrefix = "Sheet"){
xnames <- names(x)
shNames <- paste0(sheetNamePrefix, seq_along(x))
if(is.null(xnames)) xnames <- shNames
if(any(xnames == "")){
xnames[xnames == ""] <- shNames[xnames == ""]
}
wb <- createWorkbook(type = "xlsx")
if(beside){
sheet <- createSheet(wb, sheetName = shNames[1])
row <- createRow(sheet, rowIndex = 1)
col <- 0
for(i in seq_along(x)){
col <- col + 1
cell <- createCell(row, colIndex = col)
setCellValue(cell[[1, 1]], xnames[i])
col <- col + 2
addDataFrame(x[[i]], sheet,
startRow = 1, startColumn = col,
row.names = FALSE)
col <- col + ncol(x[[i]])
}
}else{
for(i in seq_along(x)){
sheet <- createSheet(wb, sheetName = xnames[i])
addDataFrame(x[[i]], sheet, row.names = FALSE)
}
}
if(!grepl("\\.xls", file)) file <- paste0(file, ".xlsx")
saveWorkbook(wb, file = file)
}
writeList_xlsx(tbl, file = "test.xlsx")
writeList_xlsx(tbl, file = "test2.xlsx", beside = FALSE)

Extract Links from Excel Sheet using R-package 'openxlsx'

I'm working with an Excel sheet in which some columns contain hyperlinks that are represented as text that is completely different from the actual address the hyperlinks point to. I want to use some R code to modify and subset the Excel sheet but keep the hyperlinks. I think I can do this by extracting those hyperlinks as an indexed character vector then re-introducing them into a new Excel document using the makeHyperlinkString() and writeFormula() functions. But I cannot figure out how to get a vector of the links themselves.
In case it matters, my intention is to do all the modifying and subsetting on a data.frame version of the Excel sheet rather than a workbook object.
Oh, now I think I got your problem. I thought there were only normal hyperlinks not Excel-Hyperlinks.
I think this may help you to get a vector of the hyperlinks, although its a bit messy.
library(openxlsx)
pathtofile = "path to .xlsx file"
df1 <- read.xlsx(xlsxFile = pathtofile,
sheet = 1, skipEmptyRows = FALSE,
colNames = F, rowNames = F,
startRow = 1)
## Sheet or Tabelle
Sheet = "Sheet" ## Or "Tabelle"
## Get Names of rows from Hyperlink column
rowIndex <- sub(x = df1[,1], pattern = paste0("(#'",Sheet,"\\d'!)"), replacement = "")
## Get the Sheet, where Hyperlinks are saved
SheetName <- regmatches(df1[,1], regexpr(text = df1[,1], pattern = paste0("(",Sheet,"\\d)")))
## Extract only the Sheet number
SheetIndex <- as.numeric(sub(x = SheetName, pattern = Sheet, replacement = ""))
## Get the row Indexes as numeric
RowIndexNum <- as.numeric(regmatches(rowIndex, regexpr(text = rowIndex, pattern = "\\d")))
## Get the column name as character
RowIndexName <- sub(x = rowIndex, pattern = "\\d", "")
## Create uppercase Letters
myLetters <- toupper(letters[1:26])
## Convert Row Name (character) to numeric (based on alphabetical order)
RowIndexNameNum <- match(RowIndexName, myLetters)
## If Hyperlinks only in 1 Sheet or several sheets
if (length(unique(SheetIndex)) == 1) {
dfLinks <- read.xlsx(xlsxFile = pathtofile,
sheet = unique(SheetIndex),
skipEmptyRows = FALSE,
colNames = F, rowNames = F,
rows = RowIndexNum[1]:tail(RowIndexNum,1),
cols = unique(RowIndexNameNum),
startRow = 1
);
} else {
dfLinks <- data.frame()
for (i in unique(SheetIndex)){
dfTmp <- read.xlsx(xlsxFile = pathtofile,
sheet = i,
skipEmptyRows = FALSE,
colNames = F, rowNames = F,
rows = RowIndexNum[1]:tail(RowIndexNum,1),
cols = unique(RowIndexNameNum),
startRow = 1)
dfLinks <- rbind(dfLinks, dfTmp)
}
}
dfLinks
This is how my Excel File looks like:

Colouring cells in Excel output from R [duplicate]

I want to export data frames to Excel and highlight cells according to certain rules. I don't think this answer to a similar question is correct. I think it is possible, and I think I get close using the CellStyle functions of the xlsx package.
Below I outline what I've tried. Most of the ideas come from the package help files. I get all the way to the end and get an error when I try to apply the style I created to the cells that meet the criteria. I get the error: Error in .jcall(cell, "V", "setCellStyle", cellStyle$ref) : RcallMethod: invalid object parameter.
library(xlsx)
# create data
cols <- sample(c(1:5), 1) # number of columns to vary to mimic this unknown
label <- rep(paste0("label ", seq(from=1, to=10)))
mydata <- data.frame(label)
for (i in 1:cols) {
mydata[,i+1] <- sample(c(1:10), 10)
}
# exporting data.frame to excel is easy with xlsx package
sheetname <- "mysheet"
write.xlsx(mydata, "mydata.xlsx", sheetName=sheetname)
file <- "mydata.xlsx"
# but we want to highlight cells if value greater than or equal to 5
wb <- loadWorkbook(file) # load workbook
fo <- Fill(backgroundColor="yellow") # create fill object
cs <- CellStyle(wb, fill=fo) # create cell style
sheets <- getSheets(wb) # get all sheets
sheet <- sheets[[sheetname]] # get specific sheet
rows <- getRows(sheet) # get rows
cells <- getCells(rows) # get cells
values <- lapply(cells, getCellValue) # extract the values
# find cells meeting conditional criteria
highlight <- "test"
for (i in names(values)) {
x <- as.numeric(values[i])
if (x>=5 & !is.na(x)) {
highlight <- c(highlight, i)
}
}
highlight <- highlight[-1]
# apply style to cells that meet criteria
if (length(highlight)>0) { # proceed if any cells meet criteria
setCellStyle(cells[highlight], cs) # DOES NOT WORK
}
# save
saveWorkbook(wb, file)
Update:
I've also tried:
if (length(highlight)>0) { # proceed if any cells meet criteria
for (h in 1:length(highlight)) {
setCellStyle(cells[highlight[h]], cs) # DOES NOT WORK
}
}
But I get the error: Error in .jcall(cell, "V", "setCellStyle", cellStyle$ref) : RcallMethod: cannot determine object class
Try this out. I changed a few things, including the a slight change to the call to Fill and limiting the cells included for consideration to those with numeric data. I used lapply to apply the conditional formatting.
cols <- sample(c(1:5), 1) # number of columns to vary to mimic this unknown
label <- rep(paste0("label ", seq(from=1, to=10)))
mydata <- data.frame(label)
for (i in 1:cols) {
mydata[,i+1] <- sample(c(1:10), 10)
}
# exporting data.frame to excel is easy with xlsx package
sheetname <- "mysheet"
write.xlsx(mydata, "mydata.xlsx", sheetName=sheetname)
file <- "mydata.xlsx"
# but we want to highlight cells if value greater than or equal to 5
wb <- loadWorkbook(file) # load workbook
fo <- Fill(foregroundColor="yellow") # create fill object
cs <- CellStyle(wb, fill=fo) # create cell style
sheets <- getSheets(wb) # get all sheets
sheet <- sheets[[sheetname]] # get specific sheet
rows <- getRows(sheet, rowIndex=2:(nrow(mydata)+1) # get rows
# 1st row is headers
cells <- getCells(rows, colIndex = 3:(cols+3)) # get cells
# in the wb I import with loadWorkbook, numeric data starts in column 3
# and the first two columns are row number and label number
values <- lapply(cells, getCellValue) # extract the values
# find cells meeting conditional criteria
highlight <- "test"
for (i in names(values)) {
x <- as.numeric(values[i])
if (x>=5 & !is.na(x)) {
highlight <- c(highlight, i)
}
}
highlight <- highlight[-1]
lapply(names(cells[highlight]),
function(ii)setCellStyle(cells[[ii]],cs))
saveWorkbook(wb, file)
Old question, but for people that still research this topic:
In the package openxlsx, there is a function that makes this much easier- conditionalFormatting()
Below is an example:
#Load needed package
if (!require("pacman")
) install.packages("pacman")
pacman::p_load(
#add list of libraries here
openxlsx
)
##Create workbook and write in sample data
wb <- createWorkbook()
addWorksheet(wb, "Moving Row")
writeData(wb, "Moving Row", -5:5)
writeData(wb, "Moving Row", LETTERS[1:11], startCol = 2)
##Define how you want the cells to be formatted
negStyle <- createStyle(fontColour = "#9C0006", bgFill = "#FFC7CE")
posStyle <- createStyle(fontColour = "#006100", bgFill = "#C6EFCE")
## highlight row dependent on first cell in row
conditionalFormatting(wb, "Moving Row",
cols = 1:2,
rows = 1:11, rule = "$A1<0", style = negStyle
)
conditionalFormatting(wb, "Moving Row",
cols = 1:2,
rows = 1:11, rule = "$A1>0", style = posStyle
)
##Save workbook in default location
saveWorkbook(wb, "conditionalFormattingExample.xlsx", TRUE)
you can read about it here and see many other types of conditional highlighting it can do:
https://www.rdocumentation.org/packages/openxlsx/versions/4.2.5/topics/conditionalFormatting
It has been a while since I used this feature. Yes it should be possible to save conditional formatting. My (old) code is given below. Hope it helps you.
file.name <- paste('loadings.',state$data,'.xls', sep = "")
wb <- loadWorkbook(file.name, create = TRUE)
createSheet(wb, name = 'loadings')
clearSheet(wb, sheet = 'loadings')
Variables <- rownames(df)
df.loadings <- cbind(Variables,df)
df.loadings[,'Communality'] <- NULL
writeWorksheet(wb,df.loadings[,-1], sheet = 'loadings', rownames = 'Variables', startRow = 1, startCol = 1)
max.loading <- createCellStyle(wb)
setFillPattern(max.loading, fill = XLC$"FILL.SOLID_FOREGROUND")
setFillForegroundColor(max.loading, color = XLC$"COLOR.SKY_BLUE")
maxVal <- apply(abs(df.loadings[,-1]),1,max)
maxValIndex <- which(abs(df.loadings[,-1]) == maxVal, arr.ind = TRUE)
setCellStyle(wb, sheet = "loadings", row = maxValIndex[,'row']+1, col = maxValIndex[,'col']+1, cellstyle = max.loading)
df.corr <- data.frame(cor(f.data))
df.corr <- cbind(Variables,df.corr)
createSheet(wb, name = 'correlations')
clearSheet(wb, sheet = 'correlations')
writeWorksheet(wb, df.corr, sheet = 'correlations', startRow = 1, startCol = 1)
corr <- createCellStyle(wb)
setFillPattern(corr, fill = XLC$"FILL.SOLID_FOREGROUND")
setFillForegroundColor(corr, color = XLC$"COLOR.SKY_BLUE")
corrIndex <- which(abs(df.corr[,-1]) > .3 & df.corr[,-1] != 1 , arr.ind = TRUE)
setCellStyle(wb, sheet = "correlations", row = corrIndex[,'row']+1, col = corrIndex[,'col']+1, cellstyle = corr)
saveWorkbook(wb)
if(.Platform$OS.type == "unix") {
execute(paste("browseURL(\"",getwd(),'/',file.name,"\", browser = '/usr/bin/open')",sep=''))
} else {
execute(paste("browseURL(\"",getwd(),'/',file.name,"\", browser = NULL)",sep=''))
}

Resources