openxlsx not able to read from .xlsx file in R - r

I am trying to read value from a .xlsx file using openxlsx package in R. In simple words, I need to write a row of data, which then populates some output cell that has to be read back in R. I will share an example to better explain the problem.
Initial state of the .xlsx file:
I'm now trying to write new values to the cell : A2:A3 = c("c", 5). So ideally, I'm expecting A6 = 15
Below is the code used :
require(openxlsx)
path <- "C:/path_to_file/for_SO1.xlsx"
input_row <- c("c", 5)
# Load workbook; create if not existing
wb <- loadWorkbook(path)
# createSheet(wb, name = "1")
writeData(wb,
sheet = "Sheet1",
x = data.frame(input_row),
startCol=1,
startRow=1
)
data_IM <- read.xlsx(wb,
sheet = "Sheet1",
rows = c(5,6),
cols = c(1))
# Save workbook
saveWorkbook(wb, file = path, overwrite = TRUE)
#> data_IM
# output_row
#1 3
But I get the inital value(3). However, If i open the .xlsx file, I can see the 15 residing there:
What could be the reason for not able to read this cell? I tried saving it after writing to the file and again reading it but even that failed. openxlsx is the only option I have due to JAVA errors from XLConnect etc.

?read.xlsx
Formulae written using writeFormula to a Workbook object will not get
picked up by read.xlsx(). This is because only the formula is written
and left to be evaluated when the file is opened in Excel. Opening,
saving and closing the file with Excel will resolve this.
So the file needs to be opened in Excel and then saved, I can verify that this does work. However this may not be suitable for you.
XLConnect seems to have the desired functionality
# rjava can run out of memory sometimes, this can help.
options(java.parameters = "-Xmx1G")
library(XLConnect)
file_path = "test.xlsx"
input_row <- c("c", 5)
wb <- loadWorkbook(file_path, create=F)
writeWorksheet(wb, 1, startRow = 1, startCol = 1, data = data.frame(input_row))
setForceFormulaRecalculation(wb, 1, TRUE)
saveWorkbook(wb)
# checking
wb <- loadWorkbook(file_path, create=F)
readWorksheet(wb, 1)

The file https://cran.r-project.org/web/packages/openxlsx/openxlsx.pdf says
Workbook object will not get picked up by read.xlsx().
This is because only the formula is written and left to be evaluated when the file is opened in Excel.
Opening, saving and closing the file with Excel will resolve this.
So if you are using windows then
save following file vbs file to for example opensaveexcel.vbs
Set objExcel = CreateObject("Excel.Application")
Set objWorkbook = objExcel.Workbooks.Open("D:\Book2.xlsx")
objWorkbook.Save
objWorkbook.Close
objExcel.Quit
Set objExcel = Nothing
Set objWorkbook = Nothing
and then you can write R code as cell A4 has formula in Book1.xlsx as =A3*5
mywritexlsx(fname="d:/Book1.xlsx",data = 20,startCol = 1,startRow = 3)
system("cp d:\\Book1.xlsx d:\\Book2.xlsx")
system("cscript //nologo d:\\opensaveexcel.vbs")
tdt1=read.xlsx(xlsxFile = "d:/Book1.xlsx",sheet = "Sheet1",colNames = FALSE)
tdt2=read.xlsx(xlsxFile = "d:/Book2.xlsx",sheet = "Sheet1",colNames = FALSE)
Works for me by the way mywritexlsx is as
mywritexlsx<-function(fname="temp.xlsx",sheetname="Sheet1",data,
startCol = 1, startRow = 1, colNames = TRUE, rowNames = FALSE)
{
if(!file.exists(fname))
{
wb = openxlsx::createWorkbook()
sheet = openxlsx::addWorksheet(wb, sheetname)
}
else
{
wb <- openxlsx::loadWorkbook(file =fname)
if(!(sum(openxlsx::getSheetNames(fname)==sheetname)))
sheet = openxlsx::addWorksheet(wb, sheetname)
else
sheet=sheetname
}
openxlsx::writeData(wb,sheet,data,startCol = startCol, startRow = startRow,
colNames = colNames, rowNames = rowNames)
openxlsx::saveWorkbook(wb, fname,overwrite = TRUE)
}

Related

Why is read_xlsx reading in stale values from a different excel spreadsheet?

Given the nature of this question, I can't really give a fully reproducible example. But here is the nature of the issue:
I have spreadsheets in a folder
I used openxlsx package and wrote a script to write a new formula in a different part of the spreadsheet, and to write "YES" in cell B1. When this happens, the Revenue line updates in the spreadsheet (using an if excel formula). A version 2 of the file is then saved with the changes.
So far so good.
I then used readxl to read the financials of both spreadsheets in the directory
The problem comes in when I look at the results. The financials in company_a2 (the updated file after running the script), the result after step 2, are being read in as the same as company_a even though they are different numbers.
I will put the code in case that reveals something obvious I am missing, as well as a couple of screen shots of the output
The code:
# ===========================================================================================
# Load packages
Library(tidyverse)
library(openxlsx
library(readxl)
# Set working directory to excel folder location
setwd("C:/Users/davem/Sync/R files/excel_file_test")
# Create function to write formulas and data in excel file
edit_xl_fx <- function(input_wb, input_sheet, input_formula, input_data, start_row_form, start_col_form, start_row_data, start_col_data){
# Load
wb <- openxlsx::loadWorkbook(input_wb)
# Write
formula_cols <- LETTERS[4:8]
lapply(1:length(formula_cols), FUN = function(x) writeFormula(wb, input_sheet, x = input_formula[x], startCol = formula_cols[x], startRow = start_row_form))
writeData(wb, sheet = input_sheet, x = input_data, startCol = start_col_data, startRow = start_row_data)
# Save
orig_xl_name <- str_remove(input_wb, pattern = ".xlsx")
new_xl_name <- paste0(orig_xl_name, 2, ".xlsx")
saveWorkbook(wb, new_xl_name, overwrite = TRUE)
}
# Get list of files
xl_files <- as.list(list.files(pattern = ".xlsx"))
# Run function
lapply(
xl_files,
edit_xl_fx,
input_sheet = "Sheet1",
input_formula = c(
"D21/(D20-D16)",
"E21/(E20-E16)",
"F21/(F20-F16)",
"G21/(G20-G16)",
"H21/(H20-H16)"
),
input_data = "YES",
start_row_form = 26,
start_col_form = 4:8,
start_row_data = 1,
start_col_data = 2
)
This all works. But then when I go to extract the financials, I get the same results even thought the values in the files are different.
Values in the excel files:
Code to read the files:
# Get updated list of files
xl_files_updated <- as.list(list.files(pattern = ".xlsx"))
# Create function to read in files
financials <- lapply(xl_files_updated, function(x) read_excel(path = x, sheet = "Sheet1", range = "B10:H21")) %>%
bind_rows(.) %>%
mutate(across(
.cols = is.numeric,
.fns = round, 1
)
)
Data that is read into R:
Even if I just run readxl::read_xlsx(path = "company_a2.xlsx", sheet = "Sheet1", range = "B10:H21") I still get the numbers in company_a.xlsx as opposed to the updated numbers.
Thanks for helping clear up what I am missing here!

Change cell value in openxlsx workbook

I want to use openxlsx to change an individual cell in a workbook sheet and write it back out as the same .xlsx without losing the styling, validation, etc of the original .xlsx file. I stipulate openxlsx because it doesn't have a rJava dependency.
Here's a dummy workbook:
library(openxlsx)
## Make a dummy workbook to read in
write.xlsx(list(iris = iris, mtcars = mtcars), file = 'test.xlsx')
wb <- loadWorkbook('test.xlsx')
isS4(wb)
How can I change the value of cell [2,1] so that it essentially is identical to the original .xlsx file but with on cell altered?
I can of course read in the workbook but I don't know what good that does me.
m <- readWorkbook(wb)
m[2, 1] <- 20
m[1:5,]
writeData can help you with this.
test.fpath <- 'test.xlsx'
openxlsx::write.xlsx(list(iris = iris, mtcars = mtcars), file = test.fpath)
.wb <- openxlsx::loadWorkbook(test.fpath)
openxlsx::writeData(
wb = .wb,
sheet = 1,
x = 20,
xy = c(2,1)
)
openxlsx::saveWorkbook(
.wb,
test.fpath,
overwrite = TRUE
)

How to use write.xlsx in R when the excel file exists or the sheet exists

I am trying to use openxlsx::write.xlsx to write results into Excel spreadsheet in R.
if the file exists and a new sheet is to be added, I can use append=T. Instead of using if to check the file, are there any ways to automatically check?
if the file and sheet both exist and this sheet is to be updated, how should I do to overwrite the results? Thanks.
Here is an openxlsx answer. In order to demonstrate, we need some data.
## Create a simple test file
library(openxlsx)
hs <- createStyle(textDecoration = "Bold")
l <- list("IRIS" = iris, "MTCARS" = mtcars)
write.xlsx(l, file = "TestFile.xlsx", borders = "columns", headerStyle = hs)
Question 1
You can check whether or not the file exists with
## Check existence of file
file.exists("TestFile.xlsx")
You can check if the tab (sheet) exists within the workbook
## Check available sheets
getSheetNames("TestFile.xlsx")
Steps for question 2:
1. Read the file into a Workbook object.
2. Pull the data from the sheet you want to modify into a data.frame.
3. Modify the data.frame to taste
4. Save the data back into the Workbook
5. Save the Workbook out to disk
In order to have a simple example to work with, let's create a simple test file.
## Load existing file
wb = loadWorkbook("TestFile.xlsx")
## Pull all data from sheet 1
Data = read.xlsx(wb, sheet=1)
## Change a single element for demonstration
## ** Beware!! ** Because of the header,
## the 2,2 position in the data
## is row 3 column 2 in the spreadsheet
Data[2,2] = 1492
## Put the data back into the workbook
writeData(wb, sheet=1, Data)
## Save to disk
saveWorkbook(wb, "TestFile.xlsx", overwrite = TRUE)
You can open up the spreadsheet and check that the change has been made.
If you want to completely change the sheet (as in your comment),
you can just delete the old sheet and replace it with a new one
using the same name.
removeWorksheet(wb, "IRIS")
addWorksheet(wb, "IRIS")
NewData = data.frame(X1=1:4, X2= LETTERS[1:4], X3=9:6)
writeData(wb, "IRIS", NewData)
saveWorkbook(wb, "TestFile.xlsx", overwrite = TRUE)
You can check if sheet exists before and if so remove it, if not append it to existing file this command also will create file if it does not exist.
library(xlsx)
path <- "testing.xlsx"
sheet_name = "new_sheet"
data <-
data.frame(
B = c(1, 2, 3, 4)
)
if(sheet_name %in% names(getSheets(loadWorkbook(path)))){
wb <- loadWorkbook(path)
removeSheet(wb, sheetName = sheet_name)
saveWorkbook(wb, path)
}
write.xlsx(data, path, sheetName = sheet_name, append = TRUE)

Exporting the differences in xlxs sheet in desired cells using R

I've to compare multiple files (Prod1, Beta1, Prod2, Beta2.. etc) and export the differences in an excel sheet if any. That should be in separate cells (Column C). I'm trying with below code library(xlsx) I can store the data only in the 1st cell.
library(xlsx)
for(i in 1:No_of_files){
prod_file_res_name <- sprintf("R/Results/F_Query_Prod_%s.txt", i)
beta_file_res_name <- sprintf("R/Results/F_Query_Beta_%s.txt", i)
if (file.exists(prod_file_res_name) && file.exists(beta_file_res_name))
{
res <- tools::Rdiff(prod_file_res_name, beta_file_res_name, Log = TRUE)
if(res[2] != "character(0)"){
write.xlsx(toString(res[2]), file = "C:/R/diff.xlsx", sheetName = "Sheet1", col.names = FALSE, row.names =FALSE, append = TRUE)
}
else{
com <- "No Difference found"
write.xlsx(com, file = "C:/R/diff.xlsx", sheetName = "ExtractFormulaHistory", col.names = FALSE, row.names =FALSE, append = TRUE)
}
}
else {
print("File doesnt exist")
}
}
Can anyone help me to save the difference in column 5 but different rows(example: 1 to X no of files)? Thanks in Advance.
The easiest way would be to create a tibble, or list to hold you output and then in the last step of your code output it to Excel using the xlsx library.
Alternatively, you could use the writeWorksheet from XLConnect package to write to an Excel file as you work out the differences.
From documentation of the XLConnect package:
# Load workbook (create if not existing)
wb <- loadWorkbook("writeWorksheet.xlsx", create = TRUE)
# Create a worksheet called 'CO2'
createSheet(wb, name = "CO2")
# Write built-in data set 'CO2' to the worksheet created above;
# offset from the top left corner and with default header = TRUE
writeWorksheet(wb, CO2, sheet = "CO2", startRow = 4, startCol = 2)
# Save workbook (this actually writes the file to disk)
saveWorkbook(wb)

Importing excel xlsx data using XLconnect and readWorksheet causes incorrect format

I have a Excel file with the extension xlsx where Sheet1 and Sheet contains the following: 18:20, 10:10 (column A, row 1:2). When I try to import them into R I do not get expected results.
library(XLConnect)
setwd("...")
my_book <- loadWorkbook("test.xlsx")
xlsx_import <- lapply(getSheets(my_book), readWorksheet, object = my_book)
xlsx_import
# Returns some kind of date format
xlsx_import <- lapply(getSheets(my_book), readWorksheet, object = my_book, colTypes = "character")
xlsx_import
# Same as above
Is it possible to fix this in R somehow? As I have quite a lot of sheets to go through.
Try This.
wb=loadWorkbook("Test.xlsx", create = TRUE)
setStyleAction(wb, XLC$"STYLE_ACTION.DATATYPE")
cs = createCellStyle(wb, name = "myDateStyle")
setDataFormat(cs, format = "dd-mm-yyyy")
setCellStyleForType(wb, style = cs, type = XLC$"DATA_TYPE.DATETIME")
s<-readWorksheet(wb,sheet = "Sheet1")
***Operations which you wanna do***
writeWorksheet(wb, wq, sheet = "Sheet1")
setForceFormulaRecalculation(wb,"*",TRUE)
saveWorkbook(wb)

Resources