Exporting .XLS file from openxlsx - r

I have a function that receives DataFrame does a bunch of transformations with openxlsx and exports the data from R to .xlsx:
export_workbook_from_df <- function(data, path) {
wb <- openxlsx::createWorkbook()
openxlsx::addWorksheet(wb, sheetName = "Sheet1")
openxlsx::openxlsx_setOp("numFmt", "0,00")
number_format <- openxlsx::createStyle(numFmt = "Number") # create thousands format
wb |>
openxlsx::addStyle(sheet = 1,
number_format,
rows = 1:nrow(dados) + 1, cols = c(6),
gridExpand = T
)
openxlsx::writeData(wb, sheet = 1, data)
openxlsx::saveWorkbook(wb, paste0(path, ".xlsx"))
}
if I try to save as .xls using openxlsx::saveWorkbook(wb, paste0(path, ".xls")) I get the following error:
Which roughly translates to:
The format of the file and the extension don't correspond. The file may be corrupted or not be safe. Don't open it, unless you trust the source. Do you want to open anyway?
The file works fine if I save it as .xlsx and manually save as .xls within Excel;
I also tried using XLConnect to load the file after is saved and export in a different format, like:
openxlsx::saveWorkbook(wb, paste0(path, ".xlsx"))
XLConnect::loadWorkbook(paste0(path, ".xlsx")) %>%
XLConnect::saveWorkbook(paste0(path, ".xls"))
While it does export the file as .xls I get the same error.
It may be worth mentioning that when I open the file I get exactly the same data as in the .xlsx file when using either methods (using openxlsx and XLConnect)

The xls and xlsx file formats are not the same: XLSX is a zipped, XML-based file format. Microsoft Excel 2007 and later uses XLSX as the default file format when creating a new spreadsheet. Support for loading and saving legacy XLS files is also included. XLS is the default format used with Office 97-2003. When you try to load the XLSX which you saved as an XLS Excel barfs as above because it is expecting the old binary format but it is encountering a zipped XML-based one instead.

Related

How to convert xlsx files to csv files in RStudio? Need to convert multiple workbooks all with multiple spreadsheets

Trying to write an R script that will convert multiple xlsx workbook files within a folder while also converting the sheets within the workbook as separate csv files.
Looking for a single script to automatically apply code to all workbooks and their spreadsheets.
For reading Excel files, there are several packages.
I personally am happy with the xlsx package, which you can use to read Excel files, as well as their individual sheets. This article looks like it will give you the gist of it.
Each worksheet you read out you should then be able to export to CSV files by using R's built-in write.csv (or write.csv2) method.
Below is an example to convert a single xlsx workbook to multiple csv files.
Note that type conversions are not guaranteed to be correct.
xlsx_path <-"path_to_xlsx.xlsx"
sheet_names <- readxl::excel_sheets(xlsx_path)
# read from all sheets to a list of data frames
xlsx_data <- purrr::map(
sheet_names,
~readxl::read_excel(xlsx_path,.x,col_types = "text",col_names = FALSE)
)
# write a list of data frame to csv files
purrr::walk2(
xlsx_data,sheet_names,
~readr::write_csv(.x,paste0(xlsx_path,"-",.y,".csv"),col_names = FALSE)
)
# csv files will be saved as:
# path_to_xlsx-sheet1.xlsx, path_to_xlsx-sheet2.xlsx, ...
If you need to apply this function to many xlsx files. Use list.files() to get the path to all xlsx files. And write a for loop or use another map function to iterate this process.
If you are using Rstudio it is possible that you already have the package readxl installed. They have many workflows for common usecases explained here: https://readxl.tidyverse.org/articles/articles/readxl-workflows.html
They also provide this nice code snippet to do what you are asking for:
read_then_csv <- function(sheet, path) {
pathbase <- tools::file_path_sans_ext(basename(path))
df <- read_excel(path = path, sheet = sheet)
write.csv(df, paste0(pathbase, "-", sheet, ".csv"),
quote = FALSE, row.names = FALSE)
df
}
path <- readxl_example("datasets.xlsx")
sheets <- excel_sheets(path)
xl_list <- lapply(excel_sheets(path), read_then_csv, path = path)
names(xl_list) <- sheets
If you go to here and put "excel" and "xls" in the search bar, you 'll get a list of packages and functions which might help.

R readxl::read_excel failed to open xls file

readxl_1.1.0
I'm trying to read the file from this link (US gov website)
https://www.cftc.gov/files/dea/history/dea_com_xls_2018.zip
When I unzip the xls file inside, and read with readxl::read_excel, it fails with the error message failed to open C:\path to file
I can open the file in excel, save it to csv and read it to R by fread, but there are a lot of those files, so that's tedious. By the way, some other xls files downloaded from the same webpage can be read by read_excel
There's something odd about the xls file. I think it's because it contains some VBA code.
If you are happy to use XLConnect here is an alternative that reads the file.
library(XLConnect)
extdir = tempdir()
unzip("dea_com_xls_2018.zip", exdir = extdir)
file = list.files(extdir, pattern = 'xls', full.names = T)
wb = loadWorkbook(file)
ws = readWorksheet(wb, sheet = 1)
dim(ws)
#[1] 11131 126

Dynamically converting a list of Excel files to csv files in R

I currently have a folder containing all Excel (.xlsx) files, and using R I would like to automatically convert all of these files to CSV files using the "openxlsx" package (or some variation). I currently have the following code to convert one of the files and place it in the same folder:convert("team_order\\team_1.xlsx", "team_order\\team_1.csv")
I would like to automate the process so it does it to all the files in the folder, and also removes the current xlsx files, so only the csv files remain. Thanks!
You can try this using rio, since it seems like that's what you're already using:
library("rio")
xls <- dir(pattern = "xlsx")
created <- mapply(convert, xls, gsub("xlsx", "csv", xls))
unlink(xls) # delete xlsx files
library(readxl)
# Create a vector of Excel files to read
files.to.read = list.files(pattern="xlsx")
# Read each file and write it to csv
lapply(files.to.read, function(f) {
df = read_excel(f, sheet=1)
write.csv(df, gsub("xlsx", "csv", f), row.names=FALSE)
})
You can remove the files with the command below. However, this is dangerous to run automatically right after the previous code. If the previous code fails for some reason, the code below will still delete your Excel files.
lapply(files.to.read, file.remove)
You could wrap it in a try/catch block to be safe.

Can't open Excel File created in R language

I get the corruption error when I try to open the Excel workbook created in R.
I tried with both .xlsx and .xls extensions but neither worked!
The code that I used for doing all this is:
wb <- loadWorkbook("RCreated.xls", create = TRUE);
saveWorkbook(wb)
createSheet(wb, name = "First")
HELP!
Create the sheet BEFORE saving the workbook.

Downloading Excel File from XLConnect with R Shiny

Has anyone tried using the download handler in R Shiny to download a freshly created Excel file with XLConnect?
In the ui.R there is the unremarkable line:
downloadButton('downloadData', 'Download')
In the server.R there is the handler:
output$downloadData <- downloadHandler(
filename = function() { "output.xlsx" },
content = function(file){
wb <- loadWorkbook(file, create = TRUE)
createSheet(wb, name = "Sheet1")
writeWorksheet(wb, c(1:3), sheet = "Sheet1") # writes numbers 1:3 in file
saveWorkbook(wb)
}
)
I have no problem downloading a .csv and no problem creating the excel file with XLConnect. But when I run the code as above I get the following error in my Chrome browser:
IllegalArgumentException (Java): File extension "file1b683b9323bc" not
supported! Only *.xls and *.xlsx are allowed!
As far as I can see, XLConnect cannot write to a temporary file.
Has anyone got a solution or workaround?
One option would be to save the file in a specific location and then creating a download link pointing to it. However, this is not very Shiny-esque as multiple users would cause havok.
Many Thanks
Marcus
Try using this for the content(...) function; it works for me...
content = function(file){
fname <- paste(file,"xlsx",sep=".")
wb <- loadWorkbook(fname, create = TRUE)
createSheet(wb, name = "Sheet1")
writeWorksheet(wb, c(1:3), sheet = "Sheet1") # writes numbers 1:3 in file
saveWorkbook(wb)
file.rename(fname,file)
}
The problem is that file is a randomly generated temp file, without an extension, whereas saveWorkbook(...) requires the .xlsx extension. So this just appends .xlsx to file and uses that for all the XLConnect manipulations, then renames the final file to the original name (e.g., strips off the extension).

Resources