How can I load an xlsx workbook into R? - r

I have an xlsx workbook on my desktop with existing sheets. I have a dataframe in R that I would like to add to this workbook as a new sheet.
Using library(openxlsx) I want to do:
wb <- loadWorkbook("workbook.xlsx", isUnzipped=TRUE)
addWorksheet(wb, "New_Sheet")
writeData(wb, "New_Sheet", df)
saveWorkbook(wb, "workbook.xlsx", overwrite=TRUE)
however the program fails at the first line:
wb <- loadWorkbook("workbook.xlsx", isUnzipped=TRUE)
I get the error message:
Error in match(sheetrID, file_rIds): object sheetrId not found
I also created a dummy file (excel workbook with just two existing sheets and some dummy strings in each sheet) and I get the same error message.
I also tried uninstalling openxlsx and re-installing. No change.

As mentioned in the comments by #MrFlick (thank you!), Excel files are actually zipped files that contain XML files. So the workbook must be unzipped before it can be opened in R.

Related

Overwrite An Excel File through RDCOMClient Package in R

I am trying to manipulate an Excel file (.xls) in R through RDCOMClient Package.
I created an Excel object in R, opened a workbook saved as .xls file format, and tried to convert the file format into .xlsx without pop-up dialog box when there is an Excel file with the same file name. Codes as below.
excel <- COMCreate("Excel.Application")
wb <- excel$Workbooks()$Open(Filename = "filepath.xls",Password = "xxxxx")
excel$DisplayAlerts(FALSE)
wb$SaveAs(Filename = "filepath.xlsx" ,FileFormat = 51,Password = "")
I got an error message when I executed the code:
excel$DisplayAlerts(FALSE)
<'checkErrorInfo'> 8002000E Error: invalid number of parameter.
You should replace it with the following:
excel[["DisplayAlerts"]]=FALSE

How do I modify an existing a sheet in an Excel Workbook using Openxlsx package in R?

I am using "openxlsx" package to read and write excel files. I have a fixed file with a sheet called "Data" which is used by formulas in other sheets. I want to update this Data sheet without touching the other.
I am trying the following code:
write.xlsx(x = Rev_4, file = "Revenue.xlsx", sheetName="Data")
But this erases the excel file and creates a new one with just the new data in the "Data" sheet while all else gets deleted. Any Advice?
Try this:
wb <- loadWorkbook("Revenue.xlsx")
writeData(wb, sheet = "Data", Rev_4, colNames = F)
saveWorkbook(wb,"Revenue.xlsx",overwrite = T)
You need to load the complete workbook, then modify its data and then save it to disk. With writeData you can also specify the starting row and column. And you could also modify other sections before saving to disk.
I've found this package. It depends on openxlsx and helps to insert many sheets on a xlsx file. Maybe it makes easier:
Package documentation
library(xlsx2dfs)
# However, be careful, the function xlsx2dfs assumes
# that all sheets contain simple tables. If that is not the case,
# use the accepted answer!
dfs <- xlsx2dfs("Revenue.xlsx") # all sheets of file as list of dfs
dfs["Data"] <- Rev_4 # replace df of sheet "Data" by updated df Rev_4
dfs2xlsx(dfs, "Revenue.xlsx") # this overwrites the existing file! cave!

Can't open Excel File created in R language

I get the corruption error when I try to open the Excel workbook created in R.
I tried with both .xlsx and .xls extensions but neither worked!
The code that I used for doing all this is:
wb <- loadWorkbook("RCreated.xls", create = TRUE);
saveWorkbook(wb)
createSheet(wb, name = "First")
HELP!
Create the sheet BEFORE saving the workbook.

List xlsx sheetnames with R

Is it possible to generate a list of sheetnames within an xlsx file? Or perhaps, can I check if a sheet name exists, and if not, proceed with some designated function?
With xlsx library you can get the list of the sheets in an existing workbook with getSheets():
wb <- loadWorkbook(your_xlsx_file)
sheets <- getSheets(wb)
Yes, I have done that with the xlsx package which (just like the XLConnect package) uses a Java backend with the Apache POI code -- so it is cross-platform.
You can also do this with the RODBC package:
h <- odbcConnectExcel2007("file.xlsx")
sqlTables(h)
A oneline solution using openxlsx would be
openxlsx::getSheetNames('your/file.xlsx')
Only this worked in my case:
library(openxlsx)
sheetNames <- getSheetNames("filename.xlsx")
None of the above solutions worked for my big xlsx (>300 sheets):
Error in .jcall("RJavaTools", "Ljava/lang/Object;", "invokeMethod", cl, :
java.lang.OutOfMemoryError: Java heap space
To get excel or workbook file sheet names using R xlsx package, load your workbook or excel file, in my case e.g. name of excel file is "input_4_r.xlsx"
> wb<-loadWorkbook("input_4_r.xlsx")
See the list of files, here it shows 2 sheets in my example case
in my example, I have not named first sheet and kept the default
but 2nd sheet, I named as "name city" and hence the output below.
> getSheets(wb)
$Sheet1
[1] "Java-Object{Name: /xl/worksheets/sheet1.xml - Content Type: application/vnd.openxmlformats-officedocument.spreadsheetml.worksheet+xml}"
$`name city`
[1] "Java-Object{Name: /xl/worksheets/sheet2.xml - Content Type: application/vnd.openxmlformats-officedocument.spreadsheetml.worksheet+xml}"
You can see the names of sheetnames as below.
> names(getSheets(wb))
[1] "Sheet1" "name city"
To get the name of specific index of sheet, e.g. passing [2] in my case for 2nd sheet.
> names(getSheets(wb))[2]
[1] "name city"
Assumption for above is xlsx package is installed and loaded in R

using xlsx package in R to adjust page setting in xlsx file

I am trying to change the page setting of a xlsx file so that it will be printed as landscape, but not portrait, I tried the following:
library(xlsx)
wb <- createWorkbook()
sheet <- createSheet(wb, "Sheet1")
ps <- printSetup(sheet, landscape=TRUE, copies=3)
This is OK if I create a new Excel workbook, but I can't use it when I am using the loadWorkbook function to load an xls file. I wonder why.
Update: I am working on a xls file but not xlsx file and found the answer below cannot solve my problem, any further suggestions? Thanks.
I can get it to work by using getSheets to select the sheet to set the print area for:
wb <- loadWorkbook("test.xlsx")
printSetup(getSheets(wb)[[1]], landscape=TRUE, copies=3)
saveWorkbook(wb,"test.xlsx")

Resources