List xlsx sheetnames with R - r

Is it possible to generate a list of sheetnames within an xlsx file? Or perhaps, can I check if a sheet name exists, and if not, proceed with some designated function?

With xlsx library you can get the list of the sheets in an existing workbook with getSheets():
wb <- loadWorkbook(your_xlsx_file)
sheets <- getSheets(wb)

Yes, I have done that with the xlsx package which (just like the XLConnect package) uses a Java backend with the Apache POI code -- so it is cross-platform.

You can also do this with the RODBC package:
h <- odbcConnectExcel2007("file.xlsx")
sqlTables(h)

A oneline solution using openxlsx would be
openxlsx::getSheetNames('your/file.xlsx')

Only this worked in my case:
library(openxlsx)
sheetNames <- getSheetNames("filename.xlsx")
None of the above solutions worked for my big xlsx (>300 sheets):
Error in .jcall("RJavaTools", "Ljava/lang/Object;", "invokeMethod", cl, :
java.lang.OutOfMemoryError: Java heap space

To get excel or workbook file sheet names using R xlsx package, load your workbook or excel file, in my case e.g. name of excel file is "input_4_r.xlsx"
> wb<-loadWorkbook("input_4_r.xlsx")
See the list of files, here it shows 2 sheets in my example case
in my example, I have not named first sheet and kept the default
but 2nd sheet, I named as "name city" and hence the output below.
> getSheets(wb)
$Sheet1
[1] "Java-Object{Name: /xl/worksheets/sheet1.xml - Content Type: application/vnd.openxmlformats-officedocument.spreadsheetml.worksheet+xml}"
$`name city`
[1] "Java-Object{Name: /xl/worksheets/sheet2.xml - Content Type: application/vnd.openxmlformats-officedocument.spreadsheetml.worksheet+xml}"
You can see the names of sheetnames as below.
> names(getSheets(wb))
[1] "Sheet1" "name city"
To get the name of specific index of sheet, e.g. passing [2] in my case for 2nd sheet.
> names(getSheets(wb))[2]
[1] "name city"
Assumption for above is xlsx package is installed and loaded in R

Related

How can I load an xlsx workbook into R?

I have an xlsx workbook on my desktop with existing sheets. I have a dataframe in R that I would like to add to this workbook as a new sheet.
Using library(openxlsx) I want to do:
wb <- loadWorkbook("workbook.xlsx", isUnzipped=TRUE)
addWorksheet(wb, "New_Sheet")
writeData(wb, "New_Sheet", df)
saveWorkbook(wb, "workbook.xlsx", overwrite=TRUE)
however the program fails at the first line:
wb <- loadWorkbook("workbook.xlsx", isUnzipped=TRUE)
I get the error message:
Error in match(sheetrID, file_rIds): object sheetrId not found
I also created a dummy file (excel workbook with just two existing sheets and some dummy strings in each sheet) and I get the same error message.
I also tried uninstalling openxlsx and re-installing. No change.
As mentioned in the comments by #MrFlick (thank you!), Excel files are actually zipped files that contain XML files. So the workbook must be unzipped before it can be opened in R.

Readin .xlsx file into R

Im trying to read an excel file into R. It's about the following file in my cwd:
> list.files()
[1] "Keuren_Op_Afspraak.xlsx"
I installed XLConnect and am doing the following:
library(XLConnect)
demoExcelFile <- system.file("Keuren_Op_Afspraak.xlsx", package = "XLConnect")
wb <- loadWorkbook(demoExcelFile)
But this gives me the error:
Error: FileNotFoundException (Java): File '' could not be found - you may specify to automatically create the file if not existing.
But I dont understand where this is coming from. Any thoughts?
I prefer using the readxl package. It is written in C so it is faster. It also seems to handle large files better. The command would be:
library(readxl)
wb <- read_excel("Keuren_Op_Afspraak.xlsx")
You can also use the xlsx package.
library(xlsx)
wb <- read.xlsx("Keuren_Op_Afspraak.xlsx", sheet = 1)
Edit :#Verena
You can also use this function much faster:
wb <- read.xlsx2("Keuren_Op_Afspraak.xlsx", sheet = 1)
You have to change your code that way:
library(XLConnect)
demoExcelFile <- "Keuren_Op_Afspraak.xlsx"
wb <- loadWorkbook(demoExcelFile)
You probably took the example from here:
http://www.inside-r.org/packages/cran/XLConnect/docs/loadWorkbook
This line
system.file("demoFiles/mtcars.xlsx", package = "XLConnect")
is a way to get sample files that are part of a package. If you download the zip File of XLConnect and look into the folder structure you will see that there is a folder demoFiles that contains mtcars.xlsx. And the parameter package="XLConnect" tells the method to look for the file in this package.
If you type it into the command line it returns the absolute path to the file:
"C:/Users/Expecto/Documents/R/win-library/3.1/XLConnect/demoFiles/mtcars.xlsx"
To use loadWorkbook you simply need to pass the relative or absolute filepath.

Convert .xlsm to .xlsx in R

I would like to convert an Excel file (say it's name is "Jimmy") that is saved as a macro enabled workbook (Jimmy.xlsm) to Jimmy.xlsx.
I need this to be done in a coding environment. I cannot simply change this by opening the file in Excel and assigning a different file-type. I am currently programming in R. If I use the function
file.rename("Jimmy.xlsm", "Jimmy.xlsx")
the file becomes corrupted.
In your framework you have to read in the sheet and write it back out. Suppose you have an XLSM file (with macros, I presume) called "testXLSM2X.xlsm" containing one sheet with tabular columns of data. This will do the trick:
library(xlsx)
r <- read.xlsx("testXLSMtoX.xlsm", 1) # read the first sheet
# provides a data frame
# use the first column in the spreadsheet to create row names then delete that column from the data frame
# otherwise you will get an extra column of row index numbers in the first column
r2w<-data.frame(r[-1],row.names=r[,1])
w <- write.xlsx(r2w,"testXLSMtoX.xlsx") # write the sheet
The macros will be stripped out, of course.
That's an answer but I would question what you are trying to accomplish. In general it is easier to control R from Excel than Excel from R. I use REXCEL from http://rcom.univie.ac.at/, which is not open source but pretty robust.
Here is a function that converts XLSM files to XLSX files with the R package RDCOMClient :
convert_XLSM_File_To_XLSX <- function(path_XLSM_File, path_XLSX_File)
{
xlApp <- COMCreate("Excel.Application")
xlApp[['Visible']] <- FALSE
xlApp[["DisplayAlerts"]] <- FALSE
xlWbk <- xlApp$Workbooks()$Open(path_XLSM_File)
xlWbk$SaveAs(path_XLSX_File, 51)
xlWbk$Close()
xlApp$Quit()
}
library(RDCOMClient)
convert_XLSM_File_To_XLSX(path_XLSM_File, path_XLSX_File)

How to extract sheet names from Excel file in R

I have loaded a workbook into R and read in the worksheets using xlConnect, but I was wondering if there was a way of extracting the names of the sheets perhaps in a vector?
So far my code is:
dataIn<-loadWorkbook(file.path(filenames[1],sep=""))
lst = readWorksheet(dataIn, sheet = getSheets(dataIn), startRow=1, startCol=1, header=TRUE)
...and I want to extract the sheet names of the sheets in lst.
Another really nice package developed by the folks at RStudio is readxl. It's easy to get the excel sheet names with the excel_sheets() function.
library(readxl)
path <- "path/to/your/file.xlsx"
excel_sheets(path = path)
You are looking for getSheets
Returns all worksheet names in a workbook.
In the "openxlsx" package it would be a command "getSheetNames":
library(openxlsx)
path <- "path/to/your/file.xlsx"
getSheetNames(path)

using xlsx package in R to adjust page setting in xlsx file

I am trying to change the page setting of a xlsx file so that it will be printed as landscape, but not portrait, I tried the following:
library(xlsx)
wb <- createWorkbook()
sheet <- createSheet(wb, "Sheet1")
ps <- printSetup(sheet, landscape=TRUE, copies=3)
This is OK if I create a new Excel workbook, but I can't use it when I am using the loadWorkbook function to load an xls file. I wonder why.
Update: I am working on a xls file but not xlsx file and found the answer below cannot solve my problem, any further suggestions? Thanks.
I can get it to work by using getSheets to select the sheet to set the print area for:
wb <- loadWorkbook("test.xlsx")
printSetup(getSheets(wb)[[1]], landscape=TRUE, copies=3)
saveWorkbook(wb,"test.xlsx")

Resources