Convert .xlsm to .xlsx in R - r

I would like to convert an Excel file (say it's name is "Jimmy") that is saved as a macro enabled workbook (Jimmy.xlsm) to Jimmy.xlsx.
I need this to be done in a coding environment. I cannot simply change this by opening the file in Excel and assigning a different file-type. I am currently programming in R. If I use the function
file.rename("Jimmy.xlsm", "Jimmy.xlsx")
the file becomes corrupted.

In your framework you have to read in the sheet and write it back out. Suppose you have an XLSM file (with macros, I presume) called "testXLSM2X.xlsm" containing one sheet with tabular columns of data. This will do the trick:
library(xlsx)
r <- read.xlsx("testXLSMtoX.xlsm", 1) # read the first sheet
# provides a data frame
# use the first column in the spreadsheet to create row names then delete that column from the data frame
# otherwise you will get an extra column of row index numbers in the first column
r2w<-data.frame(r[-1],row.names=r[,1])
w <- write.xlsx(r2w,"testXLSMtoX.xlsx") # write the sheet
The macros will be stripped out, of course.
That's an answer but I would question what you are trying to accomplish. In general it is easier to control R from Excel than Excel from R. I use REXCEL from http://rcom.univie.ac.at/, which is not open source but pretty robust.

Here is a function that converts XLSM files to XLSX files with the R package RDCOMClient :
convert_XLSM_File_To_XLSX <- function(path_XLSM_File, path_XLSX_File)
{
xlApp <- COMCreate("Excel.Application")
xlApp[['Visible']] <- FALSE
xlApp[["DisplayAlerts"]] <- FALSE
xlWbk <- xlApp$Workbooks()$Open(path_XLSM_File)
xlWbk$SaveAs(path_XLSX_File, 51)
xlWbk$Close()
xlApp$Quit()
}
library(RDCOMClient)
convert_XLSM_File_To_XLSX(path_XLSM_File, path_XLSX_File)

Related

Lock specific cells in an Excel file from R while preserving formatting

I'm trying to lock a block of cells in a series of Excel files, protect each file, and save them in their original location. I have the code working in every way except that locking the columns strips those cells of all formatting. I can't recreate the format manually within the xlsx package because each of the files is slightly different.
I know how to output data into Excel without formatting using XLConnect, but I can't get XLConnect to lock cells/protect workbooks. So I'm either looking for help using XLConnect to lock down the cells, or help using xlsx to lock the cells without overwriting the formatting.
Here is my current code (using xlsx package):
wb <- loadWorkbook(file.path)
sheets <- getSheets(wb)
sh <- sheets[[1]]
lock <- CellStyle(wb, cellProtection = CellProtection(locked = TRUE))
rows <- getRows(sh, rowIndex = 9:50)
cells <- getCells(rows, colIndex = 5:6)
lapply(names(cells), function(ii) setCellStyle(cells[[ii]], lock))
.jcall(sh, "V", "protectSheet", "p#ssword")
saveWorkbook(wb, file.path)
I think I may have eventually found my own answer by going around xlsx and XLConnect. Instead I wrote a VBA macro:
ActiveSheet.Unprotect ("p#ssword")
Range("E8:F50").Locked = True
ActiveSheet.Protect ("p#ssword")
and then called the macro in R (using RDCOMClient), cycling through the different sheets
xlApp <- COMCreate("Excel.Application")
xlWbk <- xlApp$Workbooks()$Open(paste0(temp.path))
xlApp$Run("LockColumns")
xlWbk$Close(TRUE)
xlApp$Quit()

writeWorkSheet function in R not pasting values into Excel

I am trying to copy some data from an R data frame (Shipments) to an excel file using writeWorkBook function in the XLConnect package. However, it is not copying anything to the excel file. The execution doesn't result in any error/warning appearing in the console. It just doesn't copy.
I have loaded the library XLConnect and made sure I am not loading the library xlsx. The column to be copied has been type-casted to dataframe as I thought that might be an issue.
wbnames is an additional thing. I directly wrote the sheet name in the writeWorkBook and it should have worked fine. Even with wbnames there hasn't been any change in the result.
I originally intended to copy the content to a macro file and then run the macro file from R itself but it wasn't working. So I thought it may be because of macro file but the function is not working on .xlsx itself.
So, not sure what is the issue. Would be grateful if I can get some help here. Am I missing something?
library(XLConnect)
library(RDCOMClient)
xlApp <- COMCreate("Excel.Application")
xlWbk <- xlApp$Workbooks()$Open(FILEPATH+FILENAME.XLSX)
xlWb <- loadWorkbook(FILEPATH+FILENAME.XLSX)
wbnames <- as.vector(getSheets(xlWb))
# Copy a column from the existing data frame and paste it to the first
# sheet of the FILENAME.XLSX, starting at Row#6, no headers and no rownames:
writeWorksheet(xlWb, as.data.frame(Shipments$SHIPMENT_ID),
sheet = wbnames[1], startRow = 6, header = F, rownames = NULL)
xlWb is the R-object that contains the workbook. It looks like the data has been written to the workbook, which is good. In order to view in Excel format, however, you still need to save the workbook to Excel. Add this line after your code and you should see a document called your_file_name.xlsx with your data in your working directory:
XLConnect::saveWorkbook(xlWb, "your_file_name.xlsx")

How do I modify an existing a sheet in an Excel Workbook using Openxlsx package in R?

I am using "openxlsx" package to read and write excel files. I have a fixed file with a sheet called "Data" which is used by formulas in other sheets. I want to update this Data sheet without touching the other.
I am trying the following code:
write.xlsx(x = Rev_4, file = "Revenue.xlsx", sheetName="Data")
But this erases the excel file and creates a new one with just the new data in the "Data" sheet while all else gets deleted. Any Advice?
Try this:
wb <- loadWorkbook("Revenue.xlsx")
writeData(wb, sheet = "Data", Rev_4, colNames = F)
saveWorkbook(wb,"Revenue.xlsx",overwrite = T)
You need to load the complete workbook, then modify its data and then save it to disk. With writeData you can also specify the starting row and column. And you could also modify other sections before saving to disk.
I've found this package. It depends on openxlsx and helps to insert many sheets on a xlsx file. Maybe it makes easier:
Package documentation
library(xlsx2dfs)
# However, be careful, the function xlsx2dfs assumes
# that all sheets contain simple tables. If that is not the case,
# use the accepted answer!
dfs <- xlsx2dfs("Revenue.xlsx") # all sheets of file as list of dfs
dfs["Data"] <- Rev_4 # replace df of sheet "Data" by updated df Rev_4
dfs2xlsx(dfs, "Revenue.xlsx") # this overwrites the existing file! cave!

Read excel file with formulas in cells into R

I was trying to read an excel spreadsheet into R data frame. However, some of the columns have formulas or are linked to other external spreadsheets. Whenever I read the spreadsheet into R, there are always many cells becomes NA. Is there a good way to fix this problem so that I can get the original value of those cells?
The R script I used to do the import is like the following:
options(java.parameters = "-Xmx8g")
library(XLConnect)
# Step 1 import the "raw" tab
path_cost = "..."
wb = loadWorkbook(...)
raw = readWorksheet(wb, sheet = '...', header = TRUE, useCachedValues = FALSE)
UPDATE: read_excel from the readxl package looks like a better solution. It's very fast (0.14 sec in the 1400 x 6 file I mentioned in the comments) and it evaluates formulas before import. It doesn't use java, so no need to set any java options.
# sheet can be a string (name of sheet) or integer (position of sheet)
raw = read_excel(file, sheet=sheet)
For more information and examples, see the short vignette.
ORIGINAL ANSWER: Try read.xlsx from the xlsx package. The help file implies that by default it evaluates formulas before importing (see the keepFormulas parameter). I checked this on a small test file and it worked for me. Formula results were imported correctly, including formulas that depend on other sheets in the same workbook and formulas that depend on other workbooks in the same directory.
One caveat: If an externally linked sheet has changed since the last time you updated the links on the file you're reading into R, then any values read into R that depend on external links will be the old values, not the latest ones.
The code in your case would be:
library(xlsx)
options(java.parameters = "-Xmx8g") # xlsx also uses java
# Replace file and sheetName with appropriate values for your file
# keepFormulas=FALSE and header=TRUE are the defaults. I added them only for illustration.
raw = read.xlsx(file, sheetName=sheetName, header=TRUE, keepFormulas=FALSE)

r - read.xlsx from .xlsx with unknown number of sheets

Suppose I have an excel file, which I would like to read to R with read.xlsx function. File consists of spreadsheets, number of which I do not know (there is like 200 of such files so manually checking number of sheets would be huge pain). Each spreadsheet is organized like a proper data frame.
I would like to have those spreadsheets one on top of another.
I write something like:
columnsILike <- c(1,40)
for(i in 1:numberOfSheets){
dfInd <- read.xlsx("myfile.xlsx", i, # number of sheet
colIndex=columnsILike, endRow=201, startRow=2,
header=F)
PreviousEmptyDataFrame <- rbind(PreviousEmptyDataFrame, dfInd)
}
write.csv(PreviousEmptyDataFrame, "data.csv")
Question is, how do I know number of sheets in advance?
getSheets(loadWorkbook("file_path")) in the XLSX package should return a list of the sheets in the workbook so you can get the length of the list to find the amount of sheets.
This answer is rather late, but wouldn't this be simpler?
gdata::sheetCount("myworkbook.xlsx")
You can also use package XLConnect if the workbook isn't too large.
library(XLConnect)
wb <- loadWorkbook("myworkbook.xlsx")
result <- do.call(rbind,lapply(getSheets(wb),
function(sheet)readWorksheet(wb,sheet)))

Resources