Write data to different sheets of the same xlsx - r

I would like to know if there is some way to write several variables to different sheets of the same xlsx file.
I know that I can append new sheets to an existing file:
write.xlsx(x = df,
file = "df.xlsx",
sheetName = "Data Sheet 2",
append = TRUE)
But, I wouldn't like to write this code for each sheet. Is there any command which allows to create an xlsx, adding different data to different sheets directly?

The openxlsx library can do this.
library(openxlsx)
# iniate workbook object
wb <- createWorkbook()
# add 2 worksheets to the workbook object, arbitrary names
addWorksheet(wb, "Sheet 1")
addWorksheet(wb, "Sheet 2")
# some arbitrary data, can also be data.frames
x <- matrix(1:10, 5, 2)
y <- matrix(11:20, 2, 5)
# write data to worksheets in the workbook object
writeData(wb, 1, x)
writeData(wb, 2, y)
# save the workbook to a file
saveWorkbook(wb, "2sheets.xlsx")

Yes, you can use the library writexl to perform that. See an example below on how to do that. To generate multiple sheets directly, you need to put elements into a list, and then call writexl::write_xlsx using that list as an argument. If you would like to customize the names of the sheets, you can pass the names() argument to the list.
library(writexl)
##create a list of data frames
list_of_data_frames <- lapply(1:10, function(i){
data.frame(rnorm(1000))
})
##Add names to the list: these will be converted to sheet names in the workbook
names(list_of_data_frames) <- sapply(1:10, function(i)paste0("sheet_",i))
##Write to file
writexl::write_xlsx(list_of_data_frames, "data_frames_to_excel.xlsx")

You can use the list function to grab the data into one "x" and just use write_xlsx as usual.
list_data <- list("sheet 1 name" = data1,"sheet 2 name" = data2, "sheet 3 name" = data3)
write_xlsx(list_data,directory path)

Related

Select specific data frames from global environment [duplicate]

I am surprised to find that there is no easy way to export multiple data.frame to multiple worksheets of an Excel file? I tried xlsx package, seems it can only write to one sheet (override old sheet); I also tried WriteXLS package, but it gives me error all the time...
My code structure is like this: by design, for each iteration, the output dataframe (tempTable) and the sheetName (sn) got updated and exported into one tab.
for (i in 2 : ncol(code)){
...
tempTable <- ...
sn <- ...
WriteXLS("tempTable", ExcelFileName = "C:/R_code/../file.xlsx",
SheetNames = sn);
}
I can export to several cvs files, but there has to be an easy way to do that in Excel, right?
You can write to multiple sheets with the xlsx package. You just need to use a different sheetName for each data frame and you need to add append=TRUE:
library(xlsx)
write.xlsx(dataframe1, file="filename.xlsx", sheetName="sheet1", row.names=FALSE)
write.xlsx(dataframe2, file="filename.xlsx", sheetName="sheet2", append=TRUE, row.names=FALSE)
Another option, one that gives you more control over formatting and where the data frame is placed, is to do everything within R/xlsx code and then save the workbook at the end. For example:
wb = createWorkbook()
sheet = createSheet(wb, "Sheet 1")
addDataFrame(dataframe1, sheet=sheet, startColumn=1, row.names=FALSE)
addDataFrame(dataframe2, sheet=sheet, startColumn=10, row.names=FALSE)
sheet = createSheet(wb, "Sheet 2")
addDataFrame(dataframe3, sheet=sheet, startColumn=1, row.names=FALSE)
saveWorkbook(wb, "My_File.xlsx")
In case you might find it useful, here are some interesting helper functions that make it easier to add formatting, metadata, and other features to spreadsheets using xlsx:
http://www.sthda.com/english/wiki/r2excel-read-write-and-format-easily-excel-files-using-r-software
You can also use the openxlsx library to export multiple datasets to multiple sheets in a single workbook.The advantage of openxlsx over xlsx is that openxlsx removes the dependencies on java libraries.
Write a list of data.frames to individual worksheets using list names as worksheet names.
require(openxlsx)
list_of_datasets <- list("Name of DataSheet1" = dataframe1, "Name of Datasheet2" = dataframe2)
write.xlsx(list_of_datasets, file = "writeXLSX2.xlsx")
There's a new library in town, from rOpenSci: writexl
Portable, light-weight data frame to xlsx exporter based on
libxlsxwriter. No Java or Excel required
I found it better and faster than the above suggestions (working with the dev version):
library(writexl)
sheets <- list("sheet1Name" = sheet1, "sheet2Name" = sheet2) #assume sheet1 and sheet2 are data frames
write_xlsx(sheets, "path/to/location")
Many good answers here, but some of them are a little dated. If you want to add further worksheets to a single file then this is the approach I find works for me. For clarity, here is the workflow for openxlsx version 4.0
# Create a blank workbook
OUT <- createWorkbook()
# Add some sheets to the workbook
addWorksheet(OUT, "Sheet 1 Name")
addWorksheet(OUT, "Sheet 2 Name")
# Write the data to the sheets
writeData(OUT, sheet = "Sheet 1 Name", x = dataframe1)
writeData(OUT, sheet = "Sheet 2 Name", x = dataframe2)
# Export the file
saveWorkbook(OUT, "My output file.xlsx")
EDIT
I've now trialled a few other answers, and I actually really like #Syed's. It doesn't exploit all the functionality of openxlsx but if you want a quick-and-easy export method then that's probably the most straightforward.
I'm not familiar with the package WriteXLS; I generally use XLConnect:
library(XLConnect)
##
newWB <- loadWorkbook(
filename="F:/TempDir/tempwb.xlsx",
create=TRUE)
##
for(i in 1:10){
wsName <- paste0("newsheet",i)
createSheet(
newWB,
name=wsName)
##
writeWorksheet(
newWB,
data=data.frame(
X=1:10,
Dataframe=paste0("DF ",i)),
sheet=wsName,
header=TRUE,
rownames=NULL)
}
saveWorkbook(newWB)
This can certainly be vectorized, as #joran noted above, but just for the sake of generating dynamic sheet names quickly, I used a for loop to demonstrate.
I used the create=TRUE argument in loadWorkbook since I was creating a new .xlsx file, but if your file already exists then you don't have to specify this, as the default value is FALSE.
Here are a few screenshots of the created workbook:
Incase data size is small, R has many packages and functions which can be utilized as per your requirement.
write.xlsx, write.xlsx2, XLconnect also do the work but these are sometimes slow as compare to openxlsx.
So, if you are dealing with the large data sets and came across java errors. I would suggest to have a look of "openxlsx" which is really awesome and reduce the time to 1/12th.
I've tested all and finally i was really impressed with the performance of openxlsx capabilities.
Here are the steps for writing multiple datasets into multiple sheets.
install.packages("openxlsx")
library("openxlsx")
start.time <- Sys.time()
# Creating large data frame
x <- as.data.frame(matrix(1:4000000,200000,20))
y <- as.data.frame(matrix(1:4000000,200000,20))
z <- as.data.frame(matrix(1:4000000,200000,20))
# Creating a workbook
wb <- createWorkbook("Example.xlsx")
Sys.setenv("R_ZIPCMD" = "C:/Rtools/bin/zip.exe") ## path to zip.exe
Sys.setenv("R_ZIPCMD" = "C:/Rtools/bin/zip.exe") has to be static as it takes reference of some utility from Rtools.
Note: Incase Rtools is not installed on your system, please install it first for smooth experience. here is the link for your reference: (choose appropriate version)
https://cran.r-project.org/bin/windows/Rtools/
check the options as per link below (need to select all the check box while installation)
https://cloud.githubusercontent.com/assets/7400673/12230758/99fb2202-b8a6-11e5-82e6-836159440831.png
# Adding a worksheets : parameters for addWorksheet are 1. Workbook Name 2. Sheet Name
addWorksheet(wb, "Sheet 1")
addWorksheet(wb, "Sheet 2")
addWorksheet(wb, "Sheet 3")
# Writing data in to respetive sheets: parameters for writeData are 1. Workbook Name 2. Sheet index/ sheet name 3. dataframe name
writeData(wb, 1, x)
# incase you would like to write sheet with filter available for ease of access you can pass the parameter withFilter = TRUE in writeData function.
writeData(wb, 2, x = y, withFilter = TRUE)
## Similarly writeDataTable is another way for representing your data with table formatting:
writeDataTable(wb, 3, z)
saveWorkbook(wb, file = "Example.xlsx", overwrite = TRUE)
end.time <- Sys.time()
time.taken <- end.time - start.time
time.taken
openxlsx package is really good for reading and writing huge data from/ in excel files and has lots of options for custom formatting within excel.
The interesting fact is that we dont have to bother about java heap memory here.
I had this exact problem and I solved it this way:
library(openxlsx) # loads library and doesn't require Java installed
your_df_list <- c("df1", "df2", ..., "dfn")
for(name in your_df_list){
write.xlsx(x = get(name),
file = "your_spreadsheet_name.xlsx",
sheetName = name)
}
That way you won't have to create a very long list manually if you have tons of dataframes to write to Excel.
I regularly use the packaged rio for exporting of all kinds. Using rio, you can input a list, naming each tab and specifying the dataset. rio compiles other in/out packages, and for export to Excel, uses openxlsx.
library(rio)
filename <- "C:/R_code/../file.xlsx"
export(list(sn1 = tempTable1, sn2 = tempTable2, sn3 = tempTable3), filename)
tidy way of taking one dataframe and writing sheets by groups:
library(tidyverse)
library(xlsx)
mtcars %>%
mutate(cyl1 = cyl) %>%
group_by(cyl1) %>%
nest() %>%
ungroup() %>%
mutate(rn = row_number(),
app = rn != 1,
q = pmap(list(rn,data,app),~write.xlsx(..2,"test1.xlsx",as.character(..1),append = ..3)))
For me, WriteXLS provides the functionality you are looking for. Since you did not specify which errors it returns, I show you an example:
Example
library(WriteXLS)
x <- list(sheet_a = data.frame(a=letters), sheet_b = data.frame(b = LETTERS))
WriteXLS(x, "test.xlsx", names(x))
Explanation
If x is:
a list of data frames, each one is written to a single sheet
a character vector (of R objects), each object is written to a single sheet
something else, then see also what the help states:
More on usage
?WriteXLS
shows:
`x`: A character vector or factor containing the names of one or
more R data frames; A character vector or factor containing
the name of a single list which contains one or more R data
frames; a single list object of one or more data frames; a
single data frame object.
Solution
For your example, you would need to collect all data.frames in a list during the loop, and use WriteXLS after the loop has finished.
Session info
R 3.2.4
WriteXLS 4.0.0
I do it in this way for openxlsx using following function
mywritexlsx<-function(fname="temp.xlsx",sheetname="Sheet1",data,
startCol = 1, startRow = 1, colNames = TRUE, rowNames = FALSE)
{
if(! file.exists(fname))
wb = createWorkbook()
else
wb <- loadWorkbook(file =fname)
sheet = addWorksheet(wb, sheetname)
writeData(wb,sheet,data,startCol = startCol, startRow = startRow,
colNames = colNames, rowNames = rowNames)
saveWorkbook(wb, fname,overwrite = TRUE)
}
I do this all the time, all I do is
WriteXLS::WriteXLS(
all.dataframes,
ExcelFileName = xl.filename,
AdjWidth = T,
AutoFilter = T,
FreezeRow = 1,
FreezeCol = 2,
BoldHeaderRow = T,
verbose = F,
na = '0'
)
and all those data frames come from here
all.dataframes <- vector()
for (obj.iter in all.objects) {
obj.name <- obj.iter
obj.iter <- get(obj.iter)
if (class(obj.iter) == 'data.frame') {
all.dataframes <- c(all.dataframes, obj.name)
}
obviously sapply routine would be better here
for a lapply-friendly version..
library(data.table)
library(xlsx)
path2txtlist <- your.list.of.txt.files
wb <- createWorkbook()
lapply(seq_along(path2txtlist), function (j) {
sheet <- createSheet(wb, paste("sheetname", j))
addDataFrame(fread(path2txtlist[j]), sheet=sheet, startColumn=1, row.names=FALSE)
})
saveWorkbook(wb, "My_File.xlsx")

Reading xlsx with multiple sheets in R for duplication removal

I have a excel file which has multiple sheets embedded in it. My main goal is to basically remove all rows which are appearing multiple times in a single sheet and have to do this for every sheet.
I have written the code below but the code is only reading the first sheet and also giving ' ...' in first row and column. Can someone help me out where I might be going wrong. Thank you in advanced
**config_file_name <- '/RBIAPI3tables.xlsx'
config_xl <- paste(currentPath,config_file_name,sep="")
config_xl_sheets_name <- excel_sheets(path = config_xl) # An array of sheets is created. To access the array use config_xl_sheets[1]
count_of_xl_sheets <- length(config_xl_sheets_name)
# Read all sheets in the file as separate lists
list_all_sheets <- lapply(config_xl_sheets_name, function(x) read_excel(path = config_xl, sheet = x))
names (list_all_sheets) <- config_xl_sheets_name # Change the name of all the lists to excel file sheets name
count_of_list_all_sheets <- length(list_all_sheets) # to get the data frame of each list use list_all_sheets[[Config]]
# Create data frame for each sheet Assign the sheet name to the data frame
for (i in 1:count_of_list_all_sheets)
{
assign(x= trimws(config_xl_sheets_name[i]), value = data.frame(list_all_sheets[[i]]))
updateddata = unique(list_all_sheets[[i]])
}
write.xlsx(updateddata,"Unique3tables.xlsx",showNA = FALSE)**
this is my approach
library(readxl)
library(data.table)
library(openxlsx)
file.to.read <- "./testdata.xlsx"
sheets.to.read <- readxl::excel_sheets(file.to.read)
# read sheets from the file to a list and remove duplicate rows
L <- lapply(sheets.to.read, function(x) {
data <- setDT(readxl::read_excel(file.to.read, sheet = x))
#remove puplicates
data[!duplicated(data), ]
})
# create a new workbook
wb <- createWorkbook()
# create new worksheets an write to them
for (i in seq.int(L)) {
addWorksheet(wb, sheets.to.read[i])
writeData(wb, i, L[[i]] )
}
# write the workbook to disk
saveWorkbook(wb, "testdata_new.xlsx")

Lapply with read_csv

I am having trouble understanding lapply with read_csv function. The question is if Lapply creates an array of dataframes where I can access each dataframe using data[i]?
What I did:
I have downloaded the 5 cities data set (found here: https://archive.ics.uci.edu/ml/machine-learning-databases/00394/FiveCitiePMData.rar) and wrote R code to extract the 5 csv files and save to a dataframe as follows:
cities <- list.files('FiveCities')
cities_df <- lapply(cities, read.csv)
My goal was to create a workbook and save each of the csv files into an xlsx file with each csv being a sheet in the workbook as follows:
wb <- createWorkbook()
for(i in 1:length(cities)){
sheet <- addWorksheet(wb , i)
writeData(wb, sheet, cities_df[i])
}
What I am confused on is accessing each csv like this cities_df[i]. I thought cities_df[i] accesses the ith row of the dataframe and not a separate dataframe as a whole. Does lapply create an array of dataframes called cities_df[i] or what happens? If it does create an array then how come I can simply call cities_df and receive a result without specifying which dataframe in the array to call?
Here is complete code to create the Excel workbook and save it to a file FiveCities/cities.xlsx.
cities <- list.files('FiveCities', full.names = TRUE)
cities_df <- lapply(cities, read.csv)
names(cities_df) <- sub("\\.csv", "", basename(cities))
wb <- createWorkbook()
for(i in names(cities_df)){
sheet <- addWorksheet(wb , i)
writeData(wb, i, cities_df[[i]])
}
saveWorkbook(wb, file = "FiveCities/cities.xlsx")
This code may help!
library(plyr)
library(readr)
library(tidyverse)
library(openxlsx)
mydir = "C:/Users/mouad/Desktop/assasins creed/new"
myfiles = list.files(path=mydir, pattern="*.csv", full.names=TRUE)
str_length(mydir)
mylist=lapply(1:5, function(j) read_csv(myfiles[[j]]))
setwd(mydir)
wb <- createWorkbook()
lapply(1:length(mylist), function(i){
addWorksheet(wb=wb, sheetName = substr(myfiles[i],str_length(mydir)+1,60))
writeData(wb, sheet = i, mylist[[i]][length(mylist[[i]])])
})
saveWorkbook(wb, "test.xlsx", overwrite = TRUE)
read.xlsx("test.xlsx", sheet = 1)

Saving Excel files in loop in r

I have data frame and created a subset of it. I split the data frame and its
subset by a variable factors. I want to save it in excel file. I want to
write a loop to create multiple excel files data frame and subset files are
in sheets by a variable factor.
I had written a code its just saving the last kind of variable workbook.
How to create all the workbooks.
rm(list = ls())
mtcars
split_mtcars <- split(mtcars, mtcars$cyl)
split_mtcars_subset <- split(mtcars[,2:4], mtcars$cyl)
cyl_type <- names(split_mtcars)
for(i in length(cyl_type)){
wb <- createWorkbook()
addWorksheet(wb, "raw")
addWorksheet(wb, "subset")
writeData(wb, 1, split_mtcars[[i]])
writeData(wb, 2, split_mtcars_subset[[i]])
saveWorkbook(wb, file = paste0(cyl_type[i],".xlsx"), overwrite = TRUE)
}
Thanks In advance
Consider by to split your data frame by factor(s) to avoid the need of intermediate objects and hide the loop. Below outputs your workbook and builds a list of data frames.
split_mtcars <- by(mtcars, mtcars$cyl, function(sub) {
wb <- createWorkbook()
addWorksheet(wb, "raw")
addWorksheet(wb, "subset")
writeData(wb, 1, sub)
writeData(wb, 2, sub[,2:5])
saveWorkbook(wb, file = paste0(sub$cyl[1],".xlsx"), overwrite = TRUE)
return(sub) # TO REPLICATE split()
})

List of data.frame's to individual excel worksheets - R

I have a list of data.frame's that I would like to output to their own worksheets in excel. I can easily save a single data frame to it's own excel file but I'm not sure how to save multiple data frames to the their own worksheet within the same excel file.
library(xlsx)
write.xlsx(sortedTable[1], "c:/mydata.xlsx")
Specify sheet name for each list element.
library(xlsx)
file <- paste("usarrests.xlsx", sep = "")
write.xlsx(USArrests, file, sheetName = "Sheet1")
write.xlsx(USArrests, file, sheetName = "Sheet2", append = TRUE)
Second approach as suggested by #flodel, would be to use addDataFrame. This is more or less an example from the help page of the said function.
file <- paste("usarrests.xlsx", sep="")
wb <- createWorkbook()
sheet1 <- createSheet(wb, sheetName = "Sheet1")
sheet2 <- createSheet(wb, sheetName = "Sheet2")
addDataFrame(USArrests, sheet = sheet1)
addDataFrame(USArrests * 2, sheet = sheet2)
saveWorkbook(wb, file = file)
Assuming you have a list of data.frames and a list of sheet names, you can use them pair-wise.
wb <- createWorkbook()
datas <- list(USArrests, USArrests * 2)
sheetnames <- paste0("Sheet", seq_along(datas)) # or names(datas) if provided
sheets <- lapply(sheetnames, createSheet, wb = wb)
void <- Map(addDataFrame, datas, sheets)
saveWorkbook(wb, file = file)
Here's the solution with openxlsx:
## create data;
dataframes <- split(iris, iris$Species)
# create workbook
wb <- createWorkbook()
#Iterate the same way as PavoDive, slightly different (creating an anonymous function inside Map())
Map(function(data, nameofsheet){
addWorksheet(wb, nameofsheet)
writeData(wb, nameofsheet, data)
}, dataframes, names(dataframes))
## Save workbook to excel file
saveWorkbook(wb, file = "file.xlsx", overwrite = TRUE)
.. however, openxlsx is also able to use it's function openxlsx::write.xlsx for this, so you can just give the object with your list of dataframes and the filepath, and openxlsx is smart enough to create the list as sheets within the xlsx-file. The code I post here with Map() is if you want to format the sheets in a specific way.
The following code works perfectly, which is from:https://rpubs.com/gbganalyst/RdatatoExcelworkbook
packages <- c("openxlsx", "readxl", "magrittr", "purrr", "ggplot2")
if (!require(install.load)) {
install.packages("install.load")
}
install.load::install_load(packages)
list_of_mydata
write.xlsx(list_of_mydata, "Excel workbook.xlsx")
lets say your list of data frames is called Lst and that the workbook you want to save to is called wb.xlsx. Then you can use:
library(xlsx)
counter <- 1
for (i in length(Lst)){
write.xlsx(x=Lst[[i]],file="wb.xlsx",sheetName=paste("sheet",counter,sep=""),append=T)
counter <- counter + 1
}
G
I think the simplest solution is still missing. Using the writexl package you can write a list of data frames easily:
list_of_dfs <- list(iris, iris)
writexl::write_xlsx(list_of_dfs, "output.xlsx")
Also, if you have a named list then those names become the sheet names:
names(list_of_dfs) <- c("a", "b")
writexl::write_xlsx(list_of_dfs, "output.xlsx")
Alternatively, the rio package allows for more export control and the syntax and handling of named lists is similar:
rio::export(list_of_dfs, "output.xlsx")
You can also easily output them to their own workbooks as well.

Resources