I have data frame and created a subset of it. I split the data frame and its
subset by a variable factors. I want to save it in excel file. I want to
write a loop to create multiple excel files data frame and subset files are
in sheets by a variable factor.
I had written a code its just saving the last kind of variable workbook.
How to create all the workbooks.
rm(list = ls())
mtcars
split_mtcars <- split(mtcars, mtcars$cyl)
split_mtcars_subset <- split(mtcars[,2:4], mtcars$cyl)
cyl_type <- names(split_mtcars)
for(i in length(cyl_type)){
wb <- createWorkbook()
addWorksheet(wb, "raw")
addWorksheet(wb, "subset")
writeData(wb, 1, split_mtcars[[i]])
writeData(wb, 2, split_mtcars_subset[[i]])
saveWorkbook(wb, file = paste0(cyl_type[i],".xlsx"), overwrite = TRUE)
}
Thanks In advance
Consider by to split your data frame by factor(s) to avoid the need of intermediate objects and hide the loop. Below outputs your workbook and builds a list of data frames.
split_mtcars <- by(mtcars, mtcars$cyl, function(sub) {
wb <- createWorkbook()
addWorksheet(wb, "raw")
addWorksheet(wb, "subset")
writeData(wb, 1, sub)
writeData(wb, 2, sub[,2:5])
saveWorkbook(wb, file = paste0(sub$cyl[1],".xlsx"), overwrite = TRUE)
return(sub) # TO REPLICATE split()
})
Related
I'm trying to understand why part of my appended list is getting chopped off when exporting to excel. I can separate a dataframe by a grouping variable into separate lists:
data(iris)
split_tibble <- function(tibble, col = 'col') tibble %>% split(., .[, col])
spliris = split_tibble(iris,'Species') #creates list for each species having all variables
I have a separate list that looks like this:
mylist = list(cbind(col1 = c("val1","val2","val3","val4"),col2 = c("A","B","C","D")))
names(mylist) = "Table"
And I combine them into one list:
newlist = c(mylist,spliris) #looks correct so far
And I write out to an excel wb
#Create workbook with a sheet for each list element
library(openxlsx)
wb <- createWorkbook()
lapply(seq_along(newlist), function(i){
addWorksheet(wb=wb, sheetName = names(newlist[i]))
writeData(wb, sheet = i, newlist[[i]][-length(newlist[[i]])])
})
But when I save the workbook, the first sheet "Table" is incomplete and the two columns are just a single column.
Why does this happen? If I do not append the lists together and just write out the iris-lists it works perfectly:
#This works
wb <- createWorkbook()
lapply(seq_along(spliris), function(i){
addWorksheet(wb=wb, sheetName = names(spliris[i]))
writeData(wb, sheet = i, spliris[[i]][-length(spliris[[i]])])
})
I am having trouble understanding lapply with read_csv function. The question is if Lapply creates an array of dataframes where I can access each dataframe using data[i]?
What I did:
I have downloaded the 5 cities data set (found here: https://archive.ics.uci.edu/ml/machine-learning-databases/00394/FiveCitiePMData.rar) and wrote R code to extract the 5 csv files and save to a dataframe as follows:
cities <- list.files('FiveCities')
cities_df <- lapply(cities, read.csv)
My goal was to create a workbook and save each of the csv files into an xlsx file with each csv being a sheet in the workbook as follows:
wb <- createWorkbook()
for(i in 1:length(cities)){
sheet <- addWorksheet(wb , i)
writeData(wb, sheet, cities_df[i])
}
What I am confused on is accessing each csv like this cities_df[i]. I thought cities_df[i] accesses the ith row of the dataframe and not a separate dataframe as a whole. Does lapply create an array of dataframes called cities_df[i] or what happens? If it does create an array then how come I can simply call cities_df and receive a result without specifying which dataframe in the array to call?
Here is complete code to create the Excel workbook and save it to a file FiveCities/cities.xlsx.
cities <- list.files('FiveCities', full.names = TRUE)
cities_df <- lapply(cities, read.csv)
names(cities_df) <- sub("\\.csv", "", basename(cities))
wb <- createWorkbook()
for(i in names(cities_df)){
sheet <- addWorksheet(wb , i)
writeData(wb, i, cities_df[[i]])
}
saveWorkbook(wb, file = "FiveCities/cities.xlsx")
This code may help!
library(plyr)
library(readr)
library(tidyverse)
library(openxlsx)
mydir = "C:/Users/mouad/Desktop/assasins creed/new"
myfiles = list.files(path=mydir, pattern="*.csv", full.names=TRUE)
str_length(mydir)
mylist=lapply(1:5, function(j) read_csv(myfiles[[j]]))
setwd(mydir)
wb <- createWorkbook()
lapply(1:length(mylist), function(i){
addWorksheet(wb=wb, sheetName = substr(myfiles[i],str_length(mydir)+1,60))
writeData(wb, sheet = i, mylist[[i]][length(mylist[[i]])])
})
saveWorkbook(wb, "test.xlsx", overwrite = TRUE)
read.xlsx("test.xlsx", sheet = 1)
I would like to know if there is some way to write several variables to different sheets of the same xlsx file.
I know that I can append new sheets to an existing file:
write.xlsx(x = df,
file = "df.xlsx",
sheetName = "Data Sheet 2",
append = TRUE)
But, I wouldn't like to write this code for each sheet. Is there any command which allows to create an xlsx, adding different data to different sheets directly?
The openxlsx library can do this.
library(openxlsx)
# iniate workbook object
wb <- createWorkbook()
# add 2 worksheets to the workbook object, arbitrary names
addWorksheet(wb, "Sheet 1")
addWorksheet(wb, "Sheet 2")
# some arbitrary data, can also be data.frames
x <- matrix(1:10, 5, 2)
y <- matrix(11:20, 2, 5)
# write data to worksheets in the workbook object
writeData(wb, 1, x)
writeData(wb, 2, y)
# save the workbook to a file
saveWorkbook(wb, "2sheets.xlsx")
Yes, you can use the library writexl to perform that. See an example below on how to do that. To generate multiple sheets directly, you need to put elements into a list, and then call writexl::write_xlsx using that list as an argument. If you would like to customize the names of the sheets, you can pass the names() argument to the list.
library(writexl)
##create a list of data frames
list_of_data_frames <- lapply(1:10, function(i){
data.frame(rnorm(1000))
})
##Add names to the list: these will be converted to sheet names in the workbook
names(list_of_data_frames) <- sapply(1:10, function(i)paste0("sheet_",i))
##Write to file
writexl::write_xlsx(list_of_data_frames, "data_frames_to_excel.xlsx")
You can use the list function to grab the data into one "x" and just use write_xlsx as usual.
list_data <- list("sheet 1 name" = data1,"sheet 2 name" = data2, "sheet 3 name" = data3)
write_xlsx(list_data,directory path)
I've been using the openxlsx package to format different excel files (ie. highlighting a row based on a condition).
I created two workbooks with this package and each workbook is formatted differently. Now, I am trying to combine these two workbooks into a single excel file where these individual workbooks are tabs. Is there a way to do that? I know you can do it with multiple dataframes, but if I do that, then I lose my formatting.
For example, I tried this:
wb <- createWorkbook()
addWorksheet(wb, sheetName="data")
writeData(wb, sheet="data", x=data)
wb2 <- createWorkbook()
addWorksheet(wb2, sheetName="data2")
writeData(wb2, sheet="data2", x=data2)
write.xlsx(wb, file = "combined.xlsx", sheetName="data", row.names=FALSE)
write.xlsx(wb2,file = "combined.xlsx", sheetName="data2", append = TRUE, row.names=FALSE)
but it seems to only work for dataframes.
Following your example I would do the following:
wb <- createWorkbook()
addWorksheet(wb, sheetName="data")
writeData(wb, sheet="data", x=data)
write.xlsx(wb, "combined.xlsx")
wb2 <- createWorkbook()
addWorksheet(wb2, sheetName="data")
writeData(wb2, sheet="data", x=data)
write.xlsx(wb2, "wb2.xlsx")
wb_comb <-loadWorkbook("combined.xlsx")
lapply(sheets(wb2), function(s) {
dt <- read.xlsx("wb2.xlsx", sheet = s)
addWorksheet(wb_comb , sheetName = s)
writeData(wb_comb, s, dt)
})
write.xlsx(wb_comb, "combined.xlsx")
I have a list of data.frame's that I would like to output to their own worksheets in excel. I can easily save a single data frame to it's own excel file but I'm not sure how to save multiple data frames to the their own worksheet within the same excel file.
library(xlsx)
write.xlsx(sortedTable[1], "c:/mydata.xlsx")
Specify sheet name for each list element.
library(xlsx)
file <- paste("usarrests.xlsx", sep = "")
write.xlsx(USArrests, file, sheetName = "Sheet1")
write.xlsx(USArrests, file, sheetName = "Sheet2", append = TRUE)
Second approach as suggested by #flodel, would be to use addDataFrame. This is more or less an example from the help page of the said function.
file <- paste("usarrests.xlsx", sep="")
wb <- createWorkbook()
sheet1 <- createSheet(wb, sheetName = "Sheet1")
sheet2 <- createSheet(wb, sheetName = "Sheet2")
addDataFrame(USArrests, sheet = sheet1)
addDataFrame(USArrests * 2, sheet = sheet2)
saveWorkbook(wb, file = file)
Assuming you have a list of data.frames and a list of sheet names, you can use them pair-wise.
wb <- createWorkbook()
datas <- list(USArrests, USArrests * 2)
sheetnames <- paste0("Sheet", seq_along(datas)) # or names(datas) if provided
sheets <- lapply(sheetnames, createSheet, wb = wb)
void <- Map(addDataFrame, datas, sheets)
saveWorkbook(wb, file = file)
Here's the solution with openxlsx:
## create data;
dataframes <- split(iris, iris$Species)
# create workbook
wb <- createWorkbook()
#Iterate the same way as PavoDive, slightly different (creating an anonymous function inside Map())
Map(function(data, nameofsheet){
addWorksheet(wb, nameofsheet)
writeData(wb, nameofsheet, data)
}, dataframes, names(dataframes))
## Save workbook to excel file
saveWorkbook(wb, file = "file.xlsx", overwrite = TRUE)
.. however, openxlsx is also able to use it's function openxlsx::write.xlsx for this, so you can just give the object with your list of dataframes and the filepath, and openxlsx is smart enough to create the list as sheets within the xlsx-file. The code I post here with Map() is if you want to format the sheets in a specific way.
The following code works perfectly, which is from:https://rpubs.com/gbganalyst/RdatatoExcelworkbook
packages <- c("openxlsx", "readxl", "magrittr", "purrr", "ggplot2")
if (!require(install.load)) {
install.packages("install.load")
}
install.load::install_load(packages)
list_of_mydata
write.xlsx(list_of_mydata, "Excel workbook.xlsx")
lets say your list of data frames is called Lst and that the workbook you want to save to is called wb.xlsx. Then you can use:
library(xlsx)
counter <- 1
for (i in length(Lst)){
write.xlsx(x=Lst[[i]],file="wb.xlsx",sheetName=paste("sheet",counter,sep=""),append=T)
counter <- counter + 1
}
G
I think the simplest solution is still missing. Using the writexl package you can write a list of data frames easily:
list_of_dfs <- list(iris, iris)
writexl::write_xlsx(list_of_dfs, "output.xlsx")
Also, if you have a named list then those names become the sheet names:
names(list_of_dfs) <- c("a", "b")
writexl::write_xlsx(list_of_dfs, "output.xlsx")
Alternatively, the rio package allows for more export control and the syntax and handling of named lists is similar:
rio::export(list_of_dfs, "output.xlsx")
You can also easily output them to their own workbooks as well.