I am trying to use the xlsx package to put different csv files into one excel workbook with multiple sheets. I found a routine that should work but it is not working for me.
So I have different csv files:
S:/productivity/R/Results/2008.csv
S:/productivity/R/Results/2009.csv
S:/productivity/R/Results/2010.csv
S:/productivity/R/Results/2011.csv
S:/productivity/R/Results/2012.csv
My R codes look like:
# loading the library
library(xlsx)
rm(list = ls())
# getting the path of all csv files
myfiles = system("S:/productivity/R/Results",intern = TRUE)
wb <- createWorkbook()
# going through each csv file
for (item in myfiles) {
# create a sheet in the workbook
sheet <- createSheet(wb, sheetName=strsplit(item,"/")[[1]][5])
# add the data to the new sheet
addDataFrame(read.csv(item), sheet)
}
# saving the workbook
saveWorkbook(wb, "2008_2012.xlsx")
I receive the following error:
myfiles = system('"S:/productivity/R/Results"',intern = TRUE)
Error in system("\"S:/productivity/R/Results\"", intern = TRUE) :
'"S:/productivity/R/Results"' not found
Personally, I use XLConnect for these tasks.
The steps for writing to multiple sheets are:
create a new workbook
create the sheet
output the data to the sheet
save the sheet
--
SAMPLE CODE:
library(data.table) ## for fast fread() function
library(XLConnect)
folder <- "folder/where/CSV_files_are_located"
f.out <- "path/to/file.xlsx"
## load in file
wb <- loadWorkbook(f.out, create=TRUE)
## get all files
pattern.ext <- "\\.csv$"
files <- dir(folder, full=TRUE, pattern=pattern.ext)
## Grab the base file names, you can use them as the sheet names
files.nms <- basename(files)
files.nms <- gsub(pattern.ext, "", files.nms)
## set the names to make them easier to grab
names(files) <- files.nms
for (nm in files.nms) {
## ingest the CSV file
temp_DT <- fread(files[[nm]])
## Create the sheet where the file will be outputed to
createSheet(wb, name=nm)
## output the csv contents
writeWorksheet(object=wb, data=temp_DT, sheet=nm, header=TRUE, rownames=NULL)
}
saveWorkbook(wb)
if you would like to see your file
system(sprintf("open %s", dirname(f.out))) ## For the containing folder
system(sprintf("open %s", f.out)) ## for opening the file with default app, ie excel
Related
So, I have many excel files in a folder, and each file has multiple sheets. If the name of the excel file is 'xyz', I want each sheet of each excel file to contain a 'new_column' such that each row of the new column will contain the excel file name (in this example, 'xyz').
Is there any direct way to do that? I would prefer to directly alter the files in the folder without creating new dataframes within rstudio.
Thanks.
You can use double lapply -
library(readxl)
library(writexl)
#Get a vector of xlsx filenames
filenames <- list.files(pattern = '.xlsx', full.names = TRUE)
lapply(filenames, function(x) {
#Read the sheet names
sheetname <- excel_sheets(x)
#For each sheet read the data and create list of dataframe
lapply(sheetname, function(y) {
cbind(read_xlsx(x, y), filename = x)
}) -> res
#Assign names to the list
names(res) <- sheetname
#Write the data back
write_xlsx(res, x)
})
The code below asks the user for a path to .csv files, makes a list of the .csv filenames, then writes the contents of each .csv file to a sheet, in one .xlsx file. Each sheet is named after the original name of the .csv file.
My problem is that some of my .csv filenames are over 31 characters long, which is the limit for sheet names in Excel.
I want to put the sheet name in cell A1, write the contents of the file below that and give the sheets a name (1, 2, 3..., for instance), but I can't wrap my head around how I would accomplish this. Any suggestions would be greatly appreciated.
library(data.table) ## for fast fread() function
library(XLConnect)
library(svDialogs)
# Ask user for path to csv files
folder <- dlgInput(title = "Merge csv", "Enter path to csv files (use '/' instead of '\\': ", Sys.info()["user"])$res
setwd(folder)
# Create and load Excel file
wb <- loadWorkbook("Output.xlsx", create=TRUE)
# Get list of csv files
pattern.ext <- "\\.csv$"
files <- dir(folder, full=TRUE, pattern=pattern.ext)
# Use file names for sheet names
files.nms <- basename(files)
files.nms <- gsub(pattern.ext, "", files.nms)
# Set the names to make them easier to grab
names(files) <- files.nms
# Iterate over each csv and output to sheet in Excel with its name
for (nm in files.nms) {
# Ingest csv file
temp_DT <- fread(files[[nm]])
# Create the sheet with the name
createSheet(wb, name=nm)
# Output the contents of the csv
writeWorksheet(object=wb, data=temp_DT, sheet=nm, header=TRUE, rownames=NULL)
}
# Remove default sheets
removeSheet(wb, sheet = "Sheet1")
removeSheet(wb, sheet = "Sheet2")
removeSheet(wb, sheet = "Sheet3")
saveWorkbook(wb)
# Check to see if file exists
if (file.exists("Output.xlsx")) {
dlg_message("Your Excel file has been created.")$res
} else {
dlg_message("Error: Your file was not created. Please try again.")$res
}
EDIT - ALTERNATE SOLUTION:
Modified code to initially take substring of the file names...
# Use substring of file names for sheet names (Excel sheet name limit is 31)
files.nms <- substr(basename(files),1,31)
files.nms <- gsub(pattern.ext, "", files.nms)
FINAL EDIT: Added welcome message and function ask user for path, which checks if the path exists. If it does not, it will continually ask user until they cancel or enter an existing directory.
library(data.table) # fread() function
library(XLConnect) # Excel and csv files
library(svDialogs) # Dialog boxes
# Welcome message
dlg_message("This program will merge one or more .csv files into one Excel file. When prompted, enter the path
where the .csv files are located. Sheet names in the Excel file will consist of a substring of the original filename.")$res
# Function to get user path
getPath <- function() {
# Ask for path
path <- dlgInput("Enter path to .csv files: ", Sys.info()["user"])$res
if (dir.exists(path)) {
# If it is, set the path as the working directory
setwd(path)
} else {
# If not, issue an error and recall the getPath function
dlg_message("Error: The path you entered is not a valid directory. Please try again.")$res
getPath()
}
}
# Call getPath function
folder <- getPath()
# Create and load Excel file
wb <- loadWorkbook("Combined.xlsx", create=TRUE)
# Get list of csv files in directory
pattern.ext <- "\\.csv$"
files <- dir(folder, full=TRUE, pattern=pattern.ext)
# Use substring of file names for sheet names (Excel limit) and remove extension
files.nms <- substr(basename(files),9,39)
files.nms <- gsub(pattern.ext, "", files.nms)
# Set the names
names(files) <- files.nms
# Iterate over each .csv and output to Excel sheet
for (nm in files.nms) {
# Read in .csv files
df <- fread(files[nm])
# Create the sheet and name as substr of file name
createSheet(object = wb, name = nm)
# Writes contents of the .csv To Excel
writeWorksheet(object = wb, data = df, sheet = nm, header = TRUE, rownames = NULL)
# Create a custom anonymous cell style
cs <- createCellStyle(wb)
# Wrap text
setWrapText(object = cs, wrap = TRUE)
# Set column width
setColumnWidth(object = wb, sheet = nm, column = 1:50, width = -1)
}
saveWorkbook(wb)
# Check to see if Excel file exists and is greater than default file size
if (file.exists("Combined.xlsx") & file.size("Combined.xlsx") > 8731) {
dlg_message("Your Excel file has been created.")$res
} else {
dlg_message("Error: Your file may not have been created or compelted properly. Please verify and try again if necessary.")$res
}
Have you considered using openxlsx?
It's not java dependent and it's pretty feature full.
# Load package
library(openxlsx)
# Create workbook
wb <- createWorkbook()
# Define data.frames you want to write (adapt to your scenario)
df <- c("mtcars", "iris")
# Loop over the length of the dataframes defined above
for(i in 1:length(df)){
# Create a work sheet and call it the numeric value
addWorksheet(wb, as.character(i))
# In the first row, first column, specify the df name
writeData(wb, i, df[i], startRow = 1, startCol = 1)
# Write the data.frame to the second row, first column
writeData(wb, i, eval(parse(text=df[i])), startRow = 2, startCol = 1)
}
# Save workbook
saveWorkbook(wb, "eg.xlsx", overwrite = TRUE)
I have multiple excel files having different names e.g USA.xlsx, India.xlsx etc. Each file has only one sheet. I want to rename the sheet of each file as Sheet 1.
Desired output USA.xlsx should have sheet 1, India.xlsx should have sheet 1 and so on. I have 1800 excel files. I know renameWorksheet(wb, sheet, newName) will work for one file. I have 1800 excel files
I think this should get the job done:
library(openxlsx)
list.files(pattern = '.xlsx')
for(file in list.files(pattern = '.xlsx')){
wb <- loadWorkbook(file, xlsxFile = NULL)
names(wb)[1] <- 'Sheet1'
saveWorkbook(wb, file, overwrite = TRUE)
}
I have a list of data.frame's that I would like to output to their own worksheets in excel. I can easily save a single data frame to it's own excel file but I'm not sure how to save multiple data frames to the their own worksheet within the same excel file.
library(xlsx)
write.xlsx(sortedTable[1], "c:/mydata.xlsx")
Specify sheet name for each list element.
library(xlsx)
file <- paste("usarrests.xlsx", sep = "")
write.xlsx(USArrests, file, sheetName = "Sheet1")
write.xlsx(USArrests, file, sheetName = "Sheet2", append = TRUE)
Second approach as suggested by #flodel, would be to use addDataFrame. This is more or less an example from the help page of the said function.
file <- paste("usarrests.xlsx", sep="")
wb <- createWorkbook()
sheet1 <- createSheet(wb, sheetName = "Sheet1")
sheet2 <- createSheet(wb, sheetName = "Sheet2")
addDataFrame(USArrests, sheet = sheet1)
addDataFrame(USArrests * 2, sheet = sheet2)
saveWorkbook(wb, file = file)
Assuming you have a list of data.frames and a list of sheet names, you can use them pair-wise.
wb <- createWorkbook()
datas <- list(USArrests, USArrests * 2)
sheetnames <- paste0("Sheet", seq_along(datas)) # or names(datas) if provided
sheets <- lapply(sheetnames, createSheet, wb = wb)
void <- Map(addDataFrame, datas, sheets)
saveWorkbook(wb, file = file)
Here's the solution with openxlsx:
## create data;
dataframes <- split(iris, iris$Species)
# create workbook
wb <- createWorkbook()
#Iterate the same way as PavoDive, slightly different (creating an anonymous function inside Map())
Map(function(data, nameofsheet){
addWorksheet(wb, nameofsheet)
writeData(wb, nameofsheet, data)
}, dataframes, names(dataframes))
## Save workbook to excel file
saveWorkbook(wb, file = "file.xlsx", overwrite = TRUE)
.. however, openxlsx is also able to use it's function openxlsx::write.xlsx for this, so you can just give the object with your list of dataframes and the filepath, and openxlsx is smart enough to create the list as sheets within the xlsx-file. The code I post here with Map() is if you want to format the sheets in a specific way.
The following code works perfectly, which is from:https://rpubs.com/gbganalyst/RdatatoExcelworkbook
packages <- c("openxlsx", "readxl", "magrittr", "purrr", "ggplot2")
if (!require(install.load)) {
install.packages("install.load")
}
install.load::install_load(packages)
list_of_mydata
write.xlsx(list_of_mydata, "Excel workbook.xlsx")
lets say your list of data frames is called Lst and that the workbook you want to save to is called wb.xlsx. Then you can use:
library(xlsx)
counter <- 1
for (i in length(Lst)){
write.xlsx(x=Lst[[i]],file="wb.xlsx",sheetName=paste("sheet",counter,sep=""),append=T)
counter <- counter + 1
}
G
I think the simplest solution is still missing. Using the writexl package you can write a list of data frames easily:
list_of_dfs <- list(iris, iris)
writexl::write_xlsx(list_of_dfs, "output.xlsx")
Also, if you have a named list then those names become the sheet names:
names(list_of_dfs) <- c("a", "b")
writexl::write_xlsx(list_of_dfs, "output.xlsx")
Alternatively, the rio package allows for more export control and the syntax and handling of named lists is similar:
rio::export(list_of_dfs, "output.xlsx")
You can also easily output them to their own workbooks as well.
I have a set of csv files in different directories, I would like to put them all in one excel file, each table in one excel sheet.
I am using R and xlsx package.
# loading the library
library(xlsx)
rm(list = ls())
# getting the path of all reports (they are in csv format)
restab = system("ls /home/ubuntu/ibasruns/control/*/report",intern = TRUE)
# creating work book
wb <- createWorkbook()
# going through each csv file
for (item in restab)
{
# making each as a sheet
sheet <- createSheet(wb, sheetName=strsplit(item,"/")[[1]][6])
addDataFrame(read.csv(item), sheet)
# saving the workbook
saveWorkbook(wb, "AliceResultSummary.xlsx")
}
# finally writing it.
write.xlsx(wb, "AliceResultSummary.xlsx")
However, in the last line, I am getting the following error,
Error in as.data.frame.default(x[[i]], optional = TRUE) : cannot
coerce class "structure("jobjRef", package = "rJava")" to a data.frame
Is there any thing that I am missing ?
You're close:
# creating work book
wb <- createWorkbook()
# going through each csv file
for (item in restab)
{
# create a sheet in the workbook
sheet <- createSheet(wb, sheetName=strsplit(item,"/")[[1]][6])
# add the data to the new sheet
addDataFrame(read.csv(item), sheet)
}
# saving the workbook
saveWorkbook(wb, "AliceResultSummary.xlsx")
The write.xlsx is not needed here; it's just used to create a workbook from a single data frame.