Related
I am surprised to find that there is no easy way to export multiple data.frame to multiple worksheets of an Excel file? I tried xlsx package, seems it can only write to one sheet (override old sheet); I also tried WriteXLS package, but it gives me error all the time...
My code structure is like this: by design, for each iteration, the output dataframe (tempTable) and the sheetName (sn) got updated and exported into one tab.
for (i in 2 : ncol(code)){
...
tempTable <- ...
sn <- ...
WriteXLS("tempTable", ExcelFileName = "C:/R_code/../file.xlsx",
SheetNames = sn);
}
I can export to several cvs files, but there has to be an easy way to do that in Excel, right?
You can write to multiple sheets with the xlsx package. You just need to use a different sheetName for each data frame and you need to add append=TRUE:
library(xlsx)
write.xlsx(dataframe1, file="filename.xlsx", sheetName="sheet1", row.names=FALSE)
write.xlsx(dataframe2, file="filename.xlsx", sheetName="sheet2", append=TRUE, row.names=FALSE)
Another option, one that gives you more control over formatting and where the data frame is placed, is to do everything within R/xlsx code and then save the workbook at the end. For example:
wb = createWorkbook()
sheet = createSheet(wb, "Sheet 1")
addDataFrame(dataframe1, sheet=sheet, startColumn=1, row.names=FALSE)
addDataFrame(dataframe2, sheet=sheet, startColumn=10, row.names=FALSE)
sheet = createSheet(wb, "Sheet 2")
addDataFrame(dataframe3, sheet=sheet, startColumn=1, row.names=FALSE)
saveWorkbook(wb, "My_File.xlsx")
In case you might find it useful, here are some interesting helper functions that make it easier to add formatting, metadata, and other features to spreadsheets using xlsx:
http://www.sthda.com/english/wiki/r2excel-read-write-and-format-easily-excel-files-using-r-software
You can also use the openxlsx library to export multiple datasets to multiple sheets in a single workbook.The advantage of openxlsx over xlsx is that openxlsx removes the dependencies on java libraries.
Write a list of data.frames to individual worksheets using list names as worksheet names.
require(openxlsx)
list_of_datasets <- list("Name of DataSheet1" = dataframe1, "Name of Datasheet2" = dataframe2)
write.xlsx(list_of_datasets, file = "writeXLSX2.xlsx")
There's a new library in town, from rOpenSci: writexl
Portable, light-weight data frame to xlsx exporter based on
libxlsxwriter. No Java or Excel required
I found it better and faster than the above suggestions (working with the dev version):
library(writexl)
sheets <- list("sheet1Name" = sheet1, "sheet2Name" = sheet2) #assume sheet1 and sheet2 are data frames
write_xlsx(sheets, "path/to/location")
Many good answers here, but some of them are a little dated. If you want to add further worksheets to a single file then this is the approach I find works for me. For clarity, here is the workflow for openxlsx version 4.0
# Create a blank workbook
OUT <- createWorkbook()
# Add some sheets to the workbook
addWorksheet(OUT, "Sheet 1 Name")
addWorksheet(OUT, "Sheet 2 Name")
# Write the data to the sheets
writeData(OUT, sheet = "Sheet 1 Name", x = dataframe1)
writeData(OUT, sheet = "Sheet 2 Name", x = dataframe2)
# Export the file
saveWorkbook(OUT, "My output file.xlsx")
EDIT
I've now trialled a few other answers, and I actually really like #Syed's. It doesn't exploit all the functionality of openxlsx but if you want a quick-and-easy export method then that's probably the most straightforward.
I'm not familiar with the package WriteXLS; I generally use XLConnect:
library(XLConnect)
##
newWB <- loadWorkbook(
filename="F:/TempDir/tempwb.xlsx",
create=TRUE)
##
for(i in 1:10){
wsName <- paste0("newsheet",i)
createSheet(
newWB,
name=wsName)
##
writeWorksheet(
newWB,
data=data.frame(
X=1:10,
Dataframe=paste0("DF ",i)),
sheet=wsName,
header=TRUE,
rownames=NULL)
}
saveWorkbook(newWB)
This can certainly be vectorized, as #joran noted above, but just for the sake of generating dynamic sheet names quickly, I used a for loop to demonstrate.
I used the create=TRUE argument in loadWorkbook since I was creating a new .xlsx file, but if your file already exists then you don't have to specify this, as the default value is FALSE.
Here are a few screenshots of the created workbook:
Incase data size is small, R has many packages and functions which can be utilized as per your requirement.
write.xlsx, write.xlsx2, XLconnect also do the work but these are sometimes slow as compare to openxlsx.
So, if you are dealing with the large data sets and came across java errors. I would suggest to have a look of "openxlsx" which is really awesome and reduce the time to 1/12th.
I've tested all and finally i was really impressed with the performance of openxlsx capabilities.
Here are the steps for writing multiple datasets into multiple sheets.
install.packages("openxlsx")
library("openxlsx")
start.time <- Sys.time()
# Creating large data frame
x <- as.data.frame(matrix(1:4000000,200000,20))
y <- as.data.frame(matrix(1:4000000,200000,20))
z <- as.data.frame(matrix(1:4000000,200000,20))
# Creating a workbook
wb <- createWorkbook("Example.xlsx")
Sys.setenv("R_ZIPCMD" = "C:/Rtools/bin/zip.exe") ## path to zip.exe
Sys.setenv("R_ZIPCMD" = "C:/Rtools/bin/zip.exe") has to be static as it takes reference of some utility from Rtools.
Note: Incase Rtools is not installed on your system, please install it first for smooth experience. here is the link for your reference: (choose appropriate version)
https://cran.r-project.org/bin/windows/Rtools/
check the options as per link below (need to select all the check box while installation)
https://cloud.githubusercontent.com/assets/7400673/12230758/99fb2202-b8a6-11e5-82e6-836159440831.png
# Adding a worksheets : parameters for addWorksheet are 1. Workbook Name 2. Sheet Name
addWorksheet(wb, "Sheet 1")
addWorksheet(wb, "Sheet 2")
addWorksheet(wb, "Sheet 3")
# Writing data in to respetive sheets: parameters for writeData are 1. Workbook Name 2. Sheet index/ sheet name 3. dataframe name
writeData(wb, 1, x)
# incase you would like to write sheet with filter available for ease of access you can pass the parameter withFilter = TRUE in writeData function.
writeData(wb, 2, x = y, withFilter = TRUE)
## Similarly writeDataTable is another way for representing your data with table formatting:
writeDataTable(wb, 3, z)
saveWorkbook(wb, file = "Example.xlsx", overwrite = TRUE)
end.time <- Sys.time()
time.taken <- end.time - start.time
time.taken
openxlsx package is really good for reading and writing huge data from/ in excel files and has lots of options for custom formatting within excel.
The interesting fact is that we dont have to bother about java heap memory here.
I had this exact problem and I solved it this way:
library(openxlsx) # loads library and doesn't require Java installed
your_df_list <- c("df1", "df2", ..., "dfn")
for(name in your_df_list){
write.xlsx(x = get(name),
file = "your_spreadsheet_name.xlsx",
sheetName = name)
}
That way you won't have to create a very long list manually if you have tons of dataframes to write to Excel.
I regularly use the packaged rio for exporting of all kinds. Using rio, you can input a list, naming each tab and specifying the dataset. rio compiles other in/out packages, and for export to Excel, uses openxlsx.
library(rio)
filename <- "C:/R_code/../file.xlsx"
export(list(sn1 = tempTable1, sn2 = tempTable2, sn3 = tempTable3), filename)
tidy way of taking one dataframe and writing sheets by groups:
library(tidyverse)
library(xlsx)
mtcars %>%
mutate(cyl1 = cyl) %>%
group_by(cyl1) %>%
nest() %>%
ungroup() %>%
mutate(rn = row_number(),
app = rn != 1,
q = pmap(list(rn,data,app),~write.xlsx(..2,"test1.xlsx",as.character(..1),append = ..3)))
For me, WriteXLS provides the functionality you are looking for. Since you did not specify which errors it returns, I show you an example:
Example
library(WriteXLS)
x <- list(sheet_a = data.frame(a=letters), sheet_b = data.frame(b = LETTERS))
WriteXLS(x, "test.xlsx", names(x))
Explanation
If x is:
a list of data frames, each one is written to a single sheet
a character vector (of R objects), each object is written to a single sheet
something else, then see also what the help states:
More on usage
?WriteXLS
shows:
`x`: A character vector or factor containing the names of one or
more R data frames; A character vector or factor containing
the name of a single list which contains one or more R data
frames; a single list object of one or more data frames; a
single data frame object.
Solution
For your example, you would need to collect all data.frames in a list during the loop, and use WriteXLS after the loop has finished.
Session info
R 3.2.4
WriteXLS 4.0.0
I do it in this way for openxlsx using following function
mywritexlsx<-function(fname="temp.xlsx",sheetname="Sheet1",data,
startCol = 1, startRow = 1, colNames = TRUE, rowNames = FALSE)
{
if(! file.exists(fname))
wb = createWorkbook()
else
wb <- loadWorkbook(file =fname)
sheet = addWorksheet(wb, sheetname)
writeData(wb,sheet,data,startCol = startCol, startRow = startRow,
colNames = colNames, rowNames = rowNames)
saveWorkbook(wb, fname,overwrite = TRUE)
}
I do this all the time, all I do is
WriteXLS::WriteXLS(
all.dataframes,
ExcelFileName = xl.filename,
AdjWidth = T,
AutoFilter = T,
FreezeRow = 1,
FreezeCol = 2,
BoldHeaderRow = T,
verbose = F,
na = '0'
)
and all those data frames come from here
all.dataframes <- vector()
for (obj.iter in all.objects) {
obj.name <- obj.iter
obj.iter <- get(obj.iter)
if (class(obj.iter) == 'data.frame') {
all.dataframes <- c(all.dataframes, obj.name)
}
obviously sapply routine would be better here
for a lapply-friendly version..
library(data.table)
library(xlsx)
path2txtlist <- your.list.of.txt.files
wb <- createWorkbook()
lapply(seq_along(path2txtlist), function (j) {
sheet <- createSheet(wb, paste("sheetname", j))
addDataFrame(fread(path2txtlist[j]), sheet=sheet, startColumn=1, row.names=FALSE)
})
saveWorkbook(wb, "My_File.xlsx")
I am writing a script to enter a formula into an openxlsx excel sheet using writeFormula. For the function, I need to specify a formula vector of length equal to number of cells.
Here is what I am trying to do:
for(i in 2:(nrow(data)+1)){formula_dep<-append(formula_dep, paste0("IFERROR(SEARCH(\"pack\",H",i,"))"))}
writeFormula(wb=file, sheet="data", x=formula_dep, startCol=9, startRow=2)
In the output, the escape characters are probably getting printed into the excel sheet and thus it is getting corrupted (I have to repair to open the file, where the column has nothing in it).
In R, the output is (as usual):
"IFERROR(SEARCH(\"pack\",H2))"
While the escape characters are not a problem in many other tasks, in this one, I cannot make this work. I cannot use single quote as for some unknown reason Excel does not allow that in FIND or SEARCH functions (regex issues maybe). Please help with the solution here.
Note: I cannot just inculcate the formula in the dataframe itself (using R formulae) as it is supposed to work on user inputs in the excel file itself.
I am open to solutions both from Excel side (changing the formula while doing the same thing), or from R side.
"\" is not there, it just shows that print is escaping the ", see:
cat("IFERROR(SEARCH(\"pack\",H2))")
# IFERROR(SEARCH("pack",H2))
Here is a working example:
library(openxlsx)
wb <- createWorkbook()
addWorksheet(wb, "Sheet 1")
df <- data.frame(
a = letters[1:3],
)
writeData(wb, sheet = 1, x = df)
f <- paste0("FIND(\"b\",A", seq(nrow(df)) + 1L, ")")
f[ 1 ]
#[1] "FIND(\"b\",A2)"
cat(f[ 1 ])
# FIND("b",A2)
writeFormula(wb, sheet ="Sheet 1", x = f, startCol = 2, startRow = 2)
saveWorkbook(wb, "writeFormulaExample.xlsx", overwrite = TRUE)
I want to read a xlsx file and I want to convert the data in the file into a long text string. I want to format this string in an intelligent manner, such as each row is contained in parentheses “()”, and keep the data in a comma separated value string. So for example if this was the xlsx file looked like this..
one,two,three
x,x,x
y,y,y
z,z,z
after formatting the string would look like
header(one,two,three)row(x,x,x)row(y,y,y)row(z,z,z)
How would you accomplish this task with R?
my first instinct was something like this… but I can’t figure it out..
library(xlsx)
sheet1 <- read.xlsx("run_info.xlsx",1)
paste("(",sheet1[1,],")")
This works for me:
DF <- read.xlsx("run_info.xlsx",1)
paste0("header(", paste(names(DF), collapse = ","), ")",
paste(paste0("row(", apply(DF, 1, paste, collapse = ","), ")"),
collapse = ""))
# [1] "header(one,two,three)row(x,x,x)row(y,y,y)row(z,z,z)"
I want to write a data frame from R into a CSV file. Consider the following toy example
df <- data.frame(ID = c(1,2,3), X = c("a", "b", "c"), Y = c(1,2,NA))
df[which(is.na(df[,"Y"])), 1]
write.table(t(df), file = "path to CSV/test.csv", sep = ""), col.names=F, sep=",", quote=F)
The output in test.csvlooks as follows
ID,1,2,3
X,a,b,c
Y, 1, 2,NA
At first glance, this is exactly as I need it, BUT what cannot be seen in the code insertion above is that after the NA in the last line, there is another linebreak. When I pass test.csv to a Javascript chart on a website, however, the trailing linebreak causes trouble.
Is there a way to avoid this final linebreak within R?
This is a little convoluted, but obtains your desired result:
zz <- textConnection("foo", "w")
write.table(t(df), file = zz, col.names=F, sep=",", quote=F)
close(zz)
foo
# [1] "ID,1,2,3" "X,a,b,c" "Y, 1, 2,NA"
cat(paste(foo, collapse='\n'), file = 'test.csv', sep='')
You should end up with a file that has newline character after only the first two data rows.
You can use a command line utility like sed to remove trailing whitespace from a file:
sed -e :a -e 's/^.\{1,77\}$/ & /;ta'
Or, you could begin by writing a single row then using append.
An alternative in the similar vein of the answer by #Thomas, but with slightly less typing. Send output from write.csv to a character string (capture.out). Concatenate the string (paste) and separate the elements with linebreaks (collapse = \n). Write to file with cat.
x <- capture.output(write.csv(df, row.names = FALSE, quote = FALSE))
cat(paste(x, collapse = "\n"), file = "df.csv")
You may also use format_csv from package readr to create a character vector with line breaks (\n). Remove the last end-of-line \n with substr. Write to file with cat.
library(readr)
x <- format_csv(df)
cat(substr(x, 1, nchar(x) - 1), file = "df.csv")
I want to read an R file or script, modify the name of the external data file being read and export the modified R code into a new R file or script. Other than the name of the data file being read (and the name of the new R file) I want the two R scripts to be identical.
I can come close, except that I cannot figure out how to retain the blank lines I use for readability and error reduction.
Here is the original R file being read. Note that some of the code in this file is non-sensical, but to me that is irrelevant. This code does not need to run.
# apple.pie.all.purpose.flour.arsc.Jun23.2013.r
library(my.library)
aa <- 10 # aa
bb <- c(1:7) # bb
my.data = convert.txt("../applepieallpurposeflour.txt",
group.df = data.frame(recipe =
c("recipe1", "recipe2", "recipe3", "recipe4", "recipe5")),
covariates = c(paste( "temp", seq_along(1:aa), sep="")))
ingredient <- c('all purpose flour')
function(make.pie){ make a pie }
Here is R code I use to read the above file, modify it and export the result. This R code runs and is the only code that needs to run to achieve the desired result (except that I cannot get the format of the new R script to match that of the original R script exactly, i.e., blank lines present in the original R script are not present in the new R script):
setwd('c:/users/mmiller21/simple r programs/')
# define new fruit
new.fruit <- 'peach'
# read flour file for original fruit
flour <- readLines('apple.pie.all.purpose.flour.arsc.Jun23.2013.r')
# create new file name
output.flour <- paste(new.fruit, ".pie.all.purpose.flour.arsc.Jun23.2013.r", sep="")
# add new file name
flour.a <- gsub("# apple.pie.all.purpose.flour.arsc.Jun23.2013.r",
paste("# ", output.flour, sep=""), flour)
# add line to read new data file
cat(file = output.flour,
gsub( "my.data = convert.txt\\(\"../applepieallpurposeflour.txt",
paste("my.data = convert.txt\\(\"../", new.fruit, "pieallpurposeflour.txt",
sep=""), flour.a),
sep=c("","\n"), fill = TRUE
)
Here is the resulting new R script:
# peach.pie.all.purpose.flour.arsc.Jun23.2013.r
library(my.library)
aa <- 10 # aa
bb <- c(1:7) # bb
my.data = convert.txt("../peachpieallpurposeflour.txt",
group.df = data.frame(recipe =
c("recipe1", "recipe2", "recipe3", "recipe4", "recipe5")),
covariates = c(paste( "temp", seq_along(1:aa), sep="")))
ingredient <- c('all purpose flour')
function(make.pie){ make a pie }
There is one blank line in the newly-created R file, but how can I insert all of the blank lines present in the original R script? Thank you for any advice.
EDIT: I cannot seem to duplicate the blank lines here on StackOverflow. They seem to be deleted automatically. StackOverflow is even deleting the indentation I am using and I cannot seem to replace it. Sorry about this. Automatic deletion of blank lines and indentation is problematic when the issue at hand is specifically about formatting. I cannot seem to fix the post to display the R code as formatted in my script. However, the code does display correctly when I am actively editing the post.
EDIT: June 27, 2013: The deletion of empty rows and indentation in the code for the original R file and in the code for the middle R file appears to be associated with my laptop rather than with StackOverflow. When I view this post and my answers on my office desktop the format is correct. When I view this post and my answers with my laptop the empty rows and indentation are gone. Perhaps my laptop monitor is malfunctioning. Sorry about assuming initially that the problem was with StackOverflow.
Here is a function that will create a new R file for every combination of two variables. Sorry the formatting of the code below is not better. The code does run and does work as intended (provided the name of the original R file ends in ".arsc.Jun26.2013.r" instead of in ".arsc.Jun23.2013.r" used in the original post):
setwd('c:/users/mmiller21/simple r programs/')
# define fruits of interest
fruits <- c('apple', 'pumpkin', 'pecan')
# define ingredients of interest
ingredients <- c('all.purpose.flour', 'sugar', 'ground.cinnamon')
# define every combination of fruit and ingredient
fruits.and.ingredients <- expand.grid(fruits, ingredients)
old.fruit <- as.character(rep('apple', nrow(fruits.and.ingredients)))
old.ingredient <- as.character(rep('all.purpose.flour', nrow(fruits.and.ingredients)))
fruits.and.ingredients2 <- cbind(old.fruit , as.character(fruits.and.ingredients[,1]),
old.ingredient, as.character(fruits.and.ingredients[,2]))
colnames(fruits.and.ingredients2) <- c('old.fruit', 'new.fruit', 'old.ingredient', 'new.ingredient')
# begin function
make.pie <- function(old.fruit, new.fruit, old.ingredient, new.ingredient) {
new.ingredient2 <- gsub('\\.', '', new.ingredient)
old.ingredient2 <- gsub('\\.', '', old.ingredient)
new.ingredient3 <- gsub('\\.', ' ', new.ingredient)
old.ingredient3 <- gsub('\\.', ' ', old.ingredient)
# file name
old.file <- paste(old.fruit, ".pie.", old.ingredient, ".arsc.Jun26.2013.r", sep="")
new.file <- paste(new.fruit, ".pie.", new.ingredient, ".arsc.Jun26.2013.r", sep="")
# read original fruit and original ingredient
flour <- readLines(old.file)
# add new file name
flour.a <- gsub(paste("# ", old.file, sep=""),
paste("# ", new.file, sep=""), flour)
# read new data file
old.data.file <- print(paste("my.data = convert.txt(\"../", old.fruit, "pie", old.ingredient2, ".txt\",", sep=""), quote=FALSE)
new.data.file <- print(paste("my.data = convert.txt(\"../", new.fruit, "pie", new.ingredient2, ".txt\",", sep=""), quote=FALSE)
flour.b <- ifelse(flour.a == old.data.file, new.data.file, flour.a)
flour.c <- ifelse(flour.b == paste('ingredient <- c(\'', old.ingredient3, '\')', sep=""),
paste('ingredient <- c(\'', new.ingredient3, '\')', sep=""), flour.b)
cat(flour.c, file = new.file, sep=c("\n"))
}
apply(fruits.and.ingredients2, 1, function(x) make.pie(x[1], x[2], x[3], x[4]))
Here is one solution that reproduces the original R script (except for the two desired changes) while also preserving the formatting of that original R script in the new R script.
setwd('c:/users/mmiller21/simple r programs/')
new.fruit <- 'peach'
flour <- readLines('apple.pie.all.purpose.flour.arsc.Jun23.2013.r')
output.flour <- paste(new.fruit, ".pie.all.purpose.flour.arsc.Jun23.2013.r", sep="")
flour.a <- gsub("# apple.pie.all.purpose.flour.arsc.Jun23.2013.r",
paste("# ", output.flour, sep=""), flour)
flour.b <- gsub( "my.data = convert.txt\\(\"../applepieallpurposeflour.txt",
paste("my.data = convert.txt\\(\"../", new.fruit, "pieallpurposeflour.txt", sep=""), flour.a)
for(i in 1:length(flour.b)) {
if(i == 1) cat(flour.b[i], file = output.flour, sep=c("\n"), fill=TRUE )
if(i > 1) cat(flour.b[i], file = output.flour, sep=c("\n"), fill=TRUE, append = TRUE)
}
Again, I apologize for my inability to format the above R code in a readable way. I have never encountered this problem on StackOverflow and do not know the solution. Regardless, the above R script solves the problem I described in the original post.
To see the formatting of the original R script you will have to click the edit button under the original post.
EDIT: June 25, 2013
I do not know what I was doing differently yesterday, but today I found that the following simple cat statement, in place of the for-loop immediately above, creates the new R script while preserving the formatting of the original R script.
cat(flour.b, file = output.flour, sep=c("\n"))