Avoid escape characters in paste0 when printing double quotation marks

Avoid escape characters in paste0 when printing double quotation marks - r

I am writing a script to enter a formula into an openxlsx excel sheet using writeFormula. For the function, I need to specify a formula vector of length equal to number of cells.
Here is what I am trying to do:
for(i in 2:(nrow(data)+1)){formula_dep<-append(formula_dep, paste0("IFERROR(SEARCH(\"pack\",H",i,"))"))}
writeFormula(wb=file, sheet="data", x=formula_dep, startCol=9, startRow=2)
In the output, the escape characters are probably getting printed into the excel sheet and thus it is getting corrupted (I have to repair to open the file, where the column has nothing in it).
In R, the output is (as usual):
"IFERROR(SEARCH(\"pack\",H2))"
While the escape characters are not a problem in many other tasks, in this one, I cannot make this work. I cannot use single quote as for some unknown reason Excel does not allow that in FIND or SEARCH functions (regex issues maybe). Please help with the solution here.
Note: I cannot just inculcate the formula in the dataframe itself (using R formulae) as it is supposed to work on user inputs in the excel file itself.
I am open to solutions both from Excel side (changing the formula while doing the same thing), or from R side.

"\" is not there, it just shows that print is escaping the ", see:
cat("IFERROR(SEARCH(\"pack\",H2))")
# IFERROR(SEARCH("pack",H2))
Here is a working example:
library(openxlsx)
wb <- createWorkbook()
addWorksheet(wb, "Sheet 1")
df <- data.frame(
a = letters[1:3],
)
writeData(wb, sheet = 1, x = df)
f <- paste0("FIND(\"b\",A", seq(nrow(df)) + 1L, ")")
f[ 1 ]
#[1] "FIND(\"b\",A2)"
cat(f[ 1 ])
# FIND("b",A2)
writeFormula(wb, sheet ="Sheet 1", x = f, startCol = 2, startRow = 2)
saveWorkbook(wb, "writeFormulaExample.xlsx", overwrite = TRUE)

Related

Select specific data frames from global environment [duplicate]

I am surprised to find that there is no easy way to export multiple data.frame to multiple worksheets of an Excel file? I tried xlsx package, seems it can only write to one sheet (override old sheet); I also tried WriteXLS package, but it gives me error all the time...
My code structure is like this: by design, for each iteration, the output dataframe (tempTable) and the sheetName (sn) got updated and exported into one tab.
for (i in 2 : ncol(code)){
...
tempTable <- ...
sn <- ...
WriteXLS("tempTable", ExcelFileName = "C:/R_code/../file.xlsx",
SheetNames = sn);
}
I can export to several cvs files, but there has to be an easy way to do that in Excel, right?

You can write to multiple sheets with the xlsx package. You just need to use a different sheetName for each data frame and you need to add append=TRUE:
library(xlsx)
write.xlsx(dataframe1, file="filename.xlsx", sheetName="sheet1", row.names=FALSE)
write.xlsx(dataframe2, file="filename.xlsx", sheetName="sheet2", append=TRUE, row.names=FALSE)
Another option, one that gives you more control over formatting and where the data frame is placed, is to do everything within R/xlsx code and then save the workbook at the end. For example:
wb = createWorkbook()
sheet = createSheet(wb, "Sheet 1")
addDataFrame(dataframe1, sheet=sheet, startColumn=1, row.names=FALSE)
addDataFrame(dataframe2, sheet=sheet, startColumn=10, row.names=FALSE)
sheet = createSheet(wb, "Sheet 2")
addDataFrame(dataframe3, sheet=sheet, startColumn=1, row.names=FALSE)
saveWorkbook(wb, "My_File.xlsx")
In case you might find it useful, here are some interesting helper functions that make it easier to add formatting, metadata, and other features to spreadsheets using xlsx:
http://www.sthda.com/english/wiki/r2excel-read-write-and-format-easily-excel-files-using-r-software

You can also use the openxlsx library to export multiple datasets to multiple sheets in a single workbook.The advantage of openxlsx over xlsx is that openxlsx removes the dependencies on java libraries.
Write a list of data.frames to individual worksheets using list names as worksheet names.
require(openxlsx)
list_of_datasets <- list("Name of DataSheet1" = dataframe1, "Name of Datasheet2" = dataframe2)
write.xlsx(list_of_datasets, file = "writeXLSX2.xlsx")

There's a new library in town, from rOpenSci: writexl
Portable, light-weight data frame to xlsx exporter based on
libxlsxwriter. No Java or Excel required
I found it better and faster than the above suggestions (working with the dev version):
library(writexl)
sheets <- list("sheet1Name" = sheet1, "sheet2Name" = sheet2) #assume sheet1 and sheet2 are data frames
write_xlsx(sheets, "path/to/location")

Many good answers here, but some of them are a little dated. If you want to add further worksheets to a single file then this is the approach I find works for me. For clarity, here is the workflow for openxlsx version 4.0
# Create a blank workbook
OUT <- createWorkbook()
# Add some sheets to the workbook
addWorksheet(OUT, "Sheet 1 Name")
addWorksheet(OUT, "Sheet 2 Name")
# Write the data to the sheets
writeData(OUT, sheet = "Sheet 1 Name", x = dataframe1)
writeData(OUT, sheet = "Sheet 2 Name", x = dataframe2)
# Export the file
saveWorkbook(OUT, "My output file.xlsx")
EDIT
I've now trialled a few other answers, and I actually really like #Syed's. It doesn't exploit all the functionality of openxlsx but if you want a quick-and-easy export method then that's probably the most straightforward.

I'm not familiar with the package WriteXLS; I generally use XLConnect:
library(XLConnect)
##
newWB <- loadWorkbook(
filename="F:/TempDir/tempwb.xlsx",
create=TRUE)
##
for(i in 1:10){
wsName <- paste0("newsheet",i)
createSheet(
newWB,
name=wsName)
##
writeWorksheet(
newWB,
data=data.frame(
X=1:10,
Dataframe=paste0("DF ",i)),
sheet=wsName,
header=TRUE,
rownames=NULL)
}
saveWorkbook(newWB)
This can certainly be vectorized, as #joran noted above, but just for the sake of generating dynamic sheet names quickly, I used a for loop to demonstrate.
I used the create=TRUE argument in loadWorkbook since I was creating a new .xlsx file, but if your file already exists then you don't have to specify this, as the default value is FALSE.
Here are a few screenshots of the created workbook:

Incase data size is small, R has many packages and functions which can be utilized as per your requirement.
write.xlsx, write.xlsx2, XLconnect also do the work but these are sometimes slow as compare to openxlsx.
So, if you are dealing with the large data sets and came across java errors. I would suggest to have a look of "openxlsx" which is really awesome and reduce the time to 1/12th.
I've tested all and finally i was really impressed with the performance of openxlsx capabilities.
Here are the steps for writing multiple datasets into multiple sheets.
install.packages("openxlsx")
library("openxlsx")
start.time <- Sys.time()
# Creating large data frame
x <- as.data.frame(matrix(1:4000000,200000,20))
y <- as.data.frame(matrix(1:4000000,200000,20))
z <- as.data.frame(matrix(1:4000000,200000,20))
# Creating a workbook
wb <- createWorkbook("Example.xlsx")
Sys.setenv("R_ZIPCMD" = "C:/Rtools/bin/zip.exe") ## path to zip.exe
Sys.setenv("R_ZIPCMD" = "C:/Rtools/bin/zip.exe") has to be static as it takes reference of some utility from Rtools.
Note: Incase Rtools is not installed on your system, please install it first for smooth experience. here is the link for your reference: (choose appropriate version)
https://cran.r-project.org/bin/windows/Rtools/
check the options as per link below (need to select all the check box while installation)
https://cloud.githubusercontent.com/assets/7400673/12230758/99fb2202-b8a6-11e5-82e6-836159440831.png
# Adding a worksheets : parameters for addWorksheet are 1. Workbook Name 2. Sheet Name
addWorksheet(wb, "Sheet 1")
addWorksheet(wb, "Sheet 2")
addWorksheet(wb, "Sheet 3")
# Writing data in to respetive sheets: parameters for writeData are 1. Workbook Name 2. Sheet index/ sheet name 3. dataframe name
writeData(wb, 1, x)
# incase you would like to write sheet with filter available for ease of access you can pass the parameter withFilter = TRUE in writeData function.
writeData(wb, 2, x = y, withFilter = TRUE)
## Similarly writeDataTable is another way for representing your data with table formatting:
writeDataTable(wb, 3, z)
saveWorkbook(wb, file = "Example.xlsx", overwrite = TRUE)
end.time <- Sys.time()
time.taken <- end.time - start.time
time.taken
openxlsx package is really good for reading and writing huge data from/ in excel files and has lots of options for custom formatting within excel.
The interesting fact is that we dont have to bother about java heap memory here.

I had this exact problem and I solved it this way:
library(openxlsx) # loads library and doesn't require Java installed
your_df_list <- c("df1", "df2", ..., "dfn")
for(name in your_df_list){
write.xlsx(x = get(name),
file = "your_spreadsheet_name.xlsx",
sheetName = name)
}
That way you won't have to create a very long list manually if you have tons of dataframes to write to Excel.

I regularly use the packaged rio for exporting of all kinds. Using rio, you can input a list, naming each tab and specifying the dataset. rio compiles other in/out packages, and for export to Excel, uses openxlsx.
library(rio)
filename <- "C:/R_code/../file.xlsx"
export(list(sn1 = tempTable1, sn2 = tempTable2, sn3 = tempTable3), filename)

tidy way of taking one dataframe and writing sheets by groups:
library(tidyverse)
library(xlsx)
mtcars %>%
mutate(cyl1 = cyl) %>%
group_by(cyl1) %>%
nest() %>%
ungroup() %>%
mutate(rn = row_number(),
app = rn != 1,
q = pmap(list(rn,data,app),~write.xlsx(..2,"test1.xlsx",as.character(..1),append = ..3)))

For me, WriteXLS provides the functionality you are looking for. Since you did not specify which errors it returns, I show you an example:
Example
library(WriteXLS)
x <- list(sheet_a = data.frame(a=letters), sheet_b = data.frame(b = LETTERS))
WriteXLS(x, "test.xlsx", names(x))
Explanation
If x is:
a list of data frames, each one is written to a single sheet
a character vector (of R objects), each object is written to a single sheet
something else, then see also what the help states:
More on usage
?WriteXLS
shows:
`x`: A character vector or factor containing the names of one or
more R data frames; A character vector or factor containing
the name of a single list which contains one or more R data
frames; a single list object of one or more data frames; a
single data frame object.
Solution
For your example, you would need to collect all data.frames in a list during the loop, and use WriteXLS after the loop has finished.
Session info
R 3.2.4
WriteXLS 4.0.0

I do it in this way for openxlsx using following function
mywritexlsx<-function(fname="temp.xlsx",sheetname="Sheet1",data,
startCol = 1, startRow = 1, colNames = TRUE, rowNames = FALSE)
{
if(! file.exists(fname))
wb = createWorkbook()
else
wb <- loadWorkbook(file =fname)
sheet = addWorksheet(wb, sheetname)
writeData(wb,sheet,data,startCol = startCol, startRow = startRow,
colNames = colNames, rowNames = rowNames)
saveWorkbook(wb, fname,overwrite = TRUE)
}

I do this all the time, all I do is
WriteXLS::WriteXLS(
all.dataframes,
ExcelFileName = xl.filename,
AdjWidth = T,
AutoFilter = T,
FreezeRow = 1,
FreezeCol = 2,
BoldHeaderRow = T,
verbose = F,
na = '0'
)
and all those data frames come from here
all.dataframes <- vector()
for (obj.iter in all.objects) {
obj.name <- obj.iter
obj.iter <- get(obj.iter)
if (class(obj.iter) == 'data.frame') {
all.dataframes <- c(all.dataframes, obj.name)
}
obviously sapply routine would be better here

for a lapply-friendly version..
library(data.table)
library(xlsx)
path2txtlist <- your.list.of.txt.files
wb <- createWorkbook()
lapply(seq_along(path2txtlist), function (j) {
sheet <- createSheet(wb, paste("sheetname", j))
addDataFrame(fread(path2txtlist[j]), sheet=sheet, startColumn=1, row.names=FALSE)
})
saveWorkbook(wb, "My_File.xlsx")

how can i stop excel showing 'number stored as text' for numbers with commas, when running R program?

I'm running this R code and the excel output file shows 'number stored as text' warning boxes against each of the numbers. I'm aware it's the commas causing this, but i want the commas in there.
Is there a way to stop these warning boxes showing?
df1 <- data.frame(col_1 = c('1,000', '1,500', '5,000'))
wb <- createWorkbook()
sheet.name <- 'test'
addWorksheet(wb, sheet.name)
writeData(wb, sheet = sheet.name, df1)
saveWorkbook(wb, file = "Test.xlsx")

Just save the numbers without the comma and then set the option to display a comma in Excel. You have the values as character string---not as numeric---in R. I don't think you can save them with a comma and simultaneously have it as number.
However, you can remove the comma in R before the export and set an additional style option to have the commas in Excel:
require("openxlsx")
df1 <- data.frame(col_1 = c("1,000", "1,500", "5,000"))
df1[, 1] <- as.numeric(gsub(",", "", df1[, 1]))
wb <- createWorkbook()
sheet.name <- 'test'
addWorksheet(wb, sheet.name)
writeData(wb, sheet = sheet.name, df1)
comma_thousand <- createStyle(numFmt = "COMMA")
addStyle(wb, sheet = sheet.name, style = comma_thousand, cols = 1, rows = seq(2, nrow(df1) + 1))
saveWorkbook(wb, file = "Test.xlsx", overwrite = TRUE)
I have the rows as seq(2, nrow(df1) + 1) since you have a header in the table.
Here, df1[, 1] <- as.numeric(gsub(",", "", df1[, 1])) is used to get rid of the commas before you save the dataframe as xlsy. You could also use other functions. In Rstudio, your dataframe will look like these before and after running this line:
The output looks like this in Excel:
So basically, use the values with the commas in R (for whatever reason), but remove them before you save the xlsx and use the code I provided in order to tell Excel you want the values to be displayed with commas.

How can I read a table in a loosely structured text file into a data frame in R?

Take a look at the "Estimated Global Trend daily values" file on this NOAA web page. It is a .txt file with something like 50 header lines (identified with leading #s) followed by several thousand lines of tabular data. The link to download the file is embedded in the code below.
How can I read this file so that I end up with a data frame (or tibble) with the appropriate column names and data?
All the text-to-data functions I know get stymied by those header lines. Here's what I just tried, riffing off of this SO Q&A. My thought was to read the file into a list of lines, then drop the lines that start with # from the list, then do.call(rbind, ...) the rest. The downloading part at the top works fine, but when I run the function, I'm getting back an empty list.
temp <- paste0(tempfile(), ".txt")
download.file("ftp://aftp.cmdl.noaa.gov/products/trends/co2/co2_trend_gl.txt",
destfile = temp, mode = "wb")
processFile = function(filepath) {
dat_list <- list()
con = file(filepath, "r")
while ( TRUE ) {
line = readLines(con, n = 1)
if ( length(line) == 0 ) {
break
}
append(dat_list, line)
}
close(con)
return(dat_list)
}
dat_list <- processFile(temp)

Here's a possible alternative
processFile = function(filepath, header=TRUE, ...) {
lines <- readLines(filepath)
comments <- which(grepl("^#", lines))
header_row <- gsub("^#","",lines[tail(comments,1)])
data <- read.table(text=c(header_row, lines[-comments]), header=header, ...)
return(data)
}
processFile(temp)
The idea is that we read in all the lines, find the ones that start with "#" and ignore them except for the last one which will be used as the header. We remove the "#" from the header (otherwise it's usually treated as a comment) and then pass it off to read.table to parse the data.

Here are a few options that bypass your function and that you can mix & match.
In the easiest (albeit unlikely) scenario where you know the column names already, you can use read.table and enter the column names manually. The default option of comment.char = "#" means those comment lines will be omitted.
read.table(temp, col.names = c("year", "month", "day", "cycle", "trend"))
More likely is that you don't know those column names, but can get them by figuring out how many comment lines there are, then reading just the last of those lines. That saves you having to read more of the file than you need; this is a small enough file that it shouldn't make a huge difference, but in a larger file it might. I'm doing the counting by accessing the command line, only because that's the way I know how. Note also that I saved the file at an easier path; you could instead paste the command together with the temp variable.
Again, the comments are omitted by default.
n_comments <- as.numeric(system("grep '^# ' co2.txt | wc -l", intern = TRUE))
hdrs <- scan(temp, skip = n_comments - 1, nlines = 1, what = "character")[-1]
read.table(temp, col.names = hdrs)
Or with dplyr and stringr, read all the lines, separate out the comments to extract column names, then filter to remove the comment lines and separate into fields, assigning the column names you've just pulled out. Again, with a bigger file, this could become burdensome.
library(dplyr)
lines <- data.frame(text = readLines(temp), stringsAsFactors = FALSE)
comments <- lines %>%
filter(stringr::str_detect(text, "^#"))
hdrs <- strsplit(comments[nrow(comments), 1], "\\s+")[[1]][-1]
lines %>%
filter(!stringr::str_detect(text, "^#")) %>%
mutate(text = trimws(text)) %>%
tidyr::separate(text, into = hdrs, sep = "\\s+") %>%
mutate_all(as.numeric)

Superscript from R to Excel table?

There are several great R packages for reading and writing MS Excel spreadsheets. Exporting superscripts from R is easy to LaTeX tables (see also this), but is there a way to directly export superscripts from R to an Excel table?
An example:
library(openxlsx)
dt <- data.frame(a = 1:3, b = c("a", "b", ""))
dt$try1 <- paste0(dt$a, "^{", dt$b, "}") ## Base R, openxlsx does not seem to know how to handle expression()
dt$try2 <- paste0(dt$a, "\\textsuperscript{", dt$b, "}") # Should work in xtable
dt$try3 <- paste0("\\textsuperscript{", dt$b, "}") # This does not work either
write.xlsx(dt, "Superscript test.xlsx")
The code produces a nice Excel table, but does not process LaTeX code (understandable, as we are exporting to Excel). Maybe there is a superscript code for Excel to bypass this issue?

This question has been in here for a while and I imagine OP has found a solution. In any case, my solution is entirely based on this open git issue.
For this to work, you need to define a superscript notation and create a separate column just like what you did in dt1$try1. I enclosed the superscript characters in _[] in my example. Just try to avoid ambiguous notation that may be found in other situations in your workbook.
dt <- data.frame(a = 1:3, b = c("a", "b", ""))
dt$sup <- paste0(dt$a, "_[", dt$b, "]") # create superscript col, enclosed in '_[]'
wb <- openxlsx::createWorkbook() # create workbook
openxlsx::addWorksheet(wb, sheetName = "data") # add sheet
openxlsx::writeData(wb, sheet=1, x=dt, xy=c(1, 1)) # write data on workbook
for(i in grep("\\_\\[([A-z0-9\\s]*)\\]", wb$sharedStrings)){
# if empty string in superscript notation, then just remove the superscript notation
if(grepl("\\_\\[\\]", wb$sharedStrings[[i]])){
wb$sharedStrings[[i]] <- gsub("\\_\\[\\]", "", wb$sharedStrings[[i]])
next # skip to next iteration
}
# insert additioanl formating in shared string
wb$sharedStrings[[i]] <- gsub("<si>", "<si><r>", gsub("</si>", "</r></si>", wb$sharedStrings[[i]]))
# find the "_[...]" pattern, remove brackets and udnerline and enclose the text with superscript format
wb$sharedStrings[[i]] <- gsub("\\_\\[([A-z0-9\\s]*)\\]", "</t></r><r><rPr><vertAlign val=\"superscript\"/></rPr><t xml:space=\"preserve\">\\1</t></r><r><t xml:space=\"preserve\">", wb$sharedStrings[[i]])
}
openxlsx::saveWorkbook(wb, file="test.xlsx", overwrite = TRUE)
wb$sharedStrings contains the unique instances of strings in your workbook cells. The pattern chosen captures any instance of word, digit and space (or empty string) enclosed in _[]. The first part of the loop checks for an absence of characters in the superscript notation and it removes the notation if TRUE.

Writing a csv file with fixed width in r

I read a file, change its content, and then I want to write the dataframe into a new file. The thing that bugs me is that the width of the columns isn't adjustable within Excel (it does not save changes).
I was wondering if it is possible to write the csv file with column width that fits the longest value.
dat <- read.csv("Input.csv")
# Do some processing
#Write the new file
write.csv(dat, "Output.csv", row.names=FALSE)
Edit 1:
dat <- read.csv("Input.csv")
createSheet(wb, "test")
writeWorksheet(wb, dat, "test")
colWidths <- sapply(1:ncol(dat), function(i) max(c(8, nchar(c(names(dat)[i], as.character(dat[, i]))))))
setColumnWidth(wb, "test", 1:ncol(dat), colWidths * 256)
saveWorkbook(wb)
what did I do wrong? It writes an empty file.

It doesn't matter what widths you write for your csv; Excel will always have its default column width when you open it.
Your options are:
Accept this behaviour.
Resave the file from Excel as something else (.xls or .xlsx)
Write the file from R using a package that directly exports Excel files. XLConnect will do this and even has a setColumnWidth function to set the column widths within R.
e.g.
dat <- data.frame(x = 1:24, `Long Column Name` = 25:48, `Wide Column` = paste(LETTERS, collapse = " "))
library("XLConnect")
wb <- loadWorkbook("Output.xlsx", create = TRUE)
createSheet(wb, "Output")
writeWorksheet(wb, dat, "Output")
colWidths <- sapply(1:ncol(dat), function(i) max(c(8, nchar(c(names(dat)[i], as.character(dat[, i])))))
setColumnWidth(wb, "Output", 1:ncol(dat), colWidths * 256)
saveWorkbook(wb)

You may need to close the .csv in Excel before you run write.csv in R because Excel locks the file.
It is also possible to pad the columns like this if that is what you want.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Avoid escape characters in paste0 when printing double quotation marks - r

Related

Select specific data frames from global environment [duplicate]

how can i stop excel showing 'number stored as text' for numbers with commas, when running R program?

How can I read a table in a loosely structured text file into a data frame in R?

Superscript from R to Excel table?

Writing a csv file with fixed width in r

Categories

Resources