I've to compare multiple files (Prod1, Beta1, Prod2, Beta2.. etc) and export the differences in an excel sheet if any. That should be in separate cells (Column C). I'm trying with below code library(xlsx) I can store the data only in the 1st cell.
library(xlsx)
for(i in 1:No_of_files){
prod_file_res_name <- sprintf("R/Results/F_Query_Prod_%s.txt", i)
beta_file_res_name <- sprintf("R/Results/F_Query_Beta_%s.txt", i)
if (file.exists(prod_file_res_name) && file.exists(beta_file_res_name))
{
res <- tools::Rdiff(prod_file_res_name, beta_file_res_name, Log = TRUE)
if(res[2] != "character(0)"){
write.xlsx(toString(res[2]), file = "C:/R/diff.xlsx", sheetName = "Sheet1", col.names = FALSE, row.names =FALSE, append = TRUE)
}
else{
com <- "No Difference found"
write.xlsx(com, file = "C:/R/diff.xlsx", sheetName = "ExtractFormulaHistory", col.names = FALSE, row.names =FALSE, append = TRUE)
}
}
else {
print("File doesnt exist")
}
}
Can anyone help me to save the difference in column 5 but different rows(example: 1 to X no of files)? Thanks in Advance.
The easiest way would be to create a tibble, or list to hold you output and then in the last step of your code output it to Excel using the xlsx library.
Alternatively, you could use the writeWorksheet from XLConnect package to write to an Excel file as you work out the differences.
From documentation of the XLConnect package:
# Load workbook (create if not existing)
wb <- loadWorkbook("writeWorksheet.xlsx", create = TRUE)
# Create a worksheet called 'CO2'
createSheet(wb, name = "CO2")
# Write built-in data set 'CO2' to the worksheet created above;
# offset from the top left corner and with default header = TRUE
writeWorksheet(wb, CO2, sheet = "CO2", startRow = 4, startCol = 2)
# Save workbook (this actually writes the file to disk)
saveWorkbook(wb)
Related
I've looked all over and every solution I've found does not work for me. When I use the XLSX package with this syntax:
wb <- loadWorkbook(file path)
sheets <- getSheets(wb)
isotope <-sheets[[5]]
addDataFrame(mydataframe, isotope, row.Names = FALSE, col.names = FALSE, startRow = 9)
saveWorkbook(wb, new file path & name)
All of the data for each column is saved into a single cell. However! If I set row.names and col.names to TRUE, the data copies over correctly (although I get a column and row I don't need). Unfortunately there are formulas in the Excel sheet I'm trying to put the data in so the data frame needs to come over all clean.
wb <- loadWorkbook(file path)
sheets <- getSheets(wb)
isotope <-sheets[[5]]
addDataFrame(mydataframe, isotope, row.Names = FALSE, col.names = FALSE, startRow = 9)
saveWorkbook(wb, new file path & name)
Given the nature of this question, I can't really give a fully reproducible example. But here is the nature of the issue:
I have spreadsheets in a folder
I used openxlsx package and wrote a script to write a new formula in a different part of the spreadsheet, and to write "YES" in cell B1. When this happens, the Revenue line updates in the spreadsheet (using an if excel formula). A version 2 of the file is then saved with the changes.
So far so good.
I then used readxl to read the financials of both spreadsheets in the directory
The problem comes in when I look at the results. The financials in company_a2 (the updated file after running the script), the result after step 2, are being read in as the same as company_a even though they are different numbers.
I will put the code in case that reveals something obvious I am missing, as well as a couple of screen shots of the output
The code:
# ===========================================================================================
# Load packages
Library(tidyverse)
library(openxlsx
library(readxl)
# Set working directory to excel folder location
setwd("C:/Users/davem/Sync/R files/excel_file_test")
# Create function to write formulas and data in excel file
edit_xl_fx <- function(input_wb, input_sheet, input_formula, input_data, start_row_form, start_col_form, start_row_data, start_col_data){
# Load
wb <- openxlsx::loadWorkbook(input_wb)
# Write
formula_cols <- LETTERS[4:8]
lapply(1:length(formula_cols), FUN = function(x) writeFormula(wb, input_sheet, x = input_formula[x], startCol = formula_cols[x], startRow = start_row_form))
writeData(wb, sheet = input_sheet, x = input_data, startCol = start_col_data, startRow = start_row_data)
# Save
orig_xl_name <- str_remove(input_wb, pattern = ".xlsx")
new_xl_name <- paste0(orig_xl_name, 2, ".xlsx")
saveWorkbook(wb, new_xl_name, overwrite = TRUE)
}
# Get list of files
xl_files <- as.list(list.files(pattern = ".xlsx"))
# Run function
lapply(
xl_files,
edit_xl_fx,
input_sheet = "Sheet1",
input_formula = c(
"D21/(D20-D16)",
"E21/(E20-E16)",
"F21/(F20-F16)",
"G21/(G20-G16)",
"H21/(H20-H16)"
),
input_data = "YES",
start_row_form = 26,
start_col_form = 4:8,
start_row_data = 1,
start_col_data = 2
)
This all works. But then when I go to extract the financials, I get the same results even thought the values in the files are different.
Values in the excel files:
Code to read the files:
# Get updated list of files
xl_files_updated <- as.list(list.files(pattern = ".xlsx"))
# Create function to read in files
financials <- lapply(xl_files_updated, function(x) read_excel(path = x, sheet = "Sheet1", range = "B10:H21")) %>%
bind_rows(.) %>%
mutate(across(
.cols = is.numeric,
.fns = round, 1
)
)
Data that is read into R:
Even if I just run readxl::read_xlsx(path = "company_a2.xlsx", sheet = "Sheet1", range = "B10:H21") I still get the numbers in company_a.xlsx as opposed to the updated numbers.
Thanks for helping clear up what I am missing here!
I am trying to use openxlsx::write.xlsx to write results into Excel spreadsheet in R.
if the file exists and a new sheet is to be added, I can use append=T. Instead of using if to check the file, are there any ways to automatically check?
if the file and sheet both exist and this sheet is to be updated, how should I do to overwrite the results? Thanks.
Here is an openxlsx answer. In order to demonstrate, we need some data.
## Create a simple test file
library(openxlsx)
hs <- createStyle(textDecoration = "Bold")
l <- list("IRIS" = iris, "MTCARS" = mtcars)
write.xlsx(l, file = "TestFile.xlsx", borders = "columns", headerStyle = hs)
Question 1
You can check whether or not the file exists with
## Check existence of file
file.exists("TestFile.xlsx")
You can check if the tab (sheet) exists within the workbook
## Check available sheets
getSheetNames("TestFile.xlsx")
Steps for question 2:
1. Read the file into a Workbook object.
2. Pull the data from the sheet you want to modify into a data.frame.
3. Modify the data.frame to taste
4. Save the data back into the Workbook
5. Save the Workbook out to disk
In order to have a simple example to work with, let's create a simple test file.
## Load existing file
wb = loadWorkbook("TestFile.xlsx")
## Pull all data from sheet 1
Data = read.xlsx(wb, sheet=1)
## Change a single element for demonstration
## ** Beware!! ** Because of the header,
## the 2,2 position in the data
## is row 3 column 2 in the spreadsheet
Data[2,2] = 1492
## Put the data back into the workbook
writeData(wb, sheet=1, Data)
## Save to disk
saveWorkbook(wb, "TestFile.xlsx", overwrite = TRUE)
You can open up the spreadsheet and check that the change has been made.
If you want to completely change the sheet (as in your comment),
you can just delete the old sheet and replace it with a new one
using the same name.
removeWorksheet(wb, "IRIS")
addWorksheet(wb, "IRIS")
NewData = data.frame(X1=1:4, X2= LETTERS[1:4], X3=9:6)
writeData(wb, "IRIS", NewData)
saveWorkbook(wb, "TestFile.xlsx", overwrite = TRUE)
You can check if sheet exists before and if so remove it, if not append it to existing file this command also will create file if it does not exist.
library(xlsx)
path <- "testing.xlsx"
sheet_name = "new_sheet"
data <-
data.frame(
B = c(1, 2, 3, 4)
)
if(sheet_name %in% names(getSheets(loadWorkbook(path)))){
wb <- loadWorkbook(path)
removeSheet(wb, sheetName = sheet_name)
saveWorkbook(wb, path)
}
write.xlsx(data, path, sheetName = sheet_name, append = TRUE)
I am creating two dataframes and one graph on Rstudio. I wrote code to transfer them to an Excel file on different sheets, but each time I have to choose the file path using file.choose(). Is it possible to assign the file path to the variable when saving the file for the first time? If such a method exists, how can it be done?
I would also like to receive comments on how to more easily export my dataframes to an excel file. I shared my codes.
Thank you to everyone.
dataframe1 <- data.frame("A"=1, "B"=2)
dataframe2 <- data.frame("C"=3,"D"=4)
list_of_datasets <- list("Name of DataSheet1" = dataframe1, "Name of Datasheet2" = dataframe2, )
write.xlsx(list_of_datasets, file = "writeXLSX2.xlsx")
dflist <- list("Sonuçlar"=yazılacakdosya0, "Frame"=dtf, "Grafik"="")
edc <- write.xlsx(dflist, file.choose(new = T), colNames = TRUE,
borders = "surrounding",
firstRow = T,
headerStyle = hs)
require(ggplot2)
q1 <- qplot(hist(yazılacakdosya0$Puan))
print(q1)
insertPlot(wb=edc, sheet = "Grafik")
saveWorkbook(edc, file = file.choose(), overwrite = T)
Just save the file path before you call saveWorkbook
file = file.choose()
saveWorkbook(edc, file = file, overwrite = T)
I am trying to read value from a .xlsx file using openxlsx package in R. In simple words, I need to write a row of data, which then populates some output cell that has to be read back in R. I will share an example to better explain the problem.
Initial state of the .xlsx file:
I'm now trying to write new values to the cell : A2:A3 = c("c", 5). So ideally, I'm expecting A6 = 15
Below is the code used :
require(openxlsx)
path <- "C:/path_to_file/for_SO1.xlsx"
input_row <- c("c", 5)
# Load workbook; create if not existing
wb <- loadWorkbook(path)
# createSheet(wb, name = "1")
writeData(wb,
sheet = "Sheet1",
x = data.frame(input_row),
startCol=1,
startRow=1
)
data_IM <- read.xlsx(wb,
sheet = "Sheet1",
rows = c(5,6),
cols = c(1))
# Save workbook
saveWorkbook(wb, file = path, overwrite = TRUE)
#> data_IM
# output_row
#1 3
But I get the inital value(3). However, If i open the .xlsx file, I can see the 15 residing there:
What could be the reason for not able to read this cell? I tried saving it after writing to the file and again reading it but even that failed. openxlsx is the only option I have due to JAVA errors from XLConnect etc.
?read.xlsx
Formulae written using writeFormula to a Workbook object will not get
picked up by read.xlsx(). This is because only the formula is written
and left to be evaluated when the file is opened in Excel. Opening,
saving and closing the file with Excel will resolve this.
So the file needs to be opened in Excel and then saved, I can verify that this does work. However this may not be suitable for you.
XLConnect seems to have the desired functionality
# rjava can run out of memory sometimes, this can help.
options(java.parameters = "-Xmx1G")
library(XLConnect)
file_path = "test.xlsx"
input_row <- c("c", 5)
wb <- loadWorkbook(file_path, create=F)
writeWorksheet(wb, 1, startRow = 1, startCol = 1, data = data.frame(input_row))
setForceFormulaRecalculation(wb, 1, TRUE)
saveWorkbook(wb)
# checking
wb <- loadWorkbook(file_path, create=F)
readWorksheet(wb, 1)
The file https://cran.r-project.org/web/packages/openxlsx/openxlsx.pdf says
Workbook object will not get picked up by read.xlsx().
This is because only the formula is written and left to be evaluated when the file is opened in Excel.
Opening, saving and closing the file with Excel will resolve this.
So if you are using windows then
save following file vbs file to for example opensaveexcel.vbs
Set objExcel = CreateObject("Excel.Application")
Set objWorkbook = objExcel.Workbooks.Open("D:\Book2.xlsx")
objWorkbook.Save
objWorkbook.Close
objExcel.Quit
Set objExcel = Nothing
Set objWorkbook = Nothing
and then you can write R code as cell A4 has formula in Book1.xlsx as =A3*5
mywritexlsx(fname="d:/Book1.xlsx",data = 20,startCol = 1,startRow = 3)
system("cp d:\\Book1.xlsx d:\\Book2.xlsx")
system("cscript //nologo d:\\opensaveexcel.vbs")
tdt1=read.xlsx(xlsxFile = "d:/Book1.xlsx",sheet = "Sheet1",colNames = FALSE)
tdt2=read.xlsx(xlsxFile = "d:/Book2.xlsx",sheet = "Sheet1",colNames = FALSE)
Works for me by the way mywritexlsx is as
mywritexlsx<-function(fname="temp.xlsx",sheetname="Sheet1",data,
startCol = 1, startRow = 1, colNames = TRUE, rowNames = FALSE)
{
if(!file.exists(fname))
{
wb = openxlsx::createWorkbook()
sheet = openxlsx::addWorksheet(wb, sheetname)
}
else
{
wb <- openxlsx::loadWorkbook(file =fname)
if(!(sum(openxlsx::getSheetNames(fname)==sheetname)))
sheet = openxlsx::addWorksheet(wb, sheetname)
else
sheet=sheetname
}
openxlsx::writeData(wb,sheet,data,startCol = startCol, startRow = startRow,
colNames = colNames, rowNames = rowNames)
openxlsx::saveWorkbook(wb, fname,overwrite = TRUE)
}