Splitting Excel workbook with 50 Sheets into CSV files IN R - r

I am a very basic R user and not into loops or advanced R. Challenge I am facing with an Excel Workbook with 50 worksheets and each worksheet is comprising of 1 Million rows. Loading into R this huge workbook of appx 5GB is not getting possible. I am looking forward for a fast method in R to get this workbook split into multiple CSVs of a single consolidated one
Tried to search lot of solutions and system is not responding for hours.
Please help me out of this

What about a function like this?
library(readxl)
csv_saver <- function(sheet_number){
csv <- read_xlsx(path = "yr_file_name.xlsx", sheet = sheet_number)
write.csv(csv, file = paste0("sheet_",sheet_number,".csv"))
}
lapply(1:50, csv_saver)
This reads in the sheet number specified by the variable sheet_number as a dataframe and then writes the dataframe out as csv file. You then apply that function to the vector of all the numbers between 1 and 50

Related

Creating a loop for creating multiple sheet from multiple excel files in R

I have multiple excel files with data. I wanted to split the data in each excel file into multiple sheets within that particular excel file. I have already managed to do that with the following code:
library(Openxlsx)
data<- read.xlsx(file.choose())
splitdata <- split(data, data$Assigned)
splitdata
workbook <- createWorkbook()
Map(function(data,name){
addWorksheet(workbook, name)
writeDataTable(workbook, name, data)
},splitdata, names(splitdata))
saveWorkbook(workbook, file = "WorkbookWithMultipleSheets.xlsx", overwrite = TRUE)
However, I have more than 50 excel files, for which I need to create multiple sheets using the code above. Is there any way to create a loop so that I won't have to write this data for each excel file that I have?
Any help is appreciated! Thank you!

Creating a variable using select portions of an excel column

Let's say I have an excel column with 10 different cells with values. How do I create a variable in r that includes only the first four or first 6 cells in that column?
This question is very vague, please provide more information if you need specifics...
First of all, you'll want to use a library to import the contents of the excel file, I recommend using readxl (http://readxl.tidyverse.org)
You can then follow the documentation to read specific ranges from the excel file or just import all the contents and trim the resulting tibble.
Probably
# Install -readxl- package that loads in Excel spreadsheets
install.packages("readxl")
# Load -readxl- package for use
require(readxl)
# Change working directory to directory where spreadsheet is saved in
setwd("<Insert path here>")
# Save spreadsheet data to memory
myData <- read_excel("myData.xlsx", sheet = 1)
# Subset first four or six observations
firstFour <- myData[1:4,]
firstSix <- myData[1:6,]
Let me know if you don't understand.

to import and use data in multiple worksheets of excel workbook, one by one in R

I am working on trading data and generated output in an excel workbook d://ranks.xls with 217 worksheets, each worksheet contains 125 columns and 338 rows. Now want to use each worksheet data as input so, using
library(gdata)
for (i in 1:217)
{
dat[[i]]<-read.xls("d://ranks.xls",sheet=i,head=F)
}
its making a list rather i just want to call and use each sheet one by one... its toooooo much time consuming also ... is there any other way to do the same quickly????

r - read.xlsx from .xlsx with unknown number of sheets

Suppose I have an excel file, which I would like to read to R with read.xlsx function. File consists of spreadsheets, number of which I do not know (there is like 200 of such files so manually checking number of sheets would be huge pain). Each spreadsheet is organized like a proper data frame.
I would like to have those spreadsheets one on top of another.
I write something like:
columnsILike <- c(1,40)
for(i in 1:numberOfSheets){
dfInd <- read.xlsx("myfile.xlsx", i, # number of sheet
colIndex=columnsILike, endRow=201, startRow=2,
header=F)
PreviousEmptyDataFrame <- rbind(PreviousEmptyDataFrame, dfInd)
}
write.csv(PreviousEmptyDataFrame, "data.csv")
Question is, how do I know number of sheets in advance?
getSheets(loadWorkbook("file_path")) in the XLSX package should return a list of the sheets in the workbook so you can get the length of the list to find the amount of sheets.
This answer is rather late, but wouldn't this be simpler?
gdata::sheetCount("myworkbook.xlsx")
You can also use package XLConnect if the workbook isn't too large.
library(XLConnect)
wb <- loadWorkbook("myworkbook.xlsx")
result <- do.call(rbind,lapply(getSheets(wb),
function(sheet)readWorksheet(wb,sheet)))

How to append different R outputs into one excel spreadsheet [duplicate]

This question already has answers here:
Write data into a specific workbook sheet
(3 answers)
Closed 9 years ago.
So whenever I want to run my code seperately for different datasets...I want the output from my code to be saved in the same excel spreadsheet but at different sheets....So If I run my code for 20 different datasets...I would want all the output to saved in the same excel spreadsheet but different worksheets...so I would have 20 worksheets in a single excel spreadsheet...is there a special function in r that would let me do this?.....so lets say my existing spreadsheet is called 'Values.csv'....How would I append the rest of my output to this same spreadsheet.
I usually just use write.csv(data,'Values.csv') etc....But I'm not sure how to append my output to this same worksheet...
You can use library XLConnect to do this.
library(XLConnect)
#some sample data
your.data=data.frame(a=1:10,b=21:30)
#Create .xls file
wb <- loadWorkbook("newfile.xls", create = TRUE)
#Create Sheet in file
createSheet(wb,name="name_one")
#write data
writeWorksheet(wb,your.data,sheet="name_one")
#save .xls file
saveWorkbook(wb)

Resources