Merge multiple Excel files starting at row in R

Merge multiple Excel files starting at row in R - r

I have multiple Excel files that I need to merge into one, but only certain rows. The Excel files look like this...
The column headers are identical for all files. I also need to add a new column A to the newly generated file, so I created a separate Excel file with just the headers and the new column A. My script first reads in this file (below) and writes it to the workbook...
Next, I need to read each file, starting at row 9 and merge all the data, one after another. So the final result should look like this (minus the Member site column, I haven't attempted the logic for that yet, but thinking it will be a substring of the Specimen ID value)...
However, my current result is...
I am currently only using 3 files, each with a few dozen rows, to start, but the end goal is to merge about 15-30 files, each with 25 to 200 rows, give or take. So...
1) I know my code is incorrect, but not sure how to get the intended results. For one, my loop is overwriting data because it's constantly starting at row/column 2 when it writes. However, I can't think how to rewrite this.
2) The dates are returning in General format ("43008" instead of "9/30/2017")
3) Certain columns data is being placed under different columns (like Nucleic Acid Concentration has the values from the Date of Tissue Content).
Any advice or help would be greatly appreciated!
My code...
library(openxlsx) # Excel and csv files
library(svDialogs) # Dialog boxes
setwd("C:/Users/Work/Combined Manifest")
# Create and load Excel file
wb <- createWorkbook()
# Add worksheet
addWorksheet(wb, "Template")
# Read in & write header file
df.headers <- read.xlsx("headers.xlsx", sheet = "Template")
writeData(wb, "Template", df.headers, colNames = TRUE)
# Function to get user path
getPath <- function() {
# Ask for path
path <- dlgInput("Enter path to files: ", Sys.info()["user"])$res
if (dir.exists(path)) {
# If path exists, set the path as the working directory
return(path)
} else {
# If not, issue an error and recall the getPath function
dlg_message("Error: The path you entered is not a valid directory. Please try again.")$res
getPath()
}
}
# Call getPath function
folder <- getPath()
setwd(folder)
# Get list of files in directory
pattern.ext <- "\\.xlsx$"
files <- dir(folder, full=TRUE, pattern=pattern.ext)
# Get basenames and remove extension
files.nms <- basename(files)
files.nms <- gsub(pattern.ext, "", files.nms)
# Set the names
names(files) <- files.nms
# Iterate to read in files and write to new file
for (nm in files.nms) {
# Read in files
df <- read.xlsx((files[nm]), sheet = "Template", startRow = 9, colNames = FALSE)
# Write data to sheet
writeData(wb, "Template", df, startCol = 2, startRow = 2, colNames = FALSE)
}
saveWorkbook(wb, "Combined.xlsx", overwrite = TRUE)
EDIT:
So with the loop below, I am successfully reading in the files and merging them. Thanks for all the help!
for (nm in files.nms) {
# Read in files
df <- read.xlsx(files[nm], sheet = "Template", startRow = 8, colNames = TRUE, detectDates = TRUE, skipEmptyRows = FALSE,
skipEmptyCols = FALSE)
# Append the data
allData <- rbind(allData, df)
}
EDIT: FINAL SOLUTION
Thanks to everyone for the help!!
library(openxlsx) # Excel and csv files
library(svDialogs) # Dialog boxes
# Create and load Excel file
wb <- createWorkbook()
# Add worksheet
addWorksheet(wb, "Template")
# Function to get user path
getPath <- function() {
# Ask for path
path <- dlgInput("Enter path to files: ", Sys.info()["user"])$res
if (dir.exists(path)) {
# If path exists, set the path as the working directory
return(path)
} else {
# If not, issue an error and recall the getPath function
dlg_message("Error: The path you entered is not a valid directory. Please try again.")$res
getPath()
}
}
# Call getPath function
folder <- getPath()
# Set working directory
setwd(folder)
# Get list of files in directory
pattern.ext <- "\\.xlsx$"
files <- dir(folder, full=TRUE, pattern=pattern.ext)
# Get basenames and remove extension
files.nms <- basename(files)
# Set the names
names(files) <- files.nms
# Create empty dataframe
allData <- data.frame()
# Create list (reserve memory)
f.List <- vector("list",length(files.nms))
# Look and load files
for (nm in 1:length(files.nms)) {
# Read in files
f.List[[nm]] <- read.xlsx(files[nm], sheet = "Template", startRow = 8, colNames = TRUE, detectDates = TRUE, skipEmptyRows = FALSE,
skipEmptyCols = FALSE)
}
# Append the data
allData <- do.call("rbind", f.List)
# Add a new column as 'Member Site'
allData <- data.frame('Member Site' = "", allData)
# Take the substring of the Specimen.ID column for Memeber Site
allData$Member.Site <- sapply(strsplit(allData$Specimen.ID, "-"), "[", 2)
# Write data to sheet
writeData(wb, "Template", startCol = 1, allData)
# Save workbook
saveWorkbook(wb, "Combined.xlsx", overwrite = TRUE)

First of all, you are providing a lot of information in your question, which is generally a good thing, but I’m wondering if you could make your problems easier to solve by recreating the problem using fewer and smaller files. Could you figure out how to merge two files, each containing a small amount of data first?
With regards to the first challenge you raise:
1) Yes you are overwriting the workbook in each loop. I would suggest you load the data and append it to a data.frame and then store the end result after loading all the files. Have a look at the example below. Please note that this example uses rbind, which is inefficient if you are combining a large number of files. So if you have many files you may need to use a different structure.
# Create and empty data frame
allData <- data.frame()
# Loop and load files
for(nm in files.nms) {
# Read in files
df <- read.xlsx((files[nm]), sheet = "Template", startRow = 9, colNames = FALSE)
# Append the data
allData <- rbind(allData, df)
}
# Write data to sheet
writeData(wb, "Template", df, startCol = 2, startRow = 2, colNames = FALSE)
Hopefully this gets you closer to what you need!
Edit: Updating the answer to address the comments made
If you have more then a few files, rbind will get slow like #Parfait mentioned due to multiple copies of the data being made. The way to avoid this, is by first reserving space in memory by creating an empty list with enough space to hold your data, then fill in the list, and only at the end merge all the data together using do.call("rbind", ...) . I've compiled some sample code below that's in line with what you provided in your question.
# Create list (reserve memory)
f.List <- vector("list",length(files.nms))
# Loop and load files
for(eNr in 1:length(files.nms)) {
# Read in files
f.List[[eNr]] <- read.xlsx((files.nms[eNr]), sheet = "Template", startRow = 9)
}
# Append the data
allData <- do.call("rbind", f.List)
Below to illustrate this further, a small reproducible example. It uses just a couple of data frames, but it illustrates the process of creating a list, populating that list, and merging the data as the last step.
# Sample data
df1 <- data.frame(x=1:3, y=3:1)
df2 <- data.frame(y=4:6, x=3:1)
df.List <- list(df1,df2)
# Create list
d.List <- vector("list",length(df.List))
# Loop and add data
for(eNr in 1:length(df.List)) {
d.List[[eNr]] <- df.List[[eNr]]
}
# Bind all at once
dfAll <- do.call("rbind", d.List)
print(dfAll)
Hope this help! Thanks!

Related

Reading xlsx with multiple sheets in R for duplication removal

I have a excel file which has multiple sheets embedded in it. My main goal is to basically remove all rows which are appearing multiple times in a single sheet and have to do this for every sheet.
I have written the code below but the code is only reading the first sheet and also giving ' ...' in first row and column. Can someone help me out where I might be going wrong. Thank you in advanced
**config_file_name <- '/RBIAPI3tables.xlsx'
config_xl <- paste(currentPath,config_file_name,sep="")
config_xl_sheets_name <- excel_sheets(path = config_xl) # An array of sheets is created. To access the array use config_xl_sheets[1]
count_of_xl_sheets <- length(config_xl_sheets_name)
# Read all sheets in the file as separate lists
list_all_sheets <- lapply(config_xl_sheets_name, function(x) read_excel(path = config_xl, sheet = x))
names (list_all_sheets) <- config_xl_sheets_name # Change the name of all the lists to excel file sheets name
count_of_list_all_sheets <- length(list_all_sheets) # to get the data frame of each list use list_all_sheets[[Config]]
# Create data frame for each sheet Assign the sheet name to the data frame
for (i in 1:count_of_list_all_sheets)
{
assign(x= trimws(config_xl_sheets_name[i]), value = data.frame(list_all_sheets[[i]]))
updateddata = unique(list_all_sheets[[i]])
}
write.xlsx(updateddata,"Unique3tables.xlsx",showNA = FALSE)**

this is my approach
library(readxl)
library(data.table)
library(openxlsx)
file.to.read <- "./testdata.xlsx"
sheets.to.read <- readxl::excel_sheets(file.to.read)
# read sheets from the file to a list and remove duplicate rows
L <- lapply(sheets.to.read, function(x) {
data <- setDT(readxl::read_excel(file.to.read, sheet = x))
#remove puplicates
data[!duplicated(data), ]
})
# create a new workbook
wb <- createWorkbook()
# create new worksheets an write to them
for (i in seq.int(L)) {
addWorksheet(wb, sheets.to.read[i])
writeData(wb, i, L[[i]] )
}
# write the workbook to disk
saveWorkbook(wb, "testdata_new.xlsx")

How to iterate over excel-sheets with readxl

I am supposed to load the data for my master-thesis in an R-dataframe, which is stored in 74 excel workbooks. Every workbook has 4 worksheets each, called: animals, features, r_words, verbs. All of the worksheets have the same 12 variables(starttime, word, endtime, ID,... etc.). I want to concatenate every worksheet under the one before, so the resulting dataframe should have 12 columns and the number of rows is depending on how many answers the 74 subjects produced.
I want to use the readxl-package of the tidyverse and followed this article: https://readxl.tidyverse.org/articles/articles/readxl-workflows.html#csv-caching-and-iterating-over-sheets.
The first problem I face is how to read all 4 worksheets with read_excel(path, sheet = "animals", "features", "r_words", "verbs"). This only works with the first worksheet, so I tried to make a list with all the sheet-names (object sheet). This is also not working. And when I try to use the following code with just one worksheet, the next line throws an error:
Error in basename(.) : a character vector argument expected
So, here is a part of my code, hopefully fulfilling the requirements:
filenames <- list.files("data", pattern = '\\.xlsm',full.names = TRUE)
# indices
subfile_nos <- 1:length(filenames)
# function to read all the sheets in at once and cache to csv
read_then_csv <- function(sheet, path) {
for (i in 1:length(filenames)){
sheet <- excel_sheets(filenames[i])
len.sheet <- 1:length(sheet)
path <- read_excel(filenames[i], sheet = sheet[i]) #only reading in the first sheet
pathbase <- path %>%
basename() %>% #Error in basename(.) : a character vector argument expected
tools::file_path_sans_ext()
path %>%
read_excel(sheet = sheet) %>%
write_csv(paste0(pathbase, "-", sheet, ".csv"))
}
}

You should do a double loop or a nested map, like so:
library(dplyr)
library(purrr)
library(readxl)
# I suggest looking at
?purrr::map_df
# Function to read all the sheets in at once and save as csv
read_then_csv <- function(input_filenames, output_file) {
# Iterate over files and concatenate results
map_df(input_filenames, function(f){
# Iterate over sheets and concatenate results
excel_sheets(f) %>%
map_df(function(sh){
read_excel(f, sh)
})
}) %>%
# Write csv
write_csv(output_file)
}
# Test function
filenames <- list.files("data", pattern = '\\.xlsm',full.names = TRUE)
read_then_csv(filenames, 'my_output.csv')

You say...'I want to concatenate every worksheet under the one before'... The script below will combine all sheets from all files. Test it on a COPY of your data, in case it doesn't do what you want/need it to do.
# load names of excel files
files = list.files(path = "C:\\your_path_here\\", full.names = TRUE, pattern = ".xlsx")
# create function to read multiple sheets per excel file
read_excel_allsheets <- function(filename, tibble = FALSE) {
sheets <- readxl::excel_sheets(filename)
sapply(sheets, function(f) as.data.frame(readxl::read_excel(filename, sheet = f)),
simplify = FALSE)
}
# execute function for all excel files in "files"
all_data <- lapply(files, read_excel_allsheets)

XLConnect sheet names too long

The code below asks the user for a path to .csv files, makes a list of the .csv filenames, then writes the contents of each .csv file to a sheet, in one .xlsx file. Each sheet is named after the original name of the .csv file.
My problem is that some of my .csv filenames are over 31 characters long, which is the limit for sheet names in Excel.
I want to put the sheet name in cell A1, write the contents of the file below that and give the sheets a name (1, 2, 3..., for instance), but I can't wrap my head around how I would accomplish this. Any suggestions would be greatly appreciated.
library(data.table) ## for fast fread() function
library(XLConnect)
library(svDialogs)
# Ask user for path to csv files
folder <- dlgInput(title = "Merge csv", "Enter path to csv files (use '/' instead of '\\': ", Sys.info()["user"])$res
setwd(folder)
# Create and load Excel file
wb <- loadWorkbook("Output.xlsx", create=TRUE)
# Get list of csv files
pattern.ext <- "\\.csv$"
files <- dir(folder, full=TRUE, pattern=pattern.ext)
# Use file names for sheet names
files.nms <- basename(files)
files.nms <- gsub(pattern.ext, "", files.nms)
# Set the names to make them easier to grab
names(files) <- files.nms
# Iterate over each csv and output to sheet in Excel with its name
for (nm in files.nms) {
# Ingest csv file
temp_DT <- fread(files[[nm]])
# Create the sheet with the name
createSheet(wb, name=nm)
# Output the contents of the csv
writeWorksheet(object=wb, data=temp_DT, sheet=nm, header=TRUE, rownames=NULL)
}
# Remove default sheets
removeSheet(wb, sheet = "Sheet1")
removeSheet(wb, sheet = "Sheet2")
removeSheet(wb, sheet = "Sheet3")
saveWorkbook(wb)
# Check to see if file exists
if (file.exists("Output.xlsx")) {
dlg_message("Your Excel file has been created.")$res
} else {
dlg_message("Error: Your file was not created. Please try again.")$res
}
EDIT - ALTERNATE SOLUTION:
Modified code to initially take substring of the file names...
# Use substring of file names for sheet names (Excel sheet name limit is 31)
files.nms <- substr(basename(files),1,31)
files.nms <- gsub(pattern.ext, "", files.nms)
FINAL EDIT: Added welcome message and function ask user for path, which checks if the path exists. If it does not, it will continually ask user until they cancel or enter an existing directory.
library(data.table) # fread() function
library(XLConnect) # Excel and csv files
library(svDialogs) # Dialog boxes
# Welcome message
dlg_message("This program will merge one or more .csv files into one Excel file. When prompted, enter the path
where the .csv files are located. Sheet names in the Excel file will consist of a substring of the original filename.")$res
# Function to get user path
getPath <- function() {
# Ask for path
path <- dlgInput("Enter path to .csv files: ", Sys.info()["user"])$res
if (dir.exists(path)) {
# If it is, set the path as the working directory
setwd(path)
} else {
# If not, issue an error and recall the getPath function
dlg_message("Error: The path you entered is not a valid directory. Please try again.")$res
getPath()
}
}
# Call getPath function
folder <- getPath()
# Create and load Excel file
wb <- loadWorkbook("Combined.xlsx", create=TRUE)
# Get list of csv files in directory
pattern.ext <- "\\.csv$"
files <- dir(folder, full=TRUE, pattern=pattern.ext)
# Use substring of file names for sheet names (Excel limit) and remove extension
files.nms <- substr(basename(files),9,39)
files.nms <- gsub(pattern.ext, "", files.nms)
# Set the names
names(files) <- files.nms
# Iterate over each .csv and output to Excel sheet
for (nm in files.nms) {
# Read in .csv files
df <- fread(files[nm])
# Create the sheet and name as substr of file name
createSheet(object = wb, name = nm)
# Writes contents of the .csv To Excel
writeWorksheet(object = wb, data = df, sheet = nm, header = TRUE, rownames = NULL)
# Create a custom anonymous cell style
cs <- createCellStyle(wb)
# Wrap text
setWrapText(object = cs, wrap = TRUE)
# Set column width
setColumnWidth(object = wb, sheet = nm, column = 1:50, width = -1)
}
saveWorkbook(wb)
# Check to see if Excel file exists and is greater than default file size
if (file.exists("Combined.xlsx") & file.size("Combined.xlsx") > 8731) {
dlg_message("Your Excel file has been created.")$res
} else {
dlg_message("Error: Your file may not have been created or compelted properly. Please verify and try again if necessary.")$res
}

Have you considered using openxlsx?
It's not java dependent and it's pretty feature full.
# Load package
library(openxlsx)
# Create workbook
wb <- createWorkbook()
# Define data.frames you want to write (adapt to your scenario)
df <- c("mtcars", "iris")
# Loop over the length of the dataframes defined above
for(i in 1:length(df)){
# Create a work sheet and call it the numeric value
addWorksheet(wb, as.character(i))
# In the first row, first column, specify the df name
writeData(wb, i, df[i], startRow = 1, startCol = 1)
# Write the data.frame to the second row, first column
writeData(wb, i, eval(parse(text=df[i])), startRow = 2, startCol = 1)
}
# Save workbook
saveWorkbook(wb, "eg.xlsx", overwrite = TRUE)

save files into a specific subfolder in a loop in R

I feel I am very close to the solution but at the moment i cant figure out how to get there.
I´ve got the following problem.
In my folder "Test" I´ve got stacked datafiles with the names M1_1; M1_2, M1_3 and so on: /Test/M1_1.dat for example.
No I want to seperate the files, so that I get: M1_1[1].dat, M1_1[2].dat, M1_1[3].dat and so on. These files I´d like to save in specific subfolders: Test/M1/M1_1[1]; Test/M1/M1_1[2] and so on, and Test/M2/M1_2[1], Test/M2/M1_2[2] and so on.
Now I already created the subfolders. And I got the following command to split up the files so that i get M1_1.dat[1] and so on:
for (e in dir(path = "Test/", pattern = ".dat", full.names=TRUE, recursive=TRUE)){
data <- read.table(e, header=TRUE)
df <- data[ -c(2) ]
out <- split(df , f = df$.imp)
lapply(names(out),function(z){
write.table(out[[z]], paste0(e, "[",z,"].dat"),
sep="\t", row.names=FALSE, col.names = FALSE)})
}
Now the paste0 command gets me my desired split up data (although its M1_1.dat[1] instead of M1_1[1].dat), but i cant figure out how to get this data into my subfolders.
Maybe you´ve got an idea?
Thanks in advance.

I don't have any idea what your data looks like so I am going to attempt to recreate the scenario with the gender datasets available at baby names
Assuming all the files from the zip folder are stored to "inst/data"
store all file paths to all_fi variable
all_fi <- list.files("inst/data",
full.names = TRUE,
recursive = TRUE,
pattern = "\\.txt$")
> head(all_fi, 3)
[1] "inst/data/yob1880.txt" "inst/data/yob1881.txt"
Preset function that will apply to each file in the directory
f.it <- function(f_in = NULL){
# Create the new folder based on the existing basename of the input file
new_folder <- file_path_sans_ext(f_in)
dir.create(new_folder)
data.table::fread(f_in) %>%
select(name = 1, gender = 2, freq = 3) %>%
mutate(
gender = ifelse(grepl("F", gender), "female","male")
) %>% (function(x){
# Dataset contains names for males and females
# so that's what I'm using to mimic your split
out <- split(x, x$gender)
o <- rbind.pages(
lapply(names(out), function(i){
# New filename for each iteration of the split dataframes
###### THIS IS WHERE YOU NEED TO TWEAK FOR YOUR NEEDS
new_dest_file <- sprintf("%s/%s.txt", new_folder, i)
# Write the sub-data-frame to the new file
data.table::fwrite(out[[i]], new_dest_file)
# For our purposes return a dataframe with file info on the new
# files...
data.frame(
file_name = new_dest_file,
file_size = file.size(new_dest_file),
stringsAsFactors = FALSE)
})
)
o
})
}
Now we can just loop through:
NOTE: for my purposes I'm not going to spend time looping through each file, for your purposes this would apply to each of your initial files, or in my case all_fi rather than all_fi[2:5].
> rbind.pages(lapply(all_fi[2:5], f.it))
============================ =========
file_name file_size
============================ =========
inst/data/yob1881/female.txt 16476
inst/data/yob1881/male.txt 15306
inst/data/yob1882/female.txt 18109
inst/data/yob1882/male.txt 16923
inst/data/yob1883/female.txt 18537
inst/data/yob1883/male.txt 15861
inst/data/yob1884/female.txt 20641
inst/data/yob1884/male.txt 17300
============================ =========

Mean values from multiple csv to data frame

After having searched for help in different threads on this topic, I still have not become wiser. Therefore: Here comes another question on looping through multiple data files...
OK. I have multiple CSV files in one folder containing 5 columns of data. The filenames are as follows:
Moist yyyymmdd hh_mm_ss.csv
I would like to create a script that reads processes the CSV-files one by one doing the following steps:
1) load file
2) check number of rows and exclude file if less than 3 registrations
3) calculate mean value of all measurements (=rows) for column 2
4) calculate mean value of all measurements (=rows) for column 4
5) output the filename timestamp, mean column 2 and mean column 4 to a data frame,
I have written the following function
moist.each.mean <- function() {
library("tcltk")
directory <- tk_choose.dir("","Choose folder for Humidity data files")
setwd(directory)
filelist <- list.files(path = directory)
filetitles <- regmatches(filelist, regexpr("[0-9].*[0-9]", filelist))
mdf <- data.frame(timestamp=character(), humidity=numeric(), temp=numeric())
for(i in 1:length(filelist)){
file.in[[i]] <- read.csv(filelist[i], header=F)
if (nrow(file.in[[i]]<3)){
print("discard")
} else {
newrow <- c(filetitles[[i]], round(mean(file.in[[i]]$V2),1), round(mean(file.in[[i]]$V4),1))
mdf <- rbind(mdf, newrow)
}
}
names(mdf) <- c("timestamp", "humidity", "temp")
}
but i keep getting an error:
Error in `[[<-.data.frame`(`*tmp*`, i, value = list(V1 = c(10519949L, :
replacement has 18 rows, data has 17
Any ideas?
Thx, kruemelprinz

I'd also suggest to use (l)apply... Here's my take:
getMeans <- function(fpath,runfct,
target_cols = c(2),
sep=",",
dec=".",
header = T,
min_obs_threshold = 3){
f <- list.files(fpath)
fcsv <- f[grepl("\.csv",f)]
fcsv <- paste0(fpath,fcsv)
csv_list <- lapply(fcsv,read.table,sep = sep,
dec = dec, header = header)
csv_rows <- sapply(csv_list,nrow)
rel_csv_list <- csv_list[!(csv_rows < min_obs_threshold)]
lapply(rel_csv_list,function(x) colMeans(x[,target_cols]))
}
Also with that kind of error message, the debugger might be very helpful.
Just run debug(moist.each.mean) and execute the function stepwise.

Here's a slightly different approach. Use lapply to read each csv file, exclude it if necessary, otherwise create a summary. This gives you a list where each element is a data frame summary. Then use rbind to create the final summary data frame.
Without a sample of your data, I can't be sure the code below exactly matches your problem, but hopefully it will be enough to get you where you want to go.
# Get vector of filenames to read
filelist=list.files(path=directory, pattern="csv")
# Read all the csv files into a list and create summaries
df.list = lapply(filelist, function(f) {
file.in = read.csv(f, header=TRUE, stringsAsFactors=FALSE)
# Set to empty data frame if file has less than 3 rows of data
if (nrow(file.in) < 3) {
print(paste("Discard", f))
# Otherwise, capture file timestamp and summarise data frame
} else {
data.frame(timestamp=substr(f, 7, 22),
humidity=round(mean(file.in$V2),1),
temp=round(mean(file.in$V4),1))
}
})
# Bind list into final summary data frame (excluding the list elements
# that don't contain a data frame because they didn't have enough rows
# to be included in the summary)
result = do.call(rbind, df.list[sapply(df.list, is.data.frame)])
One issue with your original code is that you create a vector of summary results rather than a data frame of results:
c(filetitles[[i]], round(mean(file.in[[i]]$V2),1), round(mean(file.in[[i]]$V4),1)) is a vector with three elements. What you actually want is a data frame with three columns:
data.frame(timestamp=filetitles[[i]],
humidity=round(mean(file.in[[i]]$V2),1),
temp=round(mean(file.in[[i]]$V4),1))

Thanks for the suggestions using lapply. This is definitely of value as it saves a whole lot of code as well! Meanwhile, I managed to fix my original code as well:
library("tcltk")
# directory: path to csv files
directory <-
tk_choose.dir("","Choose folder for Humidity data files")
setwd(directory)
filelist <- list.files(path = directory)
filetitles <-
regmatches(filelist, regexpr("[0-9].*[0-9]", filelist))
mdf <- data.frame()
for (i in 1:length(filelist)) {
file.in <- read.csv(filelist[i], header = F, skipNul = T)
if (nrow(file.in) < 3) {
print("discard")
} else {
newrow <-
matrix(
c(filetitles[[i]], round(mean(file.in$V2, na.rm=T),1), round(mean(file.in$V4, na.rm=T),1)), nrow = 1, ncol =
3, byrow = T
)
mdf <- rbind(mdf, newrow)
}
}
names(mdf) <- c("timestamp", "humidity", "temp")
Only I did not get it to work as a function because then I would only have one row in mdf containing the last file data. Somehow it did not add rows but overwrite row 1 with each iteration. But using it without a function wrapper worked fine...

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Merge multiple Excel files starting at row in R - r

Related

Reading xlsx with multiple sheets in R for duplication removal

How to iterate over excel-sheets with readxl

XLConnect sheet names too long

save files into a specific subfolder in a loop in R

Mean values from multiple csv to data frame

Categories

Resources