How to read multiple .xlsx and generate multimple data frames in R?

How to read multiple .xlsx and generate multimple data frames in R? - r

I want to read three different files in xlsx and save them in three different dataframes called excel1, excel2 and excel3. How can I do that? I think it should be something like this:
files = list.files(pattern='[.]xlsx') #There are three files.
for (i in 1:files){
"excel" + i =read.xlsx(files[i])
}

I suggest you to use a list instead of creating 3 variables in the current workspace:
dfList <- list()
for (i in 1:files){
dfList[[paste0("excel",i)]] <- read.xlsx(files[i])
}
Then you can access to them in this way :
dfList$excel1
dfList$excel2
dfList$excel3
or :
dfList[[1]]
dfList[[2]]
dfList[[3]]
But, if you really really want to create new variables, you can use assign function :
for (i in 1:files){
assign(paste0("excel",i), read.xlsx(files[i]))
}
# now excel1, excel2, excel3 variables exist...

You can use plyr also and it's a good practice to mention the environment in which you want to create the variable:
library(plyr)
l_ply(1:length(files), function(i) assign(paste0('excel',i),read.xlsx(files[i]), envir=globalenv()))

If someone tries to use this code, this parameters are really helpful:
library(xlsx)
files = list.files(pattern='[.]xlsx')
dfList <- list()
for (i in 1:length(files)){
dfList[[paste0("excel",i)]] <- read.xlsx(files[i],header=T,stringsAsFactors=FALSE,sheetIndex = 1)
}

Related

Apply function to all dataframes

I work with SAS files (sas7bdat = dataframes) and SAS formats (sas7bcat).
My sas7bdat files are in a "data" file, so I can get a list in object files_names.
Here is the first part of my code, working perfectly
files_names <- list.files(here("data"))
nb_files <- length(files_names)
data_names <- vector("list",length=nb_files)
for (i in 1 : nb_files) {
data_names[i] <- strsplit(files_names[i], split=".sas7bdat")
}
for (i in 1:nb_files) {
assign(data_names[[i]],
read_sas(paste(here("data", files_names[i])), "formats/formats.sas7bcat")
)}
but I get some issues when trying to apply function as_factor from package haven (in order to apply labels on my new dataframes and get like SEX = "Male" instead of SEX = 1).
I can make it work dataframe by dataframe like the code below
df_labelled <- haven::as_factor(df, only_labelled = TRUE)
I would like to create a loop but didn't work because my data_names[i] isn't a dataframe and as_factor requires a dataframe in first argument.
I'm quite new to R, thank you very much if someone could help me.

you might want to think about using different data structures, for example you can use a named list to save your dataframes then you can easily loop through them.
In fact you could do everything in one loop, I'm sure there's a more efficient way to do this, but here's an example of one way without changing your code too much :
files_names <- list.files(here("data"))
raw_dfs <- list()
labelled_dfs <- list()
for (file_name in files_names) {
# # strsplit returns a list either extract the first element
# # like this
# df_name <- (strsplit(file_name, split=".sas7bdat"))[[1]]
# # or use something else like gsub
df_name <- gsub(".sas7bdat", '', file_name)
raw_dfs[df_name] <- read_sas(paste(here("data", file_name)), "formats/formats.sas7bcat")
labelled_dfs[df_name] <- haven::as_factor(raw_dfs[[df_name]], only_labelled = TRUE)
}

Convert a list into multiple data frame by list column

I have import a excel file with multi worksheets. It’s a list format.
names(mysheets)
#[1] "test_sheet1" "test_sheet2"
Test_sheet1 and test _sheet2 have a different matrix.
I have to put each worksheets as individual data frame.
If do it manually, the code will look like this:
s_1 <- data.frame(mysheets[1])
s_2 <- data.frame(mysheets[2])
I try to write a function to do it, because I have many excel files and each file have multi worksheets
function
p_fun <- function (y) {
for (s_i in 1:2) {
for (i in 1:2) {
s_i<- data.frame(y[i])
return(s_i) }}}
It didn’t work correctly.
Appreciate if anyone can help.

You could use mget to get the object and then change them to data.frame
list_df <- lapply(mget(names(mysheets)), data.frame)
If you want them as separate dataframes, we can do
names(list_df) <- paste0('s_', seq_along(list_df))
list2env(list_df, .GlobalEnv)

We can use assign if we are doing this in a for loop
for(i in seq_along(mysheets)) assign(paste0("s", i), data.frame(mysheets[i]))

Iterative naming for a list created in a loop

i wrote a loop:
for(a in 1:100){
Code
list <- list("test1"=test1,"test2"=test2)
save(list, file = paste(paste("test",a,sep="_"),".RData",sep=""))
}
The iterative naming of the saved file works well, but I have not figured out a way to do this the list. The Problem is, that if I load the file into R the objects are both called list and thus I have a problem.
I have tried mv(from = "list" , to = paste(paste("test",a,sep="_")) but it does not work.
Can anybody help me with this?

Indeed this is a tricky point, since save(eval(parse(text=paste0("list", a))), file = paste("test",a,".RData",sep="")) is not working for some reason, your best bet IMO would be to save one file only - which might be more convenient any way, and access the names of the objects in the list of lists:
test1 <- 1
test2 <- 2
mylist <- list()
for(a in 1:100){
#assign(paste0("list",a), list("test1"=test1,"test2"=test2), environment())
mylist[[a]] <- list("test1"=test1,"test2"=test2)
}
save(mylist, file = "mylist.RData")

merge tables in Loop using R

I have a simple question regarding a loop that I wrote. I want to access different files in different directories and extract data from these files and combine into one table. My problem is that my loop is not adding the results of the different files but only updating with the species that is currently in the loop. Here it is my code:
for(i in 1:length(splist.par))
{
results<-read.csv(paste(getwd(),"/ResultsR10arcabiotic/",splist.par[i],"/","maxentResults.csv",sep=""),h=T)
species <- splist.par[i]
AUC <- results$Test.AUC[1:10]
AUC_SD <- results$AUC.Standard.Deviation[1:10]
Variable <- "a"
Resolution <- "10arc"
table <-cbind(species,AUC,AUC_SD,Variable,Resolution)
}
This is probably an easy question but I am not an experienced programmer. Thanks for the attention
Gabriel

I'd use lapply to get the desired data from each file and add the Species information, and then combine with rbind. Something like this (untested):
do.call(rbind, lapply(splist.par, function(x) {
d <- read.csv(file.path("ResultsR10arcabiotic", x, "maxentResults.csv"))
d <- d[1:10, c("Test.AIC", "AIC.Standard.Deviation")]
names(d) <- c("AUC", "AUC_SD")
cbind(Species=x, d, stringsAsFactors=FALSE)
}))

#Aaron's lapply answer is good, and clean. But to debug your code: you put a bunch of data into table but overwrite table every time. You need to do
table <-cbind(table, species,AUC,AUC_SD,Variable,Resolution)
BTW, since table is a function in R, I'd avoid using it as a variable name. Imagine:
table(table)
:-)

Aggregate data from different files into data structure

I noticed I encounter this task quite often when programming in R, yet I don't think I implement it "pretty".
I get a list of file names, each containing a table or a simple vector. I want to read all the files into some construct (list of tables?) so I can later manipulate them in simple loops.
I know how to read each file into a table/vector, but I do not know how to put all these objects together in one structure (list?).
Anyway, I guess this is VERY routine so I'll be happy to hear about your tricks.

Do all the files have the same # of columns? If so, I think this should work to put them all into one dataframe.
library(plyr)
x <- c(FILENAMES)
df <- ldply(x, read.table, sep = "\t", header = T)
If they don't have all the same columns, then use llply() instead

Or, without plyr:
filenames <- c("file1.txt", "file2.txt", "file3.txt")
mydata <- array(list(NULL))
for (i in 1:length(filenames))
{
mydata[[i]] <- read.table(filenames[i])
}

You can have a look at my answer here: Merge several data.frames into one data.frame with a loop.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

How to read multiple .xlsx and generate multimple data frames in R? - r

I want to read three different files in xlsx and save them in three different dataframes called excel1, excel2 and excel3. How can I do that? I think it should be something like this: files = list.files(pattern='[.]xlsx') #There are three files. for (i in 1:files){ "excel" + i =read.xlsx(files[i]) }

You can use plyr also and it's a good practice to mention the environment in which you want to create the variable: library(plyr) l_ply(1:length(files), function(i) assign(paste0('excel',i),read.xlsx(files[i]), envir=globalenv()))

If someone tries to use this code, this parameters are really helpful: library(xlsx) files = list.files(pattern='[.]xlsx') dfList <- list() for (i in 1:length(files)){ dfList[[paste0("excel",i)]] <- read.xlsx(files[i],header=T,stringsAsFactors=FALSE,sheetIndex = 1) }

Related

Apply function to all dataframes

Convert a list into multiple data frame by list column

Iterative naming for a list created in a loop

merge tables in Loop using R

Aggregate data from different files into data structure

Categories

Resources