I want to read three different files in xlsx and save them in three different dataframes called excel1, excel2 and excel3. How can I do that? I think it should be something like this:
files = list.files(pattern='[.]xlsx') #There are three files.
for (i in 1:files){
"excel" + i =read.xlsx(files[i])
}
I suggest you to use a list instead of creating 3 variables in the current workspace:
dfList <- list()
for (i in 1:files){
dfList[[paste0("excel",i)]] <- read.xlsx(files[i])
}
Then you can access to them in this way :
dfList$excel1
dfList$excel2
dfList$excel3
or :
dfList[[1]]
dfList[[2]]
dfList[[3]]
But, if you really really want to create new variables, you can use assign function :
for (i in 1:files){
assign(paste0("excel",i), read.xlsx(files[i]))
}
# now excel1, excel2, excel3 variables exist...
You can use plyr also and it's a good practice to mention the environment in which you want to create the variable:
library(plyr)
l_ply(1:length(files), function(i) assign(paste0('excel',i),read.xlsx(files[i]), envir=globalenv()))
If someone tries to use this code, this parameters are really helpful:
library(xlsx)
files = list.files(pattern='[.]xlsx')
dfList <- list()
for (i in 1:length(files)){
dfList[[paste0("excel",i)]] <- read.xlsx(files[i],header=T,stringsAsFactors=FALSE,sheetIndex = 1)
}
Related
I work with SAS files (sas7bdat = dataframes) and SAS formats (sas7bcat).
My sas7bdat files are in a "data" file, so I can get a list in object files_names.
Here is the first part of my code, working perfectly
files_names <- list.files(here("data"))
nb_files <- length(files_names)
data_names <- vector("list",length=nb_files)
for (i in 1 : nb_files) {
data_names[i] <- strsplit(files_names[i], split=".sas7bdat")
}
for (i in 1:nb_files) {
assign(data_names[[i]],
read_sas(paste(here("data", files_names[i])), "formats/formats.sas7bcat")
)}
but I get some issues when trying to apply function as_factor from package haven (in order to apply labels on my new dataframes and get like SEX = "Male" instead of SEX = 1).
I can make it work dataframe by dataframe like the code below
df_labelled <- haven::as_factor(df, only_labelled = TRUE)
I would like to create a loop but didn't work because my data_names[i] isn't a dataframe and as_factor requires a dataframe in first argument.
I'm quite new to R, thank you very much if someone could help me.
you might want to think about using different data structures, for example you can use a named list to save your dataframes then you can easily loop through them.
In fact you could do everything in one loop, I'm sure there's a more efficient way to do this, but here's an example of one way without changing your code too much :
files_names <- list.files(here("data"))
raw_dfs <- list()
labelled_dfs <- list()
for (file_name in files_names) {
# # strsplit returns a list either extract the first element
# # like this
# df_name <- (strsplit(file_name, split=".sas7bdat"))[[1]]
# # or use something else like gsub
df_name <- gsub(".sas7bdat", '', file_name)
raw_dfs[df_name] <- read_sas(paste(here("data", file_name)), "formats/formats.sas7bcat")
labelled_dfs[df_name] <- haven::as_factor(raw_dfs[[df_name]], only_labelled = TRUE)
}
I have import a excel file with multi worksheets. It’s a list format.
names(mysheets)
#[1] "test_sheet1" "test_sheet2"
Test_sheet1 and test _sheet2 have a different matrix.
I have to put each worksheets as individual data frame.
If do it manually, the code will look like this:
s_1 <- data.frame(mysheets[1])
s_2 <- data.frame(mysheets[2])
I try to write a function to do it, because I have many excel files and each file have multi worksheets
function
p_fun <- function (y) {
for (s_i in 1:2) {
for (i in 1:2) {
s_i<- data.frame(y[i])
return(s_i) }}}
It didn’t work correctly.
Appreciate if anyone can help.
You could use mget to get the object and then change them to data.frame
list_df <- lapply(mget(names(mysheets)), data.frame)
If you want them as separate dataframes, we can do
names(list_df) <- paste0('s_', seq_along(list_df))
list2env(list_df, .GlobalEnv)
We can use assign if we are doing this in a for loop
for(i in seq_along(mysheets)) assign(paste0("s", i), data.frame(mysheets[i]))
i wrote a loop:
for(a in 1:100){
Code
list <- list("test1"=test1,"test2"=test2)
save(list, file = paste(paste("test",a,sep="_"),".RData",sep=""))
}
The iterative naming of the saved file works well, but I have not figured out a way to do this the list. The Problem is, that if I load the file into R the objects are both called list and thus I have a problem.
I have tried mv(from = "list" , to = paste(paste("test",a,sep="_")) but it does not work.
Can anybody help me with this?
Indeed this is a tricky point, since save(eval(parse(text=paste0("list", a))), file = paste("test",a,".RData",sep="")) is not working for some reason, your best bet IMO would be to save one file only - which might be more convenient any way, and access the names of the objects in the list of lists:
test1 <- 1
test2 <- 2
mylist <- list()
for(a in 1:100){
#assign(paste0("list",a), list("test1"=test1,"test2"=test2), environment())
mylist[[a]] <- list("test1"=test1,"test2"=test2)
}
save(mylist, file = "mylist.RData")
I have a simple question regarding a loop that I wrote. I want to access different files in different directories and extract data from these files and combine into one table. My problem is that my loop is not adding the results of the different files but only updating with the species that is currently in the loop. Here it is my code:
for(i in 1:length(splist.par))
{
results<-read.csv(paste(getwd(),"/ResultsR10arcabiotic/",splist.par[i],"/","maxentResults.csv",sep=""),h=T)
species <- splist.par[i]
AUC <- results$Test.AUC[1:10]
AUC_SD <- results$AUC.Standard.Deviation[1:10]
Variable <- "a"
Resolution <- "10arc"
table <-cbind(species,AUC,AUC_SD,Variable,Resolution)
}
This is probably an easy question but I am not an experienced programmer. Thanks for the attention
Gabriel
I'd use lapply to get the desired data from each file and add the Species information, and then combine with rbind. Something like this (untested):
do.call(rbind, lapply(splist.par, function(x) {
d <- read.csv(file.path("ResultsR10arcabiotic", x, "maxentResults.csv"))
d <- d[1:10, c("Test.AIC", "AIC.Standard.Deviation")]
names(d) <- c("AUC", "AUC_SD")
cbind(Species=x, d, stringsAsFactors=FALSE)
}))
#Aaron's lapply answer is good, and clean. But to debug your code: you put a bunch of data into table but overwrite table every time. You need to do
table <-cbind(table, species,AUC,AUC_SD,Variable,Resolution)
BTW, since table is a function in R, I'd avoid using it as a variable name. Imagine:
table(table)
:-)
I noticed I encounter this task quite often when programming in R, yet I don't think I implement it "pretty".
I get a list of file names, each containing a table or a simple vector. I want to read all the files into some construct (list of tables?) so I can later manipulate them in simple loops.
I know how to read each file into a table/vector, but I do not know how to put all these objects together in one structure (list?).
Anyway, I guess this is VERY routine so I'll be happy to hear about your tricks.
Do all the files have the same # of columns? If so, I think this should work to put them all into one dataframe.
library(plyr)
x <- c(FILENAMES)
df <- ldply(x, read.table, sep = "\t", header = T)
If they don't have all the same columns, then use llply() instead
Or, without plyr:
filenames <- c("file1.txt", "file2.txt", "file3.txt")
mydata <- array(list(NULL))
for (i in 1:length(filenames))
{
mydata[[i]] <- read.table(filenames[i])
}
You can have a look at my answer here: Merge several data.frames into one data.frame with a loop.