Combining dataframes into a list - r

I'm trying to store multiple dataframes in a list. However, at some point, the dataframes end up getting converted into lists, and so I end up with a list of lists.
All I'm really trying to do is keep all my dataframes together in some sort of structure.
Here's the code that fails:
all_dframes <- list() # initialise a list that will hold a dataframe as each item
for(file in filelist){ # load each file
dframe <- read.csv(file) # read CSV file
all_dframes[length(all_dframes)+1] <- dframe # add to the list
}
If I now call, for example, class(all_dframes[1]), I get 'list', whereas if I call class(dframe) I get 'data.frame'!

Of course, the class of all_dframes[1] is list since all_dframes is a list. The function [ returns a subset of the list. In this example, the length of the returned list is one. If you want to extract the data frame you have to use [[, i.e., all_dframes[[1]].

May I suggest this:
library(data.table)
all_dframes <- vector("list",length(filelist))
for(i in 1:length(filelist)){ # load each file
all_dframes[[i]]<-fread(filelist[i])
}
Is this what you need?

Related

In R, how to read a list of filenames into individual data frames?

I have a (long) list of .csv file names and want to read each .csv file into its own data frame in R.
... "./data/2019-Q2.csv"
"./data/2019-Q3.csv" ...
I thought this should work:
allDFs <- lapply(csvPath, read.csv)
But it just infinit loops and I have to manually stop it. Thanks for any help.
You can read in the data using list.files and lapply, as suggested in the OP comments. To make each list item a separate data frame, use the assign() function in a for loop:
d <- split(iris, f = iris$Species)
for (i in names(d)) {
assign(i, d[[i]])
}
This uses the list names as the newly assigned variable name, so make sure this is set appropriately.

R: How to get to data frame which is stored in vector?

I have 20 csv files which I need to upload
I made in a loop and then I added each data.frame to a vector.
Finally in the vector "list_df" I have 20 elements where I stored name of my 20 dataframes.
Now I am trying to get to those dateframe stored in list_df but it doesn`t work.
Any ideas how can I get to those data frames stored in the vector to make futher calculation?
list_df[1][column_name]
or
list_df[1]$column_name
doesn`t work
path<-'thats my path'
list_of_files<-list.files(path)
list_df<-c() #creating empty vector
for (i in 1:length(list_of_files)){
assign(paste("dffile",list_of_files[i],sep=""),(read.table(paste(path,list_of_files[i],sep=""), sep=",", header=TRUE)))
list_df[i]<-paste("dffile",list_of_files[i],sep="")
}
We can initialize list_df as a character vector
list_df <- character(length(list_of_files))
Now, the index based assignment should work.
As 'list_df' contains the object names as a string, if we need to get the values of those elements, use get (for single object) or mget (for all objects in a list)
get(list_df[1])
mget(list_df)

Split dataframe in R by date

I have a data.frame that contains one Date type variable. I want to export 4 files, one containing a subset corresponding to each week. The following will divide my data in 4 however I don't know how to store each of this in a new data.frame.
split(DataAir, sample(rep(1:4)))
Thanks
If you save your split data frames in a variable. You can access the elements with double-bracket subsetting, (e.g. s[[1]]). To save, create a vector of file names
as you'd like and write each to file.
s <- split(iris, iris$Species)
filenames <- paste0("my_path/file", 1:3, ".csv")
for(i in 1:length(s)) write.csv(s[[i]], filenames[i])
And for R users that get unnecessarily bugged out by for loops:
mapply(function(x,y) write.csv(x,y), s, filenames)

Count the number of data frames beginning with prefix in R

I have a collection of data frames that I have generated in R. I need to count the number of data frames whose names begin with "entry_". I'd like to generate a number to then use for a function that rbinds all of these data frames and these data frames only.
So far, I have tried using grep to identify the data frames, however, this just returns where they are indexed in my object list (e.g., 16:19 --- objects 16-19 begin with "entry_"):
count_entry <- (grep("entry_", objects()))
Eventually I would like to rbind all of these data frames like so:
list.make <- function() {
sapply(paste('entry_', seq(1:25), sep=''), get, environment(), simplify = FALSE)
}
all.entries <- list.make()
final.data <- rbind.fill(all.entries)
I don't want to have to enter the sequence manually every time (for example (1:25) in the code above), which is why I'm hoping to be able to automatically count the data frames beginning with "entry_".
If anyone has any ideas of how to solve this, or how to go about this in a better way, I'm all ears!
Per comment by docendo: The ls function will list objects in an environment that match a regex pattern. You can then use mget to retrieve those objects as a list:
mylist <- mget(ls(pattern = "^entry_"))
That will then work with rbind.fill. You can then remove the original objects using something similar: rm(ls(pattern = "^entry_"))

R : how to append data frames in a list?

i am trying to produce data frames using for loop.
How can i append these data frames to a list and then check if any frame is empty or not ?
I would like to remove the data frames with empty rows from the list.
any help is appreciated
You should use lapply here without using a for loop. The advantages are:
You want to create a list of data.frame and lapply create a list
You do the job once , no need to do 2 loops.
Somethinkg like :
lapply(seq_len(nbr_df),function(x)
{
## code to create you data.frame dt
## dt = data.frame(...)
if(nrow(dt)>0) dt
})
second option: data.frames already created in separate variables:
We assume that your variable have a certain pattern, say patt:
lapply(mget(ls(pattern=patt)),function(x)if(nrow(x)>0)x)
To append to a list you can
Your_list= list()
for(i in numbOfPosibleDF){
k <- data.frame()
if(nrow(k)!=0){
Your_list[paste0(df,i)] = k
}
}
I would just add valid data frames to the list instead of removing them afterwards. If you want or need to use a for-loop (instead of lapply function), you may use following:
# init list
list.of.df <- list()
# start your loop to
# create data frame etc.
# ....
df <- data.frame(1,2)
# add to list
if (!is.null(df) && nrow(df)>0) list.of.df[[length(list.of.df)+1]] <- df
# ... end of loop here.
For the benefit of anyone finding this otherwise dead-end page by its title, the way to concatenate consistently formatted data.frames that are items of a list is with plyr:
rbind.fill.matrix(lst)
I would like to give a better picture of the scenario :
the frames may or may not have same number of columns/rows.
the data frames are dynamically produced using a for loop.
the frames have all data types: numeric , factor, character.

Resources