Assign names to sublists in a list - r

I have a list that contains three sublists, each of those sublists containing two objects.
Now I am looking for an efficent way to assign names to those objects in the sublists. In this case, the single objects of each sublist are supposed to have the same names (Matrix1 and Matrix2).
Here is a easy reproducible example:
# create random matrices
matrix1 <- matrix(rnorm(36),nrow=6)
matrix2 <- matrix(rnorm(36),nrow=6)
# combine the matrices to three lists
sublist1 <- list(matrix1, matrix2)
sublist2 <- list(matrix1, matrix2)
sublist3 <- list(matrix1, matrix2)
# combine the lists to one top list
Toplist <- list(sublist1, sublist2, sublist3)
I can do this by using a for loop:
# assign names via for loop
for (i in 1:length(Toplist)) {
names(Toplist[[i]]) <- c("Matrix1", "Matrix2")
}
I am sure there must be a more elegant way using a nested lapply command. But I struggled to implement the names() command inside it.
Anybody with a hint?

Try lapply(Toplist,setNames,c("a","b")).

Related

List of lists with data frames [duplicate]

I know this topic appeared on SO a few times, but the examples were often more complicated and I would like to have an answer (or set of possible solutions) to this simple situation. I am still wrapping my head around R and programming in general. So here I want to use lapply function or a simple loop to data list which is a list of three lists of vectors.
data1 <- list(rnorm(100),rnorm(100),rnorm(100))
data2 <- list(rnorm(100),rnorm(100),rnorm(100))
data3 <- list(rnorm(100),rnorm(100),rnorm(100))
data <- list(data1,data2,data3)
Now, I want to obtain the list of means for each vector. The result would be a list of three elements (lists).
I only know how to obtain list of outcomes for a list of vectors and
for (i in 1:length(data1)){
means <- lapply(data1,mean)
}
or by:
lapply(data1,mean)
and I know how to get all the means using rapply:
rapply(data,mean)
The problem is that rapply does not maintain the list structure.
Help and possibly some tips/explanations would be much appreciated.
We can loop through the list of list with a nested lapply/sapply
lapply(data, sapply, mean)
It is otherwise written as
lapply(data, function(x) sapply(x, mean))
Or if you need the output with the list structure, a nested lapply can be used
lapply(data, lapply, mean)
Or with rapply, we can use the argument how to get what kind of output we want.
rapply(data, mean, how='list')
If we are using a for loop, we may need to create an object to store the results.
res <- vector('list', length(data))
for(i in seq_along(data)){
for(j in seq_along(data[[i]])){
res[[i]][[j]] <- mean(data[[i]][[j]])
}
}

How can I run a for loop that subsets multiple lists all of which contain dataframes

I have 5 lists called "nightclub", "hospital", "bar", "attraction", "social_facility" all of which contain a data frame called osm_points. I want to create a new list with 5 dataframes with the names of the original dataframes that only contain 3 vectors "osm_id","name","addr.postcode" with no NA values for the vector "addr.postcode". Below is my attempted code, I do not know another way to subset lists without $ (which gives me an error for having an atomic vector), or without the square brackets. Let know if you guys have some advice.
vectors <- c("osm_id","name","addr.postcode")
features <- c("nightclub", "hospital", "bar", "attraction", "social_facility")
datasets <- list()
n <- 0
for (i in features){
n <- n + 1
datasets[[n]] <- paste(i)[["osm_points"]][!is.na(paste(i)[["osm_points"]][["addr.postcode"]]), variables]
}
I managed to do this operation without a for loop (below), but I want to be able to code better and do it all in one operation. Thanks so much for your help.
nightclub1 <- nightclub$osm_points[!is.na(nightclub$osm_points$addr.postcode), variables]
Thanks !!
Try this code using lapply :
result <- lapply(mget(features), function(x)
x$osm_points[!is.na(x$osm_points$addr.postcode), vectors])
result should have list of 5 dataframe one for each features with only vectors column and without NA value for addr.postcode.
You need to put your lists into a list to have something to iterate over.
As you have it, features is a character vector. In your first iteration i is "nightclub", a string. paste(i) changes nothing, it is still "nightclub". So your code becomes "nightclub"[["osm_points"]]..., but you need nightclub[["osm_points"]].
## make a list of lists
list_of_lists <- mget(features)
## then you can do
for(i in seq_along(list_of_lists)) {
datasets[[i]] <- list_of_lists[[i]][["osm_points"]]...
}
Substituting list_of_lists[[i]] wherever you currently have paste(i).

How to group set of data.frame objects in nested list with different order?

I have set of data.frame object in nested list, I want to group them by name of data.frame object. Because each nested list, data.frame objects are placed in different order, I have difficulty to group them in new list. I tried transpose method from purr packages in CRAN, but it wasn't right answer that I expected. Does anyone knows any trick of doing this sort of grouping for data.frame object more efficiently? Thanks a lot
example:
res_1 <- list(con=list(a.con_1=airquality[1:4,], b.con_1=iris[2:5,], c.con_1=ChickWeight[3:7,]),
dis=list(a.dis_1=airquality[5:7,], b.dis_1=iris[8:11,], c.dis_1=ChickWeight[12:17,]))
res_2 <- list(con=list(b.con_2=iris[7:11,], a.con_2=airquality[4:9,], c.con_2=ChickWeight[2:8,]),
dis=list(b.dis_2=iris[2:5,], a.dis_2=airquality[1:3,], c.dis_2=ChickWeight[12:15,]))
res_3 <- list(con=list(c.con_3=ChickWeight[10:15,], a.con_3=airquality[2:9,], b.con_3=iris[12:19,]),
dis=list(c.dis_3=ChickWeight[2:7,], a.dis_3=airquality[13:16,], b.dis_3=iris[2:7,]))
desired output:
group1_New <- list(con=list(a.con_1, a.con_2, a.con_3),
dis=list(a.dis_1, a.dis_2, a.dis_3))
group2_New <- list(con=list(b.con_1, b.con_2, b.con_3),
dis=list(b.dis_1, b.dis_2, b.dis_3))
group3_New <- list(con=list(c.con_1, c.con_2, c.con_3),
dis=list(c.dis_1, c.dis_2, c.dis_3))
Here is a twice nested for loop that creates the desired structure. There is likely a more efficient method.
# put the nested lists into a list:
myList <- list(res_1, res_2, res_3)
# make a copy of the list to preserve the structure for the new list
myList2 <- myList
for(i in seq_len(length(myList))) {
# get ordering of inner list names
myOrder <- rank(names(myList[[c(i,2)]]))
for(j in seq_len(length(myList[[i]]))) {
for(k in seq_len(length(myList[[c(i, j)]]))) {
# reorder content
myList2[[c(myOrder[k], j, i)]] <- myList[[c(i, j, k)]]
# rename element
names(myList2[[c(myOrder[k], j)]])[i] <- names(myList[[c(i, j)]])[k]
}
}
}
If desired, you could extract the list items after the loops.
The key to this solution is the realization that if you put these lists into a list, the result can be achieved by selectively reversing the indices of the list items. By selectively, I mean that I incorporate rank on the data.frame names to find the proper order for the inner-most loop.
In addition to reordering the data.frames as desired, I included a line to properly reset the names within the list.

in R: execute function on dataframes whose names are in list

My global environment contains several dataframes. I want to execute functions on only those that contain a specific string in their name. So, I first create a list of these dataframes of interest:
dfs <- ls()[sapply(ls(), function(x) class(get(x))) == 'data.frame']
dfs <- as.data.frame(dfs)
dfs_lst <- agrep("stats", dfs$dfs, ignore.case=FALSE, value=TRUE,
max.distance=0.1, useBytes=FALSE)
dfs_lst correctly returns all dataframes in my global environment containing the string "stats". dfs_lst
chr [1:3] "stats1" "stats2" "stats3".
Now, I want to execute functions on these 3 dataframes, however I do not know how to call them from the dfs_lst. I want something of the kind:
for(i in 1:length(dfs_lst){
# Find dataframe name in dfs_lst, and then use the matching dataframe in
# global environment. So, something of the sort:
for(dfs_lst[i] in ls()){
result[i,] <- dfs_lst[i] %>%
summarise(. , <summarise stuff> )
}
}
For example, for i=1, dfs_lst[1] is dataframe "stats1", I would want to perform the following, and save it in the first row of "results":
for(stats1 in ls()){
result[1,] <- stats1 %>% summarise(. , <summarise stuff> )
}
As #lmo pointed out, it's probably best to store these data.frames together in a single list. Instead of having data.frame objects called "stats1", "stats2", etc, floating around in your environment, a (hacky) way to store all your data.frame objects in a list is this:
dfs <- ls()[sapply(ls(), function(x) class(get(x))) == 'data.frame']
##make an empty list
my_list <- list()
##populate the list
for (dfm_name in dfs) {
my_list[[dfm_name]] <- get(dfm_name)
}
Now you've got a list my_list containing every object of the class data.frame in your environment. This will probably be helpful when you want to work with all data.frames names "statsX":
##find all list objects whose name starts with "stats"
stats_objects <- substr(names(my_list),1,5)=="stats"
results <- matrix(NA, ncol = your_length, nrow = sum(stats_objects))
##now perform intended operations
for ( row_num in 1:nrow(results)) {
results[i,] <- my_list[stats_objects][[row_num]] %>%
summarise(. , <summarise stuff> )
}
This should perform as necessary, after a couple alterations in the code (e.g. your_length needs to be specified, and you wanted all objects whose name contains "stats" so you'll need to work with regularized expressions).
What's nice about this is my_list contains all the data.frames, so if you choose to run analysis on data.frames not named "stats" you can still access them with a similar procedure. Hope this helps.
As discussed in the comments, if we have a list of interesting data frames, it will be easier to deal with the elements as data frame. So, the main issue here seems to be having just the object names and not the actual data.frame objects.
In order to follow the code and tracking the data types, I have decomposed it first:
1.
env.list <- ls() # chr vector
2.
env.classes <- sapply(env.list, function(x) class(get(x)))
# list of chr (containing classes), element names: data frame names
3.
dfs <- env.list[env.classes == 'data.frame'] # chr vector
4.
dfs <- as.data.frame(dfs)
# data frame with one column (named "dfs"), containing data.frame names
Now, we can get the list of data.frames:
3.
dfs <- env.list[env.classes == 'data.frame'] # chr vector
dfs.list <- sapply(dfs, function(x) {get(x)})
grep can be applied now to names(dfs.list) to get the interesting data frames.

Converting specific parts of lists to a dataframe

I have a large list of 2 elements containing lists of species containing lists of 25 vectors, resembling a set like this:
l1 <- list(time=runif(100), space=runif(100))
l2 <- list(time=runif(100), space=runif(100))
list1 <- list(test1=list(species1=l1, species2=l2),test2=list(species1=l1, species2=l2))
I think, its essentially a list of a list of lists.of vectors.
I want to create a data.frame from all space-vectors of all 'species' in just one of the two sublists:
final <- as.data.frame(cbind(unlist(list1[[2]]$species1$space), unlist(list1[[2]]$species2$space)))
names(final) <- names(list1[[2]])
Essentially, i need a loop/apply command that navigates me through list1[[2]]$species and picks all vectors called space.
Thank you very much!
We can use a nested loop to extract the 'space' elements
data.frame(lapply(list1, function(x)
sapply(x, "[", 'space')))

Resources