I have a list of subsets obtained through:
lapply(1:5, function(x) combn(5,x))
I would like to extract a specific vector from this list. For example, the 16th element of this list, which is (1,2,3). Any hints? Thanks.
The command produces all the subsets of (1,2,3,4,5), which is a list of 2^5=32 subsets. The 16th being (1,2,3). I want to know how to extract this by using its position (16th).
We could try by splitting (split) the matrix to a list of vectors for each list elements, concatenate c the output to flatten the list, and subset using the numeric index.
lst2 <- do.call(`c`,lapply(lst, function(x) split(x, col(x))))
lst2[[16]]
#[1] 1 2 3
Or instead of splitting the matrix output, we could use the FUN argument within combn to create list and then concatenate c using do.call
lst <- do.call(`c`,lapply(1:5, function(x) combn(5, x, FUN=list)))
lst[[16]]
#[1] 1 2 3
Or instead of do.call(c,..), we can use (contributed by #Marat Talipov)
lst <- unlist(lapply(1:5, function(x)
combn(5, x, FUN=list)), recursive=FALSE)
data
lst <- lapply(1:5, function(x) combn(5,x))
I would rather consider producing the right data instead of looping again on them :)
lst = Reduce('c', lapply(1:5, function(x) as.list(data.frame(combn(5,x)))))
> lst[[16]]
[1] 1 2 3
Related
I have many lists contained within a larger list. I want to take the mean of the same entry in each of the lists.
Since I have many lists (100 plus) I want to be able to take the mean of all these lists by using 1:100 like I would for vector entries.
Here is some example, non working code.
## Make two lists
list_1 <- list()
list_1[[1]] <- 2
list_2 <- list()
list_2[[1]] <- 4
##Make a larger list to put the lists in
big_list <- list()
## Add lists to a larger list
big_list[[1]] <- list_1
big_list[[2]] <- list_2
## This doesn't work
mean(big_list[[1]][1], big_list[[2]][1], na.rm=TRUE)
## This does not work, but it is what I want
mean(big_list[[1:2]][1])
You can use getElement inside sapply to pick out a vector comprising the first (or any specified) element from each sub-list:
sapply(big_list, getElement, 1)
#> [1] 2 4
and therefore
mean(sapply(big_list, getElement, 1))
#> [1] 3
We could combine flatten from purrr package with unlist:
library(purrr)
x <- unlist(flatten(big_list))
x
mean(x, na.rm = TRUE)
[1] 3
This approach will do it as well:
sapply(1:length(big_list[[1]]), function(i) mean(sapply(big_list, function(x) x[[i]]), na.rm = TRUE))
I'm a total noob at R and I've tried (and retried) to search for an answer to the following problem, but I've not been able to get any of the proposed solutions to do what I'm interested in.
I have two lists of named elements, with each element pointing to data frames with identical layouts:
(EDIT)
df1 <- data.frame(A=c(1,2,3),B=c("A","B","C"))
df2 <- data.frame(A=c(98,99),B=c("Y","Z"))
lst1 <- c(X=df1,Y=df2)
df3 <- data.frame(A=c(4,5),B=c("D","E"))
lst2 <- c(X=df3)
(EDIT 2)
So it seems like storing multiple data frames in a list is a bad idea, as it will convert the data frames to lists. So I'll go out looking for an alternative way to store a set of named data frames.
In general the names of the elements in the two elements might overlap partially, completely, or not at all.
I'm looking for a way to merge the two lists into a single list:
<some-function-sequence>(lst1, lst2)
->
c(X=rbind(df1,df3),Y=df2)
-resulting in something like this:
[EDIT: Syntax changed to correctly reflect desired result (list-of-data frames)]
$X
A B
1 1 A
2 2 B
3 3 C
4 4 D
5 5 E
$X.B
A B
1 98 Y
2 99 Z
I.e:
IF the lists contain identical element names, each pointing to a data frame, THEN I want to 'rbind' the rows from these two data frames and assign the resulting data frame to the same element name in the resulting list.
Otherwise the element names and data frames from both lists should just be copied into the resulting list.
I've tried the solutions from a number of discussions such as:
Can I combine a list of similar dataframes into a single dataframe?
Combine/merge lists by elements names
Simultaneously merge multiple data.frames in a list
Combine/merge lists by elements names (list in list)
Convert a list of data frames into one data frame
-but I've not been able to find the right solution. A general problem seems to be that the data frame ends up being converted into a list by the application of 'mapply/sapply/merge/...' - and usually also sliced and/or merged in ways which I am not interested in. :)
Any help with this will be much appreciated!
[SOLUTION]
The solution seems to be to change the use of c(...) when collecting data frames to list(...) after which the solution proposed by Pierre seems to give the desired result.
Here is a proposed solution using split and c to combine like terms. Please read the caveat at the bottom:
s <- split(c(lst1, lst2), names(c(lst1,lst2)))
lapply(s, function(lst) do.call(function(...) unname(c(...)), lst))
# $X.A
# [1] 1 2 3 4 5
#
# $X.B
# [1] "A" "B" "C" "D" "E"
#
# $Y.A
# [1] 98 99
#
# $Y.B
# [1] "Y" "Z"
This solution is based on NOT having factors as strings. It will not throw an error but the factors will be converted to numbers. Below I show how I transformed the data to remove factors. Let me know if you require factors:
df1 <- data.frame(A=c(1,2,3),B=c("A","B","C"), stringsAsFactors=FALSE)
df2 <- data.frame(A=c(98,99),B=c("Y","Z"), stringsAsFactors=FALSE)
lst1 <- c(X=df1,Y=df2)
df3 <- data.frame(A=c(4,5),B=c("D","E"), stringsAsFactors=FALSE)
lst2 <- c(X=df3)
If the data is stored in lists we can use:
lapply(split(c(lst1, lst2), names(c(lst1,lst2))), function(lst) do.call(rbind, lst))
The following solution is probably not the most efficient way. However, if I got your problem right this should work ;)
# Example data
# Some vectors
a <- 1:5
b <- 3:7
c <- rep(5, 5)
d <- 5:1
# Some dataframes, data1 and data3 have identical column names
data1 <- data.frame(a, b)
data2 <- data.frame(c, b)
data3 <- data.frame(a, b)
data4 <- data.frame(c, d)
# 2 lists
list1 <- list(data1, data2)
list2 <- list(data3, data4)
# Loop, wich checks for the dataframe names and rbinds dataframes with the same column names
final_list <- list1
used_lists <- numeric()
for(i in 1:length(list1)) {
for(j in 1:length(list2)) {
if(sum(colnames(list1[[i]]) == colnames(list2[[j]])) == ncol(list1[[i]])) {
final_list[[i]] <- rbind(list1[[i]], list2[[j]])
used_lists <- c(used_lists, j)
}
}
}
# Adding the other dataframes, which did not have the same column names
for(i in 1:length(list2)) {
if((i %in% used_lists) == FALSE) {
final_list[[length(final_list) + 1]] <- list2[[i]]
}
}
# Final list, which includes all other lists
final_list
I have a vector of strings like:
vector=c("a","hb","cd")
and also I have a matrix which has a column, each element of this column is a list of strings which separated by "|" separator, like:
1 "ab|hb"
2 "ab|hbc|cd"
I want to find each string of vector appears in which row of matrix completely.
For the above vector, the result is:
NA, 1, 2
You can use strsplit for splitting strings:
x <- strsplit("ab|hbc|cd", split="|", fixed=T)
and then check if values of vector appear in the data, e.g.
sapply(vector, function(x) x %in% strsplit("a|ab|cd|efg|bh",
split="|", fixed=T)[[1]])
Warning: strsplit outputs data as a list, so in the example above I extract only the first element of the list with [[1]], however you can deal with it in other way if you choose.
EDIT: answering to your question on data as a vector:
data <- c("ab|cd|ef", "aaa|b", "ab", "wf", "fg|hb|a", "cd|cd|df")
sapply(sapply(data, function(x) strsplit(x, split="|", fixed=T)[[1]]),
function(y) sapply(vector, function(z) z %in% y))
Here's an approach using regular expressions:
# Example data
vector <- c("a","hb","cd")
mat <- matrix(c("ab|hb", "ab|hbc|cd"), nrow = 2)
sapply(paste0("\\b", vector, "\\b"), function(x)
if(length(tmp <- grep(x, mat[ , 1]))) tmp else NA,
USE.NAMES = FALSE)
# [1] NA 1 2
This might be very trivial, but I haven't found an answer yet (and I'm quite new to R).
I have a list containing a bunch of matrices. Each matrix in the list has the same number of rows and rownames.
How to I access say, the second row of each matrix in the list?
Use lapply
x <-matrix(1:9, 3, dimnames=list(LETTERS[1:3], letters[24:26])) # creating a matrix
mylist <- list(x, 2*x, 3*x, 4*x) # creating the list
lapply(mylist, function(x) x['B',]) # By name
sapply(mylist, function(x) x['B',]) # alternative
lapply(mylist, function(x) x[2,]) # By index
sapply(mylist, function(x) x[2,])
I have a dataframe, df and a function process that returns a list of two dataframes, a and b. I use dlply to split up the df on an id column, and then return a list of lists of dataframes. Here's sample data/code that approximates the actual data and methods:
df <- data.frame(id1=rep(c(1,2,3,4), each=2))
process <- function(df) {
a <- data.frame(d1=rnorm(1), d2=rnorm(1))
b <- data.frame(id1=df$id1, a=rnorm(nrow(df)), b=runif(nrow(df)))
list(a=a, b=b)
}
require(plyr)
output <- dlply(df, .(id1), process)
output is a list of lists of dataframes, the nested list will always have two dataframes, named a and b. In this case the outer list has a length 4.
What I am looking to generate is a dataframe with all the a dataframes, along with an id column indicating their respective value (I believe this is left in the list as the split_labels attribute, see str(output)). Then similarly for the b dataframes.
So far I have in part used this question to come up with this code:
list <- unlist(output, recursive = FALSE)
list.a <- lapply(1:4, function(x) {
list[[(2*x)-1]]
})
all.a <- rbind.fill(list.a)
Which gives me the final a dataframe (and likewise for b with a different subscript into list), however it doesn't have the id column I need and I'm pretty sure there's got to be a more straightforward or elegant solution. Ideally something clean using plyr.
Not very clean but you can try something like this (assuming the same data generation process).
list.aID <- lapply(1:4, function(x) {
cbind(list[[(2*x) - 1]], list[[2*x]][1, 1, drop = FALSE])
})
all.aID <- rbind.fill(list.aID)
all.aID
all.aID
d1 d2 id1
1 0.68103 -0.74023 1
2 -0.50684 1.23713 2
3 0.33795 -0.37277 3
4 0.37827 0.56892 4