I wish to combine equivalent, deeply-nested columns from all elements of a reasonably long list. What I would like to do, though it's not possible in R, is this:
combined.columns <- my.list[[1:length(my.list)]]$my.matrix[,"my.column"]
The only thing I can think of is to manually type out all the elements in cbind() like this:
combined.columns <- cbind(my.list[[1]]$my.matrix[,"my.column"], my.list[[2]]$my.matrix[,"my.column"], . . . )
This answer is pretty close to what I need, but I can't figure out how to make it work for the extra level of nesting.
There must be a more elegant way of doing this, though. Any ideas?
Assuming all your matrices have the same column name you wish to extract you could use sapply
set.seed(123)
my.list <- vector("list")
my.list[[1]] <- list(my.matrix = data.frame(A=rnorm(10,sd=0.3), B=rnorm(10,sd=0.3)))
my.list[[2]] <- list(my.matrix = data.frame(C=rnorm(10,sd=0.3), B=rnorm(10,sd=0.3)))
my.list[[3]] <- list(my.matrix = data.frame(D=rnorm(10,sd=0.3), B=rnorm(10,sd=0.3)))
sapply(my.list, FUN = function(x) x$my.matrix[,"B"])
Free data:
myList <- list(list(myMat = matrix(1:10, 2, dimnames=list(NULL, letters[1:5])),
myVec = 1:10),
list(myMat = matrix(10:1, 2, dimnames=list(NULL, letters[1:5])),
myVec = 10:1))
We can get column a of myMat a few different ways. Here's one that uses with.
sapply(myList, with, myMat[,"a"])
# [,1] [,2]
# [1,] 1 10
# [2,] 2 9
This mapply one might be better for a more recursive type problem. It works too and might be faster than sapply.
mapply(function(x, y, z) x[[y]][,z] , myList, "myMat", "a")
# [,1] [,2]
# [1,] 1 10
# [2,] 2 9
Related
I have two lists
first = list(a = 1, b = 2, c = 3)
second = list(a = 2, b = 3, c = 4)
I want to merge these two lists so the final product is
$a
[1] 1 2
$b
[1] 2 3
$c
[1] 3 4
Is there a simple function to do this?
If lists always have the same structure, as in the example, then a simpler solution is
mapply(c, first, second, SIMPLIFY=FALSE)
This is a very simple adaptation of the modifyList function by Sarkar. Because it is recursive, it will handle more complex situations than mapply would, and it will handle mismatched name situations by ignoring the items in 'second' that are not in 'first'.
appendList <- function (x, val)
{
stopifnot(is.list(x), is.list(val))
xnames <- names(x)
for (v in names(val)) {
x[[v]] <- if (v %in% xnames && is.list(x[[v]]) && is.list(val[[v]]))
appendList(x[[v]], val[[v]])
else c(x[[v]], val[[v]])
}
x
}
> appendList(first,second)
$a
[1] 1 2
$b
[1] 2 3
$c
[1] 3 4
Here are two options, the first:
both <- list(first, second)
n <- unique(unlist(lapply(both, names)))
names(n) <- n
lapply(n, function(ni) unlist(lapply(both, `[[`, ni)))
and the second, which works only if they have the same structure:
apply(cbind(first, second),1,function(x) unname(unlist(x)))
Both give the desired result.
Here's some code that I ended up writing, based upon #Andrei's answer but without the elegancy/simplicity. The advantage is that it allows a more complex recursive merge and also differs between elements that should be connected with rbind and those that are just connected with c:
# Decided to move this outside the mapply, not sure this is
# that important for speed but I imagine redefining the function
# might be somewhat time-consuming
mergeLists_internal <- function(o_element, n_element){
if (is.list(n_element)){
# Fill in non-existant element with NA elements
if (length(n_element) != length(o_element)){
n_unique <- names(n_element)[! names(n_element) %in% names(o_element)]
if (length(n_unique) > 0){
for (n in n_unique){
if (is.matrix(n_element[[n]])){
o_element[[n]] <- matrix(NA,
nrow=nrow(n_element[[n]]),
ncol=ncol(n_element[[n]]))
}else{
o_element[[n]] <- rep(NA,
times=length(n_element[[n]]))
}
}
}
o_unique <- names(o_element)[! names(o_element) %in% names(n_element)]
if (length(o_unique) > 0){
for (n in o_unique){
if (is.matrix(n_element[[n]])){
n_element[[n]] <- matrix(NA,
nrow=nrow(o_element[[n]]),
ncol=ncol(o_element[[n]]))
}else{
n_element[[n]] <- rep(NA,
times=length(o_element[[n]]))
}
}
}
}
# Now merge the two lists
return(mergeLists(o_element,
n_element))
}
if(length(n_element)>1){
new_cols <- ifelse(is.matrix(n_element), ncol(n_element), length(n_element))
old_cols <- ifelse(is.matrix(o_element), ncol(o_element), length(o_element))
if (new_cols != old_cols)
stop("Your length doesn't match on the elements,",
" new element (", new_cols , ") !=",
" old element (", old_cols , ")")
}
return(rbind(o_element,
n_element,
deparse.level=0))
return(c(o_element,
n_element))
}
mergeLists <- function(old, new){
if (is.null(old))
return (new)
m <- mapply(mergeLists_internal, old, new, SIMPLIFY=FALSE)
return(m)
}
Here's my example:
v1 <- list("a"=c(1,2), b="test 1", sublist=list(one=20:21, two=21:22))
v2 <- list("a"=c(3,4), b="test 2", sublist=list(one=10:11, two=11:12, three=1:2))
mergeLists(v1, v2)
This results in:
$a
[,1] [,2]
[1,] 1 2
[2,] 3 4
$b
[1] "test 1" "test 2"
$sublist
$sublist$one
[,1] [,2]
[1,] 20 21
[2,] 10 11
$sublist$two
[,1] [,2]
[1,] 21 22
[2,] 11 12
$sublist$three
[,1] [,2]
[1,] NA NA
[2,] 1 2
Yeah, I know - perhaps not the most logical merge but I have a complex parallel loop that I had to generate a more customized .combine function for, and therefore I wrote this monster :-)
merged = map(names(first), ~c(first[[.x]], second[[.x]])
merged = set_names(merged, names(first))
Using purrr. Also solves the problem of your lists not being in order.
In general one could,
merge_list <- function(...) by(v<-unlist(c(...)),names(v),base::c)
Note that the by() solution returns an attributed list, so it will print differently, but will still be a list. But you can get rid of the attributes with attr(x,"_attribute.name_")<-NULL. You can probably also use aggregate().
We can do a lapply with c(), and use setNames to assign the original name to the output.
setNames(lapply(1:length(first), function(x) c(first[[x]], second[[x]])), names(first))
$a
[1] 1 2
$b
[1] 2 3
$c
[1] 3 4
Following #Aaron left Stack Overflow and #Theo answer, the merged list's elements are in form of vector c.
But if you want to bind rows and columns use rbind and cbind.
merged = map(names(first), ~rbind(first[[.x]], second[[.x]])
merged = set_names(merged, names(first))
Using dplyr, I found that this line works for named lists using the same names:
as.list(bind_rows(first, second))
I have created a list whose elements are themselves a list of matrices. I want to be able to extract the vectors of observations for each variable
p13 = 0.493;p43 = 0.325;p25 = 0.335;p35 = 0.574;p12 = 0.868
std_e2 = sqrt(1-p12^2)
std_e3 = sqrt(1-(p13^2+p43^2))
std_e5 = sqrt(1-(p25^2+p35^2+2*p25*p35*(p13*p12)))
set.seed(1234)
z1<-c(0,1)
z2<-c(0,1)
z3<-c(0,1)
z4<-c(0,1)
z5<-c(0,1)
s<-expand.grid(z1,z2,z3,z4,z5); s
s<-s[-1,];s
shift<-3
scenari<-s*shift;scenari
scenario_1<-scenari[1];scenario_1
genereting_fuction<-function(n){
sample<-list()
for (i in 1:nrow(scenario_1)){
X1=rnorm(n)+scenari[i,1]
X4=rnorm(n)+scenari[i,4]
X2=X1*p12+std_e2*rnorm(n)+scenari[i,2]
X3=X1*p13+X4*p43+std_e3*rnorm(n)+scenari[i,3]
X5=X2*p25+X3*p35+std_e5*rnorm(n)+scenari[i,5]
sample[[i]]=cbind(X1,X2,X3,X4,X5)
colnames(sample[[i]])<-c("X1","X2","X3","X4","X5")
}
sample
}
set.seed(123)
dati_fault<- lapply(rep(10, 100), genereting_fuction)
dati_fault[[1]]
[[1]]
X1 X2 X3 X4 X5
[1,] 2.505826 1.736593 1.0274581 -0.6038358 1.9967656
[2,] 4.127593 3.294344 2.8777777 1.2386725 3.0207723
[3,] 1.853050 1.312617 1.1875699 0.5994921 1.0471564
[4,] 4.481019 3.330629 2.1880050 -0.1087338 2.7331061
[5,] 3.916191 3.306036 0.7258404 -1.1388570 1.0293168
[6,] 3.335131 2.379439 1.2407679 0.3198553 1.6755424
[7,] 3.574675 3.769436 1.1084120 -1.0065481 2.0034434
[8,] 3.203620 2.842074 0.6550587 -0.8516120 -0.1433508
[9,] 2.552959 2.642094 2.5376430 2.0387860 3.5318055
[10,] 2.656474 1.607934 2.2760391 -1.3959822 1.0095796
I only want to save the elements of X1 in an object, and so for the other variables. .
Here you have a list of matrix with scenario in row and n columns.
genereting_fuction <- function(n, scenario, scenari){
# added argument because you assume global variable use
nr <- nrow(scenario)
sample <- vector("list", length = nr) # sample<-list()
# creating a list is better than expanding it each iteration
for (i in 1:nr){
X1=rnorm(n)+scenari[i,1]
X4=rnorm(n)+scenari[i,4]
X2=X1*p12+std_e2*rnorm(n)+scenari[i,2]
X3=X1*p13+X4*p43+std_e3*rnorm(n)+scenari[i,3]
X5=X2*p25+X3*p35+std_e5*rnorm(n)+scenari[i,5]
sample[[i]]=cbind(X1,X2,X3,X4,X5)
colnames(sample[[i]])<-c("X1","X2","X3","X4","X5")
}
sample
}
set.seed(123)
dati_fault<- lapply(rep(3, 2), function(x) genereting_fuction(x, scenario_1, scenari))
dati_fault
lapply(dati_fault, function(x) {
tmp <- lapply(x, function(y) y[,"X1"])
tmp <- do.call(rbind, tmp)
})
If you want to assemble this list of matrix, like using cbind, I suggest you just use a single big n value and not the lapply with rep inside it.
Also I bet there is easier way to simulate this number of scenari, but it's difficult to estimate without knowing the context of your code piece.
Also, try to solve your issue with a minimal example, working with a list of 100 list of 32 matrix of 5*10 is a bit messy !
Good luck !
I am having some trouble manipulating a matrix that I have created from a list of lists. I don't really understand why the resulting matrix doesn't act like a normal matrix. That is, I expect when I subset a column for it to return a vector, but instead I get a list. Here is a working example:
x = list()
x[[1]] = list(1, 2, 3)
x[[2]] = list(4, 5, 6)
x[[3]] = list(7, 8, 9)
y = do.call(rbind, x)
y
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 4 5 6
[3,] 7 8 9
y is in the format that I expect. Ultimately I will have a list of these matrices that I want to average, but I keep getting an error which appears to be due to the fact that when you subset this matrix you get lists instead of vectors.
y[,1]
[[1]]
[1] 1
[[2]]
[1] 4
[[3]]
[1] 7
Does anyone know a) why this is happening? and b) How I could avoid / solve this problem?
Thanks in advance
This is just another problem with "matrix of list". You need
do.call(rbind, lapply(x, unlist))
or even simpler:
matrix(unlist(x), nrow = length(x), byrow = TRUE)
If you need some background reading, see: Why a row of my matrix is a list. That is a more complex case than yours.
It looks like the problem is due to x being a list of lists, rather than a list of vectors. This is not great, but it'll work:
y = do.call(rbind, lapply(x, unlist))
do.call passes the argument for each list separately, so you are really just binding your three lists together. This explains why they are in list format when you call the elements.
Unlist x and then use sapply to create your matrix. Since R defaults to filling columns first, you'll need to transpose it to get your desired matrix.
y <- t(sapply(x, unlist))
Suppose a list object and a vector:
lst <- list(a = matrix(1:9, 3), b = matrix(2:10, 3))
vec <- c(2, 3)
And I want to get the result like
2 * a + 3 * b
I solve this by
matrix(apply(mapply("*", lst, vec), 1, sum), 3, 3)
But this looks a little cumbersome.
Is there an efficient way to get same result?
Not sure if it's any more efficient, but here's an idea that's a little cleaner. You can use Map() for the multiplication and Reduce() to do the summing.
Reduce("+", Map("*", lst, vec))
# [,1] [,2] [,3]
# [1,] 8 23 38
# [2,] 13 28 43
# [3,] 18 33 48
Also, in your code, you could replace the apply() call with rowSums(). That would probably improve efficiency in what you've done.
Another option is loop through the sequence of the list and then multiply
Reduce(`+`,lapply(seq_along(lst), function(i) lst[[i]]*vec[i]))
For the list and vector of two elements
lst[[1]]*vec[1] + lst[[2]] * vec[2]
I have an array in R, created by a function like this:
A <- array(data=NA, dim=c(2,4,4), dimnames=list(c("x","y"),NULL,NULL))
And I would like to select along one dimension, so for the example above I would have:
A["x",,]
dim(A["x",,]) #[1] 4 4
Is there a way to generalize if I do not know in advance how many dimensions (in addition to the named one I want to select by) my array might have? I would like to write a function that takes input that might formatted as A above, or as:
B <- c(1,2)
names(B) <- c("x", "y")
C <- matrix(1, 2, 2, dimnames=list(c("x","y"),NULL))
Background
The general background is that I am working on an ODE model, so for deSolve's ODE function it must take a single named vector with my current state. For some other functions, like calculating phase-planes/direction fields, it would be more practical to have a higher-dimensional array to apply the differential equation to, and I would like to avoid having many copies of the same function, simply with different numbers of commas after the dimension I want to select.
I spent quite a lot of time figuring out the fastest way to do this for plyr, and the best I could come up with was manually constructing the call to [:
index_array <- function(x, dim, value, drop = FALSE) {
# Create list representing arguments supplied to [
# bquote() creates an object corresponding to a missing argument
indices <- rep(list(bquote()), length(dim(x)))
indices[[dim]] <- value
# Generate the call to [
call <- as.call(c(
list(as.name("["), quote(x)),
indices,
list(drop = drop)))
# Print it, just to make it easier to see what's going on
print(call)
# Finally, evaluate it
eval(call)
}
(You can find more information about this technique at https://github.com/hadley/devtools/wiki/Computing-on-the-language)
You can then use it as follows:
A <- array(data=NA, dim=c(2,4,4), dimnames=list(c("x","y"),NULL,NULL))
index_array(A, 2, 2)
index_array(A, 2, 2, drop = TRUE)
index_array(A, 3, 2, drop = TRUE)
It would also generalise in a straightforward way if you want to extract based on more than one dimension, but you'd need to rethink the arguments to the function.
I wrote this general function. Not necessarily super fast but a nice application for arrayInd and matrix indexing:
extract <- function(A, .dim, .value) {
val.idx <- match(.value, dimnames(A)[[.dim]])
all.idx <- arrayInd(seq_along(A), dim(A))
keep.idx <- all.idx[all.idx[, .dim] == val.idx, , drop = FALSE]
array(A[keep.idx], dim = dim(A)[-.dim], dimnames = dimnames(A)[-.dim])
}
Example:
A <- array(data=1:32, dim=c(2,4,4),
dimnames=list(c("x","y"), LETTERS[1:4], letters[1:4]))
extract(A, 1, "x")
extract(A, 2, "D")
extract(A, 3, "b")
The abind package has a function, asub, to do this in addition to other very useful array manipulation functions:
library(abind)
A <- array(data=1:32, dim=c(2,4,4),
dimnames=list(c("x","y"), LETTERS[1:4], letters[1:4]))
asub(A, 'x', 1)
asub(A, 'D', 2)
asub(A, 'b', 3)
And it allows indexing in multiple dimensions:
asub(A, list('x', c('C', 'D')), c(1,2))
Perhaps there is an easier way, but this works:
do.call("[",c(list(A,"x"),lapply(dim(A)[-1],seq)))
[,1] [,2] [,3] [,4]
[1,] NA NA NA NA
[2,] NA NA NA NA
[3,] NA NA NA NA
[4,] NA NA NA NA
Let's generalize it into a function that can extract from any dimension, not necessarily the first one:
extract <- function(A, .dim, .value) {
idx.list <- lapply(dim(A), seq_len)
idx.list[[.dim]] <- .value
do.call(`[`, c(list(A), idx.list))
}
Example:
A <- array(data=1:32, dim=c(2,4,4),
dimnames=list(c("x","y"), LETTERS[1:4], letters[1:4]))
extract(A, 1, "x")
extract(A, 2, "D")
extract(A, 3, "b")