Subsetting a matrix created from a list of lists - r

I am having some trouble manipulating a matrix that I have created from a list of lists. I don't really understand why the resulting matrix doesn't act like a normal matrix. That is, I expect when I subset a column for it to return a vector, but instead I get a list. Here is a working example:
x = list()
x[[1]] = list(1, 2, 3)
x[[2]] = list(4, 5, 6)
x[[3]] = list(7, 8, 9)
y = do.call(rbind, x)
y
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 4 5 6
[3,] 7 8 9
y is in the format that I expect. Ultimately I will have a list of these matrices that I want to average, but I keep getting an error which appears to be due to the fact that when you subset this matrix you get lists instead of vectors.
y[,1]
[[1]]
[1] 1
[[2]]
[1] 4
[[3]]
[1] 7
Does anyone know a) why this is happening? and b) How I could avoid / solve this problem?
Thanks in advance

This is just another problem with "matrix of list". You need
do.call(rbind, lapply(x, unlist))
or even simpler:
matrix(unlist(x), nrow = length(x), byrow = TRUE)
If you need some background reading, see: Why a row of my matrix is a list. That is a more complex case than yours.

It looks like the problem is due to x being a list of lists, rather than a list of vectors. This is not great, but it'll work:
y = do.call(rbind, lapply(x, unlist))

do.call passes the argument for each list separately, so you are really just binding your three lists together. This explains why they are in list format when you call the elements.
Unlist x and then use sapply to create your matrix. Since R defaults to filling columns first, you'll need to transpose it to get your desired matrix.
y <- t(sapply(x, unlist))

Related

colMeans of a sparse matrix times a matrix stored as the last element of a list

I want to get the column means for the last list element, which is a sparse matrix multiplied times a regular matrix. Whenever I use colMeans, however, I get an error. For example:
# Use the igraph package to create a sparse matrix
library(igraph)
my.lattice <- get.adjacency(graph.lattice(length = 5, dim = 2))
# Create a conformable matrix of TRUE and FALSE values
start <- matrix(sample(c(TRUE, FALSE), 50, replace = T), ncol = 2)
# Multiply the matrix times the vector, and save the results to a list
out <- list()
out[[1]] <- my.lattice %*% start
out[[2]] <- my.lattice %*% out[[1]]
# Try to get column means of the last element
colMeans(tail(out, 1)[[1]]) # Selecting first element because tail creates a list
# Error in colMeans(tail(out, 1)[[1]]) :
# 'x' must be an array of at least two dimensions
# But tail(out, 1)[[1]] seems to have two dimensions
dim(tail(out, 1)[[1]])
# [1] 25 2
Any idea what's causing this error, or what I can do about it?
It looks like explicitly calling the colMeans function from the Matrix package works:
> Matrix::colMeans(tail(out, 1)[[1]])
# [1] 4.48 5.48
Thanks to user20650 for this suggestion.

Combining deeply-nested vectors from multiple lists

I wish to combine equivalent, deeply-nested columns from all elements of a reasonably long list. What I would like to do, though it's not possible in R, is this:
combined.columns <- my.list[[1:length(my.list)]]$my.matrix[,"my.column"]
The only thing I can think of is to manually type out all the elements in cbind() like this:
combined.columns <- cbind(my.list[[1]]$my.matrix[,"my.column"], my.list[[2]]$my.matrix[,"my.column"], . . . )
This answer is pretty close to what I need, but I can't figure out how to make it work for the extra level of nesting.
There must be a more elegant way of doing this, though. Any ideas?
Assuming all your matrices have the same column name you wish to extract you could use sapply
set.seed(123)
my.list <- vector("list")
my.list[[1]] <- list(my.matrix = data.frame(A=rnorm(10,sd=0.3), B=rnorm(10,sd=0.3)))
my.list[[2]] <- list(my.matrix = data.frame(C=rnorm(10,sd=0.3), B=rnorm(10,sd=0.3)))
my.list[[3]] <- list(my.matrix = data.frame(D=rnorm(10,sd=0.3), B=rnorm(10,sd=0.3)))
sapply(my.list, FUN = function(x) x$my.matrix[,"B"])
Free data:
myList <- list(list(myMat = matrix(1:10, 2, dimnames=list(NULL, letters[1:5])),
myVec = 1:10),
list(myMat = matrix(10:1, 2, dimnames=list(NULL, letters[1:5])),
myVec = 10:1))
We can get column a of myMat a few different ways. Here's one that uses with.
sapply(myList, with, myMat[,"a"])
# [,1] [,2]
# [1,] 1 10
# [2,] 2 9
This mapply one might be better for a more recursive type problem. It works too and might be faster than sapply.
mapply(function(x, y, z) x[[y]][,z] , myList, "myMat", "a")
# [,1] [,2]
# [1,] 1 10
# [2,] 2 9

Select elements from a matrix in R all at once

I have a row vector and a column vector say c(1,2), c(7,100). I want to extract (1,7), (2,100).
Out, I find Matrix[row, column] will return a cross-product thing not just a vector of two numbers.
What should I do?
You want to exploit the feature that if m is a matrix containing the row/col indices required, then subsetting by passing m as argument i of [ gives the desired behaviour. From ?'['
i, j, ...: indices specifying elements to extract or replace.
.... snipped ....
When indexing arrays by ‘[’ a single argument ‘i’ can be a
matrix with as many columns as there are dimensions of ‘x’;
the result is then a vector with elements corresponding to
the sets of indices in each row of ‘i’.
Here is an example
rv <- 1:2
cv <- 3:4
mat <- matrix(1:25, ncol = 5)
mat[cbind(rv, cv)]
R> cbind(rv, cv)
rv cv
[1,] 1 3
[2,] 2 4
R> mat[cbind(rv, cv)]
[1] 11 17
You can use 2 column subsetting matrices inside [:
mx <- matrix(1:200, nrow=2)
mx[cbind(c(1, 2), c(7, 100))]
produces:
[1] 13 200

Consistently subset matrix to a vector and avoid colnames?

I would like to know if there is R syntax to extract a column from a matrix and always have no name attribute on the returned vector (I wish to rely on this behaviour).
My problem is the following inconsistency:
when a matrix has more than one row and I do myMatrix[, 1] I will get the first column of myMatrix with no name attribute. This is what I want.
when a matrix has exactly one row and I do myMatrix[, 1], I will get the first column of myMatrix but it has the first colname as its name.
I would like to be able to do myMatrix[, 1] and consistently get something with no name.
An example to demonstrate this:
# make a matrix with more than one row,
x <- matrix(1:2, nrow=2)
colnames(x) <- 'foo'
# foo
# [1,] 1
# [2,] 2
# extract first column. Note no 'foo' name is attached.
x[, 1]
# [1] 1 2
# now suppose x has just one row (and is a matrix)
x <- x[1, , drop=F]
# extract first column
x[, 1]
# foo # <-- we keep the name!!
# 1
Now, the documentation for [ (?'[') mentions this behaviour, so it's not a bug or anything (although, why?! why this inconsistency?!):
A vector obtained by matrix indexing will be unnamed unless ‘x’ is one-dimensional when the row names (if any) will be indexed to provide names for the result.
My question is, is there a way to do x[, 1] such that the result is always unnamed, where x is a matrix?
Is my only hope unname(x[, 1]) or is there something analogous to ['s drop argument? Or is there an option I can set to say "always unname"? Some trick I can use (somehow override ['s behaviour when the extracted result is a vector?)
Update on why the code below works (as far as I can tell)
Subsetting with [ is handled using functions contained in the R source file subset.c in ~/src/main. When using matrix indexing to subset a matrix, the function VectorSubset is called. When there is more than one index used (i.e., one each for rows and columns as in x[,1]), then MatrixSubset is called.
The function VectorSubset only assigns names to 1-dimensional arrays being subsetted. Since a matrix is a 2-D array, no names are assigned to the result when using matrix indexing. The function MatrixSubset, however, does attempt to pass on dimnames under certain circumstances.
Therefore, the matrix indexing you refer to in the quote from the help page seems to be the key:
x <- matrix(1)
colnames(x) <- "foo"
x[, 1] ## 'Normal' indexing
# foo
# 1
x[matrix(c(1, 1), ncol = 2)] ## Matrix indexing
# [1] 1
And with a wider 1-row matrix:
xx <- matrix(1:10, nrow = 1)
colnames(xx) <- sprintf('foo%i', seq_len(ncol(xx)))
xx[, 6] ## 'Normal' indexing
# foo6
# 6
xx[matrix(c(1, 6), ncol = 2)] ## Matrix indexing
# [1] 6
With a matrix with both dimensions > 1:
yy <- matrix(1:10, nrow = 2, dimnames = list(NULL,
sprintf('foo%i', 1:5)))
yy[cbind(seq_len(nrow(yy)), 3)] ## Matrix indexing
# [1] 5 6

How can I make processing of matrices and vectors regular (as, e.g., in Matlab)

Suppose I have a function that takes an argument x of dimension 1 or 2. I'd like to do something like
x[1, i]
regardless of whether I got a vector or a matrix (or a table of one variable, or two).
For example:
x = 1:5
x[1,2] # this won't work...
Of course I can check to see which class was given as an argument, or force the argument to be a matrix, but I'd rather not do that. In Matlab, for example, vectors are matrices with all but one dimension of size 1 (and can be treated as either row or column, etc.). This makes code nice and regular.
Also, does anyone have an idea why in R vectors (or in general one dimensional objects) aren't special cases of matrices (or multidimensional objects)?
Thanks
In R, it is the other way round; matrices are vectors. The matrix-like behaviour comes from some extra attributes on top of the atomic vector part of the object.
To get the behaviour you want, you'd need to make the vector be a matrix, by setting dimensions on the vector using dim() or explicit coercion.
> vm <- 1:5
> dim(vm) <- c(1,5)
> vm
[,1] [,2] [,3] [,4] [,5]
[1,] 1 2 3 4 5
> class(vm)
[1] "matrix"
Next you'll need to maintain the dimensions when subsetting; by default R will drop empty dimensions, which in the case of vm above is the row dimension. You do that using drop = FALSE in the call to '['(). The behaviour by default is drop = TRUE:
> vm[, 2:4]
[1] 2 3 4
> vm[, 2:4, drop = FALSE]
[,1] [,2] [,3]
[1,] 2 3 4
You could add a class to your matrices and write methods for [ for that class where the argument drop is set to FALSE by default
class(vm) <- c("foo", class(vm))
`[.foo` <- function(x, i, j, ..., drop = FALSE) {
clx <- class(x)
class(x) <- clx[clx != "foo"]
x[i, j, ..., drop = drop]
}
which in use gives:
> vm[, 2:4]
[,1] [,2] [,3]
[1,] 2 3 4
i.e. maintains the empty dimension.
Making this fool-proof and pervasive will require a lot more effort but the above will get you started.

Resources