Select elements from a matrix in R all at once - r

I have a row vector and a column vector say c(1,2), c(7,100). I want to extract (1,7), (2,100).
Out, I find Matrix[row, column] will return a cross-product thing not just a vector of two numbers.
What should I do?

You want to exploit the feature that if m is a matrix containing the row/col indices required, then subsetting by passing m as argument i of [ gives the desired behaviour. From ?'['
i, j, ...: indices specifying elements to extract or replace.
.... snipped ....
When indexing arrays by ‘[’ a single argument ‘i’ can be a
matrix with as many columns as there are dimensions of ‘x’;
the result is then a vector with elements corresponding to
the sets of indices in each row of ‘i’.
Here is an example
rv <- 1:2
cv <- 3:4
mat <- matrix(1:25, ncol = 5)
mat[cbind(rv, cv)]
R> cbind(rv, cv)
rv cv
[1,] 1 3
[2,] 2 4
R> mat[cbind(rv, cv)]
[1] 11 17

You can use 2 column subsetting matrices inside [:
mx <- matrix(1:200, nrow=2)
mx[cbind(c(1, 2), c(7, 100))]
produces:
[1] 13 200

Related

Why does R not remove elements properly over an empty set of indices?

I have encountered some strange behaviour in R. Suppose I have a matrix and I want to remove a specified set of rows and columns. Here is an example where this works perfectly well.
#Create a matrix
MATRIX <- matrix(1:50, nrow = 4, ncol = 5)
rownames(MATRIX) <- c('a', 'b', 'c', 'd')
colnames(MATRIX) <- c('a', 'b', 'c', 'd', 'e')
#Specify rows and columns to remove
REMOVE.ROW <- 3
REMOVE.COL <- 2
#Print the matrix without these rows or columns
MATRIX[-REMOVE.ROW, -REMOVE.COL]
a c d e
a 1 9 13 17
b 2 10 14 18
d 4 12 16 20
However, when one or both of the objects REMOVE.ROW or REMOVE.COL are empty, instead of removing nothing (and therefore giving back the original matrix), it gives me back an empty matrix.
#Specify rows and columns to remove
REMOVE.ROW <- integer(0)
REMOVE.COL <- integer(0)
#Print the matrix without these rows or columns
MATRIX[-REMOVE.ROW, -REMOVE.COL]
<0 x 0 matrix>
Intuitively, I would have expected the removal of an empty set of indices to leave me with the original set of indices, and so I would have expected the full matrix back from this command. For some reason, R removes all rows and columns from the matrix in this case. As far as I can make out, this appears to be a bug in R, but perhaps there is some good reason for it that I am unaware of.
Question: Can someone explain why R is doing things this way? Aside from using if-then statements to deal with the special cases, is there any simple adjustment I can make to have R behave as I want it to?
Empty objects have this strange property that they are not NULL, hace length 0 but are not subsettable. A possible workaround is to consider every possible combination and use the property that length(integer0) is equal to zero. I understand that this solution might not be ideal.
is.na(integer(0))
#> logical(0)
is.null(integer(0))
#> [1] FALSE
length(integer(0))
#> [1] 0
integer(0)[[1]]
#> Error in integer(0)[[1]]: subscript out of bounds
integer(0)[[0]]
#> Error in integer(0)[[0]]: attempt to select less than one element in get1index <real>
MATRIX <- matrix(1:50, nrow = 4, ncol = 5)
#> Warning in matrix(1:50, nrow = 4, ncol = 5): data length [50] is not a sub-
#> multiple or multiple of the number of rows [4]
REMOVE.ROW <- integer(0)
REMOVE.COL <- integer(0)
if (all(length(REMOVE.ROW > 0) , length(REMOVE.COL) > 0)) {
MATRIX[-REMOVE.ROW, -REMOVE.COL]
} else {
if (length(REMOVE.ROW) > 0 && length(REMOVE.COL) == 0) {
MATRIX[-REMOVE.ROW, ]
} else {
if (length(REMOVE.ROW) == 0 && length(REMOVE.COL) > 0) {
MATRIX[, -REMOVE.COL]
} else {
MATRIX
}
}
}
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 1 5 9 13 17
#> [2,] 2 6 10 14 18
#> [3,] 3 7 11 15 19
#> [4,] 4 8 12 16 20
Created on 2021-11-27 by the reprex package (v2.0.1)
The problem is that R is using arithmetic negation, not set negation
Based on a helpful comment (hat tip to IceCreamToucan) it appears that this is occurring because of the two-step process involved in indexing matrices using negative indices, which are constructed using arithmetic negation instead of set negation. This appears to be one of those cases where the standard mathematical interpretation of an operation is different to the computational interpretation.
In the mathematical interpretation of indexing a matrix over a set of indices we view set negation as producing a new set composed of elements that are in the original 'sample space' but outside the negated set. In the computational interpretation in R the application of the negative sign is instead producing negative arithmetic values, and these are subsequently interpreted as elements to remove when calling the matrix.
What is happening in this case: For the usual case where we have a non-empty set of indices, using the negation sign simply turns the indices into negative values and then when we call the matrix it looks over all the indices other than the negative values.
#Specify rows and columns to remove
REMOVE.ROW <- 3
REMOVE.COL <- 2
#See negatives of the removed indices
identical(MATRIX[-REMOVE.ROW, -REMOVE.COL], MATRIX[-3, -2])
[1] TRUE
However, when we use an empty vector of indices, the negative of that vector is still the empty vector of indices ---i.e., the vector integer(0) is identical to its negative -integer(0). Consequently, when we try to remove the empty vector of indices, we are actually asking to call the matrix over the negative of the empty vector, which is still the empty vector.
#The empty vector is equivalent to its negative
identical(integer(0), -integer(0))
[1] TRUE
#Therefore, calling over these vectors is equivalent
identical(MATRIX[-integer(0), -integer(0)], MATRIX[integer(0), integer(0)])
[1] TRUE
So, the problem here is that you are interpreting -REMOVE.ROW and -REMOVE.COL as if they were using set negation when actually they are just taking the initial vectors of values and turning them negative (i.e., multiplying them by negative one).
Fixing the problem: There does not seem to be a standard function to call the matrix in a way that interprets the indices using set negation, so you will need to use conditional logic to construct the solution for a specific case or for a custom function. Here is a custom function sub.matrix to remove particular rows and columns, where these are interpreted in the sense of set negation.
sub.matrix <- function(x, remove.rows = integer(0), remove.cols = integer(0)) {
#Check that input x is a matrix
if (!('matrix' %in% class(x))) {
stop('This function is only for objects of class \'matrix\'') }
#Create output matrix
R <- length(remove.rows)
C <- length(remove.cols)
if ((R > 0)&(C > 0)) { OUT <- MATRIX[-remove.rows, -remove.cols] }
if ((R == 0)&(C > 0)) { OUT <- MATRIX[, -remove.cols] }
if ((R > 0)&(C == 0)) { OUT <- MATRIX[-remove.rows, ] }
if ((R == 0)&(C == 0)) { OUT <- MATRIX }
#Return the output matrix
OUT }

Values of matrix based on their column and row number in R

I have a 30 x 30 matrix in R and I want the values to be multiplied of their column and row number. For example the first value is [1] * [1] = 1
mat2 <- matrix(nrow= 30, ncol = 30)
We can use row and col to get the index of row/columns and use that to multiply
mat2 <- row(mat2) *col(mat2)
If you know the size of the matrix, you can also use outer() and construct the matrix directly in one step.
mat2 <- outer(seq(30), seq(30))
# other simple variations:
outer(1:30, 1:30)
seq(30) %o% seq(30)
1:30 %o% 1:30

Learning R - What is this Function Doing?

I am learning R and reading the book Guide to programming algorithms in r.
The book give an example function:
# MATRIX-VECTOR MULTIPLICATION
matvecmult = function(A,x){
m = nrow(A)
n = ncol(A)
y = matrix(0,nrow=m)
for (i in 1:m){
sumvalue = 0
for (j in 1:n){
sumvalue = sumvalue + A[i,j]*x[j]
}
y[i] = sumvalue
}
return(y)
}
How do I call this function in the R console? And what exactly is passing into this function A, X?
The function takes an argument A, which should be a matrix, and x, which should be a numeric vector of same length as values per row in A.
If
A <- matrix(c(1,2,3,4,5,6), nrow = 2, ncol = 3)
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6
then you have 3 values (number of columns, ncol) per row, thus x needs to be something like
x <- c(4,5,6)
The function itself iterates all rows, and in each row, each value is multiplied with a value from x, where the value in the first column is multiplied with the first value in x, the value in As second column is multiplied with the second value in x and so on. This is repeated for each row, and the sum for each row is returned by the function.
matvecmult(A, x)
[,1]
[1,] 49 # 1*4 + 3*5 + 5*6
[2,] 64 # 2*4 + 4*5 + 6*6
To run this function, you first have to compile (source) it and then consecutively run these three code lines:
A <- matrix(c(1,2,3,4,5,6), nrow = 2, ncol = 3)
x <- c(4,5,6)
matvecmult(A, x)
This function is designed to return the product of a matrix A with a vector x; i.e. the result will be the matrix product A x (where - as is usual in R, the vector is a column vector). An example should make things clear.
# define a matrix
mymatrix <- matrix(sample(12), nrow <- 4)
# see what the matrix looks like
mymatrix
# [,1] [,2] [,3]
# [1,] 2 10 9
# [2,] 3 1 12
# [3,] 11 7 5
# [4,] 8 4 6
# define a vector where multiplication of our matrix times the vector will be defined
vec3 <- c(-1,0,1)
# apply the function to our matrix and vector
result <- matvecmult(mymatrix, vec3)
result
# [,1]
# [1,] 7
# [2,] 9
# [3,] -6
# [4,] -2
class(result)
# [1] "matrix"
So matvecmult(mymatrix, vec3) is how you would call this function, and the result is an n by 1 matrix, where n is the number of rows in the matrix argument.
You can also get some insight by playing around and seeing what happens when you pass something other than a matrix-vector pair where the product is defined. In some cases, you will get an error; sometimes you get nonsense; and sometimes you get something you might not expect just from the function name. See what happens when you call matvecmult(mymatrix, mymatrix).
The function is calculating the product of a Matrix and a column vector. It assumes both the number of columns of the matrix is equal to the number of elements in the vector.
It stores the number of columns of A in n and number of rows in m.
It then initializes a matrix of mrows with all values as 0.
It iterates along the rows of A and multiplies each value in each row with the values in x.
The answer is the stored in y and finally it returns the single column matrix y.

colMeans of a sparse matrix times a matrix stored as the last element of a list

I want to get the column means for the last list element, which is a sparse matrix multiplied times a regular matrix. Whenever I use colMeans, however, I get an error. For example:
# Use the igraph package to create a sparse matrix
library(igraph)
my.lattice <- get.adjacency(graph.lattice(length = 5, dim = 2))
# Create a conformable matrix of TRUE and FALSE values
start <- matrix(sample(c(TRUE, FALSE), 50, replace = T), ncol = 2)
# Multiply the matrix times the vector, and save the results to a list
out <- list()
out[[1]] <- my.lattice %*% start
out[[2]] <- my.lattice %*% out[[1]]
# Try to get column means of the last element
colMeans(tail(out, 1)[[1]]) # Selecting first element because tail creates a list
# Error in colMeans(tail(out, 1)[[1]]) :
# 'x' must be an array of at least two dimensions
# But tail(out, 1)[[1]] seems to have two dimensions
dim(tail(out, 1)[[1]])
# [1] 25 2
Any idea what's causing this error, or what I can do about it?
It looks like explicitly calling the colMeans function from the Matrix package works:
> Matrix::colMeans(tail(out, 1)[[1]])
# [1] 4.48 5.48
Thanks to user20650 for this suggestion.

Consistently subset matrix to a vector and avoid colnames?

I would like to know if there is R syntax to extract a column from a matrix and always have no name attribute on the returned vector (I wish to rely on this behaviour).
My problem is the following inconsistency:
when a matrix has more than one row and I do myMatrix[, 1] I will get the first column of myMatrix with no name attribute. This is what I want.
when a matrix has exactly one row and I do myMatrix[, 1], I will get the first column of myMatrix but it has the first colname as its name.
I would like to be able to do myMatrix[, 1] and consistently get something with no name.
An example to demonstrate this:
# make a matrix with more than one row,
x <- matrix(1:2, nrow=2)
colnames(x) <- 'foo'
# foo
# [1,] 1
# [2,] 2
# extract first column. Note no 'foo' name is attached.
x[, 1]
# [1] 1 2
# now suppose x has just one row (and is a matrix)
x <- x[1, , drop=F]
# extract first column
x[, 1]
# foo # <-- we keep the name!!
# 1
Now, the documentation for [ (?'[') mentions this behaviour, so it's not a bug or anything (although, why?! why this inconsistency?!):
A vector obtained by matrix indexing will be unnamed unless ‘x’ is one-dimensional when the row names (if any) will be indexed to provide names for the result.
My question is, is there a way to do x[, 1] such that the result is always unnamed, where x is a matrix?
Is my only hope unname(x[, 1]) or is there something analogous to ['s drop argument? Or is there an option I can set to say "always unname"? Some trick I can use (somehow override ['s behaviour when the extracted result is a vector?)
Update on why the code below works (as far as I can tell)
Subsetting with [ is handled using functions contained in the R source file subset.c in ~/src/main. When using matrix indexing to subset a matrix, the function VectorSubset is called. When there is more than one index used (i.e., one each for rows and columns as in x[,1]), then MatrixSubset is called.
The function VectorSubset only assigns names to 1-dimensional arrays being subsetted. Since a matrix is a 2-D array, no names are assigned to the result when using matrix indexing. The function MatrixSubset, however, does attempt to pass on dimnames under certain circumstances.
Therefore, the matrix indexing you refer to in the quote from the help page seems to be the key:
x <- matrix(1)
colnames(x) <- "foo"
x[, 1] ## 'Normal' indexing
# foo
# 1
x[matrix(c(1, 1), ncol = 2)] ## Matrix indexing
# [1] 1
And with a wider 1-row matrix:
xx <- matrix(1:10, nrow = 1)
colnames(xx) <- sprintf('foo%i', seq_len(ncol(xx)))
xx[, 6] ## 'Normal' indexing
# foo6
# 6
xx[matrix(c(1, 6), ncol = 2)] ## Matrix indexing
# [1] 6
With a matrix with both dimensions > 1:
yy <- matrix(1:10, nrow = 2, dimnames = list(NULL,
sprintf('foo%i', 1:5)))
yy[cbind(seq_len(nrow(yy)), 3)] ## Matrix indexing
# [1] 5 6

Resources