How to use which() on a matrix to get unique indices - r

Suppose I have a symmetric matrix:
> mat <- matrix(c(1,0,1,0,0,0,1,0,1,1,0,0,0,0,0,0), ncol=4, nrow=4)
> mat
[,1] [,2] [,3] [,4]
[1,] 1 0 1 0
[2,] 0 0 1 0
[3,] 1 1 0 0
[4,] 0 0 0 0
which I would like to analyse:
> which(mat==1, arr.ind=T)
row col
[1,] 1 1
[2,] 3 1
[3,] 3 2
[4,] 1 3
[5,] 2 3
now the question is: how am I not considering duplicated cells? As the resulting index matrix shows, I have the rows 2 and 4 pointing respectively to (3,1) and (1,3), which is the same cell.
How do I avoid such a situation? I only need a reference for each cell, even though the matrix is symmetric. Is there an easy way to deal with such situations?
EDIT:
I was thinking about using upper.tri or lower.tri but in this case what I get is an vector version of the matrix and I am not able to get back to the (row, col) notation.
> which(mat[upper.tri(mat)]==1, arr.ind=T)
[1] 2 3
EDIT II
expected output would be something like an unique over the couple of (row, col) and (col, row):
row col
[1,] 1 1
[2,] 3 1
[3,] 3 2

Since you have symmetrical matrix you could do
which(mat == 1 & upper.tri(mat, diag = TRUE), arr.ind = TRUE)
# row col
#[1,] 1 1
#[2,] 1 3
#[3,] 2 3
OR
which(mat == 1 & lower.tri(mat, diag = TRUE), arr.ind = TRUE)

Related

Calculations within a matrix with internal, references that are anchored to columns within the matrix in R

I have a matrix and I would like to perform a calculation on each number in the matrix so that I get another matrix with the same dimensions only with the results of the calculation. This should be easy except that part of the equation is dependent on which column I am accessing because I will need to have an internal reference to the number at row [3,] within that column.
The equation I would like to apply is:
output matrix value = input_matrix value at a given position + (1- (matrix value at [3,] and in the same column as the input matrix value))
For example, For (1,1) in the matrix the calculation would be 1+(1-3)
For position (1,2) in the matrix, the calculation would be 5+(1-7)
input_matrix<- matrix(1:12, nrow = 4, ncol = 3)
[,1] [,2] [,3]
[1,] 1 5 9
[2,] 2 6 10
[3,] 3 7 11
[4,] 4 8 12
The output matrix should end up looking like this:
[,1] [,2] [,3]
[1,] -1 -1 -1
[2,] 0 0 0
[3,] 1 1 1
[4,] 2 2 2
I have tried doing something like this:
output_matrix<-apply(input_matrix,c(1,2), function(x) x+(1-(input_matrix[3,])))
but that gives me three matrices with the wrong dimensions as the output.
I am thinking that perhaps I can perhaps just modify the function in the above calculation to get this to work, or alternatively write something that iterates over each column of the matrix but I am not sure exactly how to do this in a way that gives me the output matrix that I want.
Any help would be greatly appreciated.
I think this should work for you:
apply(input_matrix, margin = 2, function(x) x + (1 - x[3]))
[,1] [,2] [,3]
[1,] -1 -1 -1
[2,] 0 0 0
[3,] 1 1 1
[4,] 2 2 2
We could also do this in a vectorized way
input_matrix + (1 - input_matrix[3,][col(input_matrix)])
# [,1] [,2] [,3]
#[1,] -1 -1 -1
#[2,] 0 0 0
#[3,] 1 1 1
#[4,] 2 2 2

Subtract vector from one column of a matrix

I'm a complete R novice, and I'm really struggling on this problem. I need to take a vector, evens, and subtract it from the first column of a matrix, top_secret. I tried to call up only that column using top_secret[,1] and subtract the vector from that, but then it only returns the column. Is there a way to do this inside the matrix so that I can continue to manipulate the matrix without creating a bunch of separate columns?
Sure, you can. Here is an example:
m <- matrix(c(1,2,3,4),4,4, byrow = TRUE)
> m
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 1 2 3 4
[3,] 1 2 3 4
[4,] 1 2 3 4
m[,4] <- m[,4] - c(5,5,5,5)
which gives:
> m
[,1] [,2] [,3] [,4]
[1,] 1 2 3 -1
[2,] 1 2 3 -1
[3,] 1 2 3 -1
[4,] 1 2 3 -1
Or another option is replace
replace(m, cbind(seq_len(nrow(m)), 4), m[,4] - 5)
data
m <- matrix(c(1,2,3,4),4,4, byrow = TRUE)

Filling a matrix in R

I am trying to fill some rows of a (500,2) matrix with the row vector (1,0) using this code, last line is to verify the result:
data<-matrix(ncol=2,nrow=500)
data[41:150,]<-matrix(c(1,0),nrow=1,ncol=2,byrow=TRUE)
data[41:45,]
But the result is
> data[41:45,]
[,1] [,2]
[1,] 1 1
[2,] 0 0
[3,] 1 1
[4,] 0 0
[5,] 1 1
instead of
> data[41:45,]
[,1] [,2]
[1,] 1 0
[2,] 1 0
[3,] 1 0
[4,] 1 0
[5,] 1 0
(1) What am I doing wrong?
(2) Why aren't the row indices in the result 41, 42, 43, 44 and 45?
You're trying to fill a part of the matrix, so the block you're trying to drop in there should be of the right size:
data[41:150,]<-matrix(c(1,0),nrow=110,ncol=2,byrow=TRUE)
# nrow = 110, instead of 1 !!!!
Otherwise your piece-to-be-added will be reverted to vector and added columnwise. Try, for example, this:
data[41:150,] <- matrix(c(1,2,3,4,5), nrow=5, ncol=2, byrow=TRUE)
data[41:45,]
[,1] [,2]
[1,] 1 1
[2,] 3 3
[3,] 5 5
[4,] 2 2
[5,] 4 4
Can one complain? Yes, and now. No, because R behaves as documented (matrices are vectors with dimension attributes, and recycling works on vectors). Yes, because although recycling can be convenient, it may create false expectations.
Why aren't row indices 41,42,43,... ? I don't know, that's just the way matrices and vectors behave.
> (1:10)[5:6]
[1] 5 6
(Notice there's [1] in the output, not [5].)
Data frames behave differently, so you would see the original line numbers for slices:
as.data.frame(data)[45:50,]
It will be cleaner to just do this column-wise:
data[41:150, 1L] = 1
data[41:150, 2L] = 0
You could also accomplish this in one line with matrix indexing like so:
data[cbind(rep(41:150, each = 2L), 1:2)] = 1:0
You could use rep.
data[41:150,] <- rep(1:0, each=150-41+1)
#> data[41:45,]
# [,1] [,2]
#[1,] 1 0
#[2,] 1 0
#[3,] 1 0
#[4,] 1 0
#[5,] 1 0
I think MichaelChirico approach is the cleanest/savest to use.

Write a value for maximum/minimum between two values

I have a two-column matrix and I want to produce a new matrix/data.frame where Col N has 1 if is maximum, 0 otherwise (they are never equal). This is my attempt:
testM <- matrix(c(1,2,3, 1,1,5), ncol = 2, byrow = T)
>testM
V1 V2
1 1 2
2 3 1
3 1 5
apply(data.frame(testM), 1, function(row) ifelse(max(row[1],row[2]),1,0))
I expect to have:
0 1
1 0
0 1
because of the 0,1 parameters in max() function, but I just get
[1] 1 1 1
Any ideas?
Or using pmax
testM <- matrix(c(1,2,3, 1,1,5), ncol = 2, byrow = T)
--(testM==pmax(testM[,1],testM[,2]))
V1 V2
[1,] 0 1
[2,] 1 0
[3,] 0 1
You can perform arithmetic on Booleans in R! Just check if an element in each row is equal to it's max value and multiply by 1.
t(apply(testM, 1, function(row) 1*(row == max(row))))
You can use max.col and col to produce a logical matrix:
res <- col(testM) == max.col(testM)
res
[,1] [,2]
[1,] FALSE TRUE
[2,] TRUE FALSE
[3,] FALSE TRUE
If you want it as 0/1, you can do:
res <- as.integer(col(testM) == max.col(testM)) # this removes the dimension
dim(res) <- dim(testM) # puts the dimension back
res
[,1] [,2]
[1,] 0 1
[2,] 1 0
[3,] 0 1

How to print row index and occurences count of zeros in rows in R data.frame

I want to print row index and the number of zeros present in each row of a R data.frame ..
The input matrix is like this:
A B
rowIndex1 0 1
rowIndex2 1 1
I thought to use this:
print(which(rowSums(matrix == 0) != 0))
I want that it prints something like this:
rowIndex1
1
However it does not print the number of zeros in the rows but a different number (I checked it) - like this:
rowIndex1
2400
How to achieve it?
Thanks
As mentioned in my comment, perhaps arr.ind would be of use.
Using #bartektartanus's sample data:
m <- diag(5) + c(0:6,0,0)
table(which(m == 0, arr.ind=TRUE)[, "row"])
#
# 2 3 4 5
# 1 2 1 1
The "names" (in this case, 2, 3, 4, and 5) are your row numbers and the values (in this case, 1, 2, 1, 1) are the counts.
Here is the output of which, so you can understand what is going on:
which(m == 0, arr.ind=TRUE)
# row col
# [1,] 3 2
# [2,] 4 2
# [3,] 5 2
# [4,] 2 4
# [5,] 3 4
This is working good. You get row number that contains zero.
> m <- diag(5) + c(0:6,0,0)
Warning message:
In diag(5) + c(0:6, 0, 0) :
longer object length is not a multiple of shorter object length
> m
[,1] [,2] [,3] [,4] [,5]
[1,] 1 5 1 6 2
[2,] 1 7 2 0 3
[3,] 2 0 4 0 4
[4,] 3 0 4 1 5
[5,] 4 0 5 1 7
> which(rowSums(m == 0) != 0)
[1] 2 3 4 5
to obtain what you want use this:
> x <- rowSums(m==0)
> cbind(which(x!=0),x[x!=0])
[,1] [,2]
[1,] 2 1
[2,] 3 2
[3,] 4 1
[4,] 5 1

Resources