Related
I have a matrix like so:
m <- matrix(c(1,1,1,1,1,1,1,1,0,1,1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,2,0,1,1,1,1,1,1,1,1,1), nrow = 12, ncol = 12)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
[1,] 1 1 1 1 1 1 1 1 1 1 1 1
[2,] 1 1 2 1 1 2 1 1 2 1 1 2
[3,] 1 1 0 1 1 0 1 1 0 1 1 0
[4,] 1 1 1 1 1 1 1 1 1 1 1 1
[5,] 1 1 1 1 1 1 1 1 1 1 1 1
[6,] 1 1 1 1 1 1 1 1 1 1 1 1
[7,] 1 1 1 1 1 1 1 1 1 1 1 1
[8,] 1 1 1 1 1 1 1 1 1 1 1 1
[9,] 0 1 1 0 1 1 0 1 1 0 1 1
[10,] 1 1 1 1 1 1 1 1 1 1 1 1
[11,] 1 1 1 1 1 1 1 1 1 1 1 1
[12,] 1 1 1 1 1 1 1 1 1 1 1 1
and I want to find the index locations where 0 is completely surrounded by 1s in a 3x3 window. I can find all the zeros with:
which(m == 0) but this will also return places where a 2 surrounds a 0 such as at index location m[3,3]
w <- which(m == 0, arr.ind = TRUE)
w
# row col
# [1,] 9 1
# [2,] 3 3
# [3,] 9 4
# [4,] 3 6
# [5,] 9 7
# [6,] 3 9
# [7,] 9 10
# [8,] 3 12
We don't need to know which zeroes are on a boundary, so filter out those:
w <- w[ w[,1] > 1 & w[,1] < (nrow(m)-1) & w[,2] > 2 & w[,2] < (ncol(m)-1), ]
w
# row col
# [1,] 3 3
# [2,] 9 4
# [3,] 3 6
# [4,] 9 7
# [5,] 3 9
# [6,] 9 10
Now we can take those inner indices and build 3x3 submatrices into a list. Here are the first couple (of six):
Map(function(rn,cn) m[rn+(-1:1),cn+(-1:1)], w[,1], w[,2])[1:2]
# [[1]]
# [,1] [,2] [,3]
# [1,] 1 2 1
# [2,] 1 0 1
# [3,] 1 1 1
# [[2]]
# [,1] [,2] [,3]
# [1,] 1 1 1
# [2,] 1 0 1
# [3,] 1 1 1
Now we can just filter out the ones where there is only one non-1 entry in the matrix.
Filter(function(m3) sum(m3 != 1) == 1, Map(function(rn,cn) m[rn+(-1:1),cn+(-1:1)], w[,1], w[,2]))
# [[1]]
# [,1] [,2] [,3]
# [1,] 1 1 1
# [2,] 1 0 1
# [3,] 1 1 1
# [[2]]
# [,1] [,2] [,3]
# [1,] 1 1 1
# [2,] 1 0 1
# [3,] 1 1 1
# [[3]]
# [,1] [,2] [,3]
# [1,] 1 1 1
# [2,] 1 0 1
# [3,] 1 1 1
Since you need to just count the occurrences, add length(...) around that, and you have your answer.
(If you're curious, the reason I went with sum(m3!=1)==1 is because I wasn't certain if you wanted the border submatrices as well. If you wanted those, then the number of 1s would be reduced, not "8" as a typical 3x3 would be. But we know that there should always be exactly one non-1 in the submatrix: the center 0.)
To get just the indices that match,
w[mapply(function(rn,cn) sum(m[rn+(-1:1),cn+(-1:1)] != 1) == 1,
w[,1], w[,2]),]
# row col
# [1,] 9 4
# [2,] 9 7
# [3,] 9 10
This question I already asked for Matlab and I got the required answer.here is the link for that question-how to create the matrix with the shifted version of the signal samples in Matlab?.
As I am working on my data both in Matlab and r ,I am trying to developed same code for r.In that attempt I asked the following question the link given-How to form the matrix of logical '1' and '0' using two vectors and logical operators in r? now using answer given(not yet accepted) to my second question I developed the following r code.
library("pracma")
x=c(1:10)
N=numel(x)
step=2
index=seq(N,1,by=-step)
M=numel(index)
r1 <- c(rbind((index), rev(index)))
val<-matrix(rep(rep(c(1, 0), M), r1), ncol = M)
val1=val*matrix(repmat(c(x,matrix(0,1,step)),1,M),12,M)
out=matrix(val1[1:(N*M)],N,M)
which result in the
> out
[,1] [,2] [,3] [,4] [,5]
[1,] 1 0 0 0 0
[2,] 2 0 0 0 0
[3,] 3 1 0 0 0
[4,] 4 2 0 0 0
[5,] 5 3 1 0 0
[6,] 6 4 2 0 0
[7,] 7 5 3 1 0
[8,] 8 6 4 2 0
[9,] 9 7 5 3 1
[10,] 10 8 6 4 2
but as I change the step size it won't work, how can i make this code more generalize for step size?
As #NelsonGon mentioned in the comments, creating a function with slight modification of the code would help
f1 <- function(v, step) {
N <- pracma::numel(v)
index <- seq(N,1,by=-step)
M <- pracma::numel(index)
r1 <- c(rbind((index), rev(index)))
val <- matrix(rep(rep(c(1, 0), M), r1), ncol = M)
val[!!val] <- sequence(colSums(val))
apply(val[x,], 2, function(x) x[order(x != 0)])
}
-testing
f1(1:10, 2)
# [,1] [,2] [,3] [,4] [,5]
# [1,] 1 0 0 0 0
# [2,] 2 0 0 0 0
# [3,] 3 1 0 0 0
# [4,] 4 2 0 0 0
# [5,] 5 3 1 0 0
# [6,] 6 4 2 0 0
# [7,] 7 5 3 1 0
# [8,] 8 6 4 2 0
# [9,] 9 7 5 3 1
#[10,] 10 8 6 4 2
f1(1:10, 3)
# [,1] [,2] [,3] [,4]
# [1,] 1 0 0 0
# [2,] 2 0 0 0
# [3,] 3 0 0 0
# [4,] 4 1 0 0
# [5,] 5 2 0 0
# [6,] 6 3 0 0
# [7,] 7 4 1 0
# [8,] 8 5 2 0
# [9,] 9 6 3 0
#[10,] 10 7 4 1
Note: If the intention is to keep columns with only 0's as well, it can be done in the function
f1(1:10, 4)
# [,1] [,2] [,3]
# [1,] 1 0 0
# [2,] 2 0 0
# [3,] 3 0 0
# [4,] 4 0 0
# [5,] 5 1 0
# [6,] 6 2 0
# [7,] 7 3 0
# [8,] 8 4 0
# [9,] 9 5 1
#[10,] 10 6 2
f1(1:10, 5)
# [,1] [,2]
# [1,] 1 0
# [2,] 2 0
# [3,] 3 0
# [4,] 4 0
# [5,] 5 0
# [6,] 6 1
# [7,] 7 2
# [8,] 8 3
# [9,] 9 4
#[10,] 10 5
I just was wondering if there was an easy way to compute the maximal number of identical elements between any two columns of a matrix in R.
For example, I have a matrix
test <- replicate(10, sample((0:3), 10, replace = TRUE))
test
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 3 0 1 0 2 2 1 0 2 0
[2,] 1 1 3 2 0 2 3 0 2 2
[3,] 2 3 0 0 1 2 0 3 0 2
[4,] 2 2 1 1 2 0 0 1 1 0
[5,] 2 0 1 2 0 1 1 1 0 0
[6,] 1 0 1 3 2 3 3 1 3 2
[7,] 0 1 3 2 1 0 1 2 1 1
[8,] 0 3 1 3 0 2 3 1 1 1
[9,] 2 3 1 3 0 1 0 1 3 2
[10,] 3 2 1 0 2 1 3 2 3 1
To compare column 1 and 2 I use
table(test[,1] == test[,2])
FALSE TRUE
8 2
So there are two identical elements between these two columns.
I could now repeat this for all pairs of columns using two nested for loops and then find the maximum number of TRUE calls but this does not look nice. Can anyone think of a better way?
Cheers,
Maik
It is always interesting to see a reasonable answer being voted down. Though I don't like this minus score, I would keep my answer. Voter, what do you think?
Let's first get some reproducible toy data:
set.seed(0); x <- replicate(10, sample((0:3), 10, replace = TRUE))
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
# [1,] 3 0 3 1 1 2 1 3 3 0
# [2,] 1 0 3 1 3 1 3 1 1 0
# [3,] 1 0 0 2 2 3 1 3 2 0
# [4,] 2 2 2 1 3 1 1 1 1 2
# [5,] 3 1 0 0 2 0 1 1 1 3
# [6,] 0 3 1 3 2 0 2 1 3 3
# [7,] 3 1 1 2 3 0 1 3 0 3
# [8,] 3 2 0 3 0 1 1 3 2 1
# [9,] 2 3 1 0 1 2 3 1 0 1
#[10,] 2 1 3 2 2 2 0 3 0 3
For any input matrix x, you can use:
y <- unlist(lapply(seq_len(ncol(x)-1L),
function(i) colSums(x[, (i+1):ncol(x), drop = FALSE] == x[, i])))
# [1] 1 2 3 2 4 1 4 2 3 3 1 0 0 3 1 3 5 1 3 1 2 4 1 4 3 4 2 3 5 1 1 3 2 1 2 2 3 3
#[39] 1 2 3 1 4 3 1
max(y)
# [1] 5
The comment by #David is doing essentially the same thing but way slower:
y <- combn(ncol(x), 2, FUN = function(u) sum(x[, u[1]] == x[, u[2]]))
# [1] 1 2 3 2 4 1 4 2 3 3 1 0 0 3 1 3 5 1 3 1 2 4 1 4 3 4 2 3 5 1 1 3 2 1 2 2 3 3
#[39] 1 2 3 1 4 3 1
max(y)
# [1] 5
Benchmarking
We generate a 10 * 1000 matrix for experiment:
set.seed(0); x <- replicate(1e+3, sample((0:3), 10, replace = TRUE))
system.time(unlist(lapply(seq_len(ncol(x)-1L), function(i) colSums(x[, (i+1):ncol(x), drop = FALSE] == x[, i]))))
# user system elapsed
# 0.176 0.032 0.207
system.time(combn(ncol(x), 2, FUN = function(u) sum(x[, u[1]] == x[, u[2]])))
# user system elapsed
# 4.692 0.008 4.708
Something like a distance matrix?
With this idea, you could also generate a "distance" matrix for number of non-equal elements between all columns (just replace the == with !=):
y <- unlist(lapply(seq_len(ncol(x)-1L),
function(i) colSums(x[, (i+1):ncol(x), drop = FALSE] != x[, i])))
z <- matrix(0L, ncol(x), ncol(x))
z[lower.tri(z)] <- y
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
# [1,] 0 0 0 0 0 0 0 0 0 0
# [2,] 9 0 0 0 0 0 0 0 0 0
# [3,] 8 7 0 0 0 0 0 0 0 0
# [4,] 7 9 9 0 0 0 0 0 0 0
# [5,] 8 10 7 7 0 0 0 0 0 0
# [6,] 6 10 9 6 9 0 0 0 0 0
# [7,] 9 7 8 8 7 8 0 0 0 0
# [8,] 6 9 6 7 8 7 8 0 0 0
# [9,] 8 7 9 5 9 7 7 6 0 0
#[10,] 7 5 6 9 8 9 9 7 9 0
Note that only lower triangular matrix is computed due to symmetry. Diagonal are all zeros (or course).
Try:
max(combn(split(test, col(test)), 2, function(x) sum(x[[1]] == x[[2]])))
If you want to know which pair has the greatest number of equal elements it's a little more complicated.
So, my goal is to take an input vector and to make an output matrix of different counters. So every time a value appears in my inputs, I want to find that counter and iterate it by 1. I understand that I'm not good at explaining this, so I illustrated a simple version below. However, I want to make 2 changes which I will enumerate after the example so that it makes sense.
nums = c(1,2,3,4,5,1,2,4,3,5)
unis = unique(nums)
counter = matrix(NA, nrow = length(nums), ncol = length(unis))
colnames(counter) = unis
for (i in 1:length(nums)){
temp = nums[i]
if (i == 1){
counter[1,] = 0
counter[1,temp] = 1
} else {
counter[i,] = counter[i-1,]
counter[i,temp] = counter[i-1,temp]+1
}
}
counter
which outputs
> counter
1 2 3 4 5
[1,] 1 0 0 0 0
[2,] 1 1 0 0 0
[3,] 1 1 1 0 0
[4,] 1 1 1 1 0
[5,] 1 1 1 1 1
[6,] 2 1 1 1 1
[7,] 2 2 1 1 1
[8,] 2 2 1 2 1
[9,] 2 2 2 2 1
[10,] 2 2 2 2 2
The 2 modifications. 1) Since the real data is much larger, I would want to do this using apply or however people who know R better than me says it should be done. 2) Whereas the input is a vector where each element is only an element, how could this be generalized if an element of a vector was a tuple? For example (if nums was a tuple of 4 and 5, then it would iterate both in that step and the last line of the output would then be 2,2,2,3,2)
Thanks and if you don't understand please ask questions and I'll try to clarify
Using the Matrix package (which ships with a standard installation of R)
nums <- c(1,2,3,4,5,1,2,4,3,5)
apply(Matrix::sparseMatrix(i=seq_along(nums), j=nums), 2, cumsum)
# [,1] [,2] [,3] [,4] [,5]
# [1,] 1 0 0 0 0
# [2,] 1 1 0 0 0
# [3,] 1 1 1 0 0
# [4,] 1 1 1 1 0
# [5,] 1 1 1 1 1
# [7,] 2 2 1 1 1
# [8,] 2 2 1 2 1
# [9,] 2 2 2 2 1
# [10,] 2 2 2 2 2
Note that this behaves a bit differently in a couple of ways from thelatemail's suggested solution. Which behavior you prefer will depend on what you are using this for.
Here's a small example that illustrates the differences:
nums <- c(5,2,1,1)
# My suggestion
apply(Matrix::sparseMatrix(i=seq_along(nums), j=nums), 2, cumsum)
# [,1] [,2] [,3] [,4] [,5]
# [1,] 0 0 0 0 1
# [2,] 0 1 0 0 1
# [3,] 1 1 0 0 1
# [4,] 2 1 0 0 1
# #thelatemail's suggestion
sapply(unique(nums), function(x) cumsum(nums==x) )
# [,1] [,2] [,3]
# [1,] 1 0 0
# [2,] 1 1 0
# [3,] 1 1 1
# [4,] 1 1 2
For your second question, you could do something like this:
nums <- list(1,2,3,4,5,1,2,4,3,c(4,5))
ii <- rep(seq_along(nums), times=lengths(nums)) ## lengths() is in R>=3.2.0
jj <- unlist(nums)
apply(Matrix::sparseMatrix(i=ii, j=jj), 2, cumsum)
# [,1] [,2] [,3] [,4] [,5]
# [1,] 1 0 0 0 0
# [2,] 1 1 0 0 0
# [3,] 1 1 1 0 0
# [4,] 1 1 1 1 0
# [5,] 1 1 1 1 1
# [6,] 2 1 1 1 1
# [7,] 2 2 1 1 1
# [8,] 2 2 1 2 1
# [9,] 2 2 2 2 1
# [10,] 2 2 2 3 2
For your first query, you can get there with something like:
sapply(unique(nums), function(x) cumsum(nums==x) )
# [,1] [,2] [,3] [,4] [,5]
# [1,] 1 0 0 0 0
# [2,] 1 1 0 0 0
# [3,] 1 1 1 0 0
# [4,] 1 1 1 1 0
# [5,] 1 1 1 1 1
# [6,] 2 1 1 1 1
# [7,] 2 2 1 1 1
# [8,] 2 2 1 2 1
# [9,] 2 2 2 2 1
#[10,] 2 2 2 2 2
Another idea:
do.call(rbind, Reduce("+", lapply(nums, tabulate, max(unlist(nums))), accumulate = TRUE))
# [,1] [,2] [,3] [,4] [,5]
# [1,] 1 0 0 0 0
# [2,] 1 1 0 0 0
# [3,] 1 1 1 0 0
# [4,] 1 1 1 1 0
# [5,] 1 1 1 1 1
# [6,] 2 1 1 1 1
# [7,] 2 2 1 1 1
# [8,] 2 2 1 2 1
# [9,] 2 2 2 2 1
#[10,] 2 2 2 2 2
And generally:
x = list(1, 3, 6, c(6, 3), 2, c(4, 6, 1), c(1, 2), 3)
do.call(rbind, Reduce("+", lapply(x, tabulate, max(unlist(x))), accumulate = TRUE))
# [,1] [,2] [,3] [,4] [,5] [,6]
#[1,] 1 0 0 0 0 0
#[2,] 1 0 1 0 0 0
#[3,] 1 0 1 0 0 1
#[4,] 1 0 2 0 0 2
#[5,] 1 1 2 0 0 2
#[6,] 2 1 2 1 0 3
#[7,] 3 2 2 1 0 3
#[8,] 3 2 3 1 0 3
I have a table (t1) as follows:
t1 <- array(1:20, dim=c(10,10))
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 1 11 1 11 1 11 1 11 1 11
[2,] 2 12 2 12 2 12 2 12 2 12
[3,] 3 13 3 13 3 13 3 13 3 13
[4,] 4 14 4 14 4 14 4 14 4 14
[5,] 5 15 5 15 5 15 5 15 5 15
[6,] 6 16 6 16 6 16 6 16 6 16
[7,] 7 17 7 17 7 17 7 17 7 17
[8,] 8 18 8 18 8 18 8 18 8 18
[9,] 9 19 9 19 9 19 9 19 9 19
[10,] 10 20 10 20 10 20 10 20 10 20
I want to transform this table to either 1 or 0. If the cells number is >5, we give it a 1, if the cells number is <5 or = 5, we give it a 0. Thus after transformation, table t1 will become the followings:
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 0 1 0 1 0 1 0 1 0 1
[2,] 0 1 0 1 0 1 0 1 0 1
[3,] 0 1 0 1 0 1 0 1 0 1
[4,] 0 1 0 1 0 1 0 1 0 1
[5,] 0 1 0 1 0 1 0 1 0 1
[6,] 1 1 1 1 1 1 1 1 1 1
[7,] 1 1 1 1 1 1 1 1 1 1
[8,] 1 1 1 1 1 1 1 1 1 1
[9,] 1 1 1 1 1 1 1 1 1 1
[10,] 1 1 1 1 1 1 1 1 1 1
Which commands should I used?
You can just use an ifelse() statement:
ifelse(t1 > 5, 1, 0)
Since a statement like t1 > 5 will result in a matrix of TRUE and FALSE, and since R treats TRUE as "1" and FALSE as "0", for this particular transformation, you could also do:
(t1 > 5)+0
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 0 1 0 1 0 1 0 1 0 1
[2,] 0 1 0 1 0 1 0 1 0 1
[3,] 0 1 0 1 0 1 0 1 0 1
[4,] 0 1 0 1 0 1 0 1 0 1
[5,] 0 1 0 1 0 1 0 1 0 1
[6,] 1 1 1 1 1 1 1 1 1 1
[7,] 1 1 1 1 1 1 1 1 1 1
[8,] 1 1 1 1 1 1 1 1 1 1
[9,] 1 1 1 1 1 1 1 1 1 1
[10,] 1 1 1 1 1 1 1 1 1 1
I haven't timed it, but an alternative to ifelse is just to use a conditional statement.
foo <- bar < 5
Oh, dang, Julius went and beat me to it by 10 seconds.