I was wondering if anyone could help me understand the output of this function. I know it's supposed to return the positions in which there is a run of length 2 but I am not exactly sure how to interpret the output.
fun1 = function(M,k) {
n = nrow(M)
m = ncol(M)
runs = vector('list',length=m)
for(i in 1:m) {
for(j in 1:(n-k+1)) {
if(all(M[j:(j+k-1),i]==1)) runs[[i]] = c(runs[[i]],j)
}
}
return(runs)
}
set.seed(123)
M = matrix(sample(0:1,size=15,replace=TRUE),ncol=3,nrow=5)
fun1(M,2)
Output:
[[1]]
[1] 4
[[2]]
[1] 2 3
[[3]]
[1] 3
Each element in the list is the output for a column, starting at the left-most column. The list of numbers (or NULL if there are none) gives you the row numbers in that column where there are two 1's in a row.
To interpret the sample output you have:
- In the first (left-most) column, there are two 1's starting at row 4 (M[4,1] and M[5,1] are 1)
- In the second column, there are two 1's starting at row 2 (meaning row 2 and row 3 are 1's) and also at row 3 (meaning row 3 and row 4 are 1's)
- In the third row, there are two 1's starting at row 3
You can check that this is true if you print our the matrix M, which given your seed looks like this
[,1] [,2] [,3]
[1,] 0 0 1
[2,] 1 1 0
[3,] 0 1 1
[4,] 1 1 1
[5,] 1 0 0
I hope that makes it clear.
By the way, in the future, try to format your code better with proper indentations and line breaks. I had to manually add line breaks to make the sample code work, but good job giving a seed :)
Related
I have a numeric matrix, and I need to extract the set of elements with the largest possible sum, subject to the constraint that no 2 elements can come from the same row or the same column. Is there any efficient algorithm for this, and is there an implementation of that algorithm for R?
For example, if the matrix is (using R's matrix notation):
[,1] [,2] [,3]
[1,] 7 1 9
[2,] 8 4 2
[3,] 3 6 5
then the unique solution is [1,3], [2,1], [3,2], which extracts the numbers 9, 8, and 6 for a total of 23. However, if the matrix is:
[,1] [,2] [,3]
[1,] 6 2 1
[2,] 4 9 5
[3,] 8 7 3
then there are 3 equally good solutions: 1,8,9; 3,6,9; and 5,6,7. These all add up to 18.
Additional notes:
If there are multiple equally good solutions, I need to find all of them. (Being able to find additional solutions that are almost as good would be useful as well, but not essential.)
The matrix elements are all non-negative, and many of them will be zero. Each row and column will contain at least 1 element that is nonzero.
The matrix can contain repeated elements.
The matrix need not be square. It might have more rows than columns or vice versa, but the constraint is always the same: no row or column may be used twice.
This problem could also be reformulated as finding a maximal-scoring set of edges between the 2 halves of a bipartite graph without re-using any node.
If it helps, you may assume that there is some small fixed k such that no row or column contains more than k non-zero values.
If anyone is curious, the rows of the matrix represent items to be labeled, the columns represent the labels, and each matrix element represents the "consistency score" for assigning a label to an item. I want to assign the each label to exactly one item in the way that maximizes the total consistency.
My suggest would be to (1) find all the combinations of elements following the rule that in each combination, no two elements coming from the same row or same column (2) calculate the sum of elements in each combination (3) find the maximum sum and the corresponding combination.
Here I only show the square matrix case, the non-square matrix would follow similar idea.
(1) Suppose the matrix is n*n, keep the row order as 1 to n, all I need to do is to find all the permutations of columns index (1:n), after combine the row index and one permutation of columns index, then I would get the positions of elements in one combination that follow the rule, in this way I can identify the positions of elements in all the combinations.
matrix_data <- matrix(c(6,2,1,4,9,5,8,7,3), byrow=T,nrow = 3)
## example matrix
n_length <- dim(matrix_data)[1]
## row length
all_permutation <- permn(c(1:n_length))
## list of all the permutations of columns index
(2) Find sum of elements in each combination
index_func <- function(x){ ## x will be a permutation from the list all_permutation
matrix_indexs <- matrix(data = c(c(1:n_length),x),
byrow = F, nrow = n_length)
## combine row index and column index to construct the positions of the elements in the matrix
matrix_elements <- matrix_data[matrix_indexs]
## extract the elements based on their position
matrix_combine <- cbind(matrix_indexs,matrix_elements)
## combine the above two matrices
return(matrix_combine)
}
results <- sapply(all_permutation, sum(index_func(x)[,"matrix_elements"]))
## find the sums of all the combination
(3) Find the maximum sum and corresponding combination
max(results) ## 18 maximum sum is 18
max_index <- which(results==max(results)) ## 1 2 4 there are three combinations
## if you want the complete position index
lapply(all_permutation[max_index], index_func)
## output, first column is row index, second column is column index, last column is the corresponding matrix elements
[[1]]
matrix_elements
[1,] 1 1 6
[2,] 2 2 9
[3,] 3 3 3
[[2]]
matrix_elements
[1,] 1 1 6
[2,] 2 3 5
[3,] 3 2 7
[[3]]
matrix_elements
[1,] 1 3 1
[2,] 2 2 9
[3,] 3 1 8
Here are 2 options:
1) Approaching this as an optimization problem where the objective function is to maximize the sum of elements chosen subject to the constraints that each row and column cannot be selected more than once.
sample data:
set.seed(0L)
m <- matrix(sample(12), nrow=4)
#m <- matrix(sample(16), nrow=4)
m
[,1] [,2] [,3]
[1,] 9 2 6
[2,] 4 5 11
[3,] 7 3 12
[4,] 1 8 10
code:
library(lpSolve)
nr <- nrow(m)
nc <- ncol(m)
#create the indicator matrix for column indexes
colmat <- data.table::shift(c(rep(1, nr), rep(0, (nc-1)*nr)), seq(0, by=nr, length.out=nc), fill=0)
#create indicator matrix for row indexes
rowmat <- data.table::shift(rep(c(1, rep(0, nr-1)), nc), 0:(nr-1), fill=0)
A <- do.call(rbind, c(colmat, rowmat))
#call lp solver
res <- lp("max",
as.vector(m),
A,
rep("<=", nrow(A)),
rep(1, nrow(A)),
all.bin=TRUE,
num.bin.solns=3)
sample output:
which(matrix(res$solution[1:ncol(A)], nrow=nr)==1L, arr.ind=TRUE)
row col
[1,] 1 1
[2,] 4 2
[3,] 3 3
2)
And the above leads to an greedy heuristics approach to pick the largest element and eliminate the chosen row and column and then repeat on the smaller matrix:
v <- integer(min(nc, nr))
allix <- matrix(0, nrow=length(v), ncol=2)
for (k in seq_along(v)) {
ix <- which(m == max(m), arr.ind=TRUE)
allix[k,] <- ix
v[k] <- m[ix]
m <- m[-ix[1], -ix[2], drop=FALSE]
}
v
#[1] 12 9 8
But this does not lead to multiple solutions and hence not developing further to extract indices.
I got many matrix of size 300*300, which are saved in a list, named L. This is binary matrix, i only have 0 and 1 values.
I plot image from those matrix (for exemple with c.img).
What is the best way to create a stack of those matrix? I want to create a new matrix, and or the pixel at the (i,j) position, i want to look in all my matrix saved in L, and if one or more matrix have a 1 at this position, then the (i,j) pixel in my new matrix will have value 1, else 0.
Here a pseudo code to help you understand my goal
L <- list(rep(matrix(0 or 1,300,300),n)
new_matrix<-matrix(0,300,300)
new_matrix[i,j]<- max(L[i,j])
but this code doesnt work because data is a list. I'm prety sure sure i can achieve this task using 3 loop (i,j,n), but because i got many matrix that'll take too long and i'm looking for a faster solution.
You can use :
matrix(as.numeric(Reduce(`+`, data) > 0), 300, 300)
Sum all matrix then convert values greater than 0 to 1 with as.numeric.
I would use a multi-dimensional array if all images are the same dimensions.
# Two matrices
m1 <- matrix(1,5,5)
m2 <- matrix(0,5,5)
# Placing them in an array (two matrices of dimension 5x5)
my_array <- array(c(m1,m2),dim = c(5,5,2))
# Investigating content of position 1,1 for matrices 1 to 2
my_array[1,1,1:2]
[1] 1 0
# You can even look at larger regions across matrices
> my_array[1:3,1:3,1:2]
, , 1
[,1] [,2] [,3]
[1,] 1 1 1
[2,] 1 1 1
[3,] 1 1 1
, , 2
[,1] [,2] [,3]
[1,] 0 0 0
[2,] 0 0 0
[3,] 0 0 0
In R, I am wanting to count the number of different values occurring in a column of a matrix, but only if a certain value occurs in another column. To clarify, consider this matrix:
MAT <- matrix(nrow=5,ncol=2, c(1,0,1,1,2,1,1,1,2,0))
The matrix looks like this:
> MAT
[,1] [,2]
[1,] 1 1
[2,] 0 1
[3,] 1 1
[4,] 1 2
[5,] 2 0
I would like to find the number of '1's occurring in column 2, but only if '0' occurs in column 1 in the same row. The only function I know which does something similar is table, but I don't think it can check another column; it can only exclude values in the data being checked. (Please correct me on this if I am wrong.) I have tried searching on the internet, but I only get hits to unrelated problems.
Can anyone help me find a function for this problem?
you can do something like this :
sum(MAT[,2]==1 & MAT[,1]==0)
You can always subset the matrix with a condition like this:
MAT[ MAT[,1] == 0, ]
table( MAT[ MAT[,1] == 0, ] )
This will give you the rows:
which(MAT[,1]==0 & MAT[,2]==1)
And the length of that is how many times that pattern occurs.
You can use table :
table(MAT[,2]==1 & MAT[,1]==0)
FALSE TRUE
4 1
I would like to delete rows from a large matrix using the following criteria:
Any row that contains 100 in its second column should be removed.
How can this be done? I know how to select those rows but I'm not sure how to remove them using a rule.
R > mat = matrix(c(1,2,3,100,200,300), 3,2)
R > mat
[,1] [,2]
[1,] 1 100
[2,] 2 200
[3,] 3 300
R > (index = mat[,2] == 100)
[1] TRUE FALSE FALSE
R > mat[index, ]
[1] 1 100
R > mat[!index, ]
[,1] [,2]
[1,] 2 200
[2,] 3 300
Previously I was confused by the index with another method which, here is the solution by which:
R > (index2 = which(mat[,2] == 100))
[1] 1
R > mat[-index2, ]
[,1] [,2]
[1,] 2 200
[2,] 3 300
Watch out the different use for those index (! and -).
Here's how I would do it in Matlab with a matrix A.
Option 1
for (i=size(A,1):-1:0)
if (A(i,2)==100)
A(i,:)=[];
end
end
This loops over rows (starting at the bottom), and sets any row with 100 in the 2nd element to an empty set, which effectively deletes it.
Maybe you can convert this to r, or maybe it will help somebody else who is having this problem.
Option 2
logicalIndex=(A(:,2)==100);
A(logicalIndex,:)=[];
This first finds rows with 100 in the 2nd column, then deletes them all.
Is it possible to select a subset of a three dimensional array with a two-dimensional binary array? I would like to be able to do this so that I can push values into the selection
For example I have an array dim(a) = (lat, long, time), and I want to select with dim(b) = (lat, long) which is an array full of TRUE/FALSE values. I want to be able to do something like:
> a <- array(c(1,2,3,4,5,6,7,8),c(2,2,2))
> b <- matrix(c(0,1,0,0), c(2,2))==TRUE
> a[[b]] <- 0
> a
, , 1
[,1] [,2]
[1,] 1 3
[2,] 0 4
, , 2
[,1] [,2]
[1,] 5 7
[2,] 0 8
Edit : ok, so this looks like a stupid question, as I just realised that it works exactly as stated above, if you use a[b] <- 0 (single brackets). But that only works if the dimension(s) you want to span are the ones at the end. So, to make it more interesting:
How can you do this if the dimension you want to span is the first or second dimension - eg. if dim(b)==(lat, years)?
R supports matrix subsetting of arrays with the [ operator (i.e. single bracket, not double - the double bracket will always only return a single element):
a[b] <- 0
a
, , 1
[,1] [,2]
[1,] 1 3
[2,] 0 4
, , 2
[,1] [,2]
[1,] 5 7
[2,] 0 8
Notice that this is somewhat different from the result you specify in your question. In your question, the second element (i.e. bottom left element of the matrix) is 1, thus you would expect the second element of each array slice to be modified. (In other words not the first, as you have in your example.)