Write a value for maximum/minimum between two values - r

I have a two-column matrix and I want to produce a new matrix/data.frame where Col N has 1 if is maximum, 0 otherwise (they are never equal). This is my attempt:
testM <- matrix(c(1,2,3, 1,1,5), ncol = 2, byrow = T)
>testM
V1 V2
1 1 2
2 3 1
3 1 5
apply(data.frame(testM), 1, function(row) ifelse(max(row[1],row[2]),1,0))
I expect to have:
0 1
1 0
0 1
because of the 0,1 parameters in max() function, but I just get
[1] 1 1 1
Any ideas?

Or using pmax
testM <- matrix(c(1,2,3, 1,1,5), ncol = 2, byrow = T)
--(testM==pmax(testM[,1],testM[,2]))
V1 V2
[1,] 0 1
[2,] 1 0
[3,] 0 1

You can perform arithmetic on Booleans in R! Just check if an element in each row is equal to it's max value and multiply by 1.
t(apply(testM, 1, function(row) 1*(row == max(row))))

You can use max.col and col to produce a logical matrix:
res <- col(testM) == max.col(testM)
res
[,1] [,2]
[1,] FALSE TRUE
[2,] TRUE FALSE
[3,] FALSE TRUE
If you want it as 0/1, you can do:
res <- as.integer(col(testM) == max.col(testM)) # this removes the dimension
dim(res) <- dim(testM) # puts the dimension back
res
[,1] [,2]
[1,] 0 1
[2,] 1 0
[3,] 0 1

Related

R: How do I translate solve_LSAP into a dataframe?

I'm trying to do a large assignment problem, and am using LSAP. It works, but I am trying to get the output into a dataframe so that I can do more with it. However, the documentation on the function says "An object of class "solve_LSAP" with the optimal assignment of rows to columns", with no further information on the data. I cannot seem to crack open the class to break the data out into a more usable form.
I've provided their example code.
x <- matrix(c(5, 1, 4, 3, 5, 2, 2, 4, 4), nrow = 3)
y <- solve_LSAP(x, maximum = FALSE)
y
Output:
Optimal assignment:
1 => 3, 2 => 1, 3 => 2
I have 200+ assignments, and the baseline output is simply not usable for me. How can I translate it into a dataframe, or at least a matrix that looks something like the below?
Row Column
1 3
2 1
3 2
The solve_LSAP returns column indices by each row, therefore comprising all you need for reconstruction:
len <- length(y)
parsedMat <- cbind(
1:len,
as.integer(y)
)
parsedMat
[,1] [,2]
[1,] 1 3
[2,] 2 1
[3,] 3 2
This can be turned into a solved matrix by:
solvedMat <- matrix(0, nrow = len, ncol = len)
solvedMat[parsedMat] <- 1
solvedMat
[,1] [,2] [,3]
[1,] 0 0 1
[2,] 1 0 0
[3,] 0 1 0
You can also turn this into a function that will return both outputs in form of a list, e.g.:
parseClueOutput <- function(x) {
len <- length(x)
parsedMat <- cbind(
1:len,
as.integer(x)
)
solvedMat <- matrix(0, nrow = len, ncol = len)
solvedMat[parsedMat] <- 1
return(
list(
parsedMat = parsedMat,
solvedMat = solvedMat
)
)
}
And use it as:
parseClueOutput(y)
$parsedMat
[,1] [,2]
[1,] 1 3
[2,] 2 1
[3,] 3 2
$solvedMat
[,1] [,2] [,3]
[1,] 0 0 1
[2,] 1 0 0
[3,] 0 1 0
As for the structure, the solve_LSAP is not really a complicated object, it's essentially a numeric vector which you can see with:
is.numeric(y)
[1] TRUE
Or:
str(y)
'solve_LSAP' num [1:3] 3 1 2
You can also turn the solvedMat or parsedMat into a dataframe easily - for example, the parsedMat:
setNames(as.data.frame(parsedMat), c('Row', 'Column'))
Row Column
1 1 3
2 2 1
3 3 2

Create matrix with for-loop

I am trying to create the following matrix A for n rows and n+1 columns. n will likely be around 20 or 30, but for the purpose of the question I put it at 4 and 5.
Here is what I have so far:
N <- 5 # n+1
n <- 4 # n
columns <- list()
# first column:
columns[1] <- c(-1, 1, rep(0, N-2))
# all other columns:
for(i in N:2) {
columns[i] <- c((rep(0, N-i), 1, -2, 1, rep(0, i-3)))
}
# combine into matrix:
A <- cbind(columns)
I keep getting the following error msg:
In columns[1] <- c(-1, 1, rep(0, N - 2)) :
number of items to replace is not a multiple of replacement length
And later
"for(i in N:2) {
columns[i] <- c((rep(0, N-i),"
}
Error: unexpected '}' in "}"
I guess you can try the for loop below to create your matrix A:
N <- 5
n <- 4
A <- matrix(0,n,N)
for (i in 1:nrow(A)) {
if (i == 1) {
A[i,1:2] <- c(-1,1)
} else {
A[i,i+(-1:1)] <- c(1,-2,1)
}
}
such that
> A
[,1] [,2] [,3] [,4] [,5]
[1,] -1 1 0 0 0
[2,] 1 -2 1 0 0
[3,] 0 1 -2 1 0
[4,] 0 0 1 -2 1
Another solution is to use outer, and this method would be faster and looks more compact than the for loop approach, i.e.,
A <- `diag<-`(replace(z<-abs(outer(1:n,1:N,"-")),!z %in% c(0,1),0),
c(-1,rep(-2,length(diag(z))-1)))
I thought this would be fast compared to the loop, but when I tested on a 5000x5001 example, the loop in ThomasIsCoding's answer was about 5x faster. Go with that one!
N = 5
n = N - 1
A = matrix(0, nrow = n, ncol = N)
delta = row(A) - col(A)
diag(A) = -2
A[delta %in% c(1, -1)] = 1
A[1, 1] = -1
A
# [,1] [,2] [,3] [,4] [,5]
# [1,] -1 1 0 0 0
# [2,] 1 -2 1 0 0
# [3,] 0 1 -2 1 0
# [4,] 0 0 1 -2 1
You could use data.table::shift to shift the vector c(1, -2, 1, 0) by all increments from -1 (backwards shift / lead by 1) to n - 1 (forward shift / lagged by n - 1) and then cbind all the shifted outputs together. The first-row first-column element doesn't follow this pattern so that's fixed at the end.
library(data.table)
out <- do.call(cbind, shift(c(1, -2, 1, 0), seq(-1, n - 1), fill = 0))
out[1, 1] <- -1
out
# [,1] [,2] [,3] [,4] [,5]
# [1,] -1 1 0 0 0
# [2,] 1 -2 1 0 0
# [3,] 0 1 -2 1 0
# [4,] 0 0 1 -2 1

Calculate nXn in matrix - R?

I converted an image into a 100x100 matrix of 0's and 1's.
An ntile is nXn selection. I am trying to calculate how many 2 tiles there are in the matrix with the 2 left most entries are 1 and the two most right entries are 0.
Eg
[1 0]
[1 0]
Any idea on how to start this?. I am quite new to R. Thanks very much in advance.
You can subset all chunks of larger matrix with dimensions equal to that of ntile and then check if all the elements of the chunk match the corresponding elements of ntile.
#Data
set.seed(1)
m = matrix(sample(1:0, 16, TRUE), 4)
m[3, 4] = 0
ntile = rbind(1:0, 1:0)
n = dim(ntile)
ans = t(sapply(n[1]:nrow(m), function(i){
sapply(n[2]:ncol(m), function(j){
temp = m[(i- nrow(ntile) + 1):i, (j - ncol(ntile) + 1):j]
all(temp == ntile)
})
}))
ans
# [,1] [,2] [,3]
#[1,] FALSE FALSE FALSE
#[2,] FALSE FALSE FALSE
#[3,] FALSE FALSE TRUE
sum(ans)
#[1] 1
Here is a simple solution if I understood your question correctly:
set.seed(123)
size <- 4
m <- matrix(sample(0:1, 12, replace = TRUE), size-1, size)
m <- rbind(m, c(0,0,1,0))
sum(m[1:(size-1),1:(size-1)] == 1 & m[2:size,1:(size-1)] == 1 &
m[1:(size-1),2:size] == 0 & m[2:size,2:size] == 0)
Input
[,1] [,2] [,3] [,4]
[1,] 0 1 1 0
[2,] 1 1 1 1
[3,] 0 0 1 0
[4,] 0 0 1 0
Output
# 1
You can make sure that the number of 2 tiles is 1.

How to use which() on a matrix to get unique indices

Suppose I have a symmetric matrix:
> mat <- matrix(c(1,0,1,0,0,0,1,0,1,1,0,0,0,0,0,0), ncol=4, nrow=4)
> mat
[,1] [,2] [,3] [,4]
[1,] 1 0 1 0
[2,] 0 0 1 0
[3,] 1 1 0 0
[4,] 0 0 0 0
which I would like to analyse:
> which(mat==1, arr.ind=T)
row col
[1,] 1 1
[2,] 3 1
[3,] 3 2
[4,] 1 3
[5,] 2 3
now the question is: how am I not considering duplicated cells? As the resulting index matrix shows, I have the rows 2 and 4 pointing respectively to (3,1) and (1,3), which is the same cell.
How do I avoid such a situation? I only need a reference for each cell, even though the matrix is symmetric. Is there an easy way to deal with such situations?
EDIT:
I was thinking about using upper.tri or lower.tri but in this case what I get is an vector version of the matrix and I am not able to get back to the (row, col) notation.
> which(mat[upper.tri(mat)]==1, arr.ind=T)
[1] 2 3
EDIT II
expected output would be something like an unique over the couple of (row, col) and (col, row):
row col
[1,] 1 1
[2,] 3 1
[3,] 3 2
Since you have symmetrical matrix you could do
which(mat == 1 & upper.tri(mat, diag = TRUE), arr.ind = TRUE)
# row col
#[1,] 1 1
#[2,] 1 3
#[3,] 2 3
OR
which(mat == 1 & lower.tri(mat, diag = TRUE), arr.ind = TRUE)

Match list to rows of matrix in R

"a" is a list and "b" is a matrix.
a<-list(matrix(c(0,2,0,1,0,2,0,0,1,0,0,0,0,0,2,2),4),
matrix(c(0,1,0,0,0,1,1,0,0,0,0,0),3),
matrix(c(0,0,0,0,2,0,1,0,0,0,0,0,2,0,2,1,0,1,1,0),5))
b<-matrix(c(2,2,1,1,1,2,1,2,1,1,2,1,1,1,1,1,1,2,2,2,1,2,1,1),6)
> a
[[1]]
[,1] [,2] [,3] [,4]
[1,] 0 0 1 0
[2,] 2 2 0 0
[3,] 0 0 0 2
[4,] 1 0 0 2
[[2]]
[,1] [,2] [,3] [,4]
[1,] 0 0 1 0
[2,] 1 0 0 0
[3,] 0 1 0 0
[[3]]
[,1] [,2] [,3] [,4]
[1,] 0 0 0 1
[2,] 0 1 0 0
[3,] 0 0 2 1
[4,] 0 0 0 1
[5,] 2 0 2 0
> b
[,1] [,2] [,3] [,4]
[1,] 2 1 1 2
[2,] 2 2 1 2
[3,] 1 1 1 1
[4,] 1 1 1 2
[5,] 1 2 1 1
[6,] 2 1 2 1
There are 3 objects in list "a". I want to test whether all the non-zero elements in each object in the list "a" match with the corresponding position of the same row in matrix "b". If matched, output the matched row number of b.
For example, the second object is
[[2]]
[,1] [,2] [,3] [,4]
[1,] 0 0 1 0
[2,] 1 0 0 0
[3,] 0 1 0 0
We can see the non-zero number in the 1st row is 1, and it locates in the third place of the row, it can match the 1-5 rows of matrix "b", the non-zero number in the 2nd row is 1, and it locates in the first place of this row, it can match the 3-5 rows of matrix "b", the non-zero number in the 3rd row is 1, and it locates in the second place of this row, it can match the 3-4 rows of matrix "b". so only the 3rd or 4th row of Matrix "b" can match all the rows in this object, so the output result is "3 4".
My attempting code is as follows:
temp<-Map(function(y) t(y), Map(function(a)
apply(a,1,function(x){
apply(b,1, function(y) identical(x[x!=0],y[x!=0]))}),a))
lapply(temp, function(a) which(apply(a,2,prod)==1))
The result is as follows:
[[1]]
integer(0)
[[2]]
[1] 3 4
[[3]]
[1] 6
It is right. but I wonder whether there is more quick code to handle this question?
Having a few columns and trying to take advantage of columns with > 1 unique values or no non-zero values to reduce computations:
ff = function(a, b)
{
i = seq_len(nrow(b)) #starting candidate matches
for(j in seq_len(ncol(a))) {
aj = a[, j]
nzaj = aj[aj != 0L]
if(!length(nzaj)) next #if all(a[, j] == 0) save some operations
if(sum(tabulate(nzaj) > 0L) > 1L) return(integer()) #if no unique values in a column break looping
i = i[b[i, j] == nzaj[[1L]]] #update candidate matches
}
return(i)
}
lapply(a, function(x) ff(x, b))
#[[1]]
#integer(0)
#
#[[2]]
#[1] 3 4
#
#[[3]]
#[1] 6
With data of your actual size:
set.seed(911)
a2 = replicate(300L, matrix(sample(0:3, 20 * 5, TRUE, c(0.97, 0.01, 0.01, 0.01)), 20, 5), simplify = FALSE)
b2 = matrix(sample(1:3, 15 * 5, TRUE), 15, 5)
identical(OP(a2, b2), lapply(a2, function(x) ff(x, b2)))
#[1] TRUE
microbenchmark::microbenchmark(OP(a2, b2), lapply(a2, function(x) ff(x, b2)), times = 50)
#Unit: milliseconds
# expr min lq mean median uq max neval cld
# OP(a2, b2) 686.961815 730.840732 760.029859 753.790094 785.310056 863.04577 50 b
# lapply(a2, function(x) ff(x, b2)) 8.110542 8.450888 9.381802 8.949924 9.872826 15.51568 50 a
OP is:
OP = function (a, b)
{
temp = Map(function(y) t(y), Map(function(a) apply(a, 1,
function(x) {
apply(b, 1, function(y) identical(x[x != 0], y[x !=
0]))
}), a))
lapply(temp, function(x) which(apply(x, 2, prod) == 1))
}
Your explanations of what you want and what your possible matrices look like are really not clear. From what I can deduce, you want to match the row number in b that matches the unique non-zero number in each column of a matrix in a. If so, here's a simpler option:
lapply(a, function(x){ # loop across the matrices in a
x[x == 0] <- NA # replace 0s with NA
which(apply(b, 1, function(y){ # loop across the rows of b, trying to match
all(y == colMeans(x, na.rm = TRUE)) # the rows of b with the colmeans of x
}))
})
# [[1]]
# [1] 2
#
# [[2]]
# [1] 5
#
# [[3]]
# [1] 6

Resources