List of combination between rows of diagonal block matrix - r

I have the following R matrix that is a combination of 2x3 and 3x3 submatrices and it can be more than 2 submatrices with different dimension (e.g. m1xp and m2xp and m3xp where each of m1,m2,m3 <= p)
A2 <- list(rbind(c(1,1,1),c(-1,1,-1)),
rbind(c(-1,1,1),c(1,-1,2),c(2,-1,2)))
library(Matrix)
A2 <- as.matrix(Matrix::bdiag(A2))
Rhs <- matrix(c(0,5,0.5,4),nrow = 4)
beta <- c(rep(1.2,3),c(0.5,0.2,0.1))
> A2
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1 1 1 0 0 0
[2,] -1 1 -1 0 0 0
[3,] 0 0 0 -1 1 1
[4,] 0 0 0 1 -1 2
[5,] 0 0 0 2 -1 2
I would like to get all the rows indices combination between the first sub-matrix and the 2nd sub-matrix to solve an linear optimization problem. The combination has to be from both submatrices then solve for new beta and then check if the condition Aq %*% beta == Rhs is satisfy, stop. If not, then take another combination. I think below is all the rows combination between the sub-matrices:
A combination as one from the first sub-matrix and one from the second sub-matrix
Aq <- A2[c(1,3),]
Aq <- A2[c(1,4),]
Aq <- A2[c(1,5),]
Aq <- A2[c(2,3),]
Aq <- A2[c(2,4),]
Aq <- A2[c(2,5),]
Then, a combination as one from the first and 2 from the second matrix
Aq <- A2[c(1,3,4),]
Aq <- A2[c(1,3,5),]
Aq <- A2[c(1,4,5),]
Aq <- A2[c(2,3,4),]
Aq <- A2[c(2,3,5),]
Aq <- A2[c(2,4,5),]
Then, a combination as one from the first and 3 from the second matrix
Aq <- A2[c(1,3,4,5),]
Aq <- A2[c(2,3,4,5),]
Then, a combination as 2 from the first and one from the second matrix
Aq <- A2[c(1,2,3),]
Aq <- A2[c(1,2,4),]
Aq <- A2[c(1,2,5),]
Then, a combination as 2 from the first and 2 from the second matrix
Aq <- A2[c(1,2,3,4),]
Aq <- A2[c(1,2,3,5),]
Aq <- A2[c(1,2,4,5),]
Then, a combination as 2 from the first and 3 from the second matrix
Aq <- A2[c(1,2,3,4,5),]
Is there a better way to get all the combinations?
Then I would like to create a loop that choice one on the above combination at a time and check if
if (Aq %*% beta == Rhs) {
break
} else {
TAKE ANOTHER COMBINATION Aq
}
Please note I could have more than 2 submatrices that create the block matrix. Then I have to create all row combinations between from the first, 2nd and 3rd matrix. I am hoping there is easy way to do in R. I have tried grid.expand function but it is not giving me the desired output.

A possible base R approach:
indices1 <- 1:2
indices2 <- 3:5
apply(expand.grid(seq_along(indices1), seq_along(indices2)), 1,
function(x) t(apply(
expand.grid(combn(indices1, x[1], simplify=FALSE),
combn(indices2, x[2], simplify=FALSE)),
1, unlist)))
output:
[[1]]
Var1 Var2
[1,] 1 3
[2,] 2 3
[3,] 1 4
[4,] 2 4
[5,] 1 5
[6,] 2 5
[[2]]
Var11 Var12 Var2
[1,] 1 2 3
[2,] 1 2 4
[3,] 1 2 5
[[3]]
Var1 Var21 Var22
[1,] 1 3 4
[2,] 2 3 4
[3,] 1 3 5
[4,] 2 3 5
[5,] 1 4 5
[6,] 2 4 5
[[4]]
Var11 Var12 Var21 Var22
[1,] 1 2 3 4
[2,] 1 2 3 5
[3,] 1 2 4 5
[[5]]
Var1 Var21 Var22 Var23
[1,] 1 3 4 5
[2,] 2 3 4 5
[[6]]
Var11 Var12 Var21 Var22 Var23
[1,] 1 2 3 4 5
edit: adding a more general version:
#identifying the indices
indices <- split(seq_len(nrow(A2)), max.col(abs(A2) > 0, "first"))
#generating the combinations
apply(expand.grid(lapply(indices, seq_along)), 1L,
function(idx) {
t(apply(
expand.grid(
lapply(seq_along(idx),
function(k) {
combn(indices[[k]], idx[k], simplify=FALSE)
})),
1L, unlist))
})

Related

extract every two elements in matrix row in r in sequence to calculate euclidean distance

How to extract every two elements in sequence in a matrix and return the result as a matrix so that I could feed the answer in a formula for calculation:
For example, I have a one row matrix with 6 columns:
[,1][,2][,3][,4][,5][,6]
[1,] 2 1 5 5 10 1
I want to extract column 1 and two in first iteration, 3 and 4 in second iteration and so on. The result has to be in the form of matrix.
[1,] 2 1
[2,] 5 5
[3,] 10 1
My original codes:
data <- matrix(c(1,1,1,2,2,1,2,2,5,5,5,6,10,1,10,2,11,1,11,2), ncol = 2)
Center Matrix:
[,1][,2][,3][,4][,5][,6]
[1,] 2 1 5 5 10 1
[2,] 1 1 2 1 10 1
[3,] 5 5 5 6 11 2
[4,] 2 2 5 5 10 1
[5,] 2 1 5 6 5 5
[6,] 2 2 5 5 11 1
[7,] 2 1 5 5 10 1
[8,] 1 1 5 6 11 1
[9,] 2 1 5 5 10 1
[10,] 5 6 11 1 10 2
objCentroidDist <- function(data, centers) {
resultMatrix <- matrix(NA, nrow=dim(data)[1], ncol=dim(centers)[1])
for(i in 1:nrow(centers)) {
resultMatrix [,i] <- sqrt(rowSums(t(t(data)-centers[i, ])^2))
}
resultMatrix
}
objCentroidDist(data,centers)
I want the Result matrix to be as per below:
[1,][,2][,3]
[1,]
[2,]
[3,]
[4,]
[5,]
[7,]
[8,]
[9,]
[10]
My concern is, how to calculate the data-centers distance if the dimensions of the data matrix are two, and centers matrix are six. (to calculate the distance from the data matrix and every two columns in centers matrix). Each row of the centers matrix has three centers.
Something like this maybe?
m <- matrix(c(2,1,5,5,10,1), ncol = 6)
list.seq.pairs <- lapply(seq(1, ncol(m), 2), function(x) {
m[,c(x, x+1)]
})
> list.seq.pairs
[[1]]
[1] 2 1
[[2]]
[1] 5 5
[[3]]
[1] 10 1
And, in case you're wanting to iterate over multiple rows in a matrix,
you can expand on the above like this:
mm <- matrix(1:18, ncol = 6, byrow = TRUE)
apply(mm, 1, function(x) {
lapply(seq(1, length(x), 2), function(y) {
x[c(y, y+1)]
})
})
EDIT:
I'm really not sure what you're after exactly. I think, if you want each row transformed into a 2 x 3 matrix:
mm <- matrix(1:18, ncol = 6, byrow = TRUE)
list.mats <- lapply(1:nrow(mm), function(x){
a = matrix(mm[x,], ncol = 2, byrow = TRUE)
})
> list.mats
[[1]]
[,1] [,2]
[1,] 1 2
[2,] 3 4
[3,] 5 6
[[2]]
[,1] [,2]
[1,] 7 8
[2,] 9 10
[3,] 11 12
[[3]]
[,1] [,2]
[1,] 13 14
[2,] 15 16
[3,] 17 18
If, however, you want to get to your results matrix- I think it's probably easiest to do whatever calculations you need to do while you're dealing with each row:
results <- t(apply(mm, 1, function(x) {
sapply(seq(1, length(x), 2), function(y) {
val1 = x[y] # Get item one
val2 = x[y+1] # Get item two
val1 / val2 # Do your calculation here
})
}))
> results
[,1] [,2] [,3]
[1,] 0.5000000 0.7500 0.8333333
[2,] 0.8750000 0.9000 0.9166667
[3,] 0.9285714 0.9375 0.9444444
That said, I don't understand what you're trying to do so this may miss the mark. You may have more luck if you ask a new question where you show example input and the actual expected output that you're after, with the actual values you expect.

Subsetting non-NA

I have a matrix in which every row has at least one NA cell, and every column has at least one NA cell as well. What I need is to find the largest subset of this matrix that contains no NAs.
For example, for this matrix A
A <-
structure(c(NA, NA, NA, NA, 2L, NA,
1L, 1L, 1L, 0L, NA, NA,
1L, 8L, NA, 1L, 1L, NA,
NA, 1L, 1L, 6L, 1L, 3L,
NA, 1L, 5L, 1L, 1L, NA),
.Dim = c(6L, 5L),
.Dimnames =
list(paste0("R", 1:6),
paste0("C", 1:5)))
A
C1 C2 C3 C4 C5
R1 NA 1 1 NA NA
R2 NA 1 8 1 1
R3 NA 1 NA 1 5
R4 NA 0 1 6 1
R5 2 NA 1 1 1
R6 NA NA NA 3 NA
There are two solutions (8 cells): A[c(2, 4), 2:5] and A[2:5, 4:5], though finding just one valid solution is enough for my purposes. The dimensions of my actual matrix are 77x132.
Being a noob, I see no obvious way to do this. Could anyone help me with some ideas?
1) optim In this approach we relax the problem to a continuous optimization problem which we solve with optim.
The objective function is f and the input to it is a 0-1 vector whose first nrow(A) entries correspond to rows and whose remaining entries correspond to columns. f uses a matrix Ainf which is derived from A by replacing the NAs with a large negative number and the non-NAs with 1. In terms of Ainf the negative of the number of elements in the rectangle of rows and columns corresponding to x is -x[seq(6)] %*% Ainf %*$ x[-seq(6)] which we minimize as a function of x subject to each component of x lying between 0 and 1.
Although this is a relaxation of the original problem to continuous optimization it seems that we get an integer solution, as desired, anyways.
Actually most of the code below is just to get the starting value. To do that we first apply seriation. This permutes the rows and columns giving a more blocky structure and then in the permuted matrix we find the largest square submatrix.
In the case of the specific A in the question the largest rectangular submatrix happens to be square and the starting values are already sufficiently good that they produce the optimum but we will perform the optimization anyways so it works in general. You can play around with different starting values if you like. For example, change k from 1 to some higher number in largestSquare in which case largestSquare will return k columns giving k starting values which can be used in k runs of optim taking the best.
If the starting values are sufficiently good then this should produce the optimum.
library(seriation) # only used for starting values
A.na <- is.na(A) + 0
Ainf <- ifelse(A.na, -prod(dim(A)), 1) # used by f
nr <- nrow(A) # used by f
f <- function(x) - c(x[seq(nr)] %*% Ainf %*% x[-seq(nr)])
# starting values
# Input is a square matrix of zeros and ones.
# Output is a matrix with k columns such that first column defines the
# largest square submatrix of ones, second defines next largest and so on.
# Based on algorithm given here:
# http://www.geeksforgeeks.org/maximum-size-sub-matrix-with-all-1s-in-a-binary-matrix/
largestSquare <- function(M, k = 1) {
nr <- nrow(M); nc <- ncol(M)
S <- 0*M; S[1, ] <- M[1, ]; S[, 1] <- M[, 1]
for(i in 2:nr)
for(j in 2:nc)
if (M[i, j] == 1) S[i, j] = min(S[i, j-1], S[i-1, j], S[i-1, j-1]) + 1
o <- head(order(-S), k)
d <- data.frame(row = row(M)[o], col = col(M)[o], mx = S[o])
apply(d, 1, function(x) {
dn <- dimnames(M[x[1] - 1:x[3] + 1, x[2] - 1:x[3] + 1])
out <- c(rownames(M) %in% dn[[1]], colnames(M) %in% dn[[2]]) + 0
setNames(out, unlist(dimnames(M)))
})
}
s <- seriate(A.na)
p <- permute(A.na, s)
# calcualte largest square submatrix in p of zeros rearranging to be in A's order
st <- largestSquare(1-p)[unlist(dimnames(A)), 1]
res <- optim(st, f, lower = 0*st, upper = st^0, method = "L-BFGS-B")
giving:
> res
$par
R1 R2 R3 R4 R5 R6 C1 C2 C3 C4 C5
0 1 1 1 0 0 0 1 0 1 1
$value
[1] -9
$counts
function gradient
1 1
$convergence
[1] 0
$message
[1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL"
2) GenSA Another possibility is to repeat (1) but instead of using optim use GenSA from the GenSA package. It does not require starting values (although you can provide a starting value using the par argument and this might improve the solution in some cases) so the code is considerably shorter but since it uses simulated annealing it can be expected to take substantially longer to run. Using f (and nr and Ainf which f uses) from (1). Below we try it without a starting value.
library(GenSA)
resSA <- GenSA(lower = rep(0, sum(dim(A))), upper = rep(1, sum(dim(A))), fn = f)
giving:
> setNames(resSA$par, unlist(dimnames(A)))
R1 R2 R3 R4 R5 R6 C1 C2 C3 C4 C5
0 1 1 1 0 0 0 1 0 1 1
> resSA$value
[1] -9
I have a solution, but it doesn't scale very well:
findBiggestSubmatrixNonContiguous <- function(A) {
A <- !is.na(A); ## don't care about non-NAs
howmany <- expand.grid(nr=seq_len(nrow(A)),nc=seq_len(ncol(A)));
howmany <- howmany[order(apply(howmany,1L,prod),decreasing=T),];
for (ri in seq_len(nrow(howmany))) {
nr <- howmany$nr[ri];
nc <- howmany$nc[ri];
rcom <- combn(nrow(A),nr);
ccom <- combn(ncol(A),nc);
comcom <- expand.grid(ri=seq_len(ncol(rcom)),ci=seq_len(ncol(ccom)));
for (comi in seq_len(nrow(comcom)))
if (all(A[rcom[,comcom$ri[comi]],ccom[,comcom$ci[comi]]]))
return(list(ri=rcom[,comcom$ri[comi]],ci=ccom[,comcom$ci[comi]]));
}; ## end for
NULL;
}; ## end findBiggestSubmatrixNonContiguous()
It's based on the idea that if the matrix has a small enough density of NAs, then by searching for the largest submatrices first, you'll be likely to find a solution fairly quickly.
The algorithm works by computing a cartesian product of all counts of rows and counts of columns that could be indexed out of the original matrix to produce the submatrix. The set of pairs of counts is then decreasingly ordered by the size of the submatrix that would be produced by each pair of counts; in other words, ordered by the product of the two counts. It then iterates over these pairs. For each pair, it computes all combinations of row indexes and column indexes that could be taken for that pair of counts, and tries each combination in turn until it finds a submatrix that contains zero NAs. Upon finding such a submatrix, it returns that set of row and column indexes as a list.
The result is guaranteed to be correct because it tries submatrix sizes in decreasing order, so the first one it finds must be the biggest (or tied for the biggest) possible submatrix that satisfies the condition.
## OP's example matrix
A <- data.frame(C1=c(NA,NA,NA,NA,2L,NA),C2=c(1L,1L,1L,0L,NA,NA),C3=c(1L,8L,NA,1L,1L,NA),C4=c(NA,1L,1L,6L,1L,3L),C5=c(NA,1L,5L,1L,1L,NA),row.names=c('R1','R2','R3','R4','R5','R6'));
A;
## C1 C2 C3 C4 C5
## R1 NA 1 1 NA NA
## R2 NA 1 8 1 1
## R3 NA 1 NA 1 5
## R4 NA 0 1 6 1
## R5 2 NA 1 1 1
## R6 NA NA NA 3 NA
system.time({ res <- findBiggestSubmatrixNonContiguous(A); });
## user system elapsed
## 0.094 0.000 0.100
res;
## $ri
## [1] 2 3 4
##
## $ci
## [1] 2 4 5
##
A[res$ri,res$ci];
## C2 C4 C5
## R2 1 1 1
## R3 1 1 5
## R4 0 6 1
We see that the function works very quickly on the OP's example matrix, and returns a correct result.
randTest <- function(NR,NC,probNA,seed=1L) {
set.seed(seed);
A <- replicate(NC,sample(c(NA,0:9),NR,prob=c(probNA,rep((1-probNA)/10,10L)),replace=T));
print(A);
print(system.time({ res <- findBiggestSubmatrixNonContiguous(A); }));
print(res);
print(A[res$ri,res$ci,drop=F]);
invisible(res);
}; ## end randTest()
I wrote the above function to make testing easier. We can call it to test a random input matrix of size NR by NC, with a probability of choosing NA in any given cell of probNA.
Here are a few trivial tests:
randTest(8L,1L,1/3);
## [,1]
## [1,] NA
## [2,] 1
## [3,] 4
## [4,] 9
## [5,] NA
## [6,] 9
## [7,] 0
## [8,] 5
## user system elapsed
## 0.016 0.000 0.003
## $ri
## [1] 2 3 4 6 7 8
##
## $ci
## [1] 1
##
## [,1]
## [1,] 1
## [2,] 4
## [3,] 9
## [4,] 9
## [5,] 0
## [6,] 5
randTest(11L,3L,4/5);
## [,1] [,2] [,3]
## [1,] NA NA NA
## [2,] NA NA NA
## [3,] NA NA NA
## [4,] 2 NA NA
## [5,] NA NA NA
## [6,] 5 NA NA
## [7,] 8 0 4
## [8,] NA NA NA
## [9,] NA NA NA
## [10,] NA 7 NA
## [11,] NA NA NA
## user system elapsed
## 0.297 0.000 0.300
## $ri
## [1] 4 6 7
##
## $ci
## [1] 1
##
## [,1]
## [1,] 2
## [2,] 5
## [3,] 8
randTest(10L,10L,1/3);
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
## [1,] NA NA 0 3 8 3 9 1 6 NA
## [2,] 1 NA NA 4 5 8 NA 8 2 NA
## [3,] 4 2 5 3 7 6 6 1 1 5
## [4,] 9 1 NA NA 4 NA NA 1 NA 9
## [5,] NA 7 NA 8 3 NA 5 3 7 7
## [6,] 9 3 1 2 7 NA NA 9 NA 7
## [7,] 0 2 NA 7 NA NA 3 8 2 6
## [8,] 5 0 1 NA 3 3 7 1 NA 6
## [9,] 5 1 9 2 2 5 NA 7 NA 8
## [10,] NA 7 1 6 2 6 9 0 NA 5
## user system elapsed
## 8.985 0.000 8.979
## $ri
## [1] 3 4 5 6 8 9 10
##
## $ci
## [1] 2 5 8 10
##
## [,1] [,2] [,3] [,4]
## [1,] 2 7 1 5
## [2,] 1 4 1 9
## [3,] 7 3 3 7
## [4,] 3 7 9 7
## [5,] 0 3 1 6
## [6,] 1 2 7 8
## [7,] 7 2 0 5
I don't know an easy way of verifying if the above result is correct, but it looks good to me. But it took almost 9 seconds to generate this result. Running the function on moderately larger matrices, especially a 77x132 matrix, is probably a lost cause.
Waiting to see if someone can come up with a brilliant efficient solution...

How to fill in a matrix given diagonal and off-diagonal elements in r?

I have the elements for a matrix as follows:
diag= rep(1,5)
offdiag = c(rep(1:4), rep(1:3), rep(1:2), 1)
The final matrix I want should should be a symmetric matrix that looks like this:
1 1 2 3 4
1 1 1 2 3
2 1 1 1 2
3 2 1 1 1
4 3 2 1 1
where the diagonal is filled by diag and the lower-trianglar area is filled by offdiag column-wise.
In practice, all all numbers are random. So I need a generic way to fill in the matrix with elements.
Thanks in advance!
Try this:
m <- matrix(NA, ncol = length(diag), nrow = length(diag))
m[lower.tri(m)] <- offdiag
m[upper.tri(m)] <- t(m)[upper.tri(t(m))]
diag(m) <- diag
m
# [,1] [,2] [,3] [,4] [,5]
# [1,] 1 1 2 3 4
# [2,] 1 1 1 2 3
# [3,] 2 1 1 1 2
# [4,] 3 2 1 1 1
# [5,] 4 3 2 1 1
Another alternative: Manually create a distance matrix and work from there.
class(offdiag) <- "dist"
attr(offdiag, "Size") <- length(diag)
out <- as.matrix(offdiag)
diag(out) <- diag
out

How can I create vectors of a standard size so I can add them to a data frame?

I have a single vector (call it t1) with a series of observations. I want to create a set of new vectors by popping the first observation from t1 (and so on for subsequent near-copies). But I want to keep the vectors the same length so I can add them to a data frame later.
I was able to make it work as follows:
t1 <- c(1, 2, 3)
t2 <- t1[-1]
t3 <- t2[-1]
t2[length(t2)+1] <- 0
t3[length(t3)+1] <- 0
t3[length(t3)+1] <- 0
t.all <- cbind(as.data.frame(t1), as.data.frame(t2), as.data.frame(t3))
t.all
t1 t2 t3
1 1 2 3
2 2 3 0
3 3 0 0
But this is clumsy and it's going to be tedious if I want to create a large number of columns. How can I keep the vectors the same length (or solve this problem another way)?
Here a loop version of what you try to do , uding do.call and lapply:
cbind(t1,do.call(cbind,lapply(seq_along(t1)-1,
function(x)c(tail(t1,-x),rep(0,x)))))
t1
[1,] 1 2 3
[2,] 2 3 0
[3,] 3 0 0
> t.all <- sapply(0:2, function(x) c( t1[(x+1):3], rep(0,x) ) )
> t.all
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 2 3 0
[3,] 3 0 0
If you need it to be a data.frame it would be a lot more efficient to build as a matrix first and then wrap as.data.frame around the final result.
Here's another way using vector indexing:
t1 <- (2,5,3)
mm <- do.call(rbind, lapply(seq_along(t1), function(x) t1[x:length(t1)][1:length(t1)]))
# [,1] [,2] [,3]
# [1,] 2 5 3
# [2,] 5 3 NA
# [3,] 3 NA NA
mm[is.na(mm)] <- 0
# [,1] [,2] [,3]
# [1,] 2 5 3
# [2,] 5 3 0
# [3,] 3 0 0
Another way without using apply family:
t1 <- c(2,5,4,6)
len <- length(t1)
matrix(t1[outer(1:len, 0:(len-1), '+')], ncol=len)
# [,1] [,2] [,3] [,4]
# [1,] 2 5 4 6
# [2,] 5 4 6 NA
# [3,] 4 6 NA NA
# [4,] 6 NA NA NA
How about creating a matrix row-by-row, by recycling t1 as desired:
tmat <-cbind(t1,t1,t1,t1,....) # as many as needed
Then just use a matrix triangle function
newmat<- tmat * upper.tri(tmat,diag=TRUE)
That's offset from your sample, but contains the same info per row.
Most of the other answers focus on creating the final data.frame. If that is your ultimate goal, then they provide good approaches. This answer instead focuses narrowly on your question of how to take the first element off and preserve the length. In order to keep things tidy, it is best to do the whole thing in one function.
shift <- function(tx) {append(tx[-1],0)}
Then you can have
t1 <- c(1, 2, 3)
t2 <- shift(t1)
t3 <- shift(t2)
t.all <- data.frame(t1, t2, t3)
which gives you the same result you had.
> t.all
t1 t2 t3
1 1 2 3
2 2 3 0
3 3 0 0
If you want to combine this function with a looping construct to create the data.frame, it is easiest to go through a matrix first.
t.all <- matrix(t1, nrow=length(t1), ncol=length(t1))
lapply(seq(length=length(t1))[-1], function(i) {
t.all[,i] <<- shift(t.all[,(i-1)])
})
t.all <- as.data.frame(t.all)
which gives the same data.frame, but with slightly different column names
> t.all
V1 V2 V3
1 1 2 3
2 2 3 0
3 3 0 0

in R, how to retrieve a complete matrix using combn?

My problem, removing the specific purpose, seems like this:
how to transform a combination like this:
first use combn(letters[1:4], 2) to calculate the combination
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] "a" "a" "a" "b" "b" "c"
[2,] "b" "c" "d" "c" "d" "d"
use each column to obtain another data frame:
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1 2 3 4 5 6
elements are obtained, for example: the first element, from the first column of the above dataframe
then How can i transform the above dataframe into a matrix, for example result, things like:
a b c d
a 0 1 2 3
b 1 0 4 5
c 2 4 0 6
d 3 5 6 0
the elements with same col and row names will have zero value where others corresponding to above value
Here is one way that works:
inputs <- letters[1:4]
combs <- combn(inputs, 2)
N <- seq_len(ncol(combs))
nams <- unique(as.vector(combs))
out <- matrix(ncol = length(nams), nrow = length(nams))
out[lower.tri(out)] <- N
out <- t(out)
out[lower.tri(out)] <- N
out <- t(out)
diag(out) <- 0
rownames(out) <- colnames(out) <- inputs
Which gives:
> out
a b c d
a 0 1 2 3
b 1 0 4 5
c 2 4 0 6
d 3 5 6 0
If I had to do this a lot, I'd wrap those function calls into a function.
Another option is to use as.matrix.dist() to do the conversion for us by setting up a "dist" object by hand. Using some of the objects from earlier:
## Far easier
out2 <- N
class(out2) <- "dist"
attr(out2, "Labels") <- as.character(inputs)
attr(out2, "Size") <- length(inputs)
attr(out2, "Diag") <- attr(out2, "Upper") <- FALSE
out2 <- as.matrix(out2)
Which gives:
> out2
a b c d
a 0 1 2 3
b 1 0 4 5
c 2 4 0 6
d 3 5 6 0
Again, I'd wrap this in a function if I had to do it more than once.
Does it have to be a mirror matrix with zeros over the diagonal?
combo <- combn(letters[1:4], 2)
in.combo <- matrix(1:6, nrow = 1)
combo <- rbind(combo, in.combo)
out.combo <- matrix(rep(NA, 16), ncol = 4)
colnames(out.combo) <- letters[1:4]
rownames(out.combo) <- letters[1:4]
for(cols in 1:ncol(combo)) {
vec1 <- combo[, cols]
out.combo[vec1[1], vec1[2]] <- as.numeric(vec1[3])
}
> out.combo
a b c d
a NA 1 2 3
b NA NA 4 5
c NA NA NA 6
d NA NA NA NA

Resources