Changing the values in a binary matrix - r

Consider the 8 by 6 binary matrix, M:
M <- matrix(c(0,0,1,1,0,0,1,1,
0,1,1,0,0,1,1,0,
0,0,0,0,1,1,1,1,
0,1,0,1,1,0,1,0,
0,0,1,1,1,1,0,0,
0,1,1,0,1,0,0,1),nrow = 8,ncol = 6)
Here is the M
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0 0 0 0 0 0
[2,] 0 1 0 1 0 1
[3,] 1 1 0 0 1 1
[4,] 1 0 0 1 1 0
[5,] 0 0 1 1 1 1
[6,] 0 1 1 0 1 0
[7,] 1 1 1 1 0 0
[8,] 1 0 1 0 0 1
The following matrix contains the column index of the 1's in matrix M
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 3 2 5 2 3 2
[2,] 4 3 6 4 4 3
[3,] 7 6 7 5 5 5
[4,] 8 7 8 7 6 8
Let's denote that
ind <- matrix(c(3,4,7,8,
2,3,6,7,
5,6,7,8,
2,4,5,7,
3,4,5,6,
2,3,5,8),nrow = 4, ncol=6)
I'm trying to change a single position of 1 into 0in each column of M.
For an example, one possibility of index of1s in each column would be (4,2,5,4,3,2), i.e. 4th position of Column1, 2nd position of Column2, 5thposition of Column3 and so on. Let N be the resulting matrices. This will produce the following matrix N
N <- matrix(c(0,0,1,0,0,0,1,1,
0,0,1,0,0,1,1,0,
0,0,0,0,0,1,1,1,
0,1,0,0,1,0,1,0,
0,0,0,1,1,1,0,0,
0,0,1,0,1,0,0,1),nrow = 8,ncol = 6)
Here is that N
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0 0 0 0 0 0
[2,] 0 0 0 1 0 0
[3,] 1 1 0 0 0 1
[4,] 0 0 0 0 1 0
[5,] 0 0 0 1 1 1
[6,] 0 1 1 0 1 0
[7,] 1 1 1 1 0 0
[8,] 1 0 1 0 0 1
For EACH of the resulting matrices of N, I do the following calculations.
X <- cbind(c(rep(1,nrow(N))),N)
ans <- sum(diag(solve(t(X)%*%X)[-1,-1]))
Then, I want to obtain the matrix N, which produce the smallest value of ans. How do I do this efficiently?

Let me know if this works.
We first build a conversion function that I'll need, and we build also the reverse function as you may need it at some point:
ind_to_M <- function(ind){
M <- matrix(rep(0,6*8),ncol=6)
for(i in 1:ncol(ind)){M[ind[,i],i] <- 1}
return(M)
}
M_to_ind <- function(M){apply(M==1,2,which)}
Then we will build a matrix of possible ways to ditch a value
all_possible_ways_to_ditch_value <- 1:4
for (i in 2:ncol(M)){
all_possible_ways_to_ditch_value <- merge(all_possible_ways_to_ditch_value,1:4,by=NULL)
}
# there's probably a more elegant way to do that
head(all_possible_ways_to_ditch_value)
# x y.x y.y y.x y.y y
# 1 1 1 1 1 1 1 # will be used to ditch the 1st value of ind for every column
# 2 2 1 1 1 1 1
# 3 3 1 1 1 1 1
# 4 4 1 1 1 1 1
# 5 1 2 1 1 1 1
# 6 2 2 1 1 1 1
Then we iterate through those, each time storing ans and N (as data is quite small overall).
ans_list <- list()
N_list <- list()
for(j in 1:nrow(all_possible_ways_to_ditch_value)){
#print(j)
ind_N <- matrix(rep(0,6*3),ncol=6) # initiate ind_N as an empty matrix
for(i in 1:ncol(M)){
ind_N[,i] <- ind[-all_possible_ways_to_ditch_value[j,i],i] # fill with ind except for the value we ditch
}
N <- ind_to_M(ind_N)
X <- cbind(c(rep(1,nrow(N))),N)
ans_list[[j]] <- try(sum(diag(solve(t(X)%*%X)[-1,-1])),silent=TRUE) # some systems are not well defined, we'll just ignore the errors
N_list[[j]] <- N
}
We finally retrieve the minimal ans and the relevant N
ans <- ans_list[[which.min(ans_list)]]
# [1] -3.60288e+15
N <- N_list[[which.min(ans_list)]]
# [,1] [,2] [,3] [,4] [,5] [,6]
# [1,] 0 0 0 0 0 0
# [2,] 0 1 0 1 0 1
# [3,] 1 1 0 0 1 1
# [4,] 1 0 0 1 1 0
# [5,] 0 0 1 1 1 1
# [6,] 0 1 1 0 0 0
# [7,] 1 0 1 0 0 0
# [8,] 0 0 0 0 0 0
EDIT:
To get minimal positive ans
ans_list[which(!sapply(ans_list,is.numeric))] <- Inf
ans <- ans_list[[which.min(abs(unlist(ans_list)))]]
# [1] 3.3
N <- N_list[[which.min(abs(unlist(ans_list)))]]
# [,1] [,2] [,3] [,4] [,5] [,6]
# [1,] 0 0 0 0 0 0
# [2,] 0 1 0 1 0 0
# [3,] 1 1 0 0 0 1
# [4,] 1 0 0 0 1 0
# [5,] 0 0 0 1 1 1
# [6,] 0 1 1 0 1 0
# [7,] 1 0 1 1 0 0
# [8,] 0 0 1 0 0 1
EDIT 2 : to generalize the number of rows of ind to ditch
It seems to give the same result for ans for n_ditch = 1, and results make sense for n_ditch = 2
n_ditch <- 2
ditch_possibilities <- combn(1:4,n_ditch) # these are all the possible sets of indices to ditch for one given columns
all_possible_ways_to_ditch_value <- 1:ncol(ditch_possibilities) # this will be all the possible sets of indices of ditch_possibilities to test
for (i in 2:ncol(M)){
all_possible_ways_to_ditch_value <- merge(all_possible_ways_to_ditch_value,1:ncol(ditch_possibilities),by=NULL)
}
ans_list <- list()
N_list <- list()
for(j in 1:nrow(all_possible_ways_to_ditch_value)){
#print(j)
ind_N <- matrix(rep(0,6*(4-n_ditch)),ncol=6) # initiate ind_N as an empty matrix
for(i in 1:ncol(M)){
ind_N[,i] <- ind[-ditch_possibilities[,all_possible_ways_to_ditch_value[j,i]],i] # fill with ind except for the value we ditch
}
N <- ind_to_M(ind_N)
X <- cbind(c(rep(1,nrow(N))),N)
ans_list[[j]] <- try(sum(diag(solve(t(X)%*%X)[-1,-1])),silent=TRUE) # some systems are not well defined, we'll just ignore the errors
N_list[[j]] <- N
}

Related

Copying the diagonal of a dataframe/matrix in the first row, by group

Say I have the following matrix mat3, where column 1 is a variable defining 2 groups:
mat1 <- diag(1, 5, 5)
mat1[,1] <- 1
mat2 <- diag(3, 5, 5)
mat2[,1] <- 3
mat3 <- rbind(mat1, mat2)
mat3
In mat3, how do I copy the diagonals of mat1 and mat2 in their respective first rows (i.e. rows 1 and 6)? The pseudocode would be: diag(mat3) by mat3[,1]
I tried the following but it did not work:
fnc <- function(x) {
res <- x
res[1,] <- diag(x)
res <<- res
}
by(mat3, as.factor(mat3[,1]), fnc)
res
In practice, I need to apply this operation to a dataframe.
Thanks a lot!
do.call(rbind, lapply(split.data.frame(mat3, mat3[,1]), \(x) {
x[1, ] <- diag(x); x
}))
[,1] [,2] [,3] [,4] [,5]
[1,] 1 1 1 1 1
[2,] 1 1 0 0 0
[3,] 1 0 1 0 0
[4,] 1 0 0 1 0
[5,] 1 0 0 0 1
[6,] 3 3 3 3 3
[7,] 3 3 0 0 0
[8,] 3 0 3 0 0
[9,] 3 0 0 3 0
[10,] 3 0 0 0 3
If mat3 is a data.frame, you can revise the anonymous function as
\(x) {
x[1, ] <- diag(as.matrix(x)); x
}
Here's an approch that just finds the start of each matrix assuming they are square:
idx <- seq(from=1, to=nrow(mat3), by=ncol(mat3))
for(i in idx) mat3[i, ] <- rep(mat3[i, 1], ncol(mat3))
mat3
# [,1] [,2] [,3] [,4] [,5]
# [1,] 1 1 1 1 1
# [2,] 1 1 0 0 0
# [3,] 1 0 1 0 0
# [4,] 1 0 0 1 0
# [5,] 1 0 0 0 1
# [6,] 3 3 3 3 3
# [7,] 3 3 0 0 0
# [8,] 3 0 3 0 0
# [9,] 3 0 0 3 0
# [10,] 3 0 0 0 3

Building adjacency matrix from delaunay triangle values

I have a large matrix containing 3D coordinates of points and I want to have the adjacency matrix of them using their delaunay triangulation results. To have their delaunay triangulation values I used 'geometry' package. To have an example of what I have, I prepared the below codes:
values<-rnorm(12,mean = 10, 5)
mat<-matrix(values,ncol = 3)
dimnames(mat)<-list(c("asd","qwe","rty","poi"),c("x","y","z"))
require("geometry")
delaunaynMat<-delaunayn(mat)
However, I could not find any proper function to build the adjacency matrix from delaunay results. Any idea about it?
Perhaps with the DatabionicSwarm package
> library(DatabionicSwarm)
> Delaunay4Points(mat, IsToroid=FALSE)
1 2 3 4
1 0 1 1 1
2 1 0 1 1
3 1 1 0 0
4 1 1 0 0
Or wirh deldir (only for 2D):
> require(deldir)
Loading required package: deldir
deldir 0.1-15
> set.seed(42)
> x <- runif(6)
> y <- runif(6)
> dxy <- deldir(x,y)
> ind <- dxy$dirsgs[,5:6]
> adj <- matrix(0, length(x), length(y))
> for (i in 1:nrow(ind)){
+ adj[ind[i,1], ind[i,2]] <- 1
+ adj[ind[i,2], ind[i,1]] <- 1
+ }
> adj
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0 1 0 1 0 0
[2,] 1 0 0 1 1 0
[3,] 0 0 0 0 1 1
[4,] 1 1 0 0 1 1
[5,] 0 1 1 1 0 1
[6,] 0 0 1 1 1 0
EDIT
There's something weird. I need to flip the matrix of Delaauny4Points to get results consistent with deldir:
set.seed(42)
x <- runif(6)
y <- runif(6)
dxy <- deldir(x,y)
ind <- dxy$delsgs[,5:6]
adj <- matrix(0, length(x), length(y))
for (i in 1:nrow(ind)){
adj[ind[i,1], ind[i,2]] <- 1
adj[ind[i,2], ind[i,1]] <- 1
}
adj
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0 1 0 1 0 1
[2,] 1 0 1 1 1 0
[3,] 0 1 0 0 1 1
[4,] 1 1 0 0 1 1
[5,] 0 1 1 1 0 1
[6,] 1 0 1 1 1 0
> Delaunay4Points(cbind(x,y), IsToroid=FALSE)[6:1,6:1] # <- look
6 5 4 3 2 1
6 0 1 0 1 0 1
5 1 0 1 1 1 0
4 0 1 0 0 1 1
3 1 1 0 0 1 1
2 0 1 1 1 0 1
1 1 0 1 1 1 0

Obtain all possible matrices by swapping only two positions in any given column

Let's start with the following matrix.
M <- matrix(c(0,0,1,1,0,0,1,1,
0,1,1,0,0,1,1,0,
0,0,0,0,1,1,1,1,
0,1,0,1,1,0,1,0,
0,0,1,1,1,1,0,0,
0,1,1,0,1,0,0,1),nrow = 8,ncol = 6)
Here is the M
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0 0 0 0 0 0
[2,] 0 1 0 1 0 1
[3,] 1 1 0 0 1 1
[4,] 1 0 0 1 1 0
[5,] 0 0 1 1 1 1
[6,] 0 1 1 0 1 0
[7,] 1 1 1 1 0 0
[8,] 1 0 1 0 0 1
If I pick a random column, say 4, I want to swap two positions in that column. One such possibility is swapping 5th and 6th position is given by
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0 0 0 0 0 0
[2,] 0 1 0 1 0 1
[3,] 1 1 0 0 1 1
[4,] 1 0 0 1 1 0
[5,] 0 0 1 0 1 1
[6,] 0 1 1 1 1 0
[7,] 1 1 1 1 0 0
[8,] 1 0 1 0 0 1
I want to do this for every possible swap in each column and then for all columns to obtain all the possible matrices.
Here's another solution:
# Return all unique permutations for c(0,0,0,0,1,1,1,1)
library(gtools)
perms = unique(permutations(8, 8, M[,1], set = FALSE))
# Create nested list
Mat_list = lapply(vector("list", ncol(M)), function(x) vector("list", nrow(perms)))
# Loop through every column and every permutations replacing each column
# with each unique permutation one at a time
for(ii in 1:ncol(M)){
for(jj in 1:nrow(perms)){
New_Mat = M
New_Mat[,ii] = perms[jj,]
Mat_list[[ii]][[jj]] = New_Mat
}
}
Result:
> Mat_list[[1]][[2]]
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0 0 0 0 0 0
[2,] 0 1 0 1 0 1
[3,] 1 1 0 0 1 1
[4,] 1 0 0 1 1 0
[5,] 0 0 1 1 1 1
[6,] 1 1 1 0 1 0
[7,] 0 1 1 1 0 0
[8,] 1 0 1 0 0 1
Note:
Instead of creating a super long list, I've created a nested list of matrices with 8 elements and n sub-elements per element (where n is the number of unique permutations). You can unlist the result if you prefer the long list form.
This code gives every permutation of 0s and 1s by column. I used a smaller toy example here, because the number of possibilities can get very large -- prod(choose(nrow(M), colSums(M))). As a note, this will likely not run on a standard computer for the matrix given, because of memory requirements.
library(gtools)
set.seed(1234)
M <- matrix(sample(0:1, 16, replace = TRUE), ncol = 4)
M
# [,1] [,2] [,3] [,4]
# [1,] 0 1 1 0
# [2,] 1 1 1 1
# [3,] 1 0 1 0
# [4,] 1 0 1 1
perm1s <- function(n, N) {
unique(permutations(N, N, c(rep(0, N - n), rep(1, n)), FALSE, FALSE))
}
createMat <- function(vec, lst) {
tmp <- lapply(seq_along(vec), function(x) lst[[x]][vec[x], ])
do.call(cbind, tmp)
}
makeMats <- function(M) {
sums <- colSums(M)
rows <- nrow(M)
rowPerm <- lapply(sums, perm1s, N = rows)
comb <- expand.grid(lapply(sapply(rowPerm, nrow), seq))
comb <- lapply(split(comb, seq(nrow(comb))), unlist)
mats <- lapply(comb, createMat, lst = rowPerm)
mats
}
res <- makeMats(M)
res[[1]]
# [,1] [,2] [,3] [,4]
# [1,] 0 0 1 0
# [2,] 1 0 1 0
# [3,] 1 1 1 1
# [4,] 1 1 1 1
To hold other columns constant when varying 1 column -- sum(choose(nrow(M), colSums(M))) possibilities:
makeMats2 <- function(M) {
sums <- colSums(M)
rows <- nrow(M)
rowPerm <- lapply(sums, perm1s, N = rows)
ind <- rep(seq_along(rowPerm), sapply(rowPerm, nrow))
rowPerm <- lapply(rowPerm, function(x) split(x, seq(nrow(x))))
rowPerm <- unlist(rowPerm, recursive = FALSE)
mats <- rep(list(M), length(rowPerm))
mats <- mapply(function(x, y, z) {x[ , y] <- z; x},
x = mats, y = ind, z = rowPerm, SIMPLIFY = FALSE)
mats
}

Is there a way to generate a matrix in R of 0's and 1's to satisfy specific row and column totals?

I want to generate a 7 column by 10 row matrix with a total of exactly 20 randomly generated 1's, but with at least two 1's per row and two 1's per column. How could I do that?
Also, how would the code be different if I wanted to set a range of acceptable row and column totals instead of minimums?
Thanks!
I feel like there should be a more elegant solution, but here's a ball of duct tape:
matbuilder <- function(n,nrow,ncol) {
finished <- F
while(!finished) {
trial <- matrix(sample(c(rep(1,n),rep(0,nrow*ncol-n))),nrow=nrow,ncol=ncol)
if(all(rowSums(trial)>=2 & all(colSums(trial)>=2))) finished <- T
}
return(trial)
}
x <- matbuilder(20, 10, 7)
x
## [,1] [,2] [,3] [,4] [,5] [,6] [,7]
## [1,] 1 1 0 0 0 0 0
## [2,] 0 0 0 0 1 0 1
## [3,] 0 1 0 0 0 1 0
## [4,] 1 0 0 0 1 0 0
## [5,] 0 1 0 0 0 1 0
## [6,] 0 0 1 1 0 0 0
## [7,] 0 0 1 1 0 0 0
## [8,] 0 0 0 0 1 0 1
## [9,] 0 0 0 0 0 1 1
## [10,] 0 1 1 0 0 0 0
sum(x)
## [1] 20
rowSums(x)
## [1] 2 2 2 2 2 2 2 2 2 2
colSums(x)
## [1] 2 4 3 2 3 3 3
Or, to give a range of acceptable row/column totals...
matbuilder <- function(n,nrow,ncol,rowmin,rowmax,colmin,colmax,ntimeout=100000) {
finished <- F
i <- 1
trial <- NA
while(!finished) {
trial <- matrix(sample(c(rep(1,n),rep(0,nrow*ncol-n))),nrow=nrow,ncol=ncol)
if(all(rowSums(trial)>=rowmin) & all(rowSums(trial)<=rowmax) & all(colSums(trial)>=colmin) & all(colSums(trial)<=colmax)) finished <- T
i <- i+1
if(i>ntimeout) {
finished <- T
cat("sorry boss, timeout.")
}
}
return(trial)
}
x <- matbuilder(25,10,7,rowmin=2,rowmax=3,colmin=2,colmax=4)
x
## [,1] [,2] [,3] [,4] [,5] [,6] [,7]
## [1,] 1 0 0 0 0 1 1
## [2,] 0 1 1 0 0 0 1
## [3,] 1 0 0 0 1 0 0
## [4,] 1 0 1 1 0 0 0
## [5,] 1 0 0 0 0 1 1
## [6,] 0 1 1 1 0 0 0
## [7,] 0 0 0 1 0 0 1
## [8,] 0 0 1 0 1 0 0
## [9,] 0 0 0 1 1 0 0
## [10,] 0 0 0 0 1 1 0
sum(x)
## [1] 25
rowSums(x)
## [1] 3 3 2 3 3 3 2 2 2 2
colSums(x)
## [1] 4 2 4 4 4 3 4
This one involves sampling a matrix of indices such that each row is repeated twice and columns are repeated at least 2 times.
set.seed(42)
m = matrix(rep(0, 70), nrow = 10)
#Sample rows 1-10 twice
rows = sample(c(1:10, 1:10))
#Sample columns 1-7 twice and additional 6 to make 20
columns = sample(c(sample(1:7, 6, replace = TRUE), 1:7, 1:7))
#Create a matrix of indices that should be 1
inds = cbind(rows, columns)
#Remove duplicates in inds if any (Refer: https://stackoverflow.com/q/44555420/7128934)
inds[,2] = replace(x = inds[,2],
list = duplicated(inds),
values = sample(x = columns[!(columns %in% inds[,2][duplicated(inds)])],
size = 1))
m[inds] = 1
#Check
rowSums(m)
#[1] 2 2 2 2 2 2 2 2 2 2
colSums(m)
#[1] 4 2 2 3 2 2 5
sum(m)
#[1] 20

How to generate a matrix to store all non-empty subsets of a set

Suppose I have a set N={1,2,3}, then we can list all its 7 non-empty subsets.
n=3 # number of elements in a set
a=2^n-1 # number of non-empty subsets for that set
subsets=lapply(1:n, function(x) combn(n, x)) # list all the non-empty subest
subsets
Now I want to put these subsets into a matrix and organized like:
if n=3 or in an index matrix:
1 0 0 1 0 0
0 2 0 0 1 0
0 0 3 0 0 1
1 2 0 1 1 0
1 0 3 1 0 1
0 2 3 0 1 1
1 2 3 1 1 1
Anyone knows how to write the code that could be easily extended to any n (=4, 5, 6...)? I tried this:
subindex=matrix(c(0), nrow=a, ncol=n)
i=1
while(i<=a){
j=n
b=2^(n-1)
N=i
while(N>0){
if(b<=N) {subindex[i,j]=1}&{N=N-b}
b=trunc(b/2)
j=j-1
}
i=i+1
}
subindex
But the index matrix I get is wrong in row 3 and 4. If n=4, then there are more errors... Can anybody correct this or simplify this code? or just write a completely new code. Really appreciate.
n <- 4
lapply(seq_len(n), function(i)t(combn(n, i, FUN = tabulate, nbins = n)))
# [[1]]
# [,1] [,2] [,3] [,4]
# [1,] 1 0 0 0
# [2,] 0 1 0 0
# [3,] 0 0 1 0
# [4,] 0 0 0 1
#
# [[2]]
# [,1] [,2] [,3] [,4]
# [1,] 1 1 0 0
# [2,] 1 0 1 0
# [3,] 1 0 0 1
# [4,] 0 1 1 0
# [5,] 0 1 0 1
# [6,] 0 0 1 1
#
# [[3]]
# [,1] [,2] [,3] [,4]
# [1,] 1 1 1 0
# [2,] 1 1 0 1
# [3,] 1 0 1 1
# [4,] 0 1 1 1
#
# [[4]]
# [,1] [,2] [,3] [,4]
# [1,] 1 1 1 1

Resources