Generating random 0/1 matrix with fixed features in R - r

I want to generate a random M x N matrix with zeros and ones with the following special properties:
1) The ones are only in m of the M rows.
2) The ones are only in n of the N columns.
Suppose I am ONLY given that M=10, N=10, m=6 and n=4. One possible random matrix is given by
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 0 0 1 0 0 0 1 0 1 0
[2,] 0 0 1 0 0 0 1 0 1 0
[3,] 0 0 0 0 0 0 0 0 0 0
[4,] 0 0 0 0 0 0 0 0 0 0
[5,] 0 0 1 0 0 0 1 0 0 0
[6,] 0 1 0 0 0 0 0 0 0 0
[7,] 0 0 0 0 0 0 0 0 0 0
[8,] 0 0 1 0 0 0 0 0 0 0
[9,] 0 0 1 0 0 0 0 0 1 0
[10,] 0 0 0 0 0 0 0 0 0 0
For reproducibility, I've artificially generated the above "random" matrix using
ex <- matrix(0,10,10)
ex[1,3] <- ex[1,7] <- ex[1,9] <- ex[2,3] <- ex[2,7] <-
ex[2,9] <- ex[5,3] <- ex[5,7] <- ex[6,2] <- ex[8,3] <- ex[9,3] <- ex[9,9] <- 1
Note that
sum(rowSums(ex)>0)
[1] 6
sum(colSums(ex)>0)
[1] 4
which match exactly m and n above. The number of ones can be random. On one extreme I could have 6 ones spread out over 6 rows and 4 columns (2 columns will have 2 ones, while the rest have 1) or on the other extreme, I could have 24 ones (each of the 6 rows will have a 1 in the same 4 columns).
Question
I can generate this in a brute force manner, sampling over the rows and columns, but I need to do this over thousands of such matrices (because the m and n will be different each time) and these matrices are large (M=5000 and N=8000, typically). Is there a way to do this efficiently in R?

M=10 #total number rows
N=10 #total number columns
m=6 #number valid rows
n=4 #number valid columns
#number of cells to simulate
k=12
ex <- matrix(0,M,N)
#sample m valid rows and n valid columns from uniform
mi <- sample(1:M, m)
ni <- sample(1:N, n)
#get all valid cells (valid rows and columns)
mn_i <- expand.grid(mi, ni)
#sample k cells from valid cells
x <- mn_i[sample(1:nrow(mn_i), k), ]
#update sampled cells using matrix subet on ex
ex[as.matrix(x)] <- 1
# > ex
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
# [1,] 1 0 0 0 0 0 0 0 0 0
# [2,] 0 0 0 0 0 0 0 0 0 0
# [3,] 0 0 0 0 0 0 1 1 0 1
# [4,] 0 0 0 0 0 0 0 0 0 0
# [5,] 1 0 0 0 0 0 1 1 0 1
# [6,] 1 0 0 0 0 0 1 0 0 0
# [7,] 0 0 0 0 0 0 0 0 0 0
# [8,] 0 0 0 0 0 0 0 1 0 1
# [9,] 0 0 0 0 0 0 0 0 0 0
# [10,] 0 0 0 0 0 0 0 0 0 0
You might want to wrap in a function to call it something like
ex <- constrained_matrix_sample(M, N, m, n, k)

A dplyr variant
M = 10 # rows
N = 10 # columns
m = 6
n = 4
ni = sample(1:N, n)
mi = sample(1:M, m)
expand.grid(N = 1:N, M = 1:M) %>%
mutate(value = ifelse(N %in% ni & M %in% mi, 1, 0)) %>%
.$value %>%
matrix(., nrow = M, byrow = TRUE)

Related

How to multiply a set of columns with another set of columns in R?

Consider the following matrix:
set.seed(3)
nn <- 9 # This is always fixed
mm <- 6 # This is always multiple of 3. Other possible values are 9,12,15 etc.
testMat <- matrix(rbinom(nn*mm,1,.5), nrow = nn, ncol = mm)
I am trying to take the product of all the possible combinations of the first 3 columns and the next 3 columns. From the help of solutions found in internet, I can do it in following way:
testMat1 <- testMat[,1:3]
testMat2 <- testMat[,4:6]
t(sapply(1:nn, function(i) tcrossprod(testMat1[i, ], testMat2[i, ])))
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 0 1 1 0 1 1 0 0 0
[2,] 1 1 0 0 0 0 0 0 0
[3,] 0 1 0 0 1 0 0 0 0
[4,] 0 0 0 0 0 0 0 0 0
[5,] 0 0 0 0 0 0 1 1 0
[6,] 0 0 0 1 1 0 0 0 0
[7,] 0 0 0 0 0 0 0 0 0
[8,] 0 0 0 0 0 1 0 0 1
[9,] 0 0 0 0 0 0 1 1 1
However, it requires me to separate the testMat manually. I am trying to automate this process where mm is greater than 6, for example, 9. Can you suggest an efficient way to do this?
I propose this solution to the question I asked. However, I would appreciate it if someone can propose a more elegant solution.
test.WinCombo <- matrix(NA, nrow = nn, ncol = 3^(mm/3))
for (i in 1:nrow(testMat)){
temp.split <- split(testMat[i,],ceiling(seq_along(testMat[i,])/3))
test.WinCombo[i,] <- apply(expand.grid(temp.split), 1, prod)
}

R: create diagonal from 1 to 10 in matrix -> returns only last value

I need to modify a matrix filled with random numbers so that the resulting matrix has a diagonal with the numbers 1 to 10 and everywhere else there are supposed to be zeros.
I am almost done but the diagonal only shows the last value 10 instead of the numbers 1 to 10. I know I need to cache the result somehow but I don't know how to do that.
rand_mat = matrix(sample(1:50, 100, replace = TRUE), nrow=10, ncol=10)
#10x10 matrix filled with random numbers
for (i in 1:nrow(rand_mat)) {
#for each row
for (j in 1:ncol(rand_mat)) {
#for each column
for (o in 1:10){
#go through numbers 1 to 10 for the diagonal
if(i == j){
#if row matches column -> diagonal
rand_mat[i,j] = o
#assign numbers 1 to 10 in the diagonal
}else{
rand_mat[i,j] = 0
#if location is not in the diagonal assign 0
}
}
}
}
This is the current result:
> print(rand_mat)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 10 0 0 0 0 0 0 0 0 0
[2,] 0 10 0 0 0 0 0 0 0 0
[3,] 0 0 10 0 0 0 0 0 0 0
[4,] 0 0 0 10 0 0 0 0 0 0
[5,] 0 0 0 0 10 0 0 0 0 0
[6,] 0 0 0 0 0 10 0 0 0 0
[7,] 0 0 0 0 0 0 10 0 0 0
[8,] 0 0 0 0 0 0 0 10 0 0
[9,] 0 0 0 0 0 0 0 0 10 0
[10,] 0 0 0 0 0 0 0 0 0 10
>
Your innermost loop is unnecessary. When you are looping through j, you are looping through each cell in row i. So you need only assign the value i to the cell if i == j and 0 otherwise. You don't need the variable o at all.
So a fixed-up version of your code might be:
rand_mat = matrix(sample(1:50, 100, replace = TRUE), nrow=10, ncol=10)
for (i in 1:nrow(rand_mat)) {
#for each row
for (j in 1:ncol(rand_mat)) {
#for each column
if(i == j) {
#if row matches column -> diagonal
rand_mat[i,j] <- i
#assign row number in the diagonal
} else {
rand_mat[i,j] = 0
#if location is not in the diagonal assign 0
}
}
}
rand_mat
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#> [1,] 1 0 0 0 0 0 0 0 0 0
#> [2,] 0 2 0 0 0 0 0 0 0 0
#> [3,] 0 0 3 0 0 0 0 0 0 0
#> [4,] 0 0 0 4 0 0 0 0 0 0
#> [5,] 0 0 0 0 5 0 0 0 0 0
#> [6,] 0 0 0 0 0 6 0 0 0 0
#> [7,] 0 0 0 0 0 0 7 0 0 0
#> [8,] 0 0 0 0 0 0 0 8 0 0
#> [9,] 0 0 0 0 0 0 0 0 9 0
#> [10,] 0 0 0 0 0 0 0 0 0 10
Or, more succinctly:
for (i in 1:nrow(rand_mat))
for (j in 1:ncol(rand_mat))
rand_mat[i, j] <- if(i == j) i else 0
Created on 2022-05-14 by the reprex package (v2.0.1)
Using only functions you used in your question:
rand_mat = matrix(sample(1:50, 100, replace = TRUE), nrow=10, ncol=10)
rand_mat <- matrix(0, 10, 10) # Make all the elements 0.
for(i in 1:10) rand_mat[i, i] <- i # Replace the diagonal elements
Although this is not an efficent way to do this (as noted in comments), you can fix your code by removing the innermost loop and replacing it with an assignment:
for (i in 1:nrow(rand_mat)) {
for (j in 1:ncol(rand_mat)) {
rand_mat[i,j] = if(i == j) i else 0
}
}
I got it thanks to #dcarlson
Their code was definitely a lot shorter but I really wanted to use my code to see what I was doing wrong. Turns out I had to write rand_mat[o,o] = o instead of rand_mat[i,j] = o in the if-else-loop. So the code now looks like this:
rand_mat = matrix(sample(1:50, 100, replace = TRUE), nrow=10, ncol=10)
#10x10 matrix filled with random numbers
for (i in 1:nrow(rand_mat)) {
#for each row
for (j in 1:ncol(rand_mat)) {
#for each column
for (o in 1:10){
#go through numbers 1 to 10 for the diagonal
if(i == j){
#if row matches column -> diagonal
rand_mat[o,o] = o
#assign numbers 1 to 10 in the diagonal
}else{
rand_mat[i,j] = 0
#if location is not in the diagonal assign 0
}
}
}
}
and the result looks like when I tried it with the code from #dcarlson
> print(rand_mat)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 1 0 0 0 0 0 0 0 0 0
[2,] 0 2 0 0 0 0 0 0 0 0
[3,] 0 0 3 0 0 0 0 0 0 0
[4,] 0 0 0 4 0 0 0 0 0 0
[5,] 0 0 0 0 5 0 0 0 0 0
[6,] 0 0 0 0 0 6 0 0 0 0
[7,] 0 0 0 0 0 0 7 0 0 0
[8,] 0 0 0 0 0 0 0 8 0 0
[9,] 0 0 0 0 0 0 0 0 9 0
[10,] 0 0 0 0 0 0 0 0 0 10
>
But as I learnt from #Allan Cameron you don't need the variable o so you don't even need the third for-loop:
rand_mat = matrix(sample(1:50, 100, replace = TRUE), nrow=10, ncol=10)
for (i in 1:nrow(rand_mat)) {
#for each row
for (j in 1:ncol(rand_mat)) {
#for each column
if(i == j) {
#if row matches column -> diagonal
rand_mat[i,j] <- i
#assign row number in the diagonal
} else {
rand_mat[i,j] = 0
#if location is not in the diagonal assign 0
}
}
}
The result is the same. Thank you for your help!

How can I make a loop in R that produces these matrices?

I am trying to solve large scale assignment problems with Gurobi in R. I need a loop that will produce the constraint matrices for any n I specify since I will not be able to manually enter them for very large problems. I pasted sample matrices for n=2 and n=3 and also the code I have come up with for n=2. I need the n-i part to continue as 1, 2, 3, 4, etc. but each new row needs to be cumulative. I know I have a long way to go and I am very new to R. Any help would be appreciated, thanks.
n=2
1 1 0 0
0 0 1 1
1 0 1 0
0 1 0 1
n=3
1 1 1 0 0 0 0 0 0
0 0 0 1 1 1 0 0 0
0 0 0 0 0 0 1 1 1
1 0 0 1 0 0 1 0 0
0 1 0 0 1 0 0 1 0
0 0 1 0 0 1 0 0 1
library("gurobi")
model <- list()
n=2
i=0
while (i <= n-2) {
print(i)
i = i+1
}
i
a=rep(1,n)
b=rep(0,(n-i)*n)
c=rep(0,n)
d=rep(1,n)
e=rep(0,(n-i)*n)
f=rep(1:0, times=n)
g=rep(0:1, times=n)
model$A <- matrix(c(a,b,c,d,e,f,g), nrow=4, ncol=4, byrow=T)
model$obj <- c(1,2,3,4)
model$modelsense <- "min"
model$rhs <- c(1,1,1,1)
model$sense <- c('=', '=','=','=')
model$vtype <- 'B'
params <- list(OutputFlag=0)
result <- gurobi(model, params)
print('Solution:')
print(result$objval)
print(result$x)
Use kronecker products as shown:
make_mat <- function(k) {
d <- diag(k)
ones <- t(rep(1, k))
rbind( d %x% ones, ones %x% d )
}
lapply(2:3, make_mat)
giving:
[[1]]
[,1] [,2] [,3] [,4]
[1,] 1 1 0 0
[2,] 0 0 1 1
[3,] 1 0 1 0
[4,] 0 1 0 1
[[2]]
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 1 1 1 0 0 0 0 0 0
[2,] 0 0 0 1 1 1 0 0 0
[3,] 0 0 0 0 0 0 1 1 1
[4,] 1 0 0 1 0 0 1 0 0
[5,] 0 1 0 0 1 0 0 1 0
[6,] 0 0 1 0 0 1 0 0 1

Creating matrix with probabilities depending on index column

How do I create a matrix 10x10, with only 1's (heads), and 0's (tails), with the probability of a heads is 1 divided by the index of the column.
I tried several things but it won't work which is really frustrating. I tried to do it with a vector and a for loop.
mat <- matrix(sample(c(0,1), 100, replace=TRUE, prob=c(1/h, 1-(1/h)), 10))
But now the only question is how to define h.
Here is an option using sapply
n_col <- 10
n_row <- 10
mat <- matrix(nrow = n_row,
ncol = n_col)
set.seed(1)
sapply(1:n_col, function(x) {
mat[, x] <- sample(x = c(1, 0),
size = n_row,
replace = TRUE,
prob = c(1/x, 1 - 1/x))
})
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
# [1,] 1 0 1 0 1 0 1 0 0 0
# [2,] 1 0 0 0 0 1 0 0 0 0
# [3,] 1 1 0 0 0 0 0 0 0 0
# [4,] 1 0 0 0 0 0 0 0 0 0
# [5,] 1 1 0 1 0 0 0 0 0 0
# [6,] 1 0 0 0 0 0 0 1 0 0
# [7,] 1 1 0 1 0 0 0 0 0 0
# [8,] 1 1 0 0 0 0 0 0 0 0
# [9,] 1 0 1 0 0 0 0 0 0 0
#[10,] 1 1 0 0 0 0 1 1 0 0
Hope it helps.

Create a binary adjacency matrix from a vector of indices

Suppose I have a vector that looks like this:
x <- sample(5, 500, replace = TRUE)
so that each element corresponds to some index from 1 through 5.
What's an efficient way to create a binary adjacency matrix from this vector? To elaborate, the matrix A should be such that A[i,j] = 1 if x[i] = x[j] and 0 otherwise.
In one line, you could do
outer(x, x, function(x, y) as.integer(x==y))
which returns
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 1 0 0 0 0 0 1 0 0 0
[2,] 0 1 1 1 0 1 0 0 1 0
[3,] 0 1 1 1 0 1 0 0 1 0
[4,] 0 1 1 1 0 1 0 0 1 0
[5,] 0 0 0 0 1 0 0 0 0 0
[6,] 0 1 1 1 0 1 0 0 1 0
[7,] 1 0 0 0 0 0 1 0 0 0
[8,] 0 0 0 0 0 0 0 1 0 0
[9,] 0 1 1 1 0 1 0 0 1 0
[10,] 0 0 0 0 0 0 0 0 0 1
or, in two lines
myMat <- outer(x, x, "==")
myMat[] <- as.integer(myMat)
Check that they're the same.
identical(myMat, outer(x, x, function(x, y) as.integer(x==y)))
[1] TRUE
data
set.seed(1234)
x <- sample(5, 10, replace = TRUE)

Resources