Related
I would like to transform a vector of integer such:
vector = c(0,6,1,8,5,4,2)
length(vector) = 7
max(vector) = 8
into a matrix m of nrow = length(vector) and ncol = max(vector) :
m =
0 0 0 0 0 0 0 0
1 1 1 1 1 1 0 0
1 0 0 0 0 0 0 0
1 1 1 1 1 1 1 1
1 1 1 1 1 0 0 0
1 1 1 1 0 0 0 0
1 1 0 0 0 0 0 0
It's just an example of what I am trying to do. I intend that the function work with every vector of integer.
I tried to used the function mapply(rep, 1, vector) but I obtained a list and I didn't succeed to convert it into a matrix...
It would be very useful for me if someone can help me.
Best Regards,
Maxime
If you use c(rep(1, x), rep(0, max(vector-x)) on each element of your variable vector you get the desired binary results. Looping that with sapply even returns a matrix. You only need to transpose it afterwards and you get your result.
vector = c(0,6,1,8,5,4,2)
result <- t(sapply(vector, function(x) c(rep(1, x), rep(0, max(vector)-x))))
is.matrix(result)
#> [1] TRUE
result
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
#> [1,] 0 0 0 0 0 0 0 0
#> [2,] 1 1 1 1 1 1 0 0
#> [3,] 1 0 0 0 0 0 0 0
#> [4,] 1 1 1 1 1 1 1 1
#> [5,] 1 1 1 1 1 0 0 0
#> [6,] 1 1 1 1 0 0 0 0
#> [7,] 1 1 0 0 0 0 0 0
Putting that into a function is easy:
binaryMatrix <- function(v) {
t(sapply(v, function(x) c(rep(1, x), rep(0, max(v)-x))))
}
binaryMatrix(vector)
# same result as before
Created on 2021-02-14 by the reprex package (v1.0.0)
Another straightforward approach would be to exploit matrix sub-assignment using row/column indices in a matrix form (see, also, ?Extract).
Define a matrix of 0s:
x = c(0, 6, 1, 8, 5, 4, 2)
m = matrix(0L, nrow = length(x), ncol = max(x))
And fill with 1s:
i = rep(seq_along(x), x) ## row indices of 1s
j = sequence(x) ## column indices of 1s
ij = cbind(i, j)
m[ij] = 1L
m
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
#[1,] 0 0 0 0 0 0 0 0
#[2,] 1 1 1 1 1 1 0 0
#[3,] 1 0 0 0 0 0 0 0
#[4,] 1 1 1 1 1 1 1 1
#[5,] 1 1 1 1 1 0 0 0
#[6,] 1 1 1 1 0 0 0 0
#[7,] 1 1 0 0 0 0 0 0
Assuming that all values in the vector are non-negative integers, you can define the following function
transformVectorToMatrix <- function(v) {
nrOfCols <- max(v)
zeroRow <- integer(nrOfCols)
do.call("rbind",lapply(v,function(nrOfOnes) {
if(nrOfOnes==0) return(zeroRow)
if(nrOfOnes==nrOfCols) return(zeroRow+1)
c(integer(nrOfOnes)+1,integer(nrOfCols-nrOfOnes))
}))
}
and finally do
m = transformVectorToMatrix(vector)
to get your desired binary matrix.
This question already has answers here:
Generate a dummy-variable
(17 answers)
Closed 5 years ago.
I have a column vector in a dataframe and would like to turn it into a binary matrix so I can do matrix multiplication with it later on.
y_labels
1
4
4
3
desired output
1 0 0 0
0 0 0 1
0 0 0 1
0 0 1 0
In Octave I would do something like y_matrix = (y_labels == [1 2 3 4]). However, I can't figure out how to get this in R. Anybody know how?
We can use model.matrix to change it to binary
model.matrix(~ -1 + factor(y_labels, levels = 1:4), df1)
or with table
with(df1, table(1:nrow(df1), factor(y_labels, levels = 1:4)))
# 1 2 3 4
# 1 1 0 0 0
# 2 0 0 0 1
# 3 0 0 0 1
# 4 0 0 1 0
Or more compactly
+(sapply(1:4, `==`, df1$y_labels))
# [,1] [,2] [,3] [,4]
#[1,] 1 0 0 0
#[2,] 0 0 0 1
#[3,] 0 0 0 1
#[4,] 0 0 1 0
Here's another option:
Start by creating a matrix of zeros:
m <- matrix(0, nrow = nrow(df), ncol = max(df$y_labels))
Then insert 1s at the correct positions:
m[col(m) == df$y_labels] <- 1
The result is:
[,1] [,2] [,3] [,4]
[1,] 1 0 0 0
[2,] 0 0 0 1
[3,] 0 0 0 1
[4,] 0 0 1 0
How about (where vec is your numeric vector):
m <- matrix(0, length(vec), max(vec))
m[cbind(seq_along(vec), vec)] <- 1
# [,1] [,2] [,3] [,4]
#[1,] 1 0 0 0
#[2,] 0 0 0 1
#[3,] 0 0 0 1
#[4,] 0 0 1 0
In base R:
df1 <- data.frame(y_labels = c(1,4,4,3))
t(sapply(df1$y_labels,function(x) c(rep(0,x-1),1,rep(0,max(df1$y_labels)-x))))
or
t(sapply(df1$y_labels,function(x) `[<-`(numeric(max(df1$y_labels)),x,1)))
output:
# [,1] [,2] [,3] [,4]
# [1,] 1 0 0 0
# [2,] 0 0 0 1
# [3,] 0 0 0 1
# [4,] 0 0 1 0
Let's start with the following matrix.
M <- matrix(c(0,0,0,1,0,0,1,1,
0,0,1,0,0,1,1,0,
0,0,0,0,0,1,1,1,
0,0,0,1,1,0,1,0,
0,0,0,1,1,1,0,0,
0,0,1,0,1,0,0,1),nrow = 8,ncol = 6)
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0 0 0 0 0 0
[2,] 0 0 0 0 0 0
[3,] 0 1 0 0 0 1
[4,] 1 0 0 1 1 0
[5,] 0 0 0 1 1 1
[6,] 0 1 1 0 1 0
[7,] 1 1 1 1 0 0
[8,] 1 0 1 0 0 1
I want to obtain set of matrices by switching ones and zeros. For each column, starting from column 1, I wanna obtain set of matrices by switching 1 in (4,1) with 0 in (1,1), (2,1), (3,1), (5,1), (6,1) and then do the same for 1s in (7,1) and (8,1). Then continue to the other columns. There are altogether
90 matrices (15 for each column, 15*6) after switching. This is just an example. I have bigger size matrices. How do I generalize for other cases?
Here's a solution. You could wrap the whole thing up into a function. It produces a list of lists of matrices, results, where results[[i]] is a list of matrices with the ith column switched.
column_switcher = function(x) {
ones = which(x == 1)
zeros = which(x == 0)
results = matrix(rep(x, length(ones) * length(zeros)), nrow = length(x))
counter = 1
for (one in ones) {
for (zero in zeros) {
results[one, counter] = 0
results[zero, counter] = 1
counter = counter + 1
}
}
return(results)
}
switched = lapply(1:ncol(M), function(col) column_switcher(M[, col]))
results = lapply(seq_along(switched), function(m_col) {
lapply(1:ncol(switched[[m_col]]), function(i) {
M[, m_col] = switched[[m_col]][, i]
return(M)
})
})
results[[1]]
# [[1]]
# [,1] [,2] [,3] [,4] [,5] [,6]
# [1,] 1 0 0 0 0 0
# [2,] 0 0 0 0 0 0
# [3,] 0 1 0 0 0 1
# [4,] 0 0 0 1 1 0
# [5,] 0 0 0 1 1 1
# [6,] 0 1 1 0 1 0
# [7,] 1 1 1 1 0 0
# [8,] 1 0 1 0 0 1
#
# [[2]]
# [,1] [,2] [,3] [,4] [,5] [,6]
# [1,] 0 0 0 0 0 0
# [2,] 1 0 0 0 0 0
# [3,] 0 1 0 0 0 1
# [4,] 0 0 0 1 1 0
# [5,] 0 0 0 1 1 1
# [6,] 0 1 1 0 1 0
# [7,] 1 1 1 1 0 0
# [8,] 1 0 1 0 0 1
#
# ...
Checking the length of the list and the lengths of the sublists, they're all there.
length(results)
# [1] 6
lengths(results)
# [1] 15 15 15 15 15 15
Consider the 8 by 6 binary matrix, M:
M <- matrix(c(0,0,1,1,0,0,1,1,
0,1,1,0,0,1,1,0,
0,0,0,0,1,1,1,1,
0,1,0,1,1,0,1,0,
0,0,1,1,1,1,0,0,
0,1,1,0,1,0,0,1),nrow = 8,ncol = 6)
Here is the M
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0 0 0 0 0 0
[2,] 0 1 0 1 0 1
[3,] 1 1 0 0 1 1
[4,] 1 0 0 1 1 0
[5,] 0 0 1 1 1 1
[6,] 0 1 1 0 1 0
[7,] 1 1 1 1 0 0
[8,] 1 0 1 0 0 1
The following matrix contains the column index of the 1's in matrix M
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 3 2 5 2 3 2
[2,] 4 3 6 4 4 3
[3,] 7 6 7 5 5 5
[4,] 8 7 8 7 6 8
Let's denote that
ind <- matrix(c(3,4,7,8,
2,3,6,7,
5,6,7,8,
2,4,5,7,
3,4,5,6,
2,3,5,8),nrow = 4, ncol=6)
I'm trying to change a single position of 1 into 0in each column of M.
For an example, one possibility of index of1s in each column would be (4,2,5,4,3,2), i.e. 4th position of Column1, 2nd position of Column2, 5thposition of Column3 and so on. Let N be the resulting matrices. This will produce the following matrix N
N <- matrix(c(0,0,1,0,0,0,1,1,
0,0,1,0,0,1,1,0,
0,0,0,0,0,1,1,1,
0,1,0,0,1,0,1,0,
0,0,0,1,1,1,0,0,
0,0,1,0,1,0,0,1),nrow = 8,ncol = 6)
Here is that N
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0 0 0 0 0 0
[2,] 0 0 0 1 0 0
[3,] 1 1 0 0 0 1
[4,] 0 0 0 0 1 0
[5,] 0 0 0 1 1 1
[6,] 0 1 1 0 1 0
[7,] 1 1 1 1 0 0
[8,] 1 0 1 0 0 1
For EACH of the resulting matrices of N, I do the following calculations.
X <- cbind(c(rep(1,nrow(N))),N)
ans <- sum(diag(solve(t(X)%*%X)[-1,-1]))
Then, I want to obtain the matrix N, which produce the smallest value of ans. How do I do this efficiently?
Let me know if this works.
We first build a conversion function that I'll need, and we build also the reverse function as you may need it at some point:
ind_to_M <- function(ind){
M <- matrix(rep(0,6*8),ncol=6)
for(i in 1:ncol(ind)){M[ind[,i],i] <- 1}
return(M)
}
M_to_ind <- function(M){apply(M==1,2,which)}
Then we will build a matrix of possible ways to ditch a value
all_possible_ways_to_ditch_value <- 1:4
for (i in 2:ncol(M)){
all_possible_ways_to_ditch_value <- merge(all_possible_ways_to_ditch_value,1:4,by=NULL)
}
# there's probably a more elegant way to do that
head(all_possible_ways_to_ditch_value)
# x y.x y.y y.x y.y y
# 1 1 1 1 1 1 1 # will be used to ditch the 1st value of ind for every column
# 2 2 1 1 1 1 1
# 3 3 1 1 1 1 1
# 4 4 1 1 1 1 1
# 5 1 2 1 1 1 1
# 6 2 2 1 1 1 1
Then we iterate through those, each time storing ans and N (as data is quite small overall).
ans_list <- list()
N_list <- list()
for(j in 1:nrow(all_possible_ways_to_ditch_value)){
#print(j)
ind_N <- matrix(rep(0,6*3),ncol=6) # initiate ind_N as an empty matrix
for(i in 1:ncol(M)){
ind_N[,i] <- ind[-all_possible_ways_to_ditch_value[j,i],i] # fill with ind except for the value we ditch
}
N <- ind_to_M(ind_N)
X <- cbind(c(rep(1,nrow(N))),N)
ans_list[[j]] <- try(sum(diag(solve(t(X)%*%X)[-1,-1])),silent=TRUE) # some systems are not well defined, we'll just ignore the errors
N_list[[j]] <- N
}
We finally retrieve the minimal ans and the relevant N
ans <- ans_list[[which.min(ans_list)]]
# [1] -3.60288e+15
N <- N_list[[which.min(ans_list)]]
# [,1] [,2] [,3] [,4] [,5] [,6]
# [1,] 0 0 0 0 0 0
# [2,] 0 1 0 1 0 1
# [3,] 1 1 0 0 1 1
# [4,] 1 0 0 1 1 0
# [5,] 0 0 1 1 1 1
# [6,] 0 1 1 0 0 0
# [7,] 1 0 1 0 0 0
# [8,] 0 0 0 0 0 0
EDIT:
To get minimal positive ans
ans_list[which(!sapply(ans_list,is.numeric))] <- Inf
ans <- ans_list[[which.min(abs(unlist(ans_list)))]]
# [1] 3.3
N <- N_list[[which.min(abs(unlist(ans_list)))]]
# [,1] [,2] [,3] [,4] [,5] [,6]
# [1,] 0 0 0 0 0 0
# [2,] 0 1 0 1 0 0
# [3,] 1 1 0 0 0 1
# [4,] 1 0 0 0 1 0
# [5,] 0 0 0 1 1 1
# [6,] 0 1 1 0 1 0
# [7,] 1 0 1 1 0 0
# [8,] 0 0 1 0 0 1
EDIT 2 : to generalize the number of rows of ind to ditch
It seems to give the same result for ans for n_ditch = 1, and results make sense for n_ditch = 2
n_ditch <- 2
ditch_possibilities <- combn(1:4,n_ditch) # these are all the possible sets of indices to ditch for one given columns
all_possible_ways_to_ditch_value <- 1:ncol(ditch_possibilities) # this will be all the possible sets of indices of ditch_possibilities to test
for (i in 2:ncol(M)){
all_possible_ways_to_ditch_value <- merge(all_possible_ways_to_ditch_value,1:ncol(ditch_possibilities),by=NULL)
}
ans_list <- list()
N_list <- list()
for(j in 1:nrow(all_possible_ways_to_ditch_value)){
#print(j)
ind_N <- matrix(rep(0,6*(4-n_ditch)),ncol=6) # initiate ind_N as an empty matrix
for(i in 1:ncol(M)){
ind_N[,i] <- ind[-ditch_possibilities[,all_possible_ways_to_ditch_value[j,i]],i] # fill with ind except for the value we ditch
}
N <- ind_to_M(ind_N)
X <- cbind(c(rep(1,nrow(N))),N)
ans_list[[j]] <- try(sum(diag(solve(t(X)%*%X)[-1,-1])),silent=TRUE) # some systems are not well defined, we'll just ignore the errors
N_list[[j]] <- N
}
I'm trying to create a function where at every time step in a matrix, the cells adjacent and diagonal to a 1 become 1 as well.
For example, something like this:
Input
0 0 0 0 0
0 1 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
Output after first time step
1 1 1 0 0
1 1 1 0 0
1 1 1 0 0
0 0 0 0 0
0 0 0 0 0
So far, I have this:
A = matrix(0,nrow=5,ncol=5)
A[2,2]=1
for (i in 1:5){
for (j in 1:5){
if ((A[i,j]==1)) {
A[,(j+1)]=1
A[,(j-1)]=1
A[(i+1),]=1
A[(i-1),]=1
A[(i+1),(j+1)]=1
A[(i+1),(j-1)]=1
A[(i-1),(j+1)]=1
A[(i-1),(j-1)]=1
}
}
}
I'm not too sure how to integrate a function in there, so I can have the resulting matrix for whatever time step I want.
You could determine if a bit is set either in the matrix or the matrix when it is shifted in any of the 8 legitimate directions (right, left, up, down, up-right, down-right, down-left, up-left):
spread <- function(X) unname(X |
rbind(F, head(X, -1)) |
rbind(tail(X, -1), F) |
cbind(F, X[,-ncol(X)]) |
cbind(X[,-1], F) |
cbind(F, rbind(F, head(X, -1))[,-ncol(X)]) |
cbind(rbind(F, head(X, -1))[,-1], F) |
cbind(F, rbind(tail(X, -1), F)[,-ncol(X)]) |
cbind(rbind(tail(X, -1), F)[,-1], F)) * 1
X <- matrix(rep(c(0, 1, 0), c(6, 1, 18)), nrow=5)
spread(X)
# [,1] [,2] [,3] [,4] [,5]
# [1,] 1 1 1 0 0
# [2,] 1 1 1 0 0
# [3,] 1 1 1 0 0
# [4,] 0 0 0 0 0
# [5,] 0 0 0 0 0
You can apply the function repeatedly to further spread the data:
spread(spread(X))
# [,1] [,2] [,3] [,4] [,5]
# [1,] 1 1 1 1 0
# [2,] 1 1 1 1 0
# [3,] 1 1 1 1 0
# [4,] 1 1 1 1 0
# [5,] 0 0 0 0 0
spread(spread(spread(X)))
# [,1] [,2] [,3] [,4] [,5]
# [1,] 1 1 1 1 1
# [2,] 1 1 1 1 1
# [3,] 1 1 1 1 1
# [4,] 1 1 1 1 1
# [5,] 1 1 1 1 1
This works for multiple 1's in the initial matrix that also can be in the first/last column/row.
A <- matrix(0, nrow = 5, ncol = 5)
A[2, 2] <- 1
A[5, 5] <- 1
A
# [,1] [,2] [,3] [,4] [,5]
# [1,] 0 0 0 0 0
# [2,] 0 1 0 0 0
# [3,] 0 0 0 0 0
# [4,] 0 0 0 0 0
# [5,] 0 0 0 0 1
spread <- function(x) {
idx <- do.call(rbind, apply(which(x == 1, arr.ind = TRUE), 1,
function(y) expand.grid(y[1] + 1:-1, y[2] + 1:-1)))
idx <- idx[!(idx[, 1] %in% c(0, nrow(x) + 1) | idx[, 2] %in% c(0, ncol(x) + 1)), ]
x[as.matrix(idx)] <- 1
x
}
spread(A)
# [,1] [,2] [,3] [,4] [,5]
# [1,] 1 1 1 0 0
# [2,] 1 1 1 0 0
# [3,] 1 1 1 0 0
# [4,] 0 0 0 1 1
# [5,] 0 0 0 1 1
spread(spread(A))
# [,1] [,2] [,3] [,4] [,5]
# [1,] 1 1 1 1 0
# [2,] 1 1 1 1 0
# [3,] 1 1 1 1 1
# [4,] 1 1 1 1 1
# [5,] 0 0 1 1 1
Edit:
Here is a function with a parameter k (taking values 1, 2, ...) that denotes the step of spreading 1's:
spread <- function(x, k) {
idx <- do.call(rbind, apply(which(x == 1, arr.ind = TRUE), 1,
function(y) expand.grid(y[1] + k:-k, y[2] + k:-k)))
idx <- idx[idx[, 1] %in% 1:nrow(x) & idx[, 2] %in% 1:ncol(x), ]
x[as.matrix(idx)] <- 1
x
}
spread(A, 2)
# [,1] [,2] [,3] [,4] [,5]
# [1,] 1 1 1 1 0
# [2,] 1 1 1 1 0
# [3,] 1 1 1 1 1
# [4,] 1 1 1 1 1
# [5,] 0 0 1 1 1
This works but might need some retooling for more general cases, i.e. your going to run into problems with multiple 1 in the initial matrix. If such a generalization is required please let me know and I'll gladly attempt to produce one. Or just use either josilber's or Julius's answer.
M <- as.matrix(read.table(textConnection("0 0 0 0 0
0 1 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0")))
my_spread <- function(m){
e <- which(m == 1, arr.ind = TRUE)
r <- c(e[, 1] - 1, e[, 1], e[, 1] + 1)
l <- c(e[, 2] - 1, e[, 2], e[, 2] + 1)
#dealing with border cases
r <- r[nrow(m) >= r]
l <- l[ncol(m) >= l]
m[as.matrix(expand.grid(r,l))] <- 1
m
}
my_spread(M)
[,1] [,2] [,3] [,4] [,5]
[1,] 1 1 1 0 0
[2,] 1 1 1 0 0
[3,] 1 1 1 0 0
[4,] 0 0 0 0 0
[5,] 0 0 0 0 0
my_spread(my_spread(M))
[,1] [,2] [,3] [,4] [,5]
[1,] 1 1 1 1 0
[2,] 1 1 1 1 0
[3,] 1 1 1 1 0
[4,] 1 1 1 1 0
[5,] 0 0 0 0 0
my_spread(my_spread(my_spread(M)))
[,1] [,2] [,3] [,4] [,5]
[1,] 1 1 1 1 1
[2,] 1 1 1 1 1
[3,] 1 1 1 1 1
[4,] 1 1 1 1 1
[5,] 1 1 1 1 1