Related
I have the following logical matrix:-
k <- matrix(c(T,T,F,F,T,F,T,F,T,T,F,F,T,F,T,F,T,T,T,T,F,F,F,F,F), 5)
However, when I do the following:-
z <- as.integer(k)
I get an integer vector rather than an integer matrix:-
[1] 1 1 0 0 1 0 1 0 1 1 0 0 1 0 1 0 1 1 1 1 0 0 0 0 0
I want it to get a matrix like following:-
k <- matrix(c(1,1,0,0,1,0,1,0,1,1,0,0,1,0,1,0,1,1,1,1,0,0,0,0,0), 5)
Thanks in advance.
Use unary +:
+k
## [,1] [,2] [,3] [,4] [,5]
## [1,] 1 0 0 0 0
## [2,] 1 1 0 1 0
## [3,] 0 0 1 1 0
## [4,] 0 1 0 1 0
## [5,] 1 1 1 1 0
We may need to use [] to keep the dim intact
z <- k
z[] <- as.integer(k)
Or another option is to do the dim assignment
z <- as.integer(k)
dim(z) <- dim(k)
Or without doing the dim changes, can just multiply by 1 to coerce to numeric
z <- k * 1
k <- matrix(c(T,T,F,F,T,F,T,F,T,T,F,F,T,F,T,F,T,T,T,T,F,F,F,F,F), 5)
res <- +k
res
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 1 0 0 0 0
#> [2,] 1 1 0 1 0
#> [3,] 0 0 1 1 0
#> [4,] 0 1 0 1 0
#> [5,] 1 1 1 1 0
Created on 2021-10-04 by the reprex package (v2.0.1)
Consider the 8 by 6 binary matrix, M:
M <- matrix(c(0,0,1,1,0,0,1,1,
0,1,1,0,0,1,1,0,
0,0,0,0,1,1,1,1,
0,1,0,1,1,0,1,0,
0,0,1,1,1,1,0,0,
0,1,1,0,1,0,0,1),nrow = 8,ncol = 6)
Here is the M
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0 0 0 0 0 0
[2,] 0 1 0 1 0 1
[3,] 1 1 0 0 1 1
[4,] 1 0 0 1 1 0
[5,] 0 0 1 1 1 1
[6,] 0 1 1 0 1 0
[7,] 1 1 1 1 0 0
[8,] 1 0 1 0 0 1
The following matrix contains the column index of the 1's in matrix M
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 3 2 5 2 3 2
[2,] 4 3 6 4 4 3
[3,] 7 6 7 5 5 5
[4,] 8 7 8 7 6 8
Let's denote that
ind <- matrix(c(3,4,7,8,
2,3,6,7,
5,6,7,8,
2,4,5,7,
3,4,5,6,
2,3,5,8),nrow = 4, ncol=6)
I'm trying to change a single position of 1 into 0in each column of M.
For an example, one possibility of index of1s in each column would be (4,2,5,4,3,2), i.e. 4th position of Column1, 2nd position of Column2, 5thposition of Column3 and so on. Let N be the resulting matrices. This will produce the following matrix N
N <- matrix(c(0,0,1,0,0,0,1,1,
0,0,1,0,0,1,1,0,
0,0,0,0,0,1,1,1,
0,1,0,0,1,0,1,0,
0,0,0,1,1,1,0,0,
0,0,1,0,1,0,0,1),nrow = 8,ncol = 6)
Here is that N
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0 0 0 0 0 0
[2,] 0 0 0 1 0 0
[3,] 1 1 0 0 0 1
[4,] 0 0 0 0 1 0
[5,] 0 0 0 1 1 1
[6,] 0 1 1 0 1 0
[7,] 1 1 1 1 0 0
[8,] 1 0 1 0 0 1
For EACH of the resulting matrices of N, I do the following calculations.
X <- cbind(c(rep(1,nrow(N))),N)
ans <- sum(diag(solve(t(X)%*%X)[-1,-1]))
Then, I want to obtain the matrix N, which produce the smallest value of ans. How do I do this efficiently?
Let me know if this works.
We first build a conversion function that I'll need, and we build also the reverse function as you may need it at some point:
ind_to_M <- function(ind){
M <- matrix(rep(0,6*8),ncol=6)
for(i in 1:ncol(ind)){M[ind[,i],i] <- 1}
return(M)
}
M_to_ind <- function(M){apply(M==1,2,which)}
Then we will build a matrix of possible ways to ditch a value
all_possible_ways_to_ditch_value <- 1:4
for (i in 2:ncol(M)){
all_possible_ways_to_ditch_value <- merge(all_possible_ways_to_ditch_value,1:4,by=NULL)
}
# there's probably a more elegant way to do that
head(all_possible_ways_to_ditch_value)
# x y.x y.y y.x y.y y
# 1 1 1 1 1 1 1 # will be used to ditch the 1st value of ind for every column
# 2 2 1 1 1 1 1
# 3 3 1 1 1 1 1
# 4 4 1 1 1 1 1
# 5 1 2 1 1 1 1
# 6 2 2 1 1 1 1
Then we iterate through those, each time storing ans and N (as data is quite small overall).
ans_list <- list()
N_list <- list()
for(j in 1:nrow(all_possible_ways_to_ditch_value)){
#print(j)
ind_N <- matrix(rep(0,6*3),ncol=6) # initiate ind_N as an empty matrix
for(i in 1:ncol(M)){
ind_N[,i] <- ind[-all_possible_ways_to_ditch_value[j,i],i] # fill with ind except for the value we ditch
}
N <- ind_to_M(ind_N)
X <- cbind(c(rep(1,nrow(N))),N)
ans_list[[j]] <- try(sum(diag(solve(t(X)%*%X)[-1,-1])),silent=TRUE) # some systems are not well defined, we'll just ignore the errors
N_list[[j]] <- N
}
We finally retrieve the minimal ans and the relevant N
ans <- ans_list[[which.min(ans_list)]]
# [1] -3.60288e+15
N <- N_list[[which.min(ans_list)]]
# [,1] [,2] [,3] [,4] [,5] [,6]
# [1,] 0 0 0 0 0 0
# [2,] 0 1 0 1 0 1
# [3,] 1 1 0 0 1 1
# [4,] 1 0 0 1 1 0
# [5,] 0 0 1 1 1 1
# [6,] 0 1 1 0 0 0
# [7,] 1 0 1 0 0 0
# [8,] 0 0 0 0 0 0
EDIT:
To get minimal positive ans
ans_list[which(!sapply(ans_list,is.numeric))] <- Inf
ans <- ans_list[[which.min(abs(unlist(ans_list)))]]
# [1] 3.3
N <- N_list[[which.min(abs(unlist(ans_list)))]]
# [,1] [,2] [,3] [,4] [,5] [,6]
# [1,] 0 0 0 0 0 0
# [2,] 0 1 0 1 0 0
# [3,] 1 1 0 0 0 1
# [4,] 1 0 0 0 1 0
# [5,] 0 0 0 1 1 1
# [6,] 0 1 1 0 1 0
# [7,] 1 0 1 1 0 0
# [8,] 0 0 1 0 0 1
EDIT 2 : to generalize the number of rows of ind to ditch
It seems to give the same result for ans for n_ditch = 1, and results make sense for n_ditch = 2
n_ditch <- 2
ditch_possibilities <- combn(1:4,n_ditch) # these are all the possible sets of indices to ditch for one given columns
all_possible_ways_to_ditch_value <- 1:ncol(ditch_possibilities) # this will be all the possible sets of indices of ditch_possibilities to test
for (i in 2:ncol(M)){
all_possible_ways_to_ditch_value <- merge(all_possible_ways_to_ditch_value,1:ncol(ditch_possibilities),by=NULL)
}
ans_list <- list()
N_list <- list()
for(j in 1:nrow(all_possible_ways_to_ditch_value)){
#print(j)
ind_N <- matrix(rep(0,6*(4-n_ditch)),ncol=6) # initiate ind_N as an empty matrix
for(i in 1:ncol(M)){
ind_N[,i] <- ind[-ditch_possibilities[,all_possible_ways_to_ditch_value[j,i]],i] # fill with ind except for the value we ditch
}
N <- ind_to_M(ind_N)
X <- cbind(c(rep(1,nrow(N))),N)
ans_list[[j]] <- try(sum(diag(solve(t(X)%*%X)[-1,-1])),silent=TRUE) # some systems are not well defined, we'll just ignore the errors
N_list[[j]] <- N
}
I want to generate a 7 column by 10 row matrix with a total of exactly 20 randomly generated 1's, but with at least two 1's per row and two 1's per column. How could I do that?
Also, how would the code be different if I wanted to set a range of acceptable row and column totals instead of minimums?
Thanks!
I feel like there should be a more elegant solution, but here's a ball of duct tape:
matbuilder <- function(n,nrow,ncol) {
finished <- F
while(!finished) {
trial <- matrix(sample(c(rep(1,n),rep(0,nrow*ncol-n))),nrow=nrow,ncol=ncol)
if(all(rowSums(trial)>=2 & all(colSums(trial)>=2))) finished <- T
}
return(trial)
}
x <- matbuilder(20, 10, 7)
x
## [,1] [,2] [,3] [,4] [,5] [,6] [,7]
## [1,] 1 1 0 0 0 0 0
## [2,] 0 0 0 0 1 0 1
## [3,] 0 1 0 0 0 1 0
## [4,] 1 0 0 0 1 0 0
## [5,] 0 1 0 0 0 1 0
## [6,] 0 0 1 1 0 0 0
## [7,] 0 0 1 1 0 0 0
## [8,] 0 0 0 0 1 0 1
## [9,] 0 0 0 0 0 1 1
## [10,] 0 1 1 0 0 0 0
sum(x)
## [1] 20
rowSums(x)
## [1] 2 2 2 2 2 2 2 2 2 2
colSums(x)
## [1] 2 4 3 2 3 3 3
Or, to give a range of acceptable row/column totals...
matbuilder <- function(n,nrow,ncol,rowmin,rowmax,colmin,colmax,ntimeout=100000) {
finished <- F
i <- 1
trial <- NA
while(!finished) {
trial <- matrix(sample(c(rep(1,n),rep(0,nrow*ncol-n))),nrow=nrow,ncol=ncol)
if(all(rowSums(trial)>=rowmin) & all(rowSums(trial)<=rowmax) & all(colSums(trial)>=colmin) & all(colSums(trial)<=colmax)) finished <- T
i <- i+1
if(i>ntimeout) {
finished <- T
cat("sorry boss, timeout.")
}
}
return(trial)
}
x <- matbuilder(25,10,7,rowmin=2,rowmax=3,colmin=2,colmax=4)
x
## [,1] [,2] [,3] [,4] [,5] [,6] [,7]
## [1,] 1 0 0 0 0 1 1
## [2,] 0 1 1 0 0 0 1
## [3,] 1 0 0 0 1 0 0
## [4,] 1 0 1 1 0 0 0
## [5,] 1 0 0 0 0 1 1
## [6,] 0 1 1 1 0 0 0
## [7,] 0 0 0 1 0 0 1
## [8,] 0 0 1 0 1 0 0
## [9,] 0 0 0 1 1 0 0
## [10,] 0 0 0 0 1 1 0
sum(x)
## [1] 25
rowSums(x)
## [1] 3 3 2 3 3 3 2 2 2 2
colSums(x)
## [1] 4 2 4 4 4 3 4
This one involves sampling a matrix of indices such that each row is repeated twice and columns are repeated at least 2 times.
set.seed(42)
m = matrix(rep(0, 70), nrow = 10)
#Sample rows 1-10 twice
rows = sample(c(1:10, 1:10))
#Sample columns 1-7 twice and additional 6 to make 20
columns = sample(c(sample(1:7, 6, replace = TRUE), 1:7, 1:7))
#Create a matrix of indices that should be 1
inds = cbind(rows, columns)
#Remove duplicates in inds if any (Refer: https://stackoverflow.com/q/44555420/7128934)
inds[,2] = replace(x = inds[,2],
list = duplicated(inds),
values = sample(x = columns[!(columns %in% inds[,2][duplicated(inds)])],
size = 1))
m[inds] = 1
#Check
rowSums(m)
#[1] 2 2 2 2 2 2 2 2 2 2
colSums(m)
#[1] 4 2 2 3 2 2 5
sum(m)
#[1] 20
I'd like to make some calculations on FIRST robotics teams and need to build, for lack of better words, a binary interaction matrix. That is when two teams were on the same alliance. Each alliance has three teams, so there are 7 values from each match added to the matrix, when considering (i,j), (j,i), and (i,i).
The full data I'm using is here: http://frc-events.firstinspires.org/2016/MOKC/qualifications
But for simplicity, here is an example of 9 teams playing 1 match each.
> data.frame(Team.1=1:3,Team.2=4:6,Team.3=7:9)
Team.1 Team.2 Team.3
1 1 4 7
2 2 5 8
3 3 6 9
The matrix should count each binary interaction, (1,4),(4,7),(3,6),(6,3),(9,9), etc, and will be an N x N matrix, where in the above example N=9. Here's the matrix that represents the above lists:
> matrix(data=c(1,0,0,1,0,0,1,0,0,+
+ 0,1,0,0,1,0,0,1,0,+
+ 0,0,1,0,0,1,0,0,1,+
+ 1,0,0,1,0,0,1,0,0,+
+ 0,1,0,0,1,0,0,1,0,+
+ 0,0,1,0,0,1,0,0,1,+
+ 1,0,0,1,0,0,1,0,0,+
+ 0,1,0,0,1,0,0,1,0,+
+ 0,0,1,0,0,1,0,0,1),9,9)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 1 0 0 1 0 0 1 0 0
[2,] 0 1 0 0 1 0 0 1 0
[3,] 0 0 1 0 0 1 0 0 1
[4,] 1 0 0 1 0 0 1 0 0
[5,] 0 1 0 0 1 0 0 1 0
[6,] 0 0 1 0 0 1 0 0 1
[7,] 1 0 0 1 0 0 1 0 0
[8,] 0 1 0 0 1 0 0 1 0
[9,] 0 0 1 0 0 1 0 0 1
In the real data, the team number are not sequential, and are would be more like 5732,1345,3451,etc, and there are more matches per team meaning the matrix values would be between 0 and max number of matches any of the teams played. This can be seen in the real data.
Thanks to anyone that can help.
There is probably a more elegant approach, but here is one using data.table.
library(data.table)
dat <- data.table(Team.1=1:3,Team.2=4:6,Team.3=7:9)
#add match ID
dat[,match:=1:.N]
#turn to long
mdat <- melt(dat,id="match",value.name="team")[,variable:=NULL]
#merge with itself
dat2 <- merge(mdat, mdat, by=c("match"),all=T, allow.cartesian = T)
# reshape
dcast(dat2, team.x~team.y, fun.agg=length)
team.x 1 2 3 4 5 6 7 8 9
1: 1 1 0 0 1 0 0 1 0 0
2: 2 0 1 0 0 1 0 0 1 0
3: 3 0 0 1 0 0 1 0 0 1
4: 4 1 0 0 1 0 0 1 0 0
5: 5 0 1 0 0 1 0 0 1 0
6: 6 0 0 1 0 0 1 0 0 1
7: 7 1 0 0 1 0 0 1 0 0
8: 8 0 1 0 0 1 0 0 1 0
9: 9 0 0 1 0 0 1 0 0 1
And, because I can, one in base-R. A case where I think the use of a for-loop is justified (as you keep modifying the same object).
#make matrix to put results in
nteams = length(unique(unlist(dat)))
res <- matrix(0,nrow=nteams, ncol=nteams)
#split data by row, generate combinations for each row and add to matrix
for(i in 1:nrow(dat)){
x=unlist(dat[i,])
coords=as.matrix(expand.grid(x,x))
res[coords] <- res[coords]+1
}
Here is my suggestion with base functions. I tried to create a matrix. My approach was to look for the position indexes for 1.
library(magrittr)
mydf <- data.frame(Team.1 = 1:3, Team.2 = 4:6,Team.3 = 7:9)
### Create a matrix with position indexes
lapply(1:nrow(mydf), function(x){
a <- t(combn(mydf[x, ], 2)) # Get some combination
b <- a[, 2:1] # Get other combination by reversing columns
foo <- rbind(a, b)
foo
}) %>%
do.call(rbind, .) -> ana
ana <- matrix(unlist(ana), nrow = nrow(ana))
### Another set: Get indexes for self (e.g., (1,1), (2,2), (3,3))
foo <- rep(1:max(mydf), times = 2)
matrix(foo, nrow = length(foo) / 2) -> bob
### A matric with all position indexes
cammy <- rbind(ana, bob)
### Create a plain matrix
mat <- matrix(0, nrow = max(mydf), ncol = max(mydf))
### Fill in the matrix with 1
mat[cammy] <- 1
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
# [1,] 1 0 0 1 0 0 1 0 0
# [2,] 0 1 0 0 1 0 0 1 0
# [3,] 0 0 1 0 0 1 0 0 1
# [4,] 1 0 0 1 0 0 1 0 0
# [5,] 0 1 0 0 1 0 0 1 0
# [6,] 0 0 1 0 0 1 0 0 1
# [7,] 1 0 0 1 0 0 1 0 0
# [8,] 0 1 0 0 1 0 0 1 0
# [9,] 0 0 1 0 0 1 0 0 1
EDIT
Here is a revised version based on the previous idea. This is not concise like Heroka's idea with base functions. In my modified data, team 1 and 4 had two matches. The idea here is that I counted how many times each pair appeared in the data set. The dplyr part is doing that. In the for loop, I filled in the matrix, mat by going through each row of cammy.
mydf <- data.frame(Team.1=c(1:3,1),Team.2=c(4:6,4),Team.3=c(7:9,5))
# Team.1 Team.2 Team.3
#1 1 4 7
#2 2 5 8
#3 3 6 9
#4 1 4 5
library(dplyr)
lapply(1:nrow(mydf), function(x){
a <- t(combn(mydf[x, ], 2)) # Get some combination
b <- a[, 2:1] # Get other combination by reversing columns
foo <- rbind(a, b)
foo
}) %>%
do.call(rbind, .) -> ana
ana <- data.frame(matrix(unlist(ana), nrow = nrow(ana)))
### Another set: Get indexes for self (e.g., (1,1), (2,2), (3,3))
foo <- rep(1:max(mydf), times = 2)
data.frame(matrix(foo, nrow = length(foo) / 2)) -> bob
cammy <- bind_rows(ana, bob) %>%
group_by(X1, X2) %>%
mutate(total = n()) %>%
as.matrix
### Create a plain matrix
mat <- matrix(0, nrow = max(mydf), ncol = max(mydf))
for(i in 1:nrow(cammy)){
mat[cammy[i, 1], cammy[i, 2]] <- cammy[i, 3]
}
print(mat)
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
# [1,] 1 0 0 2 1 0 1 0 0
# [2,] 0 1 0 0 1 0 0 1 0
# [3,] 0 0 1 0 0 1 0 0 1
# [4,] 2 0 0 1 1 0 1 0 0
# [5,] 1 1 0 1 1 0 0 1 0
# [6,] 0 0 1 0 0 1 0 0 1
# [7,] 1 0 0 1 0 0 1 0 0
# [8,] 0 1 0 0 1 0 0 1 0
# [9,] 0 0 1 0 0 1 0 0 1
I have a i times j (ixj) dummy-matrix for rating events of companies, with i dates and j different companies. On a day where a rating occurs rating [i,j]=1 and 0 otherwise.
I want to create a list, which contains 4 sublist (1 for each of the 4 companies). Each sublist states the row numbers of the rating event of the specific company.
This is my code:
r<-list(); i=1;j=2;
for(j in 1:4){
x<-list()
for(i in 100){
if(rating[i,j]!=0){
x<-c(x,i)
i=i+1
}
else{i=i+1}
}
r[[j]]<-x
j=j+1
}
It is somehow not working, and I really can not figure out where the bug is. The x sublists are always empty. Could somebody help?
Thanks a lot!
Here is an example rating matrix:
rating<-matrix(data = 0, nrow = (100), ncol = 4, dimnames=list(c(1:100), c(1:4)));
rating[3,1]=1;rating[7,1]=1;rating[20,1]=1;rating[75,1]=1;
rating[8,2]=1;rating[40,2]=1;rating[50,2]=1;rating[78,2]=1;
rating[1,3]=1;rating[4,3]=1;rating[17,3]=1;rating[99,3]=1;
rating[10,4]=1;rating[20,4]=1;rating[30,4]=1;rating[90,4]=1;
You may try this:
set.seed(123)
m <- matrix(data = sample(c(0, 1), 16, replace = TRUE), ncol = 4,
dimnames = list(date = 1:4, company = letters[1:4]))
m
# company
# date a b c d
# 1 0 1 1 1
# 2 1 0 0 1
# 3 0 1 1 0
# 4 1 1 0 1
lapply(as.data.frame(m), function(x){
which(x == 1)
})
# $a
# [1] 2 4
#
# $b
# [1] 1 3 4
#
# $c
# [1] 1 3
#
# $d
# [1] 1 2 4
Update
Or more compact (thanks to #flodel!):
lapply(as.data.frame(m == 1), which)
(Leave for-loops behind.) If ratings really is a matrix or even if its a dataframe, then why not use rowSums:
r <- rowSums(rating) # accomplished the stated task more effectively.
# simple example:
> rating <- matrix( rbinom(100, 1,prob=.5), 10)
> rating
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 1 0 1 0 0 1 1 0 0 1
[2,] 1 0 0 0 0 0 0 1 1 1
[3,] 0 0 1 1 1 1 0 0 1 0
[4,] 1 0 0 0 1 1 0 0 0 1
[5,] 1 1 0 1 1 1 1 0 0 0
[6,] 1 1 1 0 1 1 1 0 1 0
[7,] 0 1 0 1 0 1 1 0 1 0
[8,] 0 1 0 0 1 1 0 1 1 0
[9,] 1 1 1 0 1 1 1 1 0 0
[10,] 0 1 0 0 1 0 0 1 0 1
> rowSums(rating)
[1] 5 4 5 4 6 7 5 5 7 4
> rowSums(as.data.frame(rating))
[1] 5 4 5 4 6 7 5 5 7 4
If it needs to be a list then just wrap as.list() around it.