Matrix not filling up with 1 where necessary in R - r

I am trying to fill a matrix with ones (1) where the index corresponds to the max of another matrix of the same dimension.
below is a reproducible example
set.seed(1)
matric <- matrix(rnorm(3000), nrow = 500, ncol = 6)
best <- matrix(0, nrow = 500, ncol = 6)
n_end <- 500
for (i in n_end) {
best[which(matric[i,] == max(matric[i,]))] <- 1
}
## the best model has the higher sum
best1 <- colSums(best)
Please Help!! :)

We can use seq_len or 1:n_end and inside the loop, as we are assigning to the same row of 'best', index that matrix as well
for(i in 1:n_end) {
best[i,][matric[i,] == max(matric[i,])] <- 1
}
Or if there is only a single max value per row, then use max.col as it is faster and vectorized
best[cbind(seq_len(nrow(best)), max.col(matric, 'first'))] <- 1
Or with rowMaxs from matrixStats
library(matrixStats)
+(matric == rowMaxs(matric))

Related

How to quantify the frequency of all possible row combinations of a binary matrix in R in a more efficient way?

Lets assume I have a binary matrix with 24 columns and 5000 rows.
The columns are Parameters (P1 - P24) of 5000 subjects. The parameters are binary (0 or 1).
(Note: my real data can contain as much as 40,000 subjects)
m <- matrix(, nrow = 5000, ncol = 24)
m <- apply(m, c(1,2), function(x) sample(c(0,1),1))
colnames(m) <- paste("P", c(1:24), sep = "")
Now I would like to determine what are all possible combinations of the 24 measured parameters:
comb <- expand.grid(rep(list(0:1), 24))
colnames(comb) <- paste("P", c(1:24), sep = "")
The final question is: How often does each of the possible row combinations from comb appear in matrix m?
I managed to write a code for this and create a new column in comb to add the counts. But my code appears to be really slow and would take 328 days to complete to run. Therefore the code below only considers the 20 first combinations
comb$count <- 0
for (k in 1:20){ # considers only the first 20 combinations of comb
for (i in 1:nrow(m)){
if (all(m[i,] == comb[k,1:24])){
comb$count[k] <- comb$count[k] + 1
}
}
}
Is there computationally a more efficient way to compute this above so I can count all combinations in a short time?
Thank you very much for your help in advance.
Data.Table is fast at this type of operation:
m <- matrix(, nrow = 5000, ncol = 24)
m <- apply(m, c(1,2), function(x) sample(c(0,1),1))
colnames(m) <- paste("P", c(1:24), sep = "")
comb <- expand.grid(rep(list(0:1), 24))
colnames(comb) <- paste("P", c(1:24), sep = "")
library(data.table)
data_t = data.table(m)
ans = data_t[, .N, by = P1:P24]
dim(ans)
head(ans)
The core of the function is by = P1:P24 means group by all the columns; and .N the number of records in group
I used this as inspiration - How does one aggregate and summarize data quickly?
and the data_table manual https://cran.r-project.org/web/packages/data.table/vignettes/datatable-intro.html
If all you need is the combinations that occur in the data and how many times, this will do it:
m2 <- apply(m, 1, paste0, collapse="")
m2.tbl <- xtabs(~m2)
head(m2.tbl)
m2
# 000000000001000101010010 000000000010001000100100 000000000010001110001100 000000000100001000010111 000000000100010110101010 000000000100101000101100
# 1 1 1 1 1 1
You can use apply to paste the unique values in a row and use table to count the frequency.
table(apply(m, 1, paste0, collapse = '-'))

nested double loop in R

mat <- matrix(0,ncol=6, nrow=100)
d=c(1,2,4,8,16,32)
for(i in 1:6)
{
for(j in d)
{
mat[,i]=rep(j,100)
}
}
mat
I should get a 100 x 6 matrix with columns of 1,2,4,8,16,32. However, I simply get rows of 32 in every column. Does anyone have any idea how I can fix this. I do want to use loops, even if its one loop that's fine.
The answer from #neilfws is more elegant. If you are committed to using a loop for some reason you can do this
mat <- matrix(0,ncol=6, nrow=100)
d=c(1,2,4,8,16,32)
for(i in 1:6)
{
j <- d[i]
mat[,i]=rep(j,100)
}
mat
The issue is that you were looping through all of d for each column.
Based on your description: 100 rows x 6 columns, column 1 = value 1...column 6 = value 32, this should generate what you want.
matrix(data = rep(c(1,2,4,8,16,32), each = 100),
nrow = 100,
ncol = 6)

Call a function and provide output in a matrix

I have a function in R which I call
RS1 = t(cbind(Data[,18], Data[,20]))
RS2 = t(cbind(Data[,19], Data[,21]))
p = t(Data[23:24])
rand_x <- function (p, x) {
n.goods <- dim (p)[1]
n.obs <- dim (p)[2]
xRC = NaN*matrix(1, n.goods, n.obs)
for(i in 1:n.obs) {
xRC[1,i] <- RS1[1,i] + RS1[2,i]
xRC[2,i] <- RS2[1,i] + RS2[2,i]
}
result <- xRC
return(result)
}
This function by having these two inputs generates a vector (2x50) with some random numbers. I want to call this function rand_x 1000 times and derive 1000 matrices and then bind the results in a final matrix. I have tried to create a loop to sort this problem but I am still struggling. Any help will be much appreciated.
If you intend to add each element of column 18 to 20 (that is what your code does), try using rowSums().
Try:
xRC <- rbind(
rowSums (Data [, c(18, 20)])
rowSums (Data [, c(19, 21)])
)
The output will be a matrix.
I do not see, where randomness appears in your function though. If you just want a 2x50 matrix with random numbers you may want to use:
xRC <- matrix (rnorm(50*2), 2) # for standard-normal generated numbers
xRC <- matrix (sample(1:100, replace = T, size = 100), 2) # for numbers between 1 and 100, uniformly distributed
To do this 1000 times, try:
for (i in 1:1000) {
rbind(xRC,
rowSums (Data [, c(18, 20)])
rowSums (Data [, c(19, 21)])
)
}
# or if you just want to generate random numbers, performance is way faster when you use:
xRC <- matrix(rnorm(1000 * 2 * 50), ncol = 50)

Generating binary matrix with number of 1's fixed within a range [duplicate]

I want to generate an nxm matrix. Suppose its 100x3. I want each row to sum to 1 (so two "0"'s and one "1").
sample(c(0,0,1),3)
will give me 1 row but is there a very fast way to generate the whole matrix without an rbind?
Thank you!
No loops, no transposition. Just create a matrix of zeros and replace one entry per row with 1 by sampling the rows.
m <- matrix(0, 100, 3)
nr <- nrow(m)
m[cbind(1:nr, sample(ncol(m), nr, TRUE))] <- 1
all(rowSums(m) == 1)
# [1] TRUE
mat <- matrix(runif(300),ncol=3)
mat[] <- as.numeric(t(apply(mat, 1, function(r) r == max(r))))
t(apply(t(matrix(rep(c(0,0,1),300),nrow = 3)), 1, function(x) sample(x)))
Since you want single 1 for a row, the problem can be restated to select a column entry randomly that has 1 for each row.
So you can do like,
m <- 3; n<-100
rand_v <- floor(runif(n)*3)+1
mat <- matrix(0,n,m)
idx <- cbind(1:n,rand_v)
mat[idx] <- 1
Hope this helps.

Avoid a for loop

I am trying to optimize an algorithm and I really want to avoid all my loops. Hence I am wondering if there is a way to avoid the following simple loop:
library(FNN)
data <- cbind(1:10, 1:10)
NN.index <- get.knn(data, 5)$nn.index
bc <- matrix(0, nrow(NN.index), max(NN.index))
for(i in 1:nrow(bc)){
bc[i,NN.index[i,]] <- 1
}
were bc is a matrix of zeros.
In R, if the bracket of a matrix M take a k-by-2 matrix 'I', then each row of the k-by-2 matrix I is recognized as the row and column index of M. For example
M = matrix(1:20, nrow =4, ncol = 3)
print(M)
I = rbind(c(1,2), c(4,2), c(3,3))
print(M[I])
In this case, M[1,2], M[4,2] and M[3,3] are extracted.
In your case, we can create row_index and col_index from NN.index as below, and then assign 1 to the corresponding entries.
bc <- matrix(0, nrow(NN.index), max(NN.index))
row_index <- rep(1:nrow(NN.index), times = ncol(NN.index))
col_index <- as.vector(NN.index)
bc[cbind(row_index, col_index)] <- 1
print(bc)

Resources