Find frequency of vector elements in a matrix - r

I have a matrix in R, here is a small example:
set.seed(1)
n.columns<-6
mat <- matrix(, nrow = 5, ncol = n.columns)
for(column in 1:n.columns){
mat[, column] <- sample(1:10,5)
}
mat
The matrix looks like this:
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 3 9 3 5 10 4
[2,] 4 10 2 7 2 1
[3,] 5 6 6 8 6 10
[4,] 7 5 10 3 1 7
[5,] 2 1 5 10 9 3
I also have a vector v of integers, v<-c(1,3,6), whose elements could theoretically appear in the matrix mat above.
What I am looking for is an overview of the number of times that each element in v appears in mat per column. For the current example this overview is
1: 0 1 0 0 1 1
3: 1 0 1 1 0 1
6: 0 1 1 0 1 0
It is fairly straightforward to do this using for-loops and if-statements, but this solution is not very pretty.
Is there a professional way to do this?

One option using sapply:
t(sapply(v, function(a) colSums(mat==a)))
# [,1] [,2] [,3] [,4] [,5] [,6]
#[1,] 0 1 0 0 1 1
#[2,] 1 0 1 1 0 1
#[3,] 0 1 1 0 1 0

Using table:
table(mat[mat %in% v], col(mat)[mat %in% v])
# 1 2 3 4 5 6
# 1 0 1 0 0 1 1
# 3 1 0 1 1 0 1
# 6 0 1 1 0 1 0
A drawback is a column with all values not in v will not be reported.

Using sapply on data.frame iterates over columns.
setNames(object = as.data.frame(sapply(v, function(a)
sapply(as.data.frame(mat), function(b)
sum(a %in% b)))), nm = v)
# 1 3 6
#V1 0 1 0
#V2 1 0 1
#V3 0 1 1
#V4 0 1 0
#V5 1 0 1
#V6 1 1 0

Related

How to find rows that sum up to certain values of colSums and rowSums?

My task is to randomly assign 8 rows that consist of 12 columns and values that are random combinations of 0 and 1 values while each row sum equals 6 and each column sum equals 4.
So I create all possible combinations of 0 and 1 within 12 variables:
df <- expand.grid(0:1, 0:1, 0:1, 0:1, 0:1, 0:1,
0:1, 0:1, 0:1, 0:1, 0:1, 0:1)
Restrain possible combinations to these that row sum equals 6:
df <- df[rowSums(df)==6,]
Then I shuffle it:
shuffled <- df[sample(nrow(df)),]
and finally I'd like to pick 8 rows from shuffled data. All these 8 rows must have column sums that equal 4 and row sums equal 6:
colSums(picked_shuffled)
[1] 4 4 4 4 4 4 4 4 4 4 4 4
rowSums(picked_shuffled)
[1] 6 6 6 6 6 6 6 6
How to do it?
Doing it by trial and error will take you a very long time! An alternative is to construct a matrix that works and then shuffle it...
rows <- rep(1:8, 6) #48 row positions for the 1s - 6 of each
columns <- rep(1:12, each = 4) #48 column positions for the 1s - 4 of each
mat <- matrix(0, nrow = 8, ncol = 12) #blank matrix of 0s
mat[cbind(rows, columns)] <- 1 #set selected values to 1
mat <- mat[sample(1:8), sample(1:12)] #shuffle rows and columns
mat
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
[1,] 1 0 0 0 1 1 0 0 1 1 0 1
[2,] 0 1 1 1 0 0 1 1 0 0 1 0
[3,] 0 1 1 1 0 0 1 1 0 0 1 0
[4,] 1 0 0 0 1 1 0 0 1 1 0 1
[5,] 1 0 0 0 1 1 0 0 1 1 0 1
[6,] 0 1 1 1 0 0 1 1 0 0 1 0
[7,] 1 0 0 0 1 1 0 0 1 1 0 1
[8,] 0 1 1 1 0 0 1 1 0 0 1 0
I don't know if it is possible to produce a more "random" distribution than this - there are still only two types of column and two types of row however you shuffle it!
By the way these operations are usually much faster on matrices than dataframes - you can always convert it at the end.
A more random solution...
After a bit of thought, it is possible to get a more "random" solution with the method above, but shuffling columns until you get no duplicated row-column pairs (which seems to be quite fast). So a modified version...
rows <- rep(1:8, 6)
columns <- sample(rep(1:12, 4))
while(any(duplicated(cbind(rows, columns)))){
columns <- sample(columns)
}
mat <- matrix(0, nrow = 8, ncol = 12)
mat[cbind(rows, columns)] <- 1
mat
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
[1,] 0 0 1 0 1 1 0 1 0 1 1 0
[2,] 1 1 0 1 0 1 1 0 0 0 1 0
[3,] 1 1 1 0 0 0 1 0 0 1 0 1
[4,] 0 1 0 1 0 1 0 1 1 1 0 0
[5,] 0 1 0 0 1 0 1 0 1 0 1 1
[6,] 0 0 1 0 1 0 1 1 1 0 0 1
[7,] 1 0 1 1 1 1 0 0 0 1 0 0
[8,] 1 0 0 1 0 0 0 1 1 0 1 1
rowSums(mat)
[1] 6 6 6 6 6 6 6 6
colSums(mat)
[1] 4 4 4 4 4 4 4 4 4 4 4 4
I have got a less clean but more random solution to the problem than Andrew. It randomly shoots 1 at the initially empty grid, until the conditions are satisfied. Sometimes, it removes 20% of previous hits to prevent getting stuck. When it gets stuck because of too many iterations, it resets.
I simulated it and it usually takes about 40-80 iterations to fill the grid according to your specifications. In rare cases, it takes up to 160.
grid = matrix(0,nrow=8,ncol=12)
finished = F
count=0
while(!finished){
openrows = c(1:8)[rowSums(grid)<6]
opencols = c(1:12)[colSums(grid)<4]
if(length(openrows)>0 & length(opencols)>0){
if(length(openrows)==1 & length(opencols)==1 & grid[openrows[1],opencols[1]]==1){
grid[grid==1 & runif(length(grid),0,1)>0.8]=0
}
i = as.integer(runif(1,0,length(openrows)))+1
j = as.integer(runif(1,0,length(opencols)))+1
grid[openrows[i],opencols[j]]=1
}else{
finished=TRUE
}
count = count+1
if(count>500){
grid = matrix(0,nrow=8,ncol=12)
count=0
}
}
It's not very efficient (for large tables) but it works and gives you random data.
That was quite the brain teaser, tbh.

Changing the values in a binary matrix

Consider the 8 by 6 binary matrix, M:
M <- matrix(c(0,0,1,1,0,0,1,1,
0,1,1,0,0,1,1,0,
0,0,0,0,1,1,1,1,
0,1,0,1,1,0,1,0,
0,0,1,1,1,1,0,0,
0,1,1,0,1,0,0,1),nrow = 8,ncol = 6)
Here is the M
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0 0 0 0 0 0
[2,] 0 1 0 1 0 1
[3,] 1 1 0 0 1 1
[4,] 1 0 0 1 1 0
[5,] 0 0 1 1 1 1
[6,] 0 1 1 0 1 0
[7,] 1 1 1 1 0 0
[8,] 1 0 1 0 0 1
The following matrix contains the column index of the 1's in matrix M
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 3 2 5 2 3 2
[2,] 4 3 6 4 4 3
[3,] 7 6 7 5 5 5
[4,] 8 7 8 7 6 8
Let's denote that
ind <- matrix(c(3,4,7,8,
2,3,6,7,
5,6,7,8,
2,4,5,7,
3,4,5,6,
2,3,5,8),nrow = 4, ncol=6)
I'm trying to change a single position of 1 into 0in each column of M.
For an example, one possibility of index of1s in each column would be (4,2,5,4,3,2), i.e. 4th position of Column1, 2nd position of Column2, 5thposition of Column3 and so on. Let N be the resulting matrices. This will produce the following matrix N
N <- matrix(c(0,0,1,0,0,0,1,1,
0,0,1,0,0,1,1,0,
0,0,0,0,0,1,1,1,
0,1,0,0,1,0,1,0,
0,0,0,1,1,1,0,0,
0,0,1,0,1,0,0,1),nrow = 8,ncol = 6)
Here is that N
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0 0 0 0 0 0
[2,] 0 0 0 1 0 0
[3,] 1 1 0 0 0 1
[4,] 0 0 0 0 1 0
[5,] 0 0 0 1 1 1
[6,] 0 1 1 0 1 0
[7,] 1 1 1 1 0 0
[8,] 1 0 1 0 0 1
For EACH of the resulting matrices of N, I do the following calculations.
X <- cbind(c(rep(1,nrow(N))),N)
ans <- sum(diag(solve(t(X)%*%X)[-1,-1]))
Then, I want to obtain the matrix N, which produce the smallest value of ans. How do I do this efficiently?
Let me know if this works.
We first build a conversion function that I'll need, and we build also the reverse function as you may need it at some point:
ind_to_M <- function(ind){
M <- matrix(rep(0,6*8),ncol=6)
for(i in 1:ncol(ind)){M[ind[,i],i] <- 1}
return(M)
}
M_to_ind <- function(M){apply(M==1,2,which)}
Then we will build a matrix of possible ways to ditch a value
all_possible_ways_to_ditch_value <- 1:4
for (i in 2:ncol(M)){
all_possible_ways_to_ditch_value <- merge(all_possible_ways_to_ditch_value,1:4,by=NULL)
}
# there's probably a more elegant way to do that
head(all_possible_ways_to_ditch_value)
# x y.x y.y y.x y.y y
# 1 1 1 1 1 1 1 # will be used to ditch the 1st value of ind for every column
# 2 2 1 1 1 1 1
# 3 3 1 1 1 1 1
# 4 4 1 1 1 1 1
# 5 1 2 1 1 1 1
# 6 2 2 1 1 1 1
Then we iterate through those, each time storing ans and N (as data is quite small overall).
ans_list <- list()
N_list <- list()
for(j in 1:nrow(all_possible_ways_to_ditch_value)){
#print(j)
ind_N <- matrix(rep(0,6*3),ncol=6) # initiate ind_N as an empty matrix
for(i in 1:ncol(M)){
ind_N[,i] <- ind[-all_possible_ways_to_ditch_value[j,i],i] # fill with ind except for the value we ditch
}
N <- ind_to_M(ind_N)
X <- cbind(c(rep(1,nrow(N))),N)
ans_list[[j]] <- try(sum(diag(solve(t(X)%*%X)[-1,-1])),silent=TRUE) # some systems are not well defined, we'll just ignore the errors
N_list[[j]] <- N
}
We finally retrieve the minimal ans and the relevant N
ans <- ans_list[[which.min(ans_list)]]
# [1] -3.60288e+15
N <- N_list[[which.min(ans_list)]]
# [,1] [,2] [,3] [,4] [,5] [,6]
# [1,] 0 0 0 0 0 0
# [2,] 0 1 0 1 0 1
# [3,] 1 1 0 0 1 1
# [4,] 1 0 0 1 1 0
# [5,] 0 0 1 1 1 1
# [6,] 0 1 1 0 0 0
# [7,] 1 0 1 0 0 0
# [8,] 0 0 0 0 0 0
EDIT:
To get minimal positive ans
ans_list[which(!sapply(ans_list,is.numeric))] <- Inf
ans <- ans_list[[which.min(abs(unlist(ans_list)))]]
# [1] 3.3
N <- N_list[[which.min(abs(unlist(ans_list)))]]
# [,1] [,2] [,3] [,4] [,5] [,6]
# [1,] 0 0 0 0 0 0
# [2,] 0 1 0 1 0 0
# [3,] 1 1 0 0 0 1
# [4,] 1 0 0 0 1 0
# [5,] 0 0 0 1 1 1
# [6,] 0 1 1 0 1 0
# [7,] 1 0 1 1 0 0
# [8,] 0 0 1 0 0 1
EDIT 2 : to generalize the number of rows of ind to ditch
It seems to give the same result for ans for n_ditch = 1, and results make sense for n_ditch = 2
n_ditch <- 2
ditch_possibilities <- combn(1:4,n_ditch) # these are all the possible sets of indices to ditch for one given columns
all_possible_ways_to_ditch_value <- 1:ncol(ditch_possibilities) # this will be all the possible sets of indices of ditch_possibilities to test
for (i in 2:ncol(M)){
all_possible_ways_to_ditch_value <- merge(all_possible_ways_to_ditch_value,1:ncol(ditch_possibilities),by=NULL)
}
ans_list <- list()
N_list <- list()
for(j in 1:nrow(all_possible_ways_to_ditch_value)){
#print(j)
ind_N <- matrix(rep(0,6*(4-n_ditch)),ncol=6) # initiate ind_N as an empty matrix
for(i in 1:ncol(M)){
ind_N[,i] <- ind[-ditch_possibilities[,all_possible_ways_to_ditch_value[j,i]],i] # fill with ind except for the value we ditch
}
N <- ind_to_M(ind_N)
X <- cbind(c(rep(1,nrow(N))),N)
ans_list[[j]] <- try(sum(diag(solve(t(X)%*%X)[-1,-1])),silent=TRUE) # some systems are not well defined, we'll just ignore the errors
N_list[[j]] <- N
}

maximal number of identical elements between any two columns of a matrix in R

I just was wondering if there was an easy way to compute the maximal number of identical elements between any two columns of a matrix in R.
For example, I have a matrix
test <- replicate(10, sample((0:3), 10, replace = TRUE))
test
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 3 0 1 0 2 2 1 0 2 0
[2,] 1 1 3 2 0 2 3 0 2 2
[3,] 2 3 0 0 1 2 0 3 0 2
[4,] 2 2 1 1 2 0 0 1 1 0
[5,] 2 0 1 2 0 1 1 1 0 0
[6,] 1 0 1 3 2 3 3 1 3 2
[7,] 0 1 3 2 1 0 1 2 1 1
[8,] 0 3 1 3 0 2 3 1 1 1
[9,] 2 3 1 3 0 1 0 1 3 2
[10,] 3 2 1 0 2 1 3 2 3 1
To compare column 1 and 2 I use
table(test[,1] == test[,2])
FALSE TRUE
8 2
So there are two identical elements between these two columns.
I could now repeat this for all pairs of columns using two nested for loops and then find the maximum number of TRUE calls but this does not look nice. Can anyone think of a better way?
Cheers,
Maik
It is always interesting to see a reasonable answer being voted down. Though I don't like this minus score, I would keep my answer. Voter, what do you think?
Let's first get some reproducible toy data:
set.seed(0); x <- replicate(10, sample((0:3), 10, replace = TRUE))
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
# [1,] 3 0 3 1 1 2 1 3 3 0
# [2,] 1 0 3 1 3 1 3 1 1 0
# [3,] 1 0 0 2 2 3 1 3 2 0
# [4,] 2 2 2 1 3 1 1 1 1 2
# [5,] 3 1 0 0 2 0 1 1 1 3
# [6,] 0 3 1 3 2 0 2 1 3 3
# [7,] 3 1 1 2 3 0 1 3 0 3
# [8,] 3 2 0 3 0 1 1 3 2 1
# [9,] 2 3 1 0 1 2 3 1 0 1
#[10,] 2 1 3 2 2 2 0 3 0 3
For any input matrix x, you can use:
y <- unlist(lapply(seq_len(ncol(x)-1L),
function(i) colSums(x[, (i+1):ncol(x), drop = FALSE] == x[, i])))
# [1] 1 2 3 2 4 1 4 2 3 3 1 0 0 3 1 3 5 1 3 1 2 4 1 4 3 4 2 3 5 1 1 3 2 1 2 2 3 3
#[39] 1 2 3 1 4 3 1
max(y)
# [1] 5
The comment by #David is doing essentially the same thing but way slower:
y <- combn(ncol(x), 2, FUN = function(u) sum(x[, u[1]] == x[, u[2]]))
# [1] 1 2 3 2 4 1 4 2 3 3 1 0 0 3 1 3 5 1 3 1 2 4 1 4 3 4 2 3 5 1 1 3 2 1 2 2 3 3
#[39] 1 2 3 1 4 3 1
max(y)
# [1] 5
Benchmarking
We generate a 10 * 1000 matrix for experiment:
set.seed(0); x <- replicate(1e+3, sample((0:3), 10, replace = TRUE))
system.time(unlist(lapply(seq_len(ncol(x)-1L), function(i) colSums(x[, (i+1):ncol(x), drop = FALSE] == x[, i]))))
# user system elapsed
# 0.176 0.032 0.207
system.time(combn(ncol(x), 2, FUN = function(u) sum(x[, u[1]] == x[, u[2]])))
# user system elapsed
# 4.692 0.008 4.708
Something like a distance matrix?
With this idea, you could also generate a "distance" matrix for number of non-equal elements between all columns (just replace the == with !=):
y <- unlist(lapply(seq_len(ncol(x)-1L),
function(i) colSums(x[, (i+1):ncol(x), drop = FALSE] != x[, i])))
z <- matrix(0L, ncol(x), ncol(x))
z[lower.tri(z)] <- y
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
# [1,] 0 0 0 0 0 0 0 0 0 0
# [2,] 9 0 0 0 0 0 0 0 0 0
# [3,] 8 7 0 0 0 0 0 0 0 0
# [4,] 7 9 9 0 0 0 0 0 0 0
# [5,] 8 10 7 7 0 0 0 0 0 0
# [6,] 6 10 9 6 9 0 0 0 0 0
# [7,] 9 7 8 8 7 8 0 0 0 0
# [8,] 6 9 6 7 8 7 8 0 0 0
# [9,] 8 7 9 5 9 7 7 6 0 0
#[10,] 7 5 6 9 8 9 9 7 9 0
Note that only lower triangular matrix is computed due to symmetry. Diagonal are all zeros (or course).
Try:
max(combn(split(test, col(test)), 2, function(x) sum(x[[1]] == x[[2]])))
If you want to know which pair has the greatest number of equal elements it's a little more complicated.

Creating a matrix of multiple counters in R

So, my goal is to take an input vector and to make an output matrix of different counters. So every time a value appears in my inputs, I want to find that counter and iterate it by 1. I understand that I'm not good at explaining this, so I illustrated a simple version below. However, I want to make 2 changes which I will enumerate after the example so that it makes sense.
nums = c(1,2,3,4,5,1,2,4,3,5)
unis = unique(nums)
counter = matrix(NA, nrow = length(nums), ncol = length(unis))
colnames(counter) = unis
for (i in 1:length(nums)){
temp = nums[i]
if (i == 1){
counter[1,] = 0
counter[1,temp] = 1
} else {
counter[i,] = counter[i-1,]
counter[i,temp] = counter[i-1,temp]+1
}
}
counter
which outputs
> counter
1 2 3 4 5
[1,] 1 0 0 0 0
[2,] 1 1 0 0 0
[3,] 1 1 1 0 0
[4,] 1 1 1 1 0
[5,] 1 1 1 1 1
[6,] 2 1 1 1 1
[7,] 2 2 1 1 1
[8,] 2 2 1 2 1
[9,] 2 2 2 2 1
[10,] 2 2 2 2 2
The 2 modifications. 1) Since the real data is much larger, I would want to do this using apply or however people who know R better than me says it should be done. 2) Whereas the input is a vector where each element is only an element, how could this be generalized if an element of a vector was a tuple? For example (if nums was a tuple of 4 and 5, then it would iterate both in that step and the last line of the output would then be 2,2,2,3,2)
Thanks and if you don't understand please ask questions and I'll try to clarify
Using the Matrix package (which ships with a standard installation of R)
nums <- c(1,2,3,4,5,1,2,4,3,5)
apply(Matrix::sparseMatrix(i=seq_along(nums), j=nums), 2, cumsum)
# [,1] [,2] [,3] [,4] [,5]
# [1,] 1 0 0 0 0
# [2,] 1 1 0 0 0
# [3,] 1 1 1 0 0
# [4,] 1 1 1 1 0
# [5,] 1 1 1 1 1
# [7,] 2 2 1 1 1
# [8,] 2 2 1 2 1
# [9,] 2 2 2 2 1
# [10,] 2 2 2 2 2
Note that this behaves a bit differently in a couple of ways from thelatemail's suggested solution. Which behavior you prefer will depend on what you are using this for.
Here's a small example that illustrates the differences:
nums <- c(5,2,1,1)
# My suggestion
apply(Matrix::sparseMatrix(i=seq_along(nums), j=nums), 2, cumsum)
# [,1] [,2] [,3] [,4] [,5]
# [1,] 0 0 0 0 1
# [2,] 0 1 0 0 1
# [3,] 1 1 0 0 1
# [4,] 2 1 0 0 1
# #thelatemail's suggestion
sapply(unique(nums), function(x) cumsum(nums==x) )
# [,1] [,2] [,3]
# [1,] 1 0 0
# [2,] 1 1 0
# [3,] 1 1 1
# [4,] 1 1 2
For your second question, you could do something like this:
nums <- list(1,2,3,4,5,1,2,4,3,c(4,5))
ii <- rep(seq_along(nums), times=lengths(nums)) ## lengths() is in R>=3.2.0
jj <- unlist(nums)
apply(Matrix::sparseMatrix(i=ii, j=jj), 2, cumsum)
# [,1] [,2] [,3] [,4] [,5]
# [1,] 1 0 0 0 0
# [2,] 1 1 0 0 0
# [3,] 1 1 1 0 0
# [4,] 1 1 1 1 0
# [5,] 1 1 1 1 1
# [6,] 2 1 1 1 1
# [7,] 2 2 1 1 1
# [8,] 2 2 1 2 1
# [9,] 2 2 2 2 1
# [10,] 2 2 2 3 2
For your first query, you can get there with something like:
sapply(unique(nums), function(x) cumsum(nums==x) )
# [,1] [,2] [,3] [,4] [,5]
# [1,] 1 0 0 0 0
# [2,] 1 1 0 0 0
# [3,] 1 1 1 0 0
# [4,] 1 1 1 1 0
# [5,] 1 1 1 1 1
# [6,] 2 1 1 1 1
# [7,] 2 2 1 1 1
# [8,] 2 2 1 2 1
# [9,] 2 2 2 2 1
#[10,] 2 2 2 2 2
Another idea:
do.call(rbind, Reduce("+", lapply(nums, tabulate, max(unlist(nums))), accumulate = TRUE))
# [,1] [,2] [,3] [,4] [,5]
# [1,] 1 0 0 0 0
# [2,] 1 1 0 0 0
# [3,] 1 1 1 0 0
# [4,] 1 1 1 1 0
# [5,] 1 1 1 1 1
# [6,] 2 1 1 1 1
# [7,] 2 2 1 1 1
# [8,] 2 2 1 2 1
# [9,] 2 2 2 2 1
#[10,] 2 2 2 2 2
And generally:
x = list(1, 3, 6, c(6, 3), 2, c(4, 6, 1), c(1, 2), 3)
do.call(rbind, Reduce("+", lapply(x, tabulate, max(unlist(x))), accumulate = TRUE))
# [,1] [,2] [,3] [,4] [,5] [,6]
#[1,] 1 0 0 0 0 0
#[2,] 1 0 1 0 0 0
#[3,] 1 0 1 0 0 1
#[4,] 1 0 2 0 0 2
#[5,] 1 1 2 0 0 2
#[6,] 2 1 2 1 0 3
#[7,] 3 2 2 1 0 3
#[8,] 3 2 3 1 0 3

Create list with looping

I have a i times j (ixj) dummy-matrix for rating events of companies, with i dates and j different companies. On a day where a rating occurs rating [i,j]=1 and 0 otherwise.
I want to create a list, which contains 4 sublist (1 for each of the 4 companies). Each sublist states the row numbers of the rating event of the specific company.
This is my code:
r<-list(); i=1;j=2;
for(j in 1:4){
x<-list()
for(i in 100){
if(rating[i,j]!=0){
x<-c(x,i)
i=i+1
}
else{i=i+1}
}
r[[j]]<-x
j=j+1
}
It is somehow not working, and I really can not figure out where the bug is. The x sublists are always empty. Could somebody help?
Thanks a lot!
Here is an example rating matrix:
rating<-matrix(data = 0, nrow = (100), ncol = 4, dimnames=list(c(1:100), c(1:4)));
rating[3,1]=1;rating[7,1]=1;rating[20,1]=1;rating[75,1]=1;
rating[8,2]=1;rating[40,2]=1;rating[50,2]=1;rating[78,2]=1;
rating[1,3]=1;rating[4,3]=1;rating[17,3]=1;rating[99,3]=1;
rating[10,4]=1;rating[20,4]=1;rating[30,4]=1;rating[90,4]=1;
You may try this:
set.seed(123)
m <- matrix(data = sample(c(0, 1), 16, replace = TRUE), ncol = 4,
dimnames = list(date = 1:4, company = letters[1:4]))
m
# company
# date a b c d
# 1 0 1 1 1
# 2 1 0 0 1
# 3 0 1 1 0
# 4 1 1 0 1
lapply(as.data.frame(m), function(x){
which(x == 1)
})
# $a
# [1] 2 4
#
# $b
# [1] 1 3 4
#
# $c
# [1] 1 3
#
# $d
# [1] 1 2 4
Update
Or more compact (thanks to #flodel!):
lapply(as.data.frame(m == 1), which)
(Leave for-loops behind.) If ratings really is a matrix or even if its a dataframe, then why not use rowSums:
r <- rowSums(rating) # accomplished the stated task more effectively.
# simple example:
> rating <- matrix( rbinom(100, 1,prob=.5), 10)
> rating
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 1 0 1 0 0 1 1 0 0 1
[2,] 1 0 0 0 0 0 0 1 1 1
[3,] 0 0 1 1 1 1 0 0 1 0
[4,] 1 0 0 0 1 1 0 0 0 1
[5,] 1 1 0 1 1 1 1 0 0 0
[6,] 1 1 1 0 1 1 1 0 1 0
[7,] 0 1 0 1 0 1 1 0 1 0
[8,] 0 1 0 0 1 1 0 1 1 0
[9,] 1 1 1 0 1 1 1 1 0 0
[10,] 0 1 0 0 1 0 0 1 0 1
> rowSums(rating)
[1] 5 4 5 4 6 7 5 5 7 4
> rowSums(as.data.frame(rating))
[1] 5 4 5 4 6 7 5 5 7 4
If it needs to be a list then just wrap as.list() around it.

Resources