Number of unique values in every row of a matrix in R - r

I have this toy matrix:
m
[,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,] 1 3 1 6 8 8 8
[2,] 2 2 5 7 9 7 4
[3,] 1 2 3 4 5 6 7
[4,] 1 2 3 4 5 6 7
[5,] 1 2 3 4 5 6 7
and I want to keep only the rows with 4 unique elements.
My initial strategy was to use...
apply(m, 1, function(x) unique(x))
[[1]]
[1] 1 3 6 8
[[2]]
[1] 2 5 7 9 4
[[3]]
[1] 1 2 3 4 5 6 7
[[4]]
[1] 1 2 3 4 5 6 7
[[5]]
[1] 1 2 3 4 5 6 7
... which would inform R of the unique elements in each row.
Then it would be as simple as m[length(apply(m, 1, function(x) unique(x)))==4, ].
But, not so easy... The output of apply(), at least in this case is a list, and hence, I can't use this trick.
Can you help?

Related

I would like to fill a matrix with values of a list

I have a list of 6 with 10 values in each. I would like to fill a 10x6 (10 rows, 6 columns) matrix with these values. I've tried some things but it's not working. I'm sure there must be an easy way to do it, but I haven't found it yet. Could anyone please help?
Here some example data:
l = lapply(1:6, rep, 10)
then use ?do.call and cbind to paste the list elements as columns:
do.call(cbind, l)
and you get a matrix:
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1 2 3 4 5 6
[2,] 1 2 3 4 5 6
[3,] 1 2 3 4 5 6
[4,] 1 2 3 4 5 6
[5,] 1 2 3 4 5 6
[6,] 1 2 3 4 5 6
[7,] 1 2 3 4 5 6
[8,] 1 2 3 4 5 6
[9,] 1 2 3 4 5 6
[10,] 1 2 3 4 5 6

to find most frequently occuring element in matrix in R

Is there any function in R to find most frequently occuring element in matrix??I Have a matrix containing image pixels.I want to find which image pixel occur most frequently in the image matrix.I dont want to use the for loops since it would be very time taking to iterate over all the pixels of an image.
Set up some test data.
> (image = matrix(sample(1:10, 100, replace = TRUE), nrow = 10))
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 4 4 2 7 2 2 3 8 2 5
[2,] 7 3 2 6 6 5 7 8 1 3
[3,] 7 5 7 9 4 9 4 8 2 7
[4,] 5 3 4 2 1 5 9 10 9 5
[5,] 9 10 7 2 7 4 9 1 1 9
[6,] 2 3 5 1 2 8 1 5 9 4
[7,] 5 4 10 5 9 10 1 6 1 10
[8,] 6 3 9 7 1 1 9 2 1 7
[9,] 5 9 4 8 9 9 5 10 5 4
[10,] 10 1 4 7 3 2 3 5 4 5
Do it manually.
> table(image)
image
1 2 3 4 5 6 7 8 9 10
12 12 8 12 15 4 11 5 14 7
Here we can see that the value 5 appeared most often (15 times). To get the same results programmatically:
> which.max(table(image))
5
5
Get mode (or majority value) in 1 line of code
using set.seed to generate same random matrix
> set.seed(123)
> image = matrix(sample(1:10, 100, replace = TRUE), nrow = 10)
> image
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 3 10 9 10 2 1 7 8 3 2
[2,] 8 5 7 10 5 5 1 7 7 7
[3,] 5 7 7 7 5 8 4 8 5 4
[4,] 9 6 10 8 4 2 3 1 8 7
[5,] 10 2 7 1 2 6 9 5 2 4
[6,] 1 9 8 5 2 3 5 3 5 2
[7,] 6 3 6 8 3 2 9 4 10 8
[8,] 9 1 6 3 5 8 9 7 9 1
[9,] 6 4 3 4 3 9 8 4 9 5
[10,] 5 10 2 3 9 4 5 2 2 6)
Mode value of matrix (if tie, its gives minimum tie value)
> names(which.max(table(image)))
[1] "5"
I do not know any function to do that directly but you can use these functions:
sort(table(as.vector(Matrix))

remove a value and its corresponding value in a table

In a file (1000 columns, 2000 rows), for each column there is another column next to it. something like:
[,1] [,2] [,3] [,4] [,5] [,6]
3 3 4 4 4 6
6 5 2 2 5 1
9 1 3 5 4 1
2 5 6 4 8 5
6 1 5 2 3 1
I want to remove those values which their corresponding value is 1
the result:
[,1] [,3] [,5]
3 4 4
6 2 8
2 3
6
5
To echo what #shellter said, it's both helpful and polite to include what you've tried in the question.
Here's a compact way to accomplish this using split and mapply.
d <- read.table(text='3 3 4 4 4 6
6 5 2 2 5 1
9 1 3 5 4 1
2 5 6 4 8 5
6 1 5 2 3 1', header=FALSE)
cols <- split(as.list(d), rep(1:2, length.out=length(d)))
mapply(function(col1, col2) col1[col2 != 1],
cols[[1]], cols[[2]], SIMPLIFY=FALSE)
# $V1
# [1] 3 6 2
#
# $V3
# [1] 4 2 3 6 5
#
# $V5
# [1] 4 8

Random lists with replacement

I would like to create 1000 random lists of 1652 genes from a universe of 44.400 genes.
I decided to replace. I used the following instruction to create the random lists:
randomMatrix<-replicate(1000, sample(gene_list, 1652, replace = T))
The point is that in each list a gene is replicated. For my study, genes can be replicated between lists but not in each list. How can I impose not to replicate genes in each single list?
Thanks in advance
It should work with replace = FALSE:
randomMatrix<-replicate(1000, sample(gene_list, 1652, replace = FALSE))
This, of course, requires at least 1652 unique values in gene_list.
A reproducible example would be nice to illustrate your problem, since you didn't give us such example I just assume a List and made some replications
List <- list(c(2,1,3,4,5,6), c(1,4,5,7,0,6), c(2,4,7,9,3,1))
set.seed(001)
replicate(3, lapply(List, sample, 7, replace=TRUE), simplify = FALSE)
which produces
[[1]]
[[1]][[1]]
[1] 1 3 4 6 1 6 6
[[1]][[2]]
[1] 7 7 1 4 4 0 5
[[1]][[3]]
[1] 3 7 3 1 7 3 1
[[2]]
[[2]][[1]]
[1] 1 4 2 1 3 2 3
[[2]][[2]]
[1] 6 5 5 7 5 4 0
[[2]][[3]]
[1] 3 3 2 3 7 3 9
[[3]]
[[3]][[1]]
[1] 5 4 4 5 2 3 5
[[3]][[2]]
[1] 0 5 6 5 4 1 1
[[3]][[3]]
[1] 4 9 9 7 1 4 7
Note that this approach will give you a resampled data (with replacement) for each element of your original list, that's why each replication is a list consisting in three elements each one.
If you write sapply instead of lapply inside replicate(...) the resulting output would be nicer.
set.seed(001)
replicate(3, sapply(List, sample, 7, replace=TRUE), simplify = FALSE)
[[1]]
[,1] [,2] [,3]
[1,] 1 7 3
[2,] 3 7 7
[3,] 4 1 3
[4,] 6 4 1
[5,] 1 4 7
[6,] 6 0 3
[7,] 6 5 1
[[2]]
[,1] [,2] [,3]
[1,] 1 6 3
[2,] 4 5 3
[3,] 2 5 2
[4,] 1 7 3
[5,] 3 5 7
[6,] 2 4 3
[7,] 3 0 9
[[3]]
[,1] [,2] [,3]
[1,] 5 0 4
[2,] 4 5 9
[3,] 4 6 9
[4,] 5 5 7
[5,] 2 4 1
[6,] 3 1 4
[7,] 5 1 7

matrix flow: to the right instead of downwards?

I'm not sure what you call this, but the default 'flow' of matrices is downwards (as seen below)
matrix(1,7,5)*(1:7)
1 1 1 1 1
2 2 2 2 2
3 3 3 3 3
4 4 4 4 4
5 5 5 5 5
6 6 6 6 6
7 7 7 7 7
What if your intention is to multiply the vector to the right instead of downwards? Is there a better way to write the command below? Is there a toggle for column instead of row (same for replicate(7,1:7) it assumes downwards flow (paste row vectors downwards instead of column vectors to the right); is transpose the solution?)
t(t(matrix(1,7,5))*(1:5))
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
If you really want to do this a lot after defining the matrix you can always make an operator yourself:
'%mat%'<- function(x,y)t(t(x)*y)
matrix(1,7,5)%mat%1:5
[,1] [,2] [,3] [,4] [,5]
[1,] 1 2 3 4 5
[2,] 1 2 3 4 5
[3,] 1 2 3 4 5
[4,] 1 2 3 4 5
[5,] 1 2 3 4 5
[6,] 1 2 3 4 5
[7,] 1 2 3 4 5
But I think it easier to just transpose twice as you said in the question:
t(t(matrix(1,7,5))*1:5)
Or of course opt to transpose the matrix once in the beginning, do everything you need to do with it and then transpose it back.
As far as I know there is no way to change the default behaviour of *, nor would you probably want too,
A matrix is simply a vector with a dim attribute. The elements of the matrix are stored in the vector in column-major order and there is no way to change this. * is an element-by-element operator that recycles its arguments as necessary. You can see the recycling rule at work via:
> x <- matrix(1,7,5)
> x*1:5
[,1] [,2] [,3] [,4] [,5]
[1,] 1 3 5 2 4
[2,] 2 4 1 3 5
[3,] 3 5 2 4 1
[4,] 4 1 3 5 2
[5,] 5 2 4 1 3
[6,] 1 3 5 2 4
[7,] 2 4 1 3 5
You can see the multiplication is taking place by column and the vector (1:5) is being recycled to be the same length as the matrix. Rather than transposing, you could use the matrix function to re-size your matrix by row.
> matrix(x*1:5,nrow(x),ncol(x),byrow=TRUE)
[,1] [,2] [,3] [,4] [,5]
[1,] 1 2 3 4 5
[2,] 1 2 3 4 5
[3,] 1 2 3 4 5
[4,] 1 2 3 4 5
[5,] 1 2 3 4 5
[6,] 1 2 3 4 5
[7,] 1 2 3 4 5
I'm not sure that's the most efficient solution, but it's the best I can think of at the moment and it's slightly faster than using t twice.
Do you mean this?
> matrix(rep(1:7,5), nrow=7, ncol=5)
[,1] [,2] [,3] [,4] [,5]
[1,] 1 1 1 1 1
[2,] 2 2 2 2 2
[3,] 3 3 3 3 3
[4,] 4 4 4 4 4
[5,] 5 5 5 5 5
[6,] 6 6 6 6 6
[7,] 7 7 7 7 7
> matrix(rep(1:7,5), nrow=5, ncol=7, byrow=TRUE)
[,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,] 1 2 3 4 5 6 7
[2,] 1 2 3 4 5 6 7
[3,] 1 2 3 4 5 6 7
[4,] 1 2 3 4 5 6 7
[5,] 1 2 3 4 5 6 7

Resources