How can I separate a matrix into smaller ones in R? - r

I have the following matrix
2 4 1
6 32 1
4 2 1
5 3 2
4 2 2
I want to make the following two matrices based on 3rd column
first
2 4
6 32
4 2
second
5 3
4 2
Best I can come up with, but I get an error
x <- cbind(mat[,1], mat[,2]) if mat[,3]=1
y <- cbind(mat[,1], mat[,2]) if mat[,3]=2

If mat is your matrix:
mat <- matrix(1:15,ncol=3)
mat[,3] <- c(1,1,1,2,2)
> mat
[,1] [,2] [,3]
[1,] 1 6 1
[2,] 2 7 1
[3,] 3 8 1
[4,] 4 9 2
[5,] 5 10 2
Then you can use split:
> lapply( split( mat[,1:2], mat[,3] ), matrix, ncol=2)
$`1`
[,1] [,2]
[1,] 1 6
[2,] 2 7
[3,] 3 8
$`2`
[,1] [,2]
[1,] 4 9
[2,] 5 10
The lapply of matrix is necessary because split drops the attributes that make a vector a matrix, so you need to add them back in.

Yet another example:
#test data
mat <- matrix(1:15,ncol=3)
mat[,3] <- c(1,1,1,2,2)
#make a list storing a matrix for each id as components
result <- lapply(by(mat,mat[,3],identity),as.matrix)
Final product:
> result
$`1`
V1 V2 V3
1 1 6 1
2 2 7 1
3 3 8 1
$`2`
V1 V2 V3
4 4 9 2
5 5 10 2

If you have a matrix A, this will get the first two columns when the third column is 1:
A[A[,3] == 1,c(1,2)]
You can use this to obtain matrices for any value in the third column.
Explanation: A[,3] == 1 returns a vector of booleans, where the i-th position is TRUE if A[i,3] is 1. This vector of booleans can be used to index into a matrix to extract the rows we want.
Disclaimer: I have very little experience with R, this is the MATLAB-ish way to do it.

split.data.frame could be used also to split a matrix.
mat <- matrix(1:15,ncol=3)
mat[,3] <- c(1,1,1,2,2)
x <- split.data.frame(mat[,-3], mat[,3])
x
#$`1`
# [,1] [,2]
#[1,] 1 6
#[2,] 2 7
#[3,] 3 8
#
#$`2`
# [,1] [,2]
#[1,] 4 9
#[2,] 5 10
str(x)
#List of 2
# $ 1: num [1:3, 1:2] 1 2 3 6 7 8
# $ 2: num [1:2, 1:2] 4 5 9 10
Or split the index and and use it in lapply to subset.
lapply(split(seq_along(mat[,3]), mat[,3]), \(i) mat[i, -3, drop=FALSE])
#$`1`
# [,1] [,2]
#[1,] 1 6
#[2,] 2 7
#[3,] 3 8
#
#$`2`
# [,1] [,2]
#[1,] 4 9
#[2,] 5 10

This is a functional version of pedrosorio's idea:
getthird <- function(mat, idx) mat[mat[,3]==idx, 1:2]
sapply(unique(mat[,3]), getthird, mat=mat) #idx gets sent the unique values
#-----------
[[1]]
[,1] [,2]
[1,] 1 6
[2,] 2 7
[3,] 3 8
[[2]]
[,1] [,2]
[1,] 4 9
[2,] 5 10

We can use by or tapply
> by(seq_along(mat[, 3]), mat[, 3], function(k) mat[k, -3])
mat[, 3]: 1
[,1] [,2]
[1,] 1 6
[2,] 2 7
[3,] 3 8
------------------------------------------------------------
mat[, 3]: 2
[,1] [,2]
[1,] 4 9
[2,] 5 10
> tapply(seq_along(mat[, 3]), mat[, 3], function(k) mat[k, -3])
$`1`
[,1] [,2]
[1,] 1 6
[2,] 2 7
[3,] 3 8
$`2`
[,1] [,2]
[1,] 4 9
[2,] 5 10

Related

Minimum of cells in two matrices within a moving kernel

I have two matrices m1 and m2.
m1 <- matrix(1:16, ncol = 4)
m2 <- matrix(16:1, ncol = 4)
# > m1
# [,1] [,2] [,3] [,4]
# [1,] 1 5 9 13
# [2,] 2 6 10 14
# [3,] 3 7 11 15
# [4,] 4 8 12 16
# > m2
# [,1] [,2] [,3] [,4]
# [1,] 16 12 8 4
# [2,] 15 11 7 3
# [3,] 14 10 6 2
# [4,] 13 9 5 1
I want to find the minimum between the two matrices for each cell within a moving kernel of 3x3. The outer margines should be ignored, i.e. they can be filled with NAs and the min function should then have na.rm = TRUE. The result should look like this:
# > m3
# [,1] [,2] [,3] [,4]
# [1,] 1 1 3 3
# [2,] 1 1 2 2
# [3,] 2 2 1 1
# [4,] 3 3 1 1
I have already tried a combination of pmin{base} and runmin{caTools} like this:
pmin(runmin(m1, 3, endrule = "keep"),
runmin(m2, 3, endrule = "keep"))
However, this did not work. Probably due to the fact that
"If x is a matrix than each column will be processed separately."
(from ?runmin)
Is there any package, that performs such operations, or is it possible to apply?
Here is a base R approach:
m = pmin(m1, m2)
grid = expand.grid(seq(nrow(m)), seq(ncol(m)))
x = apply(grid, 1, function(u) {
min(m[max(1,u[1]-1):min(nrow(m), u[1]+1), max(1,u[2]-1):min(ncol(m), u[2]+1)])
})
dim(x) = dim(m)
#> x
# [,1] [,2] [,3] [,4]
#[1,] 1 1 3 3
#[2,] 1 1 2 2
#[3,] 2 2 1 1
#[4,] 3 3 1 1

Repeatedly Complementary subsets

I want to repeatedly divide a set into two complementary subsets with known size and keep them as the columns of two matrix. For example assume the main set is {1, 2, ..., 10}, the size of first sample is 8 and I want to repeat sampling 3 times. I want to have:
[,1] [,2] [,3]
[1,] 10 9 1
[2,] 8 1 10
[3,] 3 7 5
[4,] 4 2 3
[5,] 1 8 8
[6,] 6 4 2
[7,] 9 5 7
[8,] 5 10 6
and
[,1] [,2] [,3]
[1,] 2 3 4
[2,] 7 6 9
Any idea how to implement it in R avoiding for loops?
I would use replicate + sample, like this:
set.seed(1) # Just so you can replicate my results
A <- replicate(3, sample(10, 8, FALSE)) # Change 3 to the number of replications
A
# [,1] [,2] [,3]
# [1,] 3 7 8
# [2,] 4 1 9
# [3,] 5 2 4
# [4,] 7 8 6
# [5,] 2 5 7
# [6,] 8 10 2
# [7,] 9 4 3
# [8,] 6 6 1
For the other set, I would use apply + setdiff, like this:
B <- apply(A, 2, function(x) setdiff(1:10, x))
B
# [,1] [,2] [,3]
# [1,] 1 3 5
# [2,] 10 9 10
Another option as suggested by #thelatemail (which would be more efficient) is to just create use replicate to create your original matrix, and use basic subsetting to create your separate matrices.
A <- replicate(3, sample(10))
B <- A[-(seq_len(8)), ]
A <- A[seq_len(8), ]

Flattening 3-Dimensional data to elongated 2-D in R

I have data with dim 10,5,2 (t,x,y) and I want to convert it to dimensions 10*5,3. i.e to append every t frame to (x,y) frame with t value.
eg:
data[1,,]=
x y
1 2
1 3
data[2,,]=
x y
5 2
1 6
I would like to convert this data to flatten array like this
x y t
1 2 1
1 3 1
5 2 2
1 6 2
I was looking if there is already R function to do this or I'd do it by looping every t array and add the recreated array at bottom of main array.
a <- array(1:8, c(2,2,2))
a[1,,]
# [,1] [,2]
#[1,] 1 5
#[2,] 3 7
a[2,,]
# [,1] [,2]
#[1,] 2 6
#[2,] 4 8
m <- matrix(aperm(a, c( 2, 1, 3)), nrow=prod(dim(a)[2:3]))
cbind(m, rep(seq_len(dim(a)[2]), each=dim(a)[1]))
# [,1] [,2] [,3]
#[1,] 1 5 1
#[2,] 3 7 1
#[3,] 2 6 2
#[4,] 4 8 2
Here's a different approach:
a <- array(c(1,5,1,1,2,2,3,6), dim = c(2,2,2) )
do.call('rbind',lapply(1:dim(a)[3], function(x) cbind(a[x,,], t = x)))
t
[1,] 1 2 1
[2,] 1 3 1
[3,] 5 2 2
[4,] 1 6 2
Also:
If ais the array.
ft <- ftable(a)
cbind(ft[,1:2], as.numeric(factor(gsub("\\_.*","",row.names(as.matrix(ft))))))
[,1] [,2] [,3]
[1,] 1 2 1
[2,] 1 3 1
[3,] 5 2 2
[4,] 1 6 2

R Create Matrix From an Operation on a "Row" Vector and a "Column" Vector

First create a "row" vector and a "column" vector in R:
> row.vector <- seq(from = 1, length = 4, by = 1)
> col.vector <- {t(seq(from = 1, length = 3, by = 2))}
From that I'd like to create a matrix by, e.g., multiplying each value in the row vector with each value in the column vector, thus creating from just those two vectors:
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 6 10
[3,] 3 9 15
[4,] 4 12 20
Can this be done with somehow using apply()? sweep()? ...a for loop?
Thank you for any help!
Simple matrix multiplication will work just fine
row.vector %*% col.vector
# [,1] [,2] [,3]
# [1,] 1 3 5
# [2,] 2 6 10
# [3,] 3 9 15
# [4,] 4 12 20
You'd be better off working with two actual vectors, instead of a vector and a matrix:
outer(row.vector,as.vector(col.vector))
# [,1] [,2] [,3]
#[1,] 1 3 5
#[2,] 2 6 10
#[3,] 3 9 15
#[4,] 4 12 20
Here's a way to get there with apply. Is there a reason why you're not using matrix?
> apply(col.vector, 2, function(x) row.vector * x)
## [,1] [,2] [,3]
## [1,] 1 3 5
## [2,] 2 6 10
## [3,] 3 9 15
## [4,] 4 12 20

In R, using `unique()` with extra conditions to extract submatrices: easy solution without plyr

In R, let M be the matrix
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 1 3 3
[3,] 2 4 5
[4,] 6 7 8
I would like to select the submatrix m
[,1] [,2] [,3]
[1,] 1 3 3
[2,] 2 4 5
[3,] 6 7 8
using unique on M[,1], specifying to keep the row with the maximal value in the second columnM.
At the end, the algorithm should keep row [2,] from the set \{[1,], [2,]\}. Unfortunately unique() returns me a vector with actual values, and not row numbers, after elimination of duplicates.
Is there a way to get the asnwer without the package plyr?
Thanks a lot,
Avitus
Here's how:
is.first.max <- function(x) seq_along(x) == which.max(x)
M[as.logical(ave(M[, 2], M[, 1], FUN = is.first.max)), ]
# [,1] [,2] [,3]
# [1,] 1 3 3
# [2,] 2 4 5
# [3,] 6 7 8
You're looking for duplicated.
m <- as.matrix(read.table(text="1 2 3
1 3 3
2 4 5
6 7 8"))
m <- m[order(m[,2], decreasing=TRUE), ]
m[!duplicated(m[,1]),]
# V1 V2 V3
# [1,] 6 7 8
# [2,] 2 4 5
# [3,] 1 3 3
Not the most efficient:
M <- matrix(c(1,1,2,6,2,3,4,7,3,3,5,8),4)
t(sapply(unique(M[,1]),function(i) {temp <- M[M[,1]==i,,drop=FALSE]
temp[which.max(temp[,2]),]
}))
# [,1] [,2] [,3]
#[1,] 1 3 3
#[2,] 2 4 5
#[3,] 6 7 8

Resources