Flattening 3-Dimensional data to elongated 2-D in R - r

I have data with dim 10,5,2 (t,x,y) and I want to convert it to dimensions 10*5,3. i.e to append every t frame to (x,y) frame with t value.
eg:
data[1,,]=
x y
1 2
1 3
data[2,,]=
x y
5 2
1 6
I would like to convert this data to flatten array like this
x y t
1 2 1
1 3 1
5 2 2
1 6 2
I was looking if there is already R function to do this or I'd do it by looping every t array and add the recreated array at bottom of main array.

a <- array(1:8, c(2,2,2))
a[1,,]
# [,1] [,2]
#[1,] 1 5
#[2,] 3 7
a[2,,]
# [,1] [,2]
#[1,] 2 6
#[2,] 4 8
m <- matrix(aperm(a, c( 2, 1, 3)), nrow=prod(dim(a)[2:3]))
cbind(m, rep(seq_len(dim(a)[2]), each=dim(a)[1]))
# [,1] [,2] [,3]
#[1,] 1 5 1
#[2,] 3 7 1
#[3,] 2 6 2
#[4,] 4 8 2

Here's a different approach:
a <- array(c(1,5,1,1,2,2,3,6), dim = c(2,2,2) )
do.call('rbind',lapply(1:dim(a)[3], function(x) cbind(a[x,,], t = x)))
t
[1,] 1 2 1
[2,] 1 3 1
[3,] 5 2 2
[4,] 1 6 2

Also:
If ais the array.
ft <- ftable(a)
cbind(ft[,1:2], as.numeric(factor(gsub("\\_.*","",row.names(as.matrix(ft))))))
[,1] [,2] [,3]
[1,] 1 2 1
[2,] 1 3 1
[3,] 5 2 2
[4,] 1 6 2

Related

How to order a matrix by the numeric or alphabetic values of the column vectors in R?

The title with the following example should be self-explanatory:
m = unique(replicate(5, sample(1:5, 5, rep=F)), MARGIN = 2)
m
[,1] [,2] [,3] [,4] [,5]
[1,] 1 5 1 4 3
[2,] 5 1 5 1 2
[3,] 4 3 3 3 1
[4,] 3 4 4 5 5
[5,] 2 2 2 2 4
But what I want is instead:
[,1] [,2] [,3] [,4] [,5]
[1,] 1 1 3 4 5
[2,] 5 5 2 1 1
[3,] 3 4 1 3 3
[4,] 4 3 5 5 4
[5,] 2 2 4 2 2
Ideally, I would like to find a method that allows the same process to be carried out when the column vectors are words (alphabetic order).
I tried things like m[ , sort(m)] but nothing did the trick...
m[, order(m[1, ]) will order the columns by the first row. m[, order(m[1, ], m[2, ])] will order by the first row, using second row as tie-breaker. Getting fancy, m[, do.call(order, split(m, row(m)))] will order the columns by the first row, using all subsequent rows for tie-breakers. This will work character data just as well as numeric.
set.seed(47)
m = replicate(5, sample(1:5, 5, rep=F))
m
# [,1] [,2] [,3] [,4] [,5]
# [1,] 5 4 1 5 1
# [2,] 2 2 3 2 3
# [3,] 3 5 5 1 2
# [4,] 4 3 2 3 5
# [5,] 1 1 4 4 4
m[, do.call(order, split(m, row(m)))]
# [,1] [,2] [,3] [,4] [,5]
# [1,] 1 1 4 5 5
# [2,] 3 3 2 2 2
# [3,] 2 5 5 1 3
# [4,] 5 2 3 3 4
# [5,] 4 4 1 4 1

Extracting unique rows in a 3+ column matrix

Using R, I am trying to extract unique rows in a matrix, where a "unique row" is subject to all the values in a given row.
For example if I had this data set:
x = matrix(c(1,1,1,2,2,5,1,2,2,1,2,1,5,3,5,2,1,1),6,3)
Rows 1 & 6, and rows 4 & 5 are duplicated since (1,1,5) = (5,1,1) and (2,1,2) = (2,2,1).
Ultimately, i'm trying to end up with something in the form of:
y = matrix(c(1,1,1,2,1,2,2,1,5,3,5,2),4,3)
or
z = matrix(c(1,1,2,5,2,2,2,1,3,5,1,1),4,3)
The order doesn't matter as long as only one of the unique rows remains. I've searched online, but functions such as unique() and duplicated() have only worked for exact matching rows.
Thanks in advance for any help you provide.
Another answer: use sets. Slightly modified matrix:
library(sets)
x <- matrix(c(1,1,1,2,2,5,5, 1,2,2,1,2,1,5, 5,3,5,2,1,1,1),7,3)
x
[,1] [,2] [,3]
[1,] 1 1 5
[2,] 1 2 3
[3,] 1 2 5
[4,] 2 1 2
[5,] 2 2 1
[6,] 5 1 1
[7,] 5 5 1
If (5,1,1) = (5,5,1) you can use just ordinary sets:
a <- sapply(1:nrow(x), function(i) as.set(x[i,]))
x[!duplicated(a),]
[,1] [,2] [,3]
[1,] 1 1 5
[2,] 1 2 3
[3,] 1 2 5
[4,] 2 1 2
Note: rows 6 and 7 are both gone.
If (5,1,1) != (5,5,1), use generalized sets:
b <- sapply(1:nrow(x), function(i) as.gset(x[i,]))
x[!duplicated(b),]
[,1] [,2] [,3]
[1,] 1 1 5
[2,] 1 2 3
[3,] 1 2 5
[4,] 2 1 2
[5,] 5 5 1

In R, using `unique()` with extra conditions to extract submatrices: easy solution without plyr

In R, let M be the matrix
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 1 3 3
[3,] 2 4 5
[4,] 6 7 8
I would like to select the submatrix m
[,1] [,2] [,3]
[1,] 1 3 3
[2,] 2 4 5
[3,] 6 7 8
using unique on M[,1], specifying to keep the row with the maximal value in the second columnM.
At the end, the algorithm should keep row [2,] from the set \{[1,], [2,]\}. Unfortunately unique() returns me a vector with actual values, and not row numbers, after elimination of duplicates.
Is there a way to get the asnwer without the package plyr?
Thanks a lot,
Avitus
Here's how:
is.first.max <- function(x) seq_along(x) == which.max(x)
M[as.logical(ave(M[, 2], M[, 1], FUN = is.first.max)), ]
# [,1] [,2] [,3]
# [1,] 1 3 3
# [2,] 2 4 5
# [3,] 6 7 8
You're looking for duplicated.
m <- as.matrix(read.table(text="1 2 3
1 3 3
2 4 5
6 7 8"))
m <- m[order(m[,2], decreasing=TRUE), ]
m[!duplicated(m[,1]),]
# V1 V2 V3
# [1,] 6 7 8
# [2,] 2 4 5
# [3,] 1 3 3
Not the most efficient:
M <- matrix(c(1,1,2,6,2,3,4,7,3,3,5,8),4)
t(sapply(unique(M[,1]),function(i) {temp <- M[M[,1]==i,,drop=FALSE]
temp[which.max(temp[,2]),]
}))
# [,1] [,2] [,3]
#[1,] 1 3 3
#[2,] 2 4 5
#[3,] 6 7 8

How can I separate a matrix into smaller ones in R?

I have the following matrix
2 4 1
6 32 1
4 2 1
5 3 2
4 2 2
I want to make the following two matrices based on 3rd column
first
2 4
6 32
4 2
second
5 3
4 2
Best I can come up with, but I get an error
x <- cbind(mat[,1], mat[,2]) if mat[,3]=1
y <- cbind(mat[,1], mat[,2]) if mat[,3]=2
If mat is your matrix:
mat <- matrix(1:15,ncol=3)
mat[,3] <- c(1,1,1,2,2)
> mat
[,1] [,2] [,3]
[1,] 1 6 1
[2,] 2 7 1
[3,] 3 8 1
[4,] 4 9 2
[5,] 5 10 2
Then you can use split:
> lapply( split( mat[,1:2], mat[,3] ), matrix, ncol=2)
$`1`
[,1] [,2]
[1,] 1 6
[2,] 2 7
[3,] 3 8
$`2`
[,1] [,2]
[1,] 4 9
[2,] 5 10
The lapply of matrix is necessary because split drops the attributes that make a vector a matrix, so you need to add them back in.
Yet another example:
#test data
mat <- matrix(1:15,ncol=3)
mat[,3] <- c(1,1,1,2,2)
#make a list storing a matrix for each id as components
result <- lapply(by(mat,mat[,3],identity),as.matrix)
Final product:
> result
$`1`
V1 V2 V3
1 1 6 1
2 2 7 1
3 3 8 1
$`2`
V1 V2 V3
4 4 9 2
5 5 10 2
If you have a matrix A, this will get the first two columns when the third column is 1:
A[A[,3] == 1,c(1,2)]
You can use this to obtain matrices for any value in the third column.
Explanation: A[,3] == 1 returns a vector of booleans, where the i-th position is TRUE if A[i,3] is 1. This vector of booleans can be used to index into a matrix to extract the rows we want.
Disclaimer: I have very little experience with R, this is the MATLAB-ish way to do it.
split.data.frame could be used also to split a matrix.
mat <- matrix(1:15,ncol=3)
mat[,3] <- c(1,1,1,2,2)
x <- split.data.frame(mat[,-3], mat[,3])
x
#$`1`
# [,1] [,2]
#[1,] 1 6
#[2,] 2 7
#[3,] 3 8
#
#$`2`
# [,1] [,2]
#[1,] 4 9
#[2,] 5 10
str(x)
#List of 2
# $ 1: num [1:3, 1:2] 1 2 3 6 7 8
# $ 2: num [1:2, 1:2] 4 5 9 10
Or split the index and and use it in lapply to subset.
lapply(split(seq_along(mat[,3]), mat[,3]), \(i) mat[i, -3, drop=FALSE])
#$`1`
# [,1] [,2]
#[1,] 1 6
#[2,] 2 7
#[3,] 3 8
#
#$`2`
# [,1] [,2]
#[1,] 4 9
#[2,] 5 10
This is a functional version of pedrosorio's idea:
getthird <- function(mat, idx) mat[mat[,3]==idx, 1:2]
sapply(unique(mat[,3]), getthird, mat=mat) #idx gets sent the unique values
#-----------
[[1]]
[,1] [,2]
[1,] 1 6
[2,] 2 7
[3,] 3 8
[[2]]
[,1] [,2]
[1,] 4 9
[2,] 5 10
We can use by or tapply
> by(seq_along(mat[, 3]), mat[, 3], function(k) mat[k, -3])
mat[, 3]: 1
[,1] [,2]
[1,] 1 6
[2,] 2 7
[3,] 3 8
------------------------------------------------------------
mat[, 3]: 2
[,1] [,2]
[1,] 4 9
[2,] 5 10
> tapply(seq_along(mat[, 3]), mat[, 3], function(k) mat[k, -3])
$`1`
[,1] [,2]
[1,] 1 6
[2,] 2 7
[3,] 3 8
$`2`
[,1] [,2]
[1,] 4 9
[2,] 5 10

Generating all combinations of rows of matrices (brute force) in R

I have a list of matrices (with the same number of columns), say lst_Mat and I'd like to have all row-wise combinations of matrices in this list. For example, lst_Mat could be like this:
> lst_Mat
[[1]]
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 1 3 2 4
[3,] 1 3 4 2
[4,] 2 1 3 4
[5,] 2 3 1 4
[6,] 2 3 4 1
[[2]]
[,1] [,2] [,3] [,4]
[1,] 1 3 2 4
[2,] 3 1 2 4
[3,] 3 2 1 4
[4,] 3 2 4 1
[[3]]
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 1 2 4 3
[3,] 1 3 2 4
[4,] 1 3 4 2
[5,] 1 4 2 3
[6,] 1 4 3 2
[7,] 2 1 3 4
[8,] 2 1 4 3
[9,] 2 3 1 4
[10,] 3 1 2 4
[[4]]
[,1] [,2] [,3] [,4]
[1,] 2 1 4 3
[2,] 2 3 1 4
[3,] 3 1 2 4
[4,] 3 1 4 2
[5,] 3 2 1 4
As such, the total number of combinations would be 6*4*10*5=1200. This problem is analogous to the problem of generating all possible strings of English letters (i.e. a, b, c,..., x, y, z) with a specific length. For instance: aaa, aab, aac,..., aaz, aba, abb,..., abz, aca,... and so on.
I have come up with the following solution:
lst_Mat_len=list()
C=ncol(lst_Mat[[1]])
for (i in 1:length(lst_Mat))
lst_Mat_len[[length(lst_Mat_len)+1]]=(1:nrow(lst_Mat[[i]]))
combs=do.call(expand.grid, lst_Mat_len)
for (i in 1:nrow(combs)){
M=matrix(0, 0, C)
for (j in 1:ncol(combs))
M=rbind(M, lst_Mat[[j]][combs[i,j],])
# print(M)
}
Sample output of M:
> M
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 1 3 2 4
[3,] 1 2 3 4
[4,] 2 1 4 3
> M
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 1 3 2 4
[3,] 1 2 3 4
[4,] 2 3 1 4
That is, one row per matrix, each time.
I'd appreciate any other algorithms for doing so.
Here is another solution, I changed a little bit the example to make it more reproducible:
ones <- t(rep(1, 4))
lst_Mat <- list(1:6 %*% ones, 7:11 %*% ones, 12:21 %*% ones, 22:26 %*% ones)
combs <- expand.grid( sapply(lst_Mat, function(x) 1:nrow(x)) )
nbcombs <- nrow(combs)
res <- NULL
for (i in 1:nbcombs)
res[[i]] <- t(mapply(function(mat,line) mat[line,], lst_Mat, combs[i, ]))

Resources