I have a matrix consisting of 10 rows ,
I would like to make a combination between these row using R such as:
M= matrix(c(
1,2,3,4,
5,6,7,3,
5,5,4,8,
5,2,7,8,
4,8,7,8,
2,6,7,9,
5,6,7,4,
5,6,7,2,
5,6,7,3,
5,6,7,0),nrow=10, byrow=TRUE)
First step
combination (3 row ) from ( 10 row ).
This means that we have other matrices (resulting from matrix M) their number 120- matrix(3*4)
Second step
combination (6 row ) from ( 10 row )
This means that we have other matrices (we also resulting from matrix M) their number 210-matrix(6*4)
You can split matrix with apply to list of rows than use combn function as below:
M <- structure(c(1, 5, 5, 5, 4, 2, 5, 5, 5, 5, 2, 6, 5, 2, 8, 6, 6,
6, 6, 6, 3, 7, 4, 7, 7, 7, 7, 7, 7, 7, 4, 3, 8, 8, 8, 9, 4, 2,
3, 0), .Dim = c(10L, 4L))
x <- apply(M, 1, list)
# combinations for three rows
cmbs3 <- combn(x, 3)
ncol(cmbs3)
# 120
cmbs3[, 2]
# second combination
# [[1]]
# [[1]][[1]]
# [1] 1 2 3 4
#
#
# [[2]]
# [[2]][[1]]
# [1] 5 6 7 3
#
#
# [[3]]
# [[3]][[1]]
# [1] 5 2 7 8
# combinations for six rows
cmbs6 <- combn(x, 6)
ncol(cmbs6)
# 210
EDIT:
Or use elgant solution provided by nicola - subsetting by row index generated by combn (I like it much more :):
lapply(combn(10, 3, simplify = FALSE), function(x) M[x, ])
Output:
[[1]]
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 5 6 7 3
[3,] 5 5 4 8
[[2]]
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 5 6 7 3
[3,] 5 2 7 8
...
[[119]]
[,1] [,2] [,3] [,4]
[1,] 5 6 7 4
[2,] 5 6 7 3
[3,] 5 6 7 0
[[120]]
[,1] [,2] [,3] [,4]
[1,] 5 6 7 2
[2,] 5 6 7 3
[3,] 5 6 7 0
Related
I am trying to calculate the sum of the differences of all possible pairs of an array (sum of difference first/second + difference first/third, difference second/first + difference second/third, difference third/first + difference third/second). The original dataframe consists of multiple columns like the one below, still the colSum function doesn't work because of:
"Error in colSums(FC_TS1test1, dims = 1) : 'x' must be an array of at least two dimensions".
I appreciate your help!
array <- c(5, 3, 1)
The calculation behind it should be:
|(5-3)|+|(5-1)|=2+4=6
|(3-5)|+|(3-1)|=2+2=4
|(1-5)|+|(1-3)|=4+2=6
6+4+6=16
We can use sum + dist
> sum(dist(array))
[1] 840
We need combinations without repetition.
sum(combn(a, 2, diff))
# [1] -94
Or, with some packages:
colSums(matrixStats::rowDiffs(RcppAlgos::comboGeneral(a, 2, repetition=F)))
# [1] -94
or
sum(unlist(RcppAlgos::comboGeneral(a, 2, repetition=F, FUN=diff)))
# [1] -94
Two smaller examples demonstrate what the core of the code does.
combn(b, 2)
# [,1] [,2] [,3] [,4] [,5] [,6]
# [1,] 3 3 3 4 4 2
# [2,] 4 2 5 2 5 5
RcppAlgos::comboGeneral(b, 2, repetition=F)
# [,1] [,2]
# [1,] 3 4
# [2,] 3 2
# [3,] 3 5
# [4,] 4 2
# [5,] 4 5
# [6,] 2 5
Edit
For the absolute differences according to your recent edit we may define an anonymous function:
sum(combn(a, 2, function(x) abs(diff(x))))
# [1] 840
Data:
a <- c(3, 4, 2, 5, 4, 4, 1, 5, 5, 4, 1, 4, 7, 2, 1, 3, 5, 2, 4, 2,
7, 4, 4, 1, 4, 3, 4, 4, 2, 4, 1)
b <- c(3, 4, 2, 5)
I am working in R, and I have a dataset that is a list of lists of matrices. Each sublist in the mainlist has two matrices of equal dimension (10 rows x 2 cols). I would like to rbind() each list of matrices into a single matrix (20 rows x 2 cols). But I do not want to combine every sublist into a single big matrix. Gonna try my best to write a sample code for it but the real data is pretty complex so I'll do my best.
> matrix_1 <- matrix(c(1, 2, 3, 4, 5, 6, 7, 8, 9), nrow = 5, ncol = 2, byrow = TRUE)
> matrix_2 <- matrix(c(9, 8, 7, 6, 5, 4, 3, 2, 1), nrow = 5, ncol = 2, byrow = TRUE)
> matrix_3 <- matrix(c(101, 91, 81, 71, 61, 51, 41, 31, 21, 11), nrow = 5, ncol = 2, byrow = TRUE)
> matrix_4 <- matrix(c(22, 20, 19, 18, 17, 16, 15, 14, 13, 12), nrow = 5, ncol = 2, byrow = TRUE)
> sublist_1 <- list(matrix_1, matrix_2)
[[1]]
[,1] [,2]
[1,] 1 5
[2,] 2 6
[3,] 1 3
[4,] 7 4
[5,] 8 3
[[2]]
[,1] [,2]
[1,] 10 9
[2,] 8 7
[3,] 6 5
[4,] 4 3
[5,] 2 1
> sublist_2 <- list(matrix_3, matrix_4)
[[1]]
[,1] [,2]
[1,] 101 91
[2,] 81 71
[3,] 61 51
[4,] 41 31
[5,] 21 11
[[2]]
[,1] [,2]
[1,] 22 20
[2,] 19 18
[3,] 17 16
[4,] 15 14
[5,] 13 12
> mainlist <- list(sublist_1, sublist_2)
What I really want is to make this:
> rbind(sublist_1[[1]], sublist_1[[2]])
[,1] [,2]
[1,] 1 5
[2,] 2 6
[3,] 1 3
[4,] 7 4
[5,] 8 3
[6,] 10 9
[7,] 8 7
[8,] 6 5
[9,] 4 3
[10,] 2 1
apply to all of the sublists in the mainlist.
I've tried to use various combinations of lapply, mapply, map, do.call, etc. to make it work, but either I don't know the right combinations or I need something else.
I've also noticed that rbind(sublist_1) does not work, which is making it difficult to use lapply. It has to be written as rbind(sublist_1[[1]], sublist_1[[2]]).
Thank you very much for your help.
Loop over the outer list, convert the inner list elements to data.frame and use do.call with rbind
out <- lapply(mainlist, function(x) do.call(rbind, lapply(x, as.data.frame)))
Data set contains 19972 rows and 3006 columns, how can i remove all the rows containing any negative value or NA in data matrix.
Generate a data.frame containing NA and negative values:
set.seed(1)
df <- data.frame(a=runif(10, -10,10), b=runif(10, -5, 5))
df$a[7] <- NA
df
df looks like this:
a b
1 -4.689827 -2.94025425
2 -2.557522 -3.23443247
3 1.457067 1.87022847
4 8.164156 -1.15896282
5 -5.966361 2.69841420
6 7.967794 -0.02300758
7 NA 2.17618508
8 3.215956 4.91906095
9 2.582281 -1.19964821
10 -8.764275 2.77445221
Then:
negative_row <- apply(df, 1, function(x) any(x < 0 | is.na(x)))
df[!negative_row,]
giving:
a b
3 1.457067 1.870228
8 3.215956 4.919061
You can try the code below
> subset(m, rowSums(sign(m) < 0) == 0)
[,1] [,2] [,3] [,4]
[1,] 1 1 2 4
[2,] 4 16 8 16
[3,] 5 25 0 20
where m is the given matrix as
> m
[,1] [,2] [,3] [,4]
[1,] 1 1 2 4
[2,] 2 NA 4 8
[3,] -3 0 6 12
[4,] 4 16 8 16
[5,] 5 25 0 20
Dummy Data
> dput(m)
structure(c(1, 2, -3, 4, 5, 1, NA, 0, 16, 25, 2, 4, 6, 8, 0,
4, 8, 12, 16, 20), .Dim = 5:4, .Dimnames = list(NULL, NULL))
I have matrix M and N given by
> M
[,1] [,2] [,3] [,4] [,5]
[1,] 5 1 1 7 7
[2,] 4 7 4 2 7
[3,] 11 19 20 50 30
> N
[,1] [,2]
[1,] 7 1
[2,] 7 7
I want to find the column values in M that should be paired with N to get
[,1] [,2]
7 1
7 7
30 19
I tried the code below. Can i get an efficient way of doing it or especially doing it without using the for commands?
E=numeric()
for (i in 1:2){
for (j in 1:5) {
if (N[1,i]==M[1,j] & N[2,i]==M[2,j]){
E[i]= M[3,j]
}
}
}
E
rbind(N,E)
Well here is your loop re-written
E <- vapply(seq(nrow(N)), function(i) M[3,M[1,] == N[1,i] & M[2,] == N[2,i]], numeric(1))
# with
> rbind(N,E)
[,1] [,2]
7 1
7 7
E 30 19
there is only one loop (vapply - a wrapper for a loop) which runs through the rows of N.
Here's a way using multiple calls to apply. We iterate over the columns of M and N to find which column in M matches the first column in N and then which matches the second column in N.
logicals <- apply(M[-3,], # exclude third row
2, # iterate over columns
FUN = function(x)
apply(N, 2, #then iterate over columns of N
FUN = function(y) all(x == y)))
# [,1] [,2] [,3] [,4] [,5]
# [1,] FALSE FALSE FALSE FALSE TRUE
# [2,] FALSE TRUE FALSE FALSE FALSE
M[,apply(logicals, 1, which)]
[,1] [,2]
[1,] 7 1
[2,] 7 7
[3,] 30 19
data
M <- structure(c(5, 4, 11, 1, 7, 19,
1, 4, 20, 7, 2, 50,
7, 7, 30),
.Dim = c(3L, 5L))
N <- structure(c(7, 7, 1, 7), .Dim = c(2L, 2L))
I have a vector, say vec1, and another vector named vec2 as follows:
vec1 = c(4,1)
# [1] 4 1
vec2 = c(5,3,2)
# [1] 5 3 2
What I'm looking for is all possible combinations of vec1 and vec2 while the order of the vectors' elements is kept. That is, the resultant matrix should be like this:
> res
[,1] [,2] [,3] [,4] [,5]
[1,] 4 1 5 3 2
[2,] 4 5 1 3 2
[3,] 4 5 3 1 2
[4,] 4 5 3 2 1
[5,] 5 4 1 3 2
[6,] 5 4 3 1 2
[7,] 5 4 3 2 1
[8,] 5 3 4 1 2
[9,] 5 3 4 2 1
[10,] 5 3 2 4 1
# res=structure(c(4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 1, 5, 5, 5, 4, 4, 4,
# 3, 3, 3, 5, 1, 3, 3, 1, 3, 3, 4, 4, 2, 3, 3, 1, 2, 3, 1, 2, 1,
# 2, 4, 2, 2, 2, 1, 2, 2, 1, 2, 1, 1), .Dim = c(10L, 5L))
There is no repetition allowed for two vectors. That is, all rows of the resultant matrix have unique elements.
I'm actually looking for the most efficient way. One way to tackle this problem is to generate all possible permutations of length n which grows factorially (n=5 here) and then apply filtering. But it's time-consuming as n grows.
Is there an efficient way to do that?
Try this one:
nv1 <- length(vec1)
nv2 <- length(vec2)
n <- nv1 + nv2
result <- combn(n,nv1,function(v) {z=integer(n);z[v]=vec1;z[-v]=vec2;z})
The idea is to produce all combinations of indices at which to put the elements of vec1.
Not that elegant as Marat Talipov solution, but you can do:
# get the ordering per vector
cc <- c(order(vec1,decreasing = T), order(vec2, decreasing = T)+length(vec1))
cc
[1] 1 2 3 4 5
# permutation to get all "order-combinations"
library(combinat)
m <- do.call(rbind, permn(cc))
# remove unsorted per vector, only if both vectors are correct set TRUE for both:
gr <- apply(m, 1, function(x){
!is.unsorted(x[x < (length(vec1)+1)]) & !is.unsorted(x[x > (length(vec1))])
})
# result, exchange the order index with the vector elements:
t(apply(m[gr, ], 1, function(x, y) y[x], c(vec1, vec2)))
[,1] [,2] [,3] [,4] [,5]
[1,] 4 1 5 3 2
[2,] 4 5 3 1 2
[3,] 4 5 3 2 1
[4,] 4 5 1 3 2
[5,] 5 4 1 3 2
[6,] 5 4 3 2 1
[7,] 5 4 3 1 2
[8,] 5 3 4 1 2
[9,] 5 3 4 2 1
[10,] 5 3 2 4 1