I would like to generate all unique combinations of n numbers without replacement. For example if n = 4, there are gamma(4+1) combinations shown here:
combos4 <- read.table(text='
x1 x2 x3 x4
1 2 3 4
1 2 4 3
1 3 2 4
1 3 4 2
1 4 2 3
1 4 3 2
2 1 3 4
2 1 4 3
2 3 1 4
2 3 4 1
2 4 1 3
2 4 3 1
3 1 2 4
3 1 4 2
3 2 1 4
3 2 4 1
3 4 1 2
3 4 2 1
4 1 2 3
4 1 3 2
4 2 1 3
4 2 3 1
4 3 1 2
4 3 2 1
', header = TRUE)
I can obtain these 24 combinations using the following code:
combos <- expand.grid(x1=1:4, x2=1:4, x3=1:4, x4=1:4)
my.combos <- combos[apply(combos,1,function(x) length(unique(x)) == ncol(combos)),]
my.combos
However, I always assumed there was an easier way in base R. Specifically, I always assumed the function combn() could return these combinations. However, if combn() can I now realize I do not know how to do it.
The expand.grid approach seems a little inefficient, especially when n is large.
In searching the internet and StackOverflow I have found several similar questions that use the algorithm tag and the code posted in answers, when provided, is not simple. Sorry if this is a duplicate or if I am missing something obvious.
The combinat package has a permn function
library(combinat)
permn(1:4)
> permn(1:4)
[[1]]
[1] 1 2 3 4
[[2]]
[1] 1 2 4 3
[[3]]
[1] 1 4 2 3
[[4]]
[1] 4 1 2 3
[[5]]
[1] 4 1 3 2
[[6]]
[1] 1 4 3 2
[[7]]
[1] 1 3 4 2
[[8]]
[1] 1 3 2 4
[[9]]
[1] 3 1 2 4
[[10]]
[1] 3 1 4 2
[[11]]
[1] 3 4 1 2
[[12]]
[1] 4 3 1 2
[[13]]
[1] 4 3 2 1
[[14]]
[1] 3 4 2 1
[[15]]
[1] 3 2 4 1
[[16]]
[1] 3 2 1 4
[[17]]
[1] 2 3 1 4
[[18]]
[1] 2 3 4 1
[[19]]
[1] 2 4 3 1
[[20]]
[1] 4 2 3 1
[[21]]
[1] 4 2 1 3
[[22]]
[1] 2 4 1 3
[[23]]
[1] 2 1 4 3
[[24]]
[1] 2 1 3 4
gtools has a permutations function that might help:
library(gtools)
permutations(4,4,1:4)
## [,1] [,2] [,3] [,4]
## [1,] 1 2 3 4
## [2,] 1 2 4 3
## [3,] 1 3 2 4
## [4,] 1 3 4 2
## [5,] 1 4 2 3
## [6,] 1 4 3 2
## [7,] 2 1 3 4
## [8,] 2 1 4 3
## [9,] 2 3 1 4
## [10,] 2 3 4 1
## [11,] 2 4 1 3
## [12,] 2 4 3 1
## [13,] 3 1 2 4
## [14,] 3 1 4 2
## [15,] 3 2 1 4
## [16,] 3 2 4 1
## [17,] 3 4 1 2
## [18,] 3 4 2 1
## [19,] 4 1 2 3
## [20,] 4 1 3 2
## [21,] 4 2 1 3
## [22,] 4 2 3 1
## [23,] 4 3 1 2
## [24,] 4 3 2 1
and runs pretty fast:
user system elapsed
0.001 0.000 0.000
There is also iterpc package.
> I = iterpc(4, ordered=TRUE)
> getall(I)
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 1 2 4 3
[3,] 1 3 2 4
[4,] 1 3 4 2
[5,] 1 4 2 3
[6,] 1 4 3 2
[7,] 2 1 3 4
[8,] 2 1 4 3
[9,] 2 3 1 4
[10,] 2 3 4 1
...
You can also do it iteratively
> getnext(I)
[1] 1 2 3 4
> getnext(I)
[1] 1 2 4 3
> getnext(I)
[1] 1 3 2 4
> getnext(I)
[1] 1 3 4 2
Related
I have data like this:
x = c(1,2,3)
prob = c(0.13,0.13,0.74)
# Total sample size
n = 70
result = rep(x, round(n * prob))
Final<-replicate(1, sample(result))
I want to make a matrix[7,10] that have the probability of (0.14,0.14,0.72) for (1,2,3). In this matrix, I need to have in every seven values 1 and 2 repeat 1, and 3 repeats 5 times like this :
3 3 3 1 2 3 3
3 3 3 3 2 1 3
3 2 1 3 3 3 3
3 3 3 1 3 3 2
2 1 3 3 3 3 3
2 1 3 3 3 3 3
3 3 3 2 1 3 3
3 3 3 3 2 3 1
3 2 1 3 3 3 3
So, I will get just one 1, and one 2 in each raw. Could you please help me how to write the code?
One way is to populate a matrix with 3's, then assign 1 and 2 randomly to a column position for each row.
set.seed(1)
m <- matrix(rep(3, 7*10), ncol = 7)
pos <- replicate(10, sample(1:7, 2))
for (i in 1:nrow(m)) m[i, pos[,i]] <- 1:2
m
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7]
#> [1,] 1 3 3 2 3 3 3
#> [2,] 2 3 3 3 3 3 1
#> [3,] 3 1 3 3 2 3 3
#> [4,] 3 3 2 3 3 3 1
#> [5,] 3 2 3 3 3 1 3
#> [6,] 3 3 1 3 3 3 2
#> [7,] 1 3 3 3 2 3 3
#> [8,] 3 2 3 3 1 3 3
#> [9,] 3 3 3 3 3 1 2
#> [10,] 2 1 3 3 3 3 3
Created on 2022-04-15 by the reprex package (v2.0.1)
Suppose I have a column vector of [1 1 1 2 2 2 3 3 3] and I want to generate all the different column vectors only by switching two positions. For an example, one such vector would be
[1 1 3 2 2 2 1 3 3].
Try this (it gives you a data frame each row of which is a unique vector with 2 elements swapped from the original vector, there are 28 such unique vectors, including the original one):
v <- c(1,1,1,2,2,2,3,3,3)
unique(t(apply(t(combn(1:length(v), 2)), 1, function(x) {v[x] <- v[rev(x)]; v})))
with output:
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 1 1 1 2 2 2 3 3 3 # original one
[2,] 2 1 1 1 2 2 3 3 3 # swap 1st & 4th elements
[3,] 2 1 1 2 1 2 3 3 3 # swap 1st & 5th
[4,] 2 1 1 2 2 1 3 3 3 # ...
[5,] 3 1 1 2 2 2 1 3 3
[6,] 3 1 1 2 2 2 3 1 3
[7,] 3 1 1 2 2 2 3 3 1
[8,] 1 2 1 1 2 2 3 3 3
[9,] 1 2 1 2 1 2 3 3 3
[10,] 1 2 1 2 2 1 3 3 3
[11,] 1 3 1 2 2 2 1 3 3
[12,] 1 3 1 2 2 2 3 1 3
[13,] 1 3 1 2 2 2 3 3 1
[14,] 1 1 2 1 2 2 3 3 3
[15,] 1 1 2 2 1 2 3 3 3
[16,] 1 1 2 2 2 1 3 3 3
[17,] 1 1 3 2 2 2 1 3 3
[18,] 1 1 3 2 2 2 3 1 3
[19,] 1 1 3 2 2 2 3 3 1
[20,] 1 1 1 3 2 2 2 3 3
[21,] 1 1 1 3 2 2 3 2 3
[22,] 1 1 1 3 2 2 3 3 2
[23,] 1 1 1 2 3 2 2 3 3
[24,] 1 1 1 2 3 2 3 2 3
[25,] 1 1 1 2 3 2 3 3 2
[26,] 1 1 1 2 2 3 2 3 3
[27,] 1 1 1 2 2 3 3 2 3
[28,] 1 1 1 2 2 3 3 3 2 # swap 6th & 9th
First, we can use the utils function combn to generate all of the possible combinations of pairs of positions to swap. Here, I am assuming you don't want to swap the same number (e.g. 1 and 1), so am checking them to make sure they are different values:
allCombo <-
combn(1:length(startVec), 2)
toKeep <- apply(allCombo, 2, function(x) {
startVec[x[1]] != startVec[x[2]]
})
Then, apply along those that you are keeping, and swap the positions.
outVecs <- apply(allCombo[ , toKeep], 2, function(x){
temp <- startVec
temp[x] <- startVec[rev(x)]
return(temp)
})
This returns as a vector, but you can convert it to a list, which may be easier to manage, like so:
outVecsInList <-
as.list(as.data.frame(outVecs))
head(outVecsInList) shows:
$V1
[1] 2 1 1 1 2 2 3 3 3
$V2
[1] 2 1 1 2 1 2 3 3 3
$V3
[1] 2 1 1 2 2 1 3 3 3
$V4
[1] 3 1 1 2 2 2 1 3 3
$V5
[1] 3 1 1 2 2 2 3 1 3
$V6
[1] 3 1 1 2 2 2 3 3 1
How can I create a matrix from multiple column vectors?
I know that I can easily create a data frame with column vectors:
> colA <- 1:5
> colB <- 21:25
> colC <- 31:35
> data.frame(colA, colB, colC)
colA colB colC
1 1 21 31
2 2 22 32
3 3 23 33
4 4 24 34
5 5 25 35
However, when I try matrix(), it gives me unexpected results, as shown below. How can create my desired matrix? I know I can do as.matrix(df), which nicely preserves the column names, but I'm looking for a more direct approach.
> matrix(colA, colB, colC)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
[1,] 1 2 3 4 5 1 2 3 4 5 1 2 3
[2,] 2 3 4 5 1 2 3 4 5 1 2 3 4
[3,] 3 4 5 1 2 3 4 5 1 2 3 4 5
[4,] 4 5 1 2 3 4 5 1 2 3 4 5 1
[5,] 5 1 2 3 4 5 1 2 3 4 5 1 2
[6,] 1 2 3 4 5 1 2 3 4 5 1 2 3
[7,] 2 3 4 5 1 2 3 4 5 1 2 3 4
[8,] 3 4 5 1 2 3 4 5 1 2 3 4 5
[9,] 4 5 1 2 3 4 5 1 2 3 4 5 1
[10,] 5 1 2 3 4 5 1 2 3 4 5 1 2
[11,] 1 2 3 4 5 1 2 3 4 5 1 2 3
[12,] 2 3 4 5 1 2 3 4 5 1 2 3 4
[13,] 3 4 5 1 2 3 4 5 1 2 3 4 5
[14,] 4 5 1 2 3 4 5 1 2 3 4 5 1
[15,] 5 1 2 3 4 5 1 2 3 4 5 1 2
[16,] 1 2 3 4 5 1 2 3 4 5 1 2 3
[17,] 2 3 4 5 1 2 3 4 5 1 2 3 4
[18,] 3 4 5 1 2 3 4 5 1 2 3 4 5
[19,] 4 5 1 2 3 4 5 1 2 3 4 5 1
[20,] 5 1 2 3 4 5 1 2 3 4 5 1 2
[21,] 1 2 3 4 5 1 2 3 4 5 1 2 3
[,14] [,15] [,16] [,17] [,18] [,19] [,20] [,21] [,22] [,23] [,24] [,25]
[1,] 4 5 1 2 3 4 5 1 2 3 4 5
[2,] 5 1 2 3 4 5 1 2 3 4 5 1
[3,] 1 2 3 4 5 1 2 3 4 5 1 2
[4,] 2 3 4 5 1 2 3 4 5 1 2 3
[5,] 3 4 5 1 2 3 4 5 1 2 3 4
[6,] 4 5 1 2 3 4 5 1 2 3 4 5
[7,] 5 1 2 3 4 5 1 2 3 4 5 1
[8,] 1 2 3 4 5 1 2 3 4 5 1 2
[9,] 2 3 4 5 1 2 3 4 5 1 2 3
[10,] 3 4 5 1 2 3 4 5 1 2 3 4
[11,] 4 5 1 2 3 4 5 1 2 3 4 5
[12,] 5 1 2 3 4 5 1 2 3 4 5 1
[13,] 1 2 3 4 5 1 2 3 4 5 1 2
[14,] 2 3 4 5 1 2 3 4 5 1 2 3
[15,] 3 4 5 1 2 3 4 5 1 2 3 4
[16,] 4 5 1 2 3 4 5 1 2 3 4 5
[17,] 5 1 2 3 4 5 1 2 3 4 5 1
[18,] 1 2 3 4 5 1 2 3 4 5 1 2
[19,] 2 3 4 5 1 2 3 4 5 1 2 3
[20,] 3 4 5 1 2 3 4 5 1 2 3 4
[21,] 4 5 1 2 3 4 5 1 2 3 4 5
[,26] [,27] [,28] [,29] [,30] [,31]
[1,] 1 2 3 4 5 1
[2,] 2 3 4 5 1 2
[3,] 3 4 5 1 2 3
[4,] 4 5 1 2 3 4
[5,] 5 1 2 3 4 5
[6,] 1 2 3 4 5 1
[7,] 2 3 4 5 1 2
[8,] 3 4 5 1 2 3
[9,] 4 5 1 2 3 4
[10,] 5 1 2 3 4 5
[11,] 1 2 3 4 5 1
[12,] 2 3 4 5 1 2
[13,] 3 4 5 1 2 3
[14,] 4 5 1 2 3 4
[15,] 5 1 2 3 4 5
[16,] 1 2 3 4 5 1
[17,] 2 3 4 5 1 2
[18,] 3 4 5 1 2 3
[19,] 4 5 1 2 3 4
[20,] 5 1 2 3 4 5
[21,] 1 2 3 4 5 1
Warning message:
In matrix(colA, colB, colC) :
data length [5] is not a sub-multiple or multiple of the number of rows [21]
You can use cbind to produce the desired matrix:
mat <- cbind(colA, colB, colC)
mat
# colA colB colC
# [1,] 1 21 31
# [2,] 2 22 32
# [3,] 3 23 33
# [4,] 4 24 34
# [5,] 5 25 35
class(mat)
# [1] "matrix"
You don't get the matrix you're expecting with the call of matrix(colA, colB, colC), because your arguments are getting interpreted as the first, second, and third arguments to the matrix function (aka data, nrow, and ncol). If you wanted to use the matrix function, you would need to provide your data as a single argument, with something like mat <- matrix(c(colA, colB, colC), ncol=3). If you used this syntax, you would not get the column names from the variables like we did with cbind.
I have a table which i want to transform
t LabelA LabelB start stop
[1,] 1 4 9 1 2
[2,] 1 3 5 1 2
[3,] 1 1 6 1 2
[4,] 2 4 9 1 2
[5,] 2 3 5 1 2
[6,] 2 1 6 1 2
[7,] 2 7 2 2 2
[8,] 3 3 5 3 4
[9,] 3 1 6 3 4
[10,] 3 7 2 3 5
[11,] 3 4 9 3 5
I want to filter the data in a way that rows which just differ by there number in the first column are removed (not completely but only the duplicate). So for rows 1 and 4 only row 1 should remain in the table. Or for row 3 and 9 only row 9 should remain. It is important that the information in the first column is remained and that the earliest occurance of the row remaisn in the table not the other incidences.
You can use duplicated:
mat[!duplicated(as.data.frame(mat[, -1])), ]
t LabelA LabelB start stop
[1,] 1 4 9 1 2
[2,] 1 3 5 1 2
[3,] 1 1 6 1 2
[7,] 2 7 2 2 2
[8,] 3 3 5 3 4
[9,] 3 1 6 3 4
[10,] 3 7 2 3 5
[11,] 3 4 9 3 5
where mat is the name of your matrix.
Try using duplicated function:
mymx <- matrix(c(1,4,9,1,2 ,1,3,5,1,2 ,1,1,6,1,2 ,2,4,9,1,2 ,2,3,5,1,2 ,2,1,6,1,2 ,2,7,2,2,2 ,3,3,5,3,4 ,3,1,6,3,4 ,3,7,2,3,5 ,3,4,9,3,5), ncol=5, byrow=T)
mymx[!duplicated(mymx[,-1]),]
> mymx[!duplicated(mymx[,-1]),]
[,1] [,2] [,3] [,4] [,5]
[1,] 1 4 9 1 2
[2,] 1 3 5 1 2
[3,] 1 1 6 1 2
[4,] 2 7 2 2 2
[5,] 3 3 5 3 4
[6,] 3 1 6 3 4
[7,] 3 7 2 3 5
[8,] 3 4 9 3 5
Goal: from a list of vectors of equal length, create a matrix where each vector becomes a row.
Example:
> a <- list()
> for (i in 1:10) a[[i]] <- c(i,1:5)
> a
[[1]]
[1] 1 1 2 3 4 5
[[2]]
[1] 2 1 2 3 4 5
[[3]]
[1] 3 1 2 3 4 5
[[4]]
[1] 4 1 2 3 4 5
[[5]]
[1] 5 1 2 3 4 5
[[6]]
[1] 6 1 2 3 4 5
[[7]]
[1] 7 1 2 3 4 5
[[8]]
[1] 8 1 2 3 4 5
[[9]]
[1] 9 1 2 3 4 5
[[10]]
[1] 10 1 2 3 4 5
I want:
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1 1 2 3 4 5
[2,] 2 1 2 3 4 5
[3,] 3 1 2 3 4 5
[4,] 4 1 2 3 4 5
[5,] 5 1 2 3 4 5
[6,] 6 1 2 3 4 5
[7,] 7 1 2 3 4 5
[8,] 8 1 2 3 4 5
[9,] 9 1 2 3 4 5
[10,] 10 1 2 3 4 5
One option is to use do.call():
> do.call(rbind, a)
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1 1 2 3 4 5
[2,] 2 1 2 3 4 5
[3,] 3 1 2 3 4 5
[4,] 4 1 2 3 4 5
[5,] 5 1 2 3 4 5
[6,] 6 1 2 3 4 5
[7,] 7 1 2 3 4 5
[8,] 8 1 2 3 4 5
[9,] 9 1 2 3 4 5
[10,] 10 1 2 3 4 5
simplify2array is a base function that is fairly intuitive. However, since R's default is to fill in data by columns first, you will need to transpose the output. (sapply uses simplify2array, as documented in help(sapply).)
> t(simplify2array(a))
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1 1 2 3 4 5
[2,] 2 1 2 3 4 5
[3,] 3 1 2 3 4 5
[4,] 4 1 2 3 4 5
[5,] 5 1 2 3 4 5
[6,] 6 1 2 3 4 5
[7,] 7 1 2 3 4 5
[8,] 8 1 2 3 4 5
[9,] 9 1 2 3 4 5
[10,] 10 1 2 3 4 5
The built-in matrix function has the nice option to enter data byrow. Combine that with an unlist on your source list will give you a matrix. We also need to specify the number of rows so it can break up the unlisted data. That is:
> matrix(unlist(a), byrow=TRUE, nrow=length(a) )
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1 1 2 3 4 5
[2,] 2 1 2 3 4 5
[3,] 3 1 2 3 4 5
[4,] 4 1 2 3 4 5
[5,] 5 1 2 3 4 5
[6,] 6 1 2 3 4 5
[7,] 7 1 2 3 4 5
[8,] 8 1 2 3 4 5
[9,] 9 1 2 3 4 5
[10,] 10 1 2 3 4 5
Not straightforward, but it works:
> t(sapply(a, unlist))
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1 1 2 3 4 5
[2,] 2 1 2 3 4 5
[3,] 3 1 2 3 4 5
[4,] 4 1 2 3 4 5
[5,] 5 1 2 3 4 5
[6,] 6 1 2 3 4 5
[7,] 7 1 2 3 4 5
[8,] 8 1 2 3 4 5
[9,] 9 1 2 3 4 5
[10,] 10 1 2 3 4 5
t(sapply(a, '[', 1:max(sapply(a, length))))
where 'a' is a list.
Would work for unequal row size
> library(plyr)
> as.matrix(ldply(a))
V1 V2 V3 V4 V5 V6
[1,] 1 1 2 3 4 5
[2,] 2 1 2 3 4 5
[3,] 3 1 2 3 4 5
[4,] 4 1 2 3 4 5
[5,] 5 1 2 3 4 5
[6,] 6 1 2 3 4 5
[7,] 7 1 2 3 4 5
[8,] 8 1 2 3 4 5
[9,] 9 1 2 3 4 5
[10,] 10 1 2 3 4 5