Related
I am trying to generate all possible combinations of 0 and 1's in a vector of length 14. Is there an easy way of getting that output as a list of vectors, or even better, a dataframe?
To demonstrate better what I am looking for, let's suppose that I only want a vector of length 3. I would like to be able to generate the following:
(1,1,1), (0,0,0), (1,1,0), (1,0,0), (1,0,1), (0,1,0), (0,1,1), (0,0,0)
You're looking for expand.grid.
expand.grid(0:1, 0:1, 0:1)
Or, for the long case:
n <- 14
l <- rep(list(0:1), n)
expand.grid(l)
tidyr has a couple of options similar to expand.grid().
tidyr::crossing() returns a tibble and does not convert strings to factors (though you could do expand.grid(..., stringsAsFactors = F)).
library(tidyr)
crossing(var1 = 0:1, var2 = 0:1, var3 = 0:1)
# A tibble: 8 x 3
var1 var2 var3
<int> <int> <int>
1 0 0 0
2 0 0 1
3 0 1 0
4 0 1 1
5 1 0 0
6 1 0 1
7 1 1 0
8 1 1 1
tidyr::expand() can give both combinations of only values that appear in the data, like this:
expand(mtcars, nesting(vs, cyl))
# A tibble: 5 x 2
vs cyl
<dbl> <dbl>
1 0 4
2 0 6
3 0 8
4 1 4
5 1 6
or all possible combinations of two variables, even if there isn't an observation with those specific values in the data in the data, like this:
expand(mtcars, vs, cyl)
# A tibble: 6 x 2
vs cyl
<dbl> <dbl>
1 0 4
2 0 6
3 0 8
4 1 4
5 1 6
6 1 8
(You can see that there were no observations in the original data where vs == 1 & cyl == 8)
tidyr::complete() can also be used similar to expand.grid(). This is an example from the docs:
df <- dplyr::tibble(
group = c(1:2, 1),
item_id = c(1:2, 2),
item_name = c("a", "b", "b"),
value1 = 1:3,
value2 = 4:6
)
df %>% complete(group, nesting(item_id, item_name))
# A tibble: 4 x 5
group item_id item_name value1 value2
<dbl> <dbl> <chr> <int> <int>
1 1 1 a 1 4
2 1 2 b 3 6
3 2 1 a NA NA
4 2 2 b 2 5
This gives all possible combinations of item_id and item_name for each group - it creates a line for group=2 item_id=1 and item_name=a.
As an alternative to #Justin's approach, you can also use CJ from the "data.table" package. Here, I've also made use of replicate to create my list of 14 zeroes and ones.
library(data.table)
do.call(CJ, replicate(14, 0:1, FALSE))
# V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14
# 1: 0 0 0 0 0 0 0 0 0 0 0 0 0 0
# 2: 0 0 0 0 0 0 0 0 0 0 0 0 0 1
# 3: 0 0 0 0 0 0 0 0 0 0 0 0 1 0
# 4: 0 0 0 0 0 0 0 0 0 0 0 0 1 1
# 5: 0 0 0 0 0 0 0 0 0 0 0 1 0 0
# ---
# 16380: 1 1 1 1 1 1 1 1 1 1 1 0 1 1
# 16381: 1 1 1 1 1 1 1 1 1 1 1 1 0 0
# 16382: 1 1 1 1 1 1 1 1 1 1 1 1 0 1
# 16383: 1 1 1 1 1 1 1 1 1 1 1 1 1 0
# 16384: 1 1 1 1 1 1 1 1 1 1 1 1 1 1
I discuss here a generic approach to solve all similar type of questions like this one. First let's see how the solutions evolve with increasing number of N to find out the general patterns.
First, the solution for length 1 is
0
1
Now for length 2, the solution becomes (2nd column separated by |):
0 | 0 0, 0 1
1 | 1 0, 1 1
Comparing it with previous solution for length 1, it is obvious that to obtain this new solution we simply append 0 and 1 to each of the previous solution (1st column, 0 and 1).
Now for length 3, the solution is (3rd column):
0 | 0 0 | 0 0 0, 0 0 1
1 | 1 0 | 1 0 0, 1 0 1
| 0 1 | 0 1 0, 0 1 1
| 1 1 | 1 1 0, 1 1 1
Again, this new solution is obtained by appending 0 and 1 to each of the previous solution (2nd column for length 2).
This observation naturally leads to a recursive solution. Assume we have already obtained our solution for length N-1 solution(c(0,1), N-1), to obtain solution of N we simply append 0 and 1 to each item of the solution N-1 append_each_to_list(solution(c(0,1), N-1), c(0,1)). Notice here how a more complex problem (solving N) is naturally decomposed to a simpler problem (solving N-1).
Then we just need to translate this plain English to R code almost literally:
# assume you have got solution for a shorter length len-1 -> solution(v, len-1)
# the solution of length len will be the solution of shorter length appended with each element in v
solution <- function(v, len) {
if (len<=1) {
as.list(v)
} else {
append_each_to_list(solution(v, len-1), v)
}
}
# function to append each element in vector v to list L and return a list
append_each_to_list <- function(L, v) {
purrr::flatten(lapply(v,
function(n) lapply(L, function(l) c(l, n))
))
}
To call the function:
> solution(c(1,0), 3)
[[1]]
[1] 1 1 1
[[2]]
[1] 0 1 1
[[3]]
[1] 1 0 1
[[4]]
[1] 0 0 1
[[5]]
[1] 1 1 0
[[6]]
[1] 0 1 0
[[7]]
[1] 1 0 0
There are 16384 possible permutations. You can use the iterpc package to fetch the result iteratively.
library(iterpc)
I = iterpc(2, 14, label=c(0,1), order=T, replace=T)
getnext(I)
# [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0
getnext(I)
# [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 1
getnext(I)
# [1] 0 0 0 0 0 0 0 0 0 0 0 0 1 0
If you want all results, you can still use getall(I).
Since you are dealing with 0's and 1's, it seems natural to think of integers in terms of bit. Using a function that has been slightly altered from this post (MyIntToBit below), along with your choice of apply functions, we can get the desired result.
MyIntToBit <- function(x, dig) {
i <- 0L
string <- numeric(dig)
while (x > 0) {
string[dig - i] <- x %% 2L
x <- x %/% 2L
i <- i + 1L
}
string
}
If you want a list, use lapply like so:
lapply(0:(2^14 - 1), function(x) MyIntToBit(x,14))
If you prefer a matrix, sapply will do the trick:
sapply(0:(2^14 - 1), function(x) MyIntToBit(x,14))
Below are example outputs:
> lapply(0:(2^3 - 1), function(x) MyIntToBit(x,3))
[[1]]
[1] 0 0 0
[[2]]
[1] 0 0 1
[[3]]
[1] 0 1 0
[[4]]
[1] 0 1 1
[[5]]
[1] 1 0 0
[[6]]
[1] 1 0 1
[[7]]
[1] 1 1 0
[[8]]
[1] 1 1 1
> sapply(0:(2^3 - 1), function(x) MyIntToBit(x,3))
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,] 0 0 0 0 1 1 1 1
[2,] 0 0 1 1 0 0 1 1
[3,] 0 1 0 1 0 1 0 1
This is a different approach to the previous answers. If you need all possible combinations of 14 values of 1 and 0, it's like generating all possible numbers from 0 to (2^14)-1 and keeping the binary representation of them.
n <- 14
lapply(0:(2^n-1), FUN=function(x) head(as.integer(intToBits(x)),n))
Preface
Many nice answers here. I want to add one for those of us that can't seem to wrap their heads around the provided implementations. The solutions here are essentially generalizations of loops, which is why recursive solutions look so elegant. No one outright wrote it as a loop--I think there are merits to giving the most straight-forward solution, just to trace out what's actually happening.
This is not guaranteed to have great performance--and most of the other answers are more practical. The purpose is to allow you to trace out what's actually happening.
The Math
A combination is all the unique selections of a set in which the order of the elements do not matter ([0, 1] is different from [1, 0]). Your list has n elements and you are selecting k elements, for a total number of combinations n^k.
Ex.
You have three letters, ['a', 'b', 'c'] and want to find all unique ways to arrange two of these letters, allowing letters to be pulled repeatedly (so ['a', 'a'] is allowed). n = 3 and k = 2--we have three things and want to find all of the different ways to pick two of them. There are 9 ways to make this selection (3^2--->n^k).
The Code
As mentioned, the simplest solution requires a whole lotta loops.
Keep adding loops and values to select from as your value of k increases.
set <- c("a", "b", "c")
n <- length(set)
# k = 1
# There are only three ways to pick one thing from a selection of three items!
sprintf("Number of combinations:%4d", n^1)
for(i in seq_along(set)){
print(paste(set[i]))
}
# k = 2
sprintf("Number of combinations:%4d", n^2)
for(i in seq_along(set)){
for(j in seq_along(set)){
print(paste(set[i], set[j]))
}
}
# k = 3
sprintf("Number of combinations:%4d", n^3)
for(i in seq_along(set)){
for(j in seq_along(set)){
for(k in seq_along(set)){
print(paste(set[i], set[j], set[k]))
}
}
}
# See the pattern? The value of k corresponds
# to the number of loops and to the number of
# indexes on `set`
A purrr solution with cross() and its variant:
library(purrr)
cross(list(0:1, 0:1, 0:1)) %>% simplify_all()
# [[1]]
# [1] 0 0 0
#
# [[2]]
# [1] 1 0 0
#
# [[3]]
# [1] 0 1 0
#
# ...
#
# [[8]]
# [1] 1 1 1
cross_df(list(var1 = 0:1, var2 = 0:1, var3 = 0:1))
# # A tibble: 8 × 3
# var1 var2 var3
# <int> <int> <int>
# 1 0 0 0
# 2 1 0 0
# 3 0 1 0
# 4 1 1 0
# 5 0 0 1
# 6 1 0 1
# 7 0 1 1
# 8 1 1 1
With dplyr, you could use full_join(x, y, by = character()) to perform a cross-join, generating all combinations of x and y.
Reduce(\(x, y) full_join(x, y, by = character()),
list(tibble(var1 = 0:1), tibble(var2 = 0:1), tibble(var3 = 0:1)))
# # A tibble: 8 × 3
# var1 var2 var3
# <int> <int> <int>
# 1 0 0 0
# 2 0 0 1
# 3 0 1 0
# 4 0 1 1
# 5 1 0 0
# 6 1 0 1
# 7 1 1 0
# 8 1 1 1
A beautiful minimal reproducible example here:
x <- c("red", "blue", "black")
do.call(c, lapply(seq_along(x), combn, x = x, simplify = FALSE))
# [[1]]
# [1] "red"
#
# [[2]]
# [1] "blue"
#
# [[3]]
# [1] "black"
#
# [[4]]
# [1] "red" "blue"
#
# [[5]]
# [1] "red" "black"
#
# [[6]]
# [1] "blue" "black"
#
# [[7]]
# [1] "red" "blue" "black"
All credit goes to #RichScriven
Say I have a list (a, b, c), I Want to find out all the possible combinations of them and store in a matrix like:
a b c
[1,] 1 0 0
[2,] 0 1 0
[3,] 0 0 1
[4,] 1 1 0
[5,] 1 0 1
[6,] 0 1 1
[7,] 1 1 1`
I don't know how to make it. Thanks for the help!
To do exactly what you want, use permutations in the gtools package. This works as follows:
m <- permutations(2, 3, v=c(0,1), repeats.allowed=T)
colnames(m) <- c('a','b','c')
# delete [0,0,0]
m <- m[-1,]
Yields:
a b c
[1,] 0 0 1
[2,] 0 1 0
[3,] 0 1 1
[4,] 1 0 0
[5,] 1 0 1
[6,] 1 1 0
[7,] 1 1 1
Idea was taken from the comment section under this question:
Generate all combinations of length 2 using 3 letters
My adaptation is not very elegant... but it seems to do the job.
output <- expand.grid(rep(list(c('a', 'b', 'c')), 3))
colnames(output) <- c('a', 'b', 'c')
for (col in colnames(output)) {
output[, col] <- as.character(output[,col])
output[, col] <- ifelse(output[, col]==col, 1, 0)
}
output <- output[!duplicated(output), ]
rownames(output) <- NULL
print(output)
# a b c
# 1 1 0 0
# 2 0 0 0
# 3 1 1 0
# 4 0 1 0
# 5 1 0 1
# 6 0 0 1
# 7 1 1 1
# 8 0 1 1
I am trying to generate all possible combinations of 0 and 1's in a vector of length 14. Is there an easy way of getting that output as a list of vectors, or even better, a dataframe?
To demonstrate better what I am looking for, let's suppose that I only want a vector of length 3. I would like to be able to generate the following:
(1,1,1), (0,0,0), (1,1,0), (1,0,0), (1,0,1), (0,1,0), (0,1,1), (0,0,0)
You're looking for expand.grid.
expand.grid(0:1, 0:1, 0:1)
Or, for the long case:
n <- 14
l <- rep(list(0:1), n)
expand.grid(l)
tidyr has a couple of options similar to expand.grid().
tidyr::crossing() returns a tibble and does not convert strings to factors (though you could do expand.grid(..., stringsAsFactors = F)).
library(tidyr)
crossing(var1 = 0:1, var2 = 0:1, var3 = 0:1)
# A tibble: 8 x 3
var1 var2 var3
<int> <int> <int>
1 0 0 0
2 0 0 1
3 0 1 0
4 0 1 1
5 1 0 0
6 1 0 1
7 1 1 0
8 1 1 1
tidyr::expand() can give both combinations of only values that appear in the data, like this:
expand(mtcars, nesting(vs, cyl))
# A tibble: 5 x 2
vs cyl
<dbl> <dbl>
1 0 4
2 0 6
3 0 8
4 1 4
5 1 6
or all possible combinations of two variables, even if there isn't an observation with those specific values in the data in the data, like this:
expand(mtcars, vs, cyl)
# A tibble: 6 x 2
vs cyl
<dbl> <dbl>
1 0 4
2 0 6
3 0 8
4 1 4
5 1 6
6 1 8
(You can see that there were no observations in the original data where vs == 1 & cyl == 8)
tidyr::complete() can also be used similar to expand.grid(). This is an example from the docs:
df <- dplyr::tibble(
group = c(1:2, 1),
item_id = c(1:2, 2),
item_name = c("a", "b", "b"),
value1 = 1:3,
value2 = 4:6
)
df %>% complete(group, nesting(item_id, item_name))
# A tibble: 4 x 5
group item_id item_name value1 value2
<dbl> <dbl> <chr> <int> <int>
1 1 1 a 1 4
2 1 2 b 3 6
3 2 1 a NA NA
4 2 2 b 2 5
This gives all possible combinations of item_id and item_name for each group - it creates a line for group=2 item_id=1 and item_name=a.
As an alternative to #Justin's approach, you can also use CJ from the "data.table" package. Here, I've also made use of replicate to create my list of 14 zeroes and ones.
library(data.table)
do.call(CJ, replicate(14, 0:1, FALSE))
# V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14
# 1: 0 0 0 0 0 0 0 0 0 0 0 0 0 0
# 2: 0 0 0 0 0 0 0 0 0 0 0 0 0 1
# 3: 0 0 0 0 0 0 0 0 0 0 0 0 1 0
# 4: 0 0 0 0 0 0 0 0 0 0 0 0 1 1
# 5: 0 0 0 0 0 0 0 0 0 0 0 1 0 0
# ---
# 16380: 1 1 1 1 1 1 1 1 1 1 1 0 1 1
# 16381: 1 1 1 1 1 1 1 1 1 1 1 1 0 0
# 16382: 1 1 1 1 1 1 1 1 1 1 1 1 0 1
# 16383: 1 1 1 1 1 1 1 1 1 1 1 1 1 0
# 16384: 1 1 1 1 1 1 1 1 1 1 1 1 1 1
I discuss here a generic approach to solve all similar type of questions like this one. First let's see how the solutions evolve with increasing number of N to find out the general patterns.
First, the solution for length 1 is
0
1
Now for length 2, the solution becomes (2nd column separated by |):
0 | 0 0, 0 1
1 | 1 0, 1 1
Comparing it with previous solution for length 1, it is obvious that to obtain this new solution we simply append 0 and 1 to each of the previous solution (1st column, 0 and 1).
Now for length 3, the solution is (3rd column):
0 | 0 0 | 0 0 0, 0 0 1
1 | 1 0 | 1 0 0, 1 0 1
| 0 1 | 0 1 0, 0 1 1
| 1 1 | 1 1 0, 1 1 1
Again, this new solution is obtained by appending 0 and 1 to each of the previous solution (2nd column for length 2).
This observation naturally leads to a recursive solution. Assume we have already obtained our solution for length N-1 solution(c(0,1), N-1), to obtain solution of N we simply append 0 and 1 to each item of the solution N-1 append_each_to_list(solution(c(0,1), N-1), c(0,1)). Notice here how a more complex problem (solving N) is naturally decomposed to a simpler problem (solving N-1).
Then we just need to translate this plain English to R code almost literally:
# assume you have got solution for a shorter length len-1 -> solution(v, len-1)
# the solution of length len will be the solution of shorter length appended with each element in v
solution <- function(v, len) {
if (len<=1) {
as.list(v)
} else {
append_each_to_list(solution(v, len-1), v)
}
}
# function to append each element in vector v to list L and return a list
append_each_to_list <- function(L, v) {
purrr::flatten(lapply(v,
function(n) lapply(L, function(l) c(l, n))
))
}
To call the function:
> solution(c(1,0), 3)
[[1]]
[1] 1 1 1
[[2]]
[1] 0 1 1
[[3]]
[1] 1 0 1
[[4]]
[1] 0 0 1
[[5]]
[1] 1 1 0
[[6]]
[1] 0 1 0
[[7]]
[1] 1 0 0
There are 16384 possible permutations. You can use the iterpc package to fetch the result iteratively.
library(iterpc)
I = iterpc(2, 14, label=c(0,1), order=T, replace=T)
getnext(I)
# [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0
getnext(I)
# [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 1
getnext(I)
# [1] 0 0 0 0 0 0 0 0 0 0 0 0 1 0
If you want all results, you can still use getall(I).
Since you are dealing with 0's and 1's, it seems natural to think of integers in terms of bit. Using a function that has been slightly altered from this post (MyIntToBit below), along with your choice of apply functions, we can get the desired result.
MyIntToBit <- function(x, dig) {
i <- 0L
string <- numeric(dig)
while (x > 0) {
string[dig - i] <- x %% 2L
x <- x %/% 2L
i <- i + 1L
}
string
}
If you want a list, use lapply like so:
lapply(0:(2^14 - 1), function(x) MyIntToBit(x,14))
If you prefer a matrix, sapply will do the trick:
sapply(0:(2^14 - 1), function(x) MyIntToBit(x,14))
Below are example outputs:
> lapply(0:(2^3 - 1), function(x) MyIntToBit(x,3))
[[1]]
[1] 0 0 0
[[2]]
[1] 0 0 1
[[3]]
[1] 0 1 0
[[4]]
[1] 0 1 1
[[5]]
[1] 1 0 0
[[6]]
[1] 1 0 1
[[7]]
[1] 1 1 0
[[8]]
[1] 1 1 1
> sapply(0:(2^3 - 1), function(x) MyIntToBit(x,3))
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,] 0 0 0 0 1 1 1 1
[2,] 0 0 1 1 0 0 1 1
[3,] 0 1 0 1 0 1 0 1
This is a different approach to the previous answers. If you need all possible combinations of 14 values of 1 and 0, it's like generating all possible numbers from 0 to (2^14)-1 and keeping the binary representation of them.
n <- 14
lapply(0:(2^n-1), FUN=function(x) head(as.integer(intToBits(x)),n))
Preface
Many nice answers here. I want to add one for those of us that can't seem to wrap their heads around the provided implementations. The solutions here are essentially generalizations of loops, which is why recursive solutions look so elegant. No one outright wrote it as a loop--I think there are merits to giving the most straight-forward solution, just to trace out what's actually happening.
This is not guaranteed to have great performance--and most of the other answers are more practical. The purpose is to allow you to trace out what's actually happening.
The Math
A combination is all the unique selections of a set in which the order of the elements do not matter ([0, 1] is different from [1, 0]). Your list has n elements and you are selecting k elements, for a total number of combinations n^k.
Ex.
You have three letters, ['a', 'b', 'c'] and want to find all unique ways to arrange two of these letters, allowing letters to be pulled repeatedly (so ['a', 'a'] is allowed). n = 3 and k = 2--we have three things and want to find all of the different ways to pick two of them. There are 9 ways to make this selection (3^2--->n^k).
The Code
As mentioned, the simplest solution requires a whole lotta loops.
Keep adding loops and values to select from as your value of k increases.
set <- c("a", "b", "c")
n <- length(set)
# k = 1
# There are only three ways to pick one thing from a selection of three items!
sprintf("Number of combinations:%4d", n^1)
for(i in seq_along(set)){
print(paste(set[i]))
}
# k = 2
sprintf("Number of combinations:%4d", n^2)
for(i in seq_along(set)){
for(j in seq_along(set)){
print(paste(set[i], set[j]))
}
}
# k = 3
sprintf("Number of combinations:%4d", n^3)
for(i in seq_along(set)){
for(j in seq_along(set)){
for(k in seq_along(set)){
print(paste(set[i], set[j], set[k]))
}
}
}
# See the pattern? The value of k corresponds
# to the number of loops and to the number of
# indexes on `set`
A purrr solution with cross() and its variant:
library(purrr)
cross(list(0:1, 0:1, 0:1)) %>% simplify_all()
# [[1]]
# [1] 0 0 0
#
# [[2]]
# [1] 1 0 0
#
# [[3]]
# [1] 0 1 0
#
# ...
#
# [[8]]
# [1] 1 1 1
cross_df(list(var1 = 0:1, var2 = 0:1, var3 = 0:1))
# # A tibble: 8 × 3
# var1 var2 var3
# <int> <int> <int>
# 1 0 0 0
# 2 1 0 0
# 3 0 1 0
# 4 1 1 0
# 5 0 0 1
# 6 1 0 1
# 7 0 1 1
# 8 1 1 1
With dplyr, you could use full_join(x, y, by = character()) to perform a cross-join, generating all combinations of x and y.
Reduce(\(x, y) full_join(x, y, by = character()),
list(tibble(var1 = 0:1), tibble(var2 = 0:1), tibble(var3 = 0:1)))
# # A tibble: 8 × 3
# var1 var2 var3
# <int> <int> <int>
# 1 0 0 0
# 2 0 0 1
# 3 0 1 0
# 4 0 1 1
# 5 1 0 0
# 6 1 0 1
# 7 1 1 0
# 8 1 1 1
A beautiful minimal reproducible example here:
x <- c("red", "blue", "black")
do.call(c, lapply(seq_along(x), combn, x = x, simplify = FALSE))
# [[1]]
# [1] "red"
#
# [[2]]
# [1] "blue"
#
# [[3]]
# [1] "black"
#
# [[4]]
# [1] "red" "blue"
#
# [[5]]
# [1] "red" "black"
#
# [[6]]
# [1] "blue" "black"
#
# [[7]]
# [1] "red" "blue" "black"
All credit goes to #RichScriven
I'm looking to do a special type of dummy coding procedure in R. Basically, I want a function that looks at each level of an ordinal variable and codes them sequentially. Here's what I need:
Variable_1 --> Variable_1a Variable_1b Variable_1c
1 1 0 0
1 1 0 0
2 1 1 0
2 1 1 0
3 1 1 1
3 1 1 1
I can do this manually using ifelse statements but there has to be an easier way.
Here one way :
x <- c(1,1,2,2,2,3,3)
cbind(x,sapply(unique(x),function(y)ifelse(x>=y,1,0)))
# [1,] 1 1 0 0
# [2,] 1 1 0 0
# [3,] 2 1 1 0
# [4,] 2 1 1 0
# [5,] 2 1 1 0
# [6,] 3 1 1 1
# [7,] 3 1 1 1
Another way:
#simulate your variable
set.seed(1)
var1 <- sample(1:3,10,replace=T)
#initialise the matrix and set the colnames
res <- matrix(0,nrow=length(var1),ncol=max(var1))
colnames(res) <- paste0("Variable_1",letters[1:max(var1)])
#set the 1 elements
res[cbind(rep(1:length(var1),var1),unlist(lapply(var1,seq_len)))] <- 1
x <- c(1,1,2,2,2,3,3)
sapply(1:max(x),`<=`,x)*1
# [,1] [,2] [,3]
# [1,] 1 0 0
# [2,] 1 0 0
# [3,] 1 1 0
# [4,] 1 1 0
# [5,] 1 1 0
# [6,] 1 1 1
# [7,] 1 1 1
Here's another answer using model.matrix (assume your variable is x):
x <- c(1,1,2,2,3,3)
m <- model.matrix(~ 0 + factor(x))
for(i in 1:nrow(m)) m[i, 1:max.col(m, ties.method = "last")[i]] <- 1
m
# factor(x)1 factor(x)2 factor(x)3
# 1 1 0 0
# 2 1 0 0
# 3 1 1 0
# 4 1 1 0
# 5 1 1 1
# 6 1 1 1
# attr(,"assign")
# [1] 0 1 1
# attr(,"contrasts")
# attr(,"contrasts")$`factor(x)`
# [1] "contr.treatment"
To get the reverse is maybe a bit more complicated:
m <- model.matrix(~ 0 + factor(x))
m <- m[,ncol(m):1]
for(i in 1:nrow(m)) m[i, 1:max.col(m, ties.method = "last")[i]] <- 1
m[,ncol(m):1]
# factor(x)1 factor(x)2 factor(x)3
# 1 1 1 1
# 2 1 1 1
# 3 0 1 1
# 4 0 1 1
# 5 0 0 1
# 6 0 0 1
I am trying to generate all possible combinations of 0 and 1's in a vector of length 14. Is there an easy way of getting that output as a list of vectors, or even better, a dataframe?
To demonstrate better what I am looking for, let's suppose that I only want a vector of length 3. I would like to be able to generate the following:
(1,1,1), (0,0,0), (1,1,0), (1,0,0), (1,0,1), (0,1,0), (0,1,1), (0,0,0)
You're looking for expand.grid.
expand.grid(0:1, 0:1, 0:1)
Or, for the long case:
n <- 14
l <- rep(list(0:1), n)
expand.grid(l)
tidyr has a couple of options similar to expand.grid().
tidyr::crossing() returns a tibble and does not convert strings to factors (though you could do expand.grid(..., stringsAsFactors = F)).
library(tidyr)
crossing(var1 = 0:1, var2 = 0:1, var3 = 0:1)
# A tibble: 8 x 3
var1 var2 var3
<int> <int> <int>
1 0 0 0
2 0 0 1
3 0 1 0
4 0 1 1
5 1 0 0
6 1 0 1
7 1 1 0
8 1 1 1
tidyr::expand() can give both combinations of only values that appear in the data, like this:
expand(mtcars, nesting(vs, cyl))
# A tibble: 5 x 2
vs cyl
<dbl> <dbl>
1 0 4
2 0 6
3 0 8
4 1 4
5 1 6
or all possible combinations of two variables, even if there isn't an observation with those specific values in the data in the data, like this:
expand(mtcars, vs, cyl)
# A tibble: 6 x 2
vs cyl
<dbl> <dbl>
1 0 4
2 0 6
3 0 8
4 1 4
5 1 6
6 1 8
(You can see that there were no observations in the original data where vs == 1 & cyl == 8)
tidyr::complete() can also be used similar to expand.grid(). This is an example from the docs:
df <- dplyr::tibble(
group = c(1:2, 1),
item_id = c(1:2, 2),
item_name = c("a", "b", "b"),
value1 = 1:3,
value2 = 4:6
)
df %>% complete(group, nesting(item_id, item_name))
# A tibble: 4 x 5
group item_id item_name value1 value2
<dbl> <dbl> <chr> <int> <int>
1 1 1 a 1 4
2 1 2 b 3 6
3 2 1 a NA NA
4 2 2 b 2 5
This gives all possible combinations of item_id and item_name for each group - it creates a line for group=2 item_id=1 and item_name=a.
As an alternative to #Justin's approach, you can also use CJ from the "data.table" package. Here, I've also made use of replicate to create my list of 14 zeroes and ones.
library(data.table)
do.call(CJ, replicate(14, 0:1, FALSE))
# V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14
# 1: 0 0 0 0 0 0 0 0 0 0 0 0 0 0
# 2: 0 0 0 0 0 0 0 0 0 0 0 0 0 1
# 3: 0 0 0 0 0 0 0 0 0 0 0 0 1 0
# 4: 0 0 0 0 0 0 0 0 0 0 0 0 1 1
# 5: 0 0 0 0 0 0 0 0 0 0 0 1 0 0
# ---
# 16380: 1 1 1 1 1 1 1 1 1 1 1 0 1 1
# 16381: 1 1 1 1 1 1 1 1 1 1 1 1 0 0
# 16382: 1 1 1 1 1 1 1 1 1 1 1 1 0 1
# 16383: 1 1 1 1 1 1 1 1 1 1 1 1 1 0
# 16384: 1 1 1 1 1 1 1 1 1 1 1 1 1 1
I discuss here a generic approach to solve all similar type of questions like this one. First let's see how the solutions evolve with increasing number of N to find out the general patterns.
First, the solution for length 1 is
0
1
Now for length 2, the solution becomes (2nd column separated by |):
0 | 0 0, 0 1
1 | 1 0, 1 1
Comparing it with previous solution for length 1, it is obvious that to obtain this new solution we simply append 0 and 1 to each of the previous solution (1st column, 0 and 1).
Now for length 3, the solution is (3rd column):
0 | 0 0 | 0 0 0, 0 0 1
1 | 1 0 | 1 0 0, 1 0 1
| 0 1 | 0 1 0, 0 1 1
| 1 1 | 1 1 0, 1 1 1
Again, this new solution is obtained by appending 0 and 1 to each of the previous solution (2nd column for length 2).
This observation naturally leads to a recursive solution. Assume we have already obtained our solution for length N-1 solution(c(0,1), N-1), to obtain solution of N we simply append 0 and 1 to each item of the solution N-1 append_each_to_list(solution(c(0,1), N-1), c(0,1)). Notice here how a more complex problem (solving N) is naturally decomposed to a simpler problem (solving N-1).
Then we just need to translate this plain English to R code almost literally:
# assume you have got solution for a shorter length len-1 -> solution(v, len-1)
# the solution of length len will be the solution of shorter length appended with each element in v
solution <- function(v, len) {
if (len<=1) {
as.list(v)
} else {
append_each_to_list(solution(v, len-1), v)
}
}
# function to append each element in vector v to list L and return a list
append_each_to_list <- function(L, v) {
purrr::flatten(lapply(v,
function(n) lapply(L, function(l) c(l, n))
))
}
To call the function:
> solution(c(1,0), 3)
[[1]]
[1] 1 1 1
[[2]]
[1] 0 1 1
[[3]]
[1] 1 0 1
[[4]]
[1] 0 0 1
[[5]]
[1] 1 1 0
[[6]]
[1] 0 1 0
[[7]]
[1] 1 0 0
There are 16384 possible permutations. You can use the iterpc package to fetch the result iteratively.
library(iterpc)
I = iterpc(2, 14, label=c(0,1), order=T, replace=T)
getnext(I)
# [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0
getnext(I)
# [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 1
getnext(I)
# [1] 0 0 0 0 0 0 0 0 0 0 0 0 1 0
If you want all results, you can still use getall(I).
Since you are dealing with 0's and 1's, it seems natural to think of integers in terms of bit. Using a function that has been slightly altered from this post (MyIntToBit below), along with your choice of apply functions, we can get the desired result.
MyIntToBit <- function(x, dig) {
i <- 0L
string <- numeric(dig)
while (x > 0) {
string[dig - i] <- x %% 2L
x <- x %/% 2L
i <- i + 1L
}
string
}
If you want a list, use lapply like so:
lapply(0:(2^14 - 1), function(x) MyIntToBit(x,14))
If you prefer a matrix, sapply will do the trick:
sapply(0:(2^14 - 1), function(x) MyIntToBit(x,14))
Below are example outputs:
> lapply(0:(2^3 - 1), function(x) MyIntToBit(x,3))
[[1]]
[1] 0 0 0
[[2]]
[1] 0 0 1
[[3]]
[1] 0 1 0
[[4]]
[1] 0 1 1
[[5]]
[1] 1 0 0
[[6]]
[1] 1 0 1
[[7]]
[1] 1 1 0
[[8]]
[1] 1 1 1
> sapply(0:(2^3 - 1), function(x) MyIntToBit(x,3))
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,] 0 0 0 0 1 1 1 1
[2,] 0 0 1 1 0 0 1 1
[3,] 0 1 0 1 0 1 0 1
This is a different approach to the previous answers. If you need all possible combinations of 14 values of 1 and 0, it's like generating all possible numbers from 0 to (2^14)-1 and keeping the binary representation of them.
n <- 14
lapply(0:(2^n-1), FUN=function(x) head(as.integer(intToBits(x)),n))
Preface
Many nice answers here. I want to add one for those of us that can't seem to wrap their heads around the provided implementations. The solutions here are essentially generalizations of loops, which is why recursive solutions look so elegant. No one outright wrote it as a loop--I think there are merits to giving the most straight-forward solution, just to trace out what's actually happening.
This is not guaranteed to have great performance--and most of the other answers are more practical. The purpose is to allow you to trace out what's actually happening.
The Math
A combination is all the unique selections of a set in which the order of the elements do not matter ([0, 1] is different from [1, 0]). Your list has n elements and you are selecting k elements, for a total number of combinations n^k.
Ex.
You have three letters, ['a', 'b', 'c'] and want to find all unique ways to arrange two of these letters, allowing letters to be pulled repeatedly (so ['a', 'a'] is allowed). n = 3 and k = 2--we have three things and want to find all of the different ways to pick two of them. There are 9 ways to make this selection (3^2--->n^k).
The Code
As mentioned, the simplest solution requires a whole lotta loops.
Keep adding loops and values to select from as your value of k increases.
set <- c("a", "b", "c")
n <- length(set)
# k = 1
# There are only three ways to pick one thing from a selection of three items!
sprintf("Number of combinations:%4d", n^1)
for(i in seq_along(set)){
print(paste(set[i]))
}
# k = 2
sprintf("Number of combinations:%4d", n^2)
for(i in seq_along(set)){
for(j in seq_along(set)){
print(paste(set[i], set[j]))
}
}
# k = 3
sprintf("Number of combinations:%4d", n^3)
for(i in seq_along(set)){
for(j in seq_along(set)){
for(k in seq_along(set)){
print(paste(set[i], set[j], set[k]))
}
}
}
# See the pattern? The value of k corresponds
# to the number of loops and to the number of
# indexes on `set`
A purrr solution with cross() and its variant:
library(purrr)
cross(list(0:1, 0:1, 0:1)) %>% simplify_all()
# [[1]]
# [1] 0 0 0
#
# [[2]]
# [1] 1 0 0
#
# [[3]]
# [1] 0 1 0
#
# ...
#
# [[8]]
# [1] 1 1 1
cross_df(list(var1 = 0:1, var2 = 0:1, var3 = 0:1))
# # A tibble: 8 × 3
# var1 var2 var3
# <int> <int> <int>
# 1 0 0 0
# 2 1 0 0
# 3 0 1 0
# 4 1 1 0
# 5 0 0 1
# 6 1 0 1
# 7 0 1 1
# 8 1 1 1
With dplyr, you could use full_join(x, y, by = character()) to perform a cross-join, generating all combinations of x and y.
Reduce(\(x, y) full_join(x, y, by = character()),
list(tibble(var1 = 0:1), tibble(var2 = 0:1), tibble(var3 = 0:1)))
# # A tibble: 8 × 3
# var1 var2 var3
# <int> <int> <int>
# 1 0 0 0
# 2 0 0 1
# 3 0 1 0
# 4 0 1 1
# 5 1 0 0
# 6 1 0 1
# 7 1 1 0
# 8 1 1 1
A beautiful minimal reproducible example here:
x <- c("red", "blue", "black")
do.call(c, lapply(seq_along(x), combn, x = x, simplify = FALSE))
# [[1]]
# [1] "red"
#
# [[2]]
# [1] "blue"
#
# [[3]]
# [1] "black"
#
# [[4]]
# [1] "red" "blue"
#
# [[5]]
# [1] "red" "black"
#
# [[6]]
# [1] "blue" "black"
#
# [[7]]
# [1] "red" "blue" "black"
All credit goes to #RichScriven