R: all combinations of nested list of variable length [duplicate] - r

I'm not sure if permutations is the correct word for this. I want to given a set of n vectors (i.e. [1,2],[3,4] and [2,3]) permute them all and get an output of
[1,3,2],[1,3,3],[1,4,2],[1,4,3],[2,3,2] etc.
Is there an operation in R that will do this?

This is a useful case for storing the vectors in a list and using do.call() to arrange for an appropriate function call for you. expand.grid() is the standard function you want. But so you don't have to type out or name individual vectors, try:
> l <- list(a = 1:2, b = 3:4, c = 2:3)
> do.call(expand.grid, l)
a b c
1 1 3 2
2 2 3 2
3 1 4 2
4 2 4 2
5 1 3 3
6 2 3 3
7 1 4 3
8 2 4 3
However, for all my cleverness, it turns out that expand.grid() accepts a list:
> expand.grid(l)
a b c
1 1 3 2
2 2 3 2
3 1 4 2
4 2 4 2
5 1 3 3
6 2 3 3
7 1 4 3
8 2 4 3

This is what expand.grid does.
Quoting from the help page: Create a data frame from all combinations of the supplied vectors or factors. The result is a data.frame with a row for each combination.
expand.grid(
c(1, 2),
c(3, 4),
c(2, 3)
)
Var1 Var2 Var3
1 1 3 2
2 2 3 2
3 1 4 2
4 2 4 2
5 1 3 3
6 2 3 3
7 1 4 3
8 2 4 3

As an alternative to expand.grid() you could use rep() to produce the desired combination. Consider the following simplified example using the original data from this question:
a <- c(1,2)
b <- c(3,4)
c <- c(2,3)
To get the expand.grid()-like effect, use rep() with a times= argument equal to the product of the length of the other vectors (or 4). The middle vector would use a nested rep() with products of vector length to either side (or 2 and 2). The end vector is like the first but with each= argument in order to pattern correctly. This is trivial to calculate when each vector is length of 2. Example:
#tibble of all combinations of a, b and c
tibble::tibble(
var1 = rep(a, times = 4),
var2 = rep(rep(b, each= 2), times = 2), #nested rep()
var3 = rep(c, each= 4)
)
For an unknown number of input vectors (or unknown vector lengths), we can get all combinations with rep() in a function like this:
#Produces a tibble of all combinations of input vectors
expand_tibble <- function(...){
x <- list(...) #all input vectors stored here
l <- lapply(x,length)|> unlist() #vector showing length of each input vector
t <- length(l) #total input vector count
r <-list() #empty list
for(i in 1:t){
if(i==1){ #first input vector
first <-l[2:length(l)] |> prod()
r[[i]]<-rep(x[[i]], each = first)
}else{ #last input vector
if(i==t){
last <- l[1:t-1] |> prod()
r[[i]]<-rep(x[[i]], last)
}else{ #all middle input vectors
m1 <- l[1:(i-1)] |> prod()
m2 <- l[(i+1):t] |> prod()
r[[i]] <- rep(rep(x[[i]], each=m1),m2)
}
}
names(r)[i]<-paste0("var",i)
}
tibble::as_tibble(r)
}
output:
expand_tibble(a,b,c)
var1 var2 var3
<dbl> <dbl> <dbl>
1 1 3 2
2 1 3 3
3 1 4 2
4 1 4 3
5 2 3 2
6 2 3 3
7 2 4 2
8 2 4 3

Related

How to repeat the indices of a vector based on the values of that same vector?

Given a random integer vector below:
z <- c(3, 2, 4, 2, 1)
I'd like to create a new vector that contains all z's indices a number of times specified by the value corresponding to that element of z. To illustrate this. The desired result in this case should be:
[1] 1 1 1 2 2 3 3 3 3 4 4 5
There must be a simple way to do this.
You can use rep and seq to repeat the indices of a vector based on the values of that same vector. seq to get the indices and rep to repeat them.
rep(seq(z), z)
# [1] 1 1 1 2 2 3 3 3 3 4 4 5
Starting with all the indices of the vector z. These are given by:
1:length(z)
Then these elements should be repeated. The number of times these numbers should be repeated is specified by the values of z. This can be done using a combination of the lapply or sapply function and the rep function:
unlist(lapply(X = 1:length(z), FUN = function(x) rep(x = x, times = z[x])))
[1] 1 1 1 2 2 3 3 3 3 4 4 5
unlist(sapply(X = 1:length(z), FUN = function(x) rep(x = x, times = z[x])))
[1] 1 1 1 2 2 3 3 3 3 4 4 5
Both alternatives give the same result.

How to get permutations by selecting one member of a subset with multiple subsets in R? [duplicate]

I'm not sure if permutations is the correct word for this. I want to given a set of n vectors (i.e. [1,2],[3,4] and [2,3]) permute them all and get an output of
[1,3,2],[1,3,3],[1,4,2],[1,4,3],[2,3,2] etc.
Is there an operation in R that will do this?
This is a useful case for storing the vectors in a list and using do.call() to arrange for an appropriate function call for you. expand.grid() is the standard function you want. But so you don't have to type out or name individual vectors, try:
> l <- list(a = 1:2, b = 3:4, c = 2:3)
> do.call(expand.grid, l)
a b c
1 1 3 2
2 2 3 2
3 1 4 2
4 2 4 2
5 1 3 3
6 2 3 3
7 1 4 3
8 2 4 3
However, for all my cleverness, it turns out that expand.grid() accepts a list:
> expand.grid(l)
a b c
1 1 3 2
2 2 3 2
3 1 4 2
4 2 4 2
5 1 3 3
6 2 3 3
7 1 4 3
8 2 4 3
This is what expand.grid does.
Quoting from the help page: Create a data frame from all combinations of the supplied vectors or factors. The result is a data.frame with a row for each combination.
expand.grid(
c(1, 2),
c(3, 4),
c(2, 3)
)
Var1 Var2 Var3
1 1 3 2
2 2 3 2
3 1 4 2
4 2 4 2
5 1 3 3
6 2 3 3
7 1 4 3
8 2 4 3
As an alternative to expand.grid() you could use rep() to produce the desired combination. Consider the following simplified example using the original data from this question:
a <- c(1,2)
b <- c(3,4)
c <- c(2,3)
To get the expand.grid()-like effect, use rep() with a times= argument equal to the product of the length of the other vectors (or 4). The middle vector would use a nested rep() with products of vector length to either side (or 2 and 2). The end vector is like the first but with each= argument in order to pattern correctly. This is trivial to calculate when each vector is length of 2. Example:
#tibble of all combinations of a, b and c
tibble::tibble(
var1 = rep(a, times = 4),
var2 = rep(rep(b, each= 2), times = 2), #nested rep()
var3 = rep(c, each= 4)
)
For an unknown number of input vectors (or unknown vector lengths), we can get all combinations with rep() in a function like this:
#Produces a tibble of all combinations of input vectors
expand_tibble <- function(...){
x <- list(...) #all input vectors stored here
l <- lapply(x,length)|> unlist() #vector showing length of each input vector
t <- length(l) #total input vector count
r <-list() #empty list
for(i in 1:t){
if(i==1){ #first input vector
first <-l[2:length(l)] |> prod()
r[[i]]<-rep(x[[i]], each = first)
}else{ #last input vector
if(i==t){
last <- l[1:t-1] |> prod()
r[[i]]<-rep(x[[i]], last)
}else{ #all middle input vectors
m1 <- l[1:(i-1)] |> prod()
m2 <- l[(i+1):t] |> prod()
r[[i]] <- rep(rep(x[[i]], each=m1),m2)
}
}
names(r)[i]<-paste0("var",i)
}
tibble::as_tibble(r)
}
output:
expand_tibble(a,b,c)
var1 var2 var3
<dbl> <dbl> <dbl>
1 1 3 2
2 1 3 3
3 1 4 2
4 1 4 3
5 2 3 2
6 2 3 3
7 2 4 2
8 2 4 3

R How to permute all rows of a data frame such that all possible combinations of rows are returned in a list?

I'm trying to produce all possible row permutations of a data frame (or matrix if that's easier) and have an object returned as a list or array of the data frames/matrices. I've constructed a mock dataframe that as the same dimensions as the one I'm working with.
test.df <- as.data.frame(matrix(1:80,nrow=16,ncol=5)
Edit: changed combinations to permutations
v.df <- data.frame(symbol = c("a", "b", "c"), number = c(1,2,3))
v.df
## symbol number
## 1 a 1
## 2 b 2
## 3 c 3
permutate.rows <- function(df) {
k <- dim(df)[1] # number of rows
index.df <- as.data.frame(t(permutations(n = k, r = k, v = 1:k)))
res <- lapply(index.df, function(idx) df[idx, , drop = FALSE])
}
permutate.rows(v.df)
gives the list of all permutated dfs:
$V1
symbol number
1 a 1
2 b 2
3 c 3
$V2
symbol number
1 a 1
3 c 3
2 b 2
$V3
symbol number
2 b 2
1 a 1
3 c 3
$V4
symbol number
2 b 2
3 c 3
1 a 1
$V5
symbol number
3 c 3
1 a 1
2 b 2
$V6
symbol number
3 c 3
2 b 2
1 a 1
Use 16 instead of 3 and your data frame to apply it on your example.
I shortened the df because 16!=20922789888000
library(purrr)
library(combinat)
test.df <- as.data.frame(matrix(1:25,nrow=5,ncol=5))
map(permn(1:nrow(test.df)), function(x) test.df[x,])

Generating random number by length of blocks of data in R data frame

I am trying to simulate n times the measuring order and see how measuring order effects my study subject. To do this I am trying to generate integer random numbers to a new column in a dataframe. I have a big dataframe and i would like to add a column into the dataframe that consists a random number according to the number of observations in a block.
Example of data(each row is an observation):
df <- data.frame(A=c(1,1,1,2,2,3,3,3,3),
B=c("x","b","c","g","h","g","g","u","l"),
C=c(1,2,4,1,5,7,1,2,5))
A B C
1 1 x 1
2 1 b 2
3 1 c 4
4 2 g 1
5 2 h 5
6 3 g 7
7 3 g 1
8 3 u 2
9 3 l 5
What I'd like to do is add a D column and generate random integer numbers according to the length of each block. Blocks are defined in column A.
Result should look something like this:
df <- data.frame(A=c(1,1,1,2,2,3,3,3,3),
B=c("x","b","c","g","h","g","g","u","l"),
C=c(1,2,4,1,5,7,1,2,5),
D=c(2,1,3,2,1,4,3,1,2))
> df
A B C D
1 1 x 1 2
2 1 b 2 1
3 1 c 4 3
4 2 g 1 2
5 2 h 5 1
6 3 g 7 4
7 3 g 1 3
8 3 u 2 1
9 3 l 5 2
I have tried to use R:s sample() function to generate random numbers but my problem is splitting the data according to block length and adding the new column. Any help is greatly appreciated.
It can be done easily with ave
df$D <- ave( df$A, df$A, FUN = function(x) sample(length(x)) )
(you could replace length() with max(), or whatever, but length will work even if A is not numbers matching the length of their blocks)
This is really easy with ddply from plyr.
ddply(df, .(A), transform, D = sample(length(A)))
The longer manual version is:
Use split to split the data frame by the first column.
split_df <- split(df, df$A)
Then call sample on each member of the list.
split_df <- lapply(split_df, function(df)
{
df$D <- sample(nrow(df))
df
})
Then recombine with
df <- do.call(rbind, split_df)
One simple way:
df$D = 0
counts = table(df$A)
for (i in 1:length(counts)){
df$D[df$A == names(counts)[i]] = sample(counts[i])
}

Combinations of multiple vectors in R

I'm not sure if permutations is the correct word for this. I want to given a set of n vectors (i.e. [1,2],[3,4] and [2,3]) permute them all and get an output of
[1,3,2],[1,3,3],[1,4,2],[1,4,3],[2,3,2] etc.
Is there an operation in R that will do this?
This is a useful case for storing the vectors in a list and using do.call() to arrange for an appropriate function call for you. expand.grid() is the standard function you want. But so you don't have to type out or name individual vectors, try:
> l <- list(a = 1:2, b = 3:4, c = 2:3)
> do.call(expand.grid, l)
a b c
1 1 3 2
2 2 3 2
3 1 4 2
4 2 4 2
5 1 3 3
6 2 3 3
7 1 4 3
8 2 4 3
However, for all my cleverness, it turns out that expand.grid() accepts a list:
> expand.grid(l)
a b c
1 1 3 2
2 2 3 2
3 1 4 2
4 2 4 2
5 1 3 3
6 2 3 3
7 1 4 3
8 2 4 3
This is what expand.grid does.
Quoting from the help page: Create a data frame from all combinations of the supplied vectors or factors. The result is a data.frame with a row for each combination.
expand.grid(
c(1, 2),
c(3, 4),
c(2, 3)
)
Var1 Var2 Var3
1 1 3 2
2 2 3 2
3 1 4 2
4 2 4 2
5 1 3 3
6 2 3 3
7 1 4 3
8 2 4 3
As an alternative to expand.grid() you could use rep() to produce the desired combination. Consider the following simplified example using the original data from this question:
a <- c(1,2)
b <- c(3,4)
c <- c(2,3)
To get the expand.grid()-like effect, use rep() with a times= argument equal to the product of the length of the other vectors (or 4). The middle vector would use a nested rep() with products of vector length to either side (or 2 and 2). The end vector is like the first but with each= argument in order to pattern correctly. This is trivial to calculate when each vector is length of 2. Example:
#tibble of all combinations of a, b and c
tibble::tibble(
var1 = rep(a, times = 4),
var2 = rep(rep(b, each= 2), times = 2), #nested rep()
var3 = rep(c, each= 4)
)
For an unknown number of input vectors (or unknown vector lengths), we can get all combinations with rep() in a function like this:
#Produces a tibble of all combinations of input vectors
expand_tibble <- function(...){
x <- list(...) #all input vectors stored here
l <- lapply(x,length)|> unlist() #vector showing length of each input vector
t <- length(l) #total input vector count
r <-list() #empty list
for(i in 1:t){
if(i==1){ #first input vector
first <-l[2:length(l)] |> prod()
r[[i]]<-rep(x[[i]], each = first)
}else{ #last input vector
if(i==t){
last <- l[1:t-1] |> prod()
r[[i]]<-rep(x[[i]], last)
}else{ #all middle input vectors
m1 <- l[1:(i-1)] |> prod()
m2 <- l[(i+1):t] |> prod()
r[[i]] <- rep(rep(x[[i]], each=m1),m2)
}
}
names(r)[i]<-paste0("var",i)
}
tibble::as_tibble(r)
}
output:
expand_tibble(a,b,c)
var1 var2 var3
<dbl> <dbl> <dbl>
1 1 3 2
2 1 3 3
3 1 4 2
4 1 4 3
5 2 3 2
6 2 3 3
7 2 4 2
8 2 4 3

Resources