Generating Permutations of Values Within Multiple Lists [duplicate] - r

This question already has an answer here:
All possible combinations of elements from different bins (one element from every bin) [duplicate]
(1 answer)
Closed 6 years ago.
I'm trying to generate permutations by taking 1 value from 3 different lists
l <- list(A=c(1:13), B=c(1:5), C=c(1:3))
Desired result => Matrix of all the permutations where the first value can be 1-13, second value can be 1-5, third value can be 1-3
I tried using permn from the combinat package, but it seems to just rearrange the 3 lists.
> permn(l)
[[1]]
[[1]]$A
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13
[[1]]$B
[1] 1 2 3 4 5
[[1]]$C
[1] 1 2 3
[[2]]
[[2]]$A
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13
[[2]]$C
[1] 1 2 3
[[2]]$B
[1] 1 2 3 4 5
....
Expected output
[,1] [,2] [,3]
[1,] 1 1 3
[2,] 1 2 1
[3,] 1 1 2
[4,] 1 1 3
and so on...

We can use expand.grid. It can directly be applied on the list
expand.grid(l)

You can create a data frame using do.call and expand.grid, if you really need a matrix, then use as.matrix on the result:
> l <- list(A=c(1:13), B=c(1:5), C=c(1:3))
> out <- do.call(expand.grid, l)
> head(out)
A B C
1 1 1 1
2 2 1 1
3 3 1 1
4 4 1 1
5 5 1 1
6 6 1 1
> tail(out)
A B C
190 8 5 3
191 9 5 3
192 10 5 3
193 11 5 3
194 12 5 3
195 13 5 3
> tail(as.matrix(out))
A B C
[190,] 8 5 3
[191,] 9 5 3
[192,] 10 5 3
[193,] 11 5 3
[194,] 12 5 3
[195,] 13 5 3
>

Related

Generating an vector with rep and seq but without the c() function [duplicate]

This question already has answers here:
R repeating sequence add 1 each repeat
(2 answers)
Closed 5 months ago.
Suppose that I am not allowed to use the c() function.
My target is to generate the vector
"1 2 3 4 5 2 3 4 5 6 3 4 5 6 7 4 5 6 7 8 5 6 7 8 9"
Here is my attempt:
rep(seq(1, 5, 1), 5)
# [1] 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5
rep(0:4,rep(5,5))
# [1] 0 0 0 0 0 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4
So basically I am sum them up. But I wonder if there is a better way to use rep and seq functions ONLY.
Like so:
1:5 + rep(0:4, each = 5)
# [1] 1 2 3 4 5 2 3 4 5 6 3 4 5 6 7 4 5 6 7 8 5 6 7 8 9
I like the sequence option as well:
sequence(rep(5, 5), 1:5)
You could do
rep(1:5, each=5) + rep.int(0:4, 5)
# [1] 1 2 3 4 5 2 3 4 5 6 3 4 5 6 7 4 5 6 7 8 5 6 7 8 9
Just to be precise and use seq as well:
rep(seq.int(1:5), each=5) + rep.int(0:4, 5)
(PS: You can remove the .ints, but it's slower.)
One possible way:
as.vector(sapply(1:5, `+`, 0:4))
[1] 1 2 3 4 5 2 3 4 5 6 3 4 5 6 7 4 5 6 7 8 5 6 7 8 9
I would also propose the outer() function as well:
library(dplyr)
outer(1:5, 0:4, "+") %>%
array()
Or without magrittr %>% function in newer R versions:
outer(1:5, 0:4, "+") |>
array()
Explanation.
The first function will create an array of 1:5 by 0:4 sequencies and fill the intersections with sums of these values:
[,1] [,2] [,3] [,4] [,5]
[1,] 1 2 3 4 5
[2,] 2 3 4 5 6
[3,] 3 4 5 6 7
[4,] 4 5 6 7 8
[5,] 5 6 7 8 9
The second will pull the vector from the array and return the required vector:
[1] 1 2 3 4 5 2 3 4 5 6 3 4 5 6 7 4 5 6 7 8 5 6 7 8 9

How to create a list with variable number of lists in R

I am trying to create a list "pool" like this:
> n=10
> pool=list(0:n,0:n,0:n)
> pool
[[1]]
[1] 0 1 2 3 4 5 6 7 8 9 10
[[2]]
[1] 0 1 2 3 4 5 6 7 8 9 10
[[3]]
[1] 0 1 2 3 4 5 6 7 8 9 10
Instead of typing the 0:n three times, I need to use a line of code to represent it because the number of repeats changes over time. I tried:
> K=3
> pool=NULL
> for (i in 1:K){
+ pool=list(pool,0:n)
+ }
> pool
[[1]]
[[1]][[1]]
[[1]][[1]][[1]]
NULL
[[1]][[1]][[2]]
[1] 0 1 2 3 4 5 6 7 8 9 10
[[1]][[2]]
[1] 0 1 2 3 4 5 6 7 8 9 10
[[2]]
[1] 0 1 2 3 4 5 6 7 8 9 10
but they are different. How can I make it? Thanks!
Use rep:
k <- 3
rep(list(0:n), k)
Or you can use purrr::map:
purrr::map(1:k, ~0:n)
Output
[[1]]
[1] 0 1 2 3 4 5 6 7 8 9 10
[[2]]
[1] 0 1 2 3 4 5 6 7 8 9 10
[[3]]
[1] 0 1 2 3 4 5 6 7 8 9 10
You can use replicate :
n = 10
K = 3
pool <- replicate(K, 0:n, simplify = FALSE)
pool
#[[1]]
# [1] 0 1 2 3 4 5 6 7 8 9 10
#[[2]]
# [1] 0 1 2 3 4 5 6 7 8 9 10
#[[3]]
# [1] 0 1 2 3 4 5 6 7 8 9 10
replicate is similar to purrr::rerun
pool <- purrr::rerun(K, 0:n)

sort matrix elements based on diagonal position in R [duplicate]

This question already has answers here:
Get all diagonal vectors from matrix
(3 answers)
Closed 5 years ago.
Before I attempt writing a custom function; is there an elegant/native method to achieve this?
m<-matrix(1:9,ncol = 3)
m
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9
By column:
as.vector(m)
[1] 1 2 3 4 5 6 7 8 9
By row:
as.vector(t(m))
[1] 1 4 7 2 5 8 3 6 9
By diagonal (I would like a function output):
some.function(m)
[1] 1 2 4 3 5 7 6 8 9
And the perpendicular diagonal:
some.other.function(m)
[1] 7 8 4 9 5 1 6 2 3
ind = expand.grid(1:3, 1:3)
ind[,3] = rowSums(ind)
ind = ind[order(ind[,3], ind[,2], ind[,1]),]
m[as.matrix(ind[,1:2])]
#[1] 1 2 4 3 5 7 6 8 9
m[,3:1][as.matrix(ind[,1:2])]
#[1] 7 8 4 9 5 1 6 2 3

Reflecting changes in a dataframe by Modifying a list containing the dataframe

I have a list containing a couple of data frames. I'm applying lapply on the list and directing output to the same list itself. I expected that this would change the dataframes themselves, but it doesn't. Can someone help with this? I guess it should be quite straight forward, but can't find anything that helps.
Thanks.
Sample data: (Source: Change multiple dataframes in a loop)
data_frame1 <- data.frame(a=c(1,5,3,3,2), b=c(3,6,1,5,5), c=c(4,4,1,9,2))
data_frame2 <- data.frame(a=c(6,0,9,1,2), b=c(2,7,2,2,1), c=c(8,4,1,9,2))
data_frame3 <- data.frame(a=c(0,0,1,5,1), b=c(4,1,9,2,3), c=c(2,9,7,1,1))
ll <- list(data_frame1,data_frame2,data_frame3)
ll <- lapply(ll,function(df){
df$log_a <- log(df$a) ## new column with the log a
df$tans_col <- df$a+df$b+df$c ## new column with sums of some columns or any other
df
})
Results:
ll
[[1]]
a b c log_a tans_col
1 1 3 4 0.0000000 8
2 5 6 4 1.6094379 15
3 3 1 1 1.0986123 5
4 3 5 9 1.0986123 17
5 2 5 2 0.6931472 9
[[2]]
a b c log_a tans_col
1 6 2 8 1.7917595 16
2 0 7 4 -Inf 11
3 9 2 1 2.1972246 12
4 1 2 9 0.0000000 12
5 2 1 2 0.6931472 5
[[3]]
a b c log_a tans_col
1 0 4 2 -Inf 6
2 0 1 9 -Inf 10
3 1 9 7 0.000000 17
4 5 2 1 1.609438 8
5 1 3 1 0.000000 5
data_frame1
a b c
1 1 3 4
2 5 6 4
3 3 1 1
4 3 5 9
5 2 5 2

How to change the way split returns values in R?

I'm working on a project and I want to take a matrix, split it by the values w and x, and then for each of those splits find the maximum value of y.
Here's an example matrix
>rah = cbind(w = 1:6, x = 1:3, y = 12:1, z = 1:12)
>rah
w x y z
[1,] 1 1 12 1
[2,] 2 2 11 2
[3,] 3 3 10 3
[4,] 4 1 9 4
[5,] 5 2 8 5
[6,] 6 3 7 6
[7,] 1 1 6 7
[8,] 2 2 5 8
[9,] 3 3 4 9
[10,] 4 1 3 10
[11,] 5 2 2 11
[12,] 6 3 1 12
So I run split
> doh = split(rah, list(rah[,1], rah[,2]))
> doh
$`1.1`
[1] 1 1 1 1 12 6 1 7
$`2.1`
integer(0)
$`3.1`
integer(0)
$`4.1`
[1] 4 4 1 1 9 3 4 10
$`5.1`
integer(0)
$`6.1`
integer(0)
$`1.2`
integer(0)
$`2.2`
[1] 2 2 2 2 11 5 2 8
$`3.2`
integer(0)
$`4.2`
integer(0)
$`5.2`
[1] 5 5 2 2 8 2 5 11
...
So I'm a bit confused as to how take the output of split and use it to sort the rows with the matching combination of w and x values (Such as row 1 compared to row 7) and then compared them to find the one with the high y value.
EDIT: Informative answers so far but I just realized that I forgot to mention one very important part: I want to keep the whole row (x,w,y,z).
Use aggregate instead
> aggregate(y ~ w + x, max, data=rah)
w x y
1 1 1 12
2 4 1 9
3 2 2 11
4 5 2 8
5 3 3 10
6 6 3 7
If you want to use split, try
> split_rah <- split(rah[,"y"], list(rah[, "w"], rah[, "x"]))
> ind <- sapply(split_rah, function(x) length(x)>0)
> sapply(split_rah[ind], max)
1.1 4.1 2.2 5.2 3.3 6.3
12 9 11 8 10 7
Just for the record, summaryBy from doBy package also works in the same fashion of aggregate
> library(doBy)
> summaryBy(y ~ w + x, FUN=max, data=as.data.frame(rah))
w x y.max
1 1 1 12
2 2 2 11
3 3 3 10
4 4 1 9
5 5 2 8
6 6 3 7
data.table solution:
> library(data.table)
> dt <- data.table(rah)
> dt[, max(y), by=list(w, x)]
w x V1
1: 1 1 12
2: 2 2 11
3: 3 3 10
4: 4 1 9
5: 5 2 8
6: 6 3 7
> tapply(rah[,"y"], list( rah[,"w"], rah[,"x"]), max)
1 2 3
1 12 NA NA
2 NA 11 NA
3 NA NA 10
4 9 NA NA
5 NA 8 NA
6 NA NA 7
Another option using plyr package:
ddply(as.data.frame(rah),.(w,x),summarize,z=max(y))
w x z
1 1 1 12
2 2 2 11
3 3 3 10
4 4 1 9
5 5 2 8
6 6 3 7

Resources