Convert or transform the list - r

Input list is:
$A
[1] 25
$B
[1] 22
$C
[1] 25
$D
[1] 26
----
Need to convert this to
$25
[1] "A" "C"
$22
[1] "B"
$26
[1] "D"
How do I change the grouping? Please help me.

If your list is called "L" (example below), try:
L <- list(A = 25, B = 22, C = 25, D = 26)
split(names(L), unlist(L))
# $`22`
# [1] "B"
#
# $`25`
# [1] "A" "C"
#
# $`26`
# [1] "D"
You could also try with(stack(L), split(as.character(ind), values)).

Related

Creating result groups in R, using each element once (combination without repetition)

I have a dataset of 6 individuals: A,B,C,D,E,F
I want to group these into two groups of three individuals and have done so with the combn function in R:
m <- combn(n, 3)
This gives me all 20 possible groups where individuals occur in multiple groups. From this set of groups I then went to find all possible combinations of results, where each individual can only be used once.
I would like to do this using combinations without repetition:
C(n,r) = n! / r!(n-r)! and would therefore get 10 results that would look like this:
abc + def
abd + cef
abe + cdf
abf + cde
acd + bef
ace + bdf
acf + bde
ade + bcf
adf + bce
aef + bcd
I am not sure how to code this in R, from the list of groups that I have generated.
Edit: to generate the dataset I am using I have used the following code:
individuals <- c("a","b","c","d","e","f")
n <- length(individuals)
x <- 3
comb = function(n, x) {
factorial(n) / factorial(n-x) / factorial(x)
}
comb(n,x)
(m <- combn(n, 3))
numbers <- m
letters <- individuals
for (i in 1:length(numbers)) {
m[i] <- letters[numbers[i]]
}
In base R:
Create combnations of 3 letters and store it in a list (asplit)
Create new combnations of 2 groups (of 3 letters)
Filter the list to only keep combinations where the both parts have no element in common
individuals <- c("a","b","c","d","e","f")
combn(individuals, 3, simplify = FALSE) |>
combn(m = 2, simplify = FALSE) |>
Filter(f = \(x) !any(x[[1]] %in% x[[2]]))
output
[[1]]
[[1]][[1]]
[1] "a" "b" "c"
[[1]][[2]]
[1] "d" "e" "f"
[[2]]
[[2]][[1]]
[1] "a" "b" "d"
[[2]][[2]]
[1] "c" "e" "f"
[[3]]
[[3]][[1]]
[1] "a" "b" "e"
[[3]][[2]]
[1] "c" "d" "f"
[[4]]
[[4]][[1]]
[1] "a" "b" "f"
[[4]][[2]]
[1] "c" "d" "e"
[[5]]
[[5]][[1]]
[1] "a" "c" "d"
[[5]][[2]]
[1] "b" "e" "f"
[[6]]
[[6]][[1]]
[1] "a" "c" "e"
[[6]][[2]]
[1] "b" "d" "f"
[[7]]
[[7]][[1]]
[1] "a" "c" "f"
[[7]][[2]]
[1] "b" "d" "e"
[[8]]
[[8]][[1]]
[1] "a" "d" "e"
[[8]][[2]]
[1] "b" "c" "f"
[[9]]
[[9]][[1]]
[1] "a" "d" "f"
[[9]][[2]]
[1] "b" "c" "e"
[[10]]
[[10]][[1]]
[1] "a" "e" "f"
[[10]][[2]]
[1] "b" "c" "d"

Split a vector into non-overlapping sub-list with increasing length

Let's say I have this vector:
letters[1:7]
[1] "a" "b" "c" "d" "e" "f" "g"
I would like to split it into a non-overlapping list with increasing length of 1, and keep what is left behind (e.g. sub-list 4 should have 4 elements, but there's only one left, and I'd like to keep that one), like the following:
[[1]]
[1] "a"
[[2]]
[1] "b" "c"
[[3]]
[1] "d" "e" "f"
[[4]]
[1] "g"
Please do let me know any direction to solve this, thank you!
Example vector:
x <- letters[1:7]
Solution:
n <- ceiling(0.5 * sqrt(1 + 8 * length(x)) - 0.5)
split(x, rep(1:n, 1:n)[1:length(x)])
#$`1`
#[1] "a"
#
#$`2`
#[1] "b" "c"
#
#$`3`
#[1] "d" "e" "f"
#
#$`4`
#[1] "g"
Something quick'n dirty
splitter = function(x) {
n = length(x)
i = 1
while ( i * (i + 1L) / 2L < (n-i) ) i = i + 1
out = rep(i+1, n)
out[1:(i * (i + 1L) / 2L)] = rep(1:i, 1:i)
unname(split(x, out))
}
splitter(x)
[[1]]
[1] "a"
[[2]]
[1] "b" "c"
[[3]]
[1] "d" "e" "f"
[[4]]
[1] "g"
x <- letters[1:7]
splt <- rep(seq(length(x)), seq(length(x)))[seq(length(x))]
split(x, splt)
#> $`1`
#> [1] "a"
#>
#> $`2`
#> [1] "b" "c"
#>
#> $`3`
#> [1] "d" "e" "f"
#>
#> $`4`
#> [1] "g"
Created on 2022-08-04 by the reprex package (v2.0.1)

appending to multiple elements of a list in R

Suppose you have a list foo containing some elements.
foo <- list()
foo[1:3] <- "a"
foo
# [[1]]
# [1] "a"
# [[2]]
# [1] "a"
# [[3]]
# [1] "a"
I would like to efficiently grow the list by both appending to existing elements and adding additional elements. For example adding "b" to elements 2:5, as simply as possible, preferably using foo[2:5]<-.
Desired output
# [[1]]
# [1] "a"
# [[2]]
# [1] "a" "b"
# [[3]]
# [1] "a" "b"
# [[4]]
# [1] "b"
# [[5]]
# [1] "b"
Oh this indeed works:
foo[2:5] <- lapply(foo[2:5], c, "b")
The c is the concatenation function.

Retrieve list of path name from igraph all_simple_paths

I have a directed cyclical matrix and need to extract all the simple paths between any i and j.
The following is my ex. matrix:
>M2<-matrix(c(1,1,0,0,0,1,1,1,1,0,0,1,1,1,0,0,1,0,1,1,0,0,0,1,1), 5, byrow=T)
>colnames(M2)<-c("A", "B", "C", "D", "E")
>row.names(M2)=colnames(M2)
>M2
A B C D E
A 1 1 0 0 0
B 1 1 1 1 0
C 0 1 1 1 0
D 0 1 0 1 1
E 0 0 0 1 1
I use igraph to convert the matrix to a graph object using the graph_from_adjency_matrix function.
>graph<-graph_from_adjacency_matrix(M2, mode=c("directed"), weighted=NULL, diag=F, add.colnames=NULL, add.rownames=NA)
>graph
IGRAPH DN-- 5 9 --
+ attr: name (v/c)
+ edges (vertex names):
[1] A->B B->A B->C B->D C->B C->D D->B D->E E->D
And from there I use the all_simple_paths function to get all the simple paths between i and j. And here starts my questions.
1) I can specify the j (argument to has to=V(graph)) to be all possible end vertices. But I can't specify the from argument to calculate the paths looking for all vertices has possible starting points. I have to specify each of my variables at a time. Any solution?
2) The all_simple_path function works well and gives me all the simple paths between i and j, e.g. for simple paths starting in A and ending in any possible j:
>Simple_path_list<-all_simple_paths(graph, from ="A", to=V(graph), mode = c("out"))
>Simple_path_list
[[1]]
+ 2/5 vertices, named:
[1] A B
[[2]]
+ 3/5 vertices, named:
[1] A B C
[[3]]
+ 4/5 vertices, named:
[1] A B C D
[[4]]
+ 5/5 vertices, named:
[1] A B C D E
[[5]]
+ 3/5 vertices, named:
[1] A B D
[[6]]
+ 4/5 vertices, named:
[1] A B D E
My problem is, I need to collect all those paths and put on a list, e.g.:
Paths
A B
A B C
A B C D
A B C D E
A B D
A B D E
I tried to create a list and call for the path names using the normal list<-Simple_path_list[1] or so, but I always retrieve, together with the paths, the information on the number of vertices involved (e.g., + 4/5 vertices, named). Any idea on how can I retrieve solely the paths name and not the other information?
The lapply function on all_simple_paths makes a list of lists (i.e. a list of each vertex's list of paths). Simplify the list of lists to a list using unlist(..., recursive = F) and then use names or igraph's as_ids to extract the vertex ids solo.
library(igraph)
M2<-matrix(c(1,1,0,0,0,1,1,1,1,0,0,1,1,1,0,0,1,0,1,1,0,0,0,1,1), 5, byrow=T)
colnames(M2)<-c("A", "B", "C", "D", "E")
row.names(M2)=colnames(M2)
M2
graph<-graph_from_adjacency_matrix(M2, mode=c("directed"), weighted=NULL, diag=F, add.colnames=NULL, add.rownames=NA)
l <- unlist(lapply(V(graph) , function(x) all_simple_paths(graph, from=x)), recursive = F)
paths <- lapply(1:length(l), function(x) as_ids(l[[x]]))
This produces:
> paths
[[1]]
[1] "A" "B"
[[2]]
[1] "A" "B" "C"
[[3]]
[1] "A" "B" "C" "D"
[[4]]
[1] "A" "B" "C" "D" "E"
[[5]]
[1] "A" "B" "D"
[[6]]
[1] "A" "B" "D" "E"
[[7]]
[1] "B" "A"
[[8]]
[1] "B" "C"
[[9]]
[1] "B" "C" "D"
[[10]]
[1] "B" "C" "D" "E"
[[11]]
[1] "B" "D"
[[12]]
[1] "B" "D" "E"
[[13]]
[1] "C" "B"
[[14]]
[1] "C" "B" "A"
[[15]]
[1] "C" "B" "D"
[[16]]
[1] "C" "B" "D" "E"
[[17]]
[1] "C" "D"
[[18]]
[1] "C" "D" "B"
[[19]]
[1] "C" "D" "B" "A"
[[20]]
[1] "C" "D" "E"
[[21]]
[1] "D" "B"
[[22]]
[1] "D" "B" "A"
[[23]]
[1] "D" "B" "C"
[[24]]
[1] "D" "E"
[[25]]
[1] "E" "D"
[[26]]
[1] "E" "D" "B"
[[27]]
[1] "E" "D" "B" "A"
[[28]]
[1] "E" "D" "B" "C"
Addition
For all_shortest_paths you must subset the list of paths for each node to exclude the geodesic information.
l <- lapply(V(graph), function(x) all_shortest_paths(graph, from = x))
l <- lapply(l, function(x) x[[-2]])
l <- unlist(l, recursive = F)
paths <- lapply(1:length(l), function(x) as_ids(l[[x]]))

Filtering list in R

I have a list
A <- c(1,2,3,4,5,6,7,8,9,10)
B <- c("a" ,"b", "c" ,"d","b" ,"f" ,"g" ,"a" ,"b" ,"a")
C <- c(25, 26, 27, 28, 29, 30, 31, 32, 10, 15)
mylist <- list(A,B,C)
mylist
[[1]]
[1] 1 2 3 4 5 6 7 8 9 10
[[2]]
[1] "a" "b" "c" "d" "b" "f" "g" "a" "b" "a"
[[3]]
[1] 25 26 27 28 29 30 31 32 10 15
I would like to select all components A,B,C of the list where second component B has value "a" or "b" .
Sample output
mylist
[[1]]
[1] 1 2 6 8 9 10
[[2]]
[1] "a" "b" "b" "a" "b" "a"
[[3]]
[1] 25 26 29 32 10 15
How can I do that? Note that each component have same length.
To stay with a list, why not simply:
lapply(mylist, `[`, is.element(B, letters[1:2]))
#[[1]]
#[1] 1 2 5 8 9 10
#[[2]]
#[1] "a" "b" "b" "a" "b" "a"
#[[3]]
#[1] 25 26 29 32 10 15
I would go with a data.frame or data.table for this use case:
Using your original list (with a 10 added to A to have the same number of entries as B and C):
>df <- data.frame(A=mylist[[1]],B=mylist[[2]],C=mylist[[3]],stringsAsFactors=F)
> df[df$B %in% c("a","b"),]
A B C
1 1 a 25
2 2 b 26
5 5 b 29
8 8 a 32
9 9 b 10
10 10 a 15
This will subset the data.frame by where B values are a or b. If you build your list at first, you may avoid the list step and build the data.frame directly.
If you really want a list at end:
> as.list(df[df$B %in% c("a","b"),])
$A
[1] 1 2 5 8 9 10
$B
[1] "a" "b" "b" "a" "b" "a"
$C
[1] 25 26 29 32 10 15
If you wish to avoid the named entries, use unname: as.list(unname(df[..]))
Here is a simple solution.
First, I create mylist :
mylist <- list(1:10, letters[1:10], 25:15)
Then I create a function which returns TRUE if the condition is TRUE and FALSE otherwise
> filt <- function(x) {
+ x[2] %in% c("a", "b")
+ }
>
Then I use sapply to apply the function to mylist and I select only the components I need :
> mylist[sapply(mylist, filt) == TRUE]
[[1]]
[1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j"

Resources