I would like to find all the connected components of a graph where the components have more than one element.
using the clusters gives the membership to different clusters and using cliques does not give connected components.
This is a follow up from
multiple intersection of lists in R
My main goal was to find all the groups of lists which have elements in common with each other.
Thanks in advance!
You can use the results from components to subset your nodes according to the component size.
library(igraph)
# example graph
set.seed(1)
g <- erdos.renyi.game(20, 1/20)
V(g)$name <- letters[1:20]
par(mar=rep(0,4))
plot(g)
# get components
cl <- components(g)
cl
# $membership
# [1] 1 2 3 4 5 4 5 5 6 7 8 9 10 3 5 11 5 3 12 5
#
# $csize
# [1] 1 1 3 2 6 1 1 1 1 1 1 1
#
# $no
# [1] 12
# loop through to extract common vertices
lapply(seq_along(cl$csize)[cl$csize > 1], function(x)
V(g)$name[cl$membership %in% x])
# [[1]]
# [1] "c" "n" "r"
#
# [[2]]
# [1] "d" "f"
#
# [[3]]
# [1] "e" "g" "h" "o" "q" "t"
Related
suppose I have a dataframe where there are two columns that indicate a direct relationship between the parallel values.
c2 <- c(2,5,7,8,10)
c1 <- c(1,3,2,7,5)
df <- data.frame(c1, c2)
Such that:
1 is related to 2 [1],
2 is related to 7 [3],
7 is related to 8 [4]
So I get a vector of the indexes 1,3, and 4
and then 3 is related to 5 [2],
and 5 is related to 10 [5]
so I get a vector of the indexes 2 and 5?
It hurts my brain.
This could be effectively solved using the igraph library:
common_ids <- clusters(graph_from_data_frame(df, directed = FALSE))$membership
split(1:nrow(df), common_ids[match(df$c1, names(common_ids))])
$`1`
[1] 1 3 4
$`2`
[1] 2 5
If also members of the groups are of interest:
split(names(common_ids), common_ids)
$`1`
[1] "1" "2" "7" "8"
$`2`
[1] "3" "5" "10"
An option with igraph
lapply(
groups(components(graph_from_data_frame(df, directed = FALSE))),
function(x) Filter(Negate(is.na),match(x, as.character(df$c1)))
)
gives
$`1`
[1] 1 3 4
$`2`
[1] 2 5
I am trying to create a list of all possible ways to split n individuals into groups of variable size. For example, lets say I have 3 individuals, There are 5 possible ways to split them up:
a single group with 3 people [(1+2+3)]
2 groups (with 3 possible combinations): [(1+2,3),(1+3,2),(2+3,1)]
Each individual into its own group [(1),(2),(3)]
I have written a code determine how the groups could be split:
inds<-1:3
out<-list()
out[[1]]<-matrix(rep(1,length(inds)),nrow=length(inds))
for(i in 1:length(inds)) {
eg <- expand.grid(rep(list(inds), i))
out[[i]]<-unique(t(apply(unique(as.matrix(eg[which(rowSums(eg)==length(inds)),])),1,sort)))
}
out
which results in a list of group numbers and sizes:
[[1]]
[,1]
[1,] 3
[[2]]
[,1] [,2]
2 1 2
[[3]]
Var1 Var2 Var3
1 1 1 1
But I'm not sure how to generate each combination within those potential splits.
Ideally, I would like an output which shows all possible ways n individuals could be split up:
Option Group Individuals
1 1 1,2,3
2 1 1,2
2 2 3
3 1 1,3
3 2 2
4 1 2,3
4 2 1
5 1 1
5 2 2
5 3 3
Any help would be greatly appreciated!
Using combn takes you most of the way:
combinations <- function(group_size, N) {
apply(combn(N, m = group_size), 2, paste0, collapse = ",")
}
all_combinations <- function(N) {
lapply(seq_len(N), combinations, N = N)
}
all_combinations(3)
# [[1]]
# [1] "1" "2" "3"
#
# [[2]]
# [1] "1,2" "1,3" "2,3"
#
# [[3]]
# [1] "1,2,3"
all_combinations(4)
# [[1]]
# [1] "1" "2" "3" "4"
#
# [[2]]
# [1] "1,2" "1,3" "1,4" "2,3" "2,4" "3,4"
#
# [[3]]
# [1] "1,2,3" "1,2,4" "1,3,4" "2,3,4"
#
# [[4]]
# [1] "1,2,3,4"
Say I am given the following strings:
1:{a,b,c,t}
2:{b,c,d}
3:{a,c,d}
4:{a,t}
I want to make a program that will give me all different combinations of these strings, where each combination has to include each given letter.
So for example the above combinations are strings {1&2, 1&3, 2&3&4, 1&2&3&4, 2&4}.
I was thinking of doing this with for loops, where the program would look at the first string, find which elements are missing, then work down through the list to find strings which have these letters. However I think this idea will only find combinations of two strings, and also it requires listing all letters to the program which seems very un-economical.
I think something like this should work.
sets <- list(c('a', 'b', 'c', 't'),
c('b', 'c', 'd'),
c('a', 'c', 'd'),
c('a', 't'))
combinations <- lapply(2:length(sets),
function(x) combn(1:length(sets), x, simplify=FALSE))
combinations <- unlist(combinations, FALSE)
combinations
# [[1]]
# [1] 1 2
#
# [[2]]
# [1] 1 3
#
# [[3]]
# [1] 1 4
#
# [[4]]
# [1] 2 3
#
# [[5]]
# [1] 2 4
#
# [[6]]
# [1] 3 4
#
# [[7]]
# [1] 1 2 3
#
# [[8]]
# [1] 1 2 4
#
# [[9]]
# [1] 1 3 4
#
# [[10]]
# [1] 2 3 4
#
# [[11]]
# [1] 1 2 3 4
u <- unique(unlist(sets))
u
# [1] "a" "b" "c" "t" "d"
Filter(function(x) length(setdiff(u, unlist(sets[x]))) == 0, combinations)
# [[1]]
# [1] 1 2
#
# [[2]]
# [1] 1 3
#
# [[3]]
# [1] 2 4
#
# [[4]]
# [1] 1 2 3
#
# [[5]]
# [1] 1 2 4
#
# [[6]]
# [1] 1 3 4
#
# [[7]]
# [1] 2 3 4
#
# [[8]]
# [1] 1 2 3 4
As a start...
I'll edit this answer when I have time. The following result is dependent on the order of choice. I haven't figured out how to flatten the list yet. If I could flatten it, I would sort each result then remove duplicates.
v = list(c("a","b","c","t"),c("b","c","d"),c("a","c","d"),c("a","t"))
allChars <- Reduce(union, v) # [1] "a" "b" "c" "t" "d"
charInList <- function(ch, li) which(sapply(li, function(vect) ch %in% vect))
locations <- sapply(allChars, function(ch) charInList(ch, v) )
# > locations
# $a
# [1] 1 3 4
#
# $b
# [1] 1 2
#
# $c
# [1] 1 2 3
#
# $t
# [1] 1 4
#
# $d
# [1] 2 3
findStillNeeded<-function(chosen){
haveChars <- Reduce(union, v[chosen])
stillNeed <- allChars[!allChars %in% haveChars]
if(length(stillNeed) == 0 ) return(chosen) #terminate if you dont need any more characters
return ( lapply(1:length(stillNeed), function(i) { #for each of the characters you still need
loc <- locations[[stillNeed[i]]] #find where the character is located
lapply(loc, function(j){
findStillNeeded(c(chosen, j)) #when you add this location to the choices, terminate if you dont need any more characters
})
}) )
}
result<-lapply(1:length(v), function(i){
findStillNeeded(i)
})
This question already has answers here:
Create grouping variable for consecutive sequences and split vector
(5 answers)
Closed 5 years ago.
The following vector x contains the two sequences 1:4 and 6:7, among other non-sequential digits.
x <- c(7, 1:4, 6:7, 9)
I'd like to split x by its sequences, so that the result is a list like the following.
# [[1]]
# [1] 7
#
# [[2]]
# [1] 1 2 3 4
#
# [[3]]
# [1] 6 7
#
# [[4]]
# [1] 9
Is there a quick and simple way to do this?
I've tried
split(x, c(0, diff(x)))
which gets close, but I don't feel like appending 0 to the differenced vector is the right way to go. Using findInterval didn't work either.
split(x, cumsum(c(TRUE, diff(x)!=1)))
#$`1`
#[1] 7
#
#$`2`
#[1] 1 2 3 4
#
#$`3`
#[1] 6 7
#
#$`4`
#[1] 9
Just for fun, you can make use of Carl Witthoft's seqle function from his "cgwtools" package. (It's not going to be anywhere near as efficient as Roland's answer.)
library(cgwtools)
## Here's what seqle does...
## It's like rle, but for sequences
seqle(x)
# Run Length Encoding
# lengths: int [1:4] 1 4 2 1
# values : num [1:4] 7 1 6 9
y <- seqle(x)
split(x, rep(seq_along(y$lengths), y$lengths))
# $`1`
# [1] 7
#
# $`2`
# [1] 1 2 3 4
#
# $`3`
# [1] 6 7
#
# $`4`
# [1] 9
Suppose I have the following list:
test<-list(c("a","b","c"),c("a"),c("c"))
>test
[[1]]
[1] "a" "b" "c"
[[2]]
[1] "a"
[[3]]
[1] "c"
What do I do(or functions to use) to get the frequency of unique items in a list like this:?
a 2
b 1
c 2
I tried using table(test), but I get the following error
> table(test)
Error in table(test) : all arguments must have the same length
test <- list(c("a", "b", "c"), c("a"), c("c"))
# If you want count accross all elements
table(unlist(test))
##
## a b c
## 2 1 2
# If you want seperate counts in each item of list
lapply(test, table)
## [[1]]
##
## a b c
## 1 1 1
##
## [[2]]
##
## a
## 1
##
## [[3]]
##
## c
## 1
##
Use unlist first
table(unlist(test))