Combinations of two arrays with ordering in Julia - julia

If I have
a=[1,3,5,7,9]
b=[2,4,6,8,10]
and I want to create every combination of length 5 of the two lists with ordering.
So far I can get every possible combination through:
ab=hcat(a,b)
collect(combinations(ab,5))
but I want to receive only the 32 (in this case) ordered combinations.
A function similar to what I am looking for would be the Tuples[Transpose#{a,b}] function in Mathematica.
EDIT:
Mathematica output would be as follows
a = {1, 3, 5, 7, 9};
b = {2, 4, 6, 8, 10};
combin = Tuples[Transpose#{a, b}]
Length[combin]
Out[1]:= {{1, 3, 5, 7, 9}, {1, 3, 5, 7, 10}, {1, 3, 5, 8, 9}, {1, 3, 5, 8,
10}, {1, 3, 6, 7, 9}, {1, 3, 6, 7, 10}, {1, 3, 6, 8, 9}, {1, 3, 6,
8, 10}, {1, 4, 5, 7, 9}, {1, 4, 5, 7, 10}, {1, 4, 5, 8, 9}, {1, 4,
5, 8, 10}, {1, 4, 6, 7, 9}, {1, 4, 6, 7, 10}, {1, 4, 6, 8, 9}, {1,
4, 6, 8, 10}, {2, 3, 5, 7, 9}, {2, 3, 5, 7, 10}, {2, 3, 5, 8,
9}, {2, 3, 5, 8, 10}, {2, 3, 6, 7, 9}, {2, 3, 6, 7, 10}, {2, 3, 6,
8, 9}, {2, 3, 6, 8, 10}, {2, 4, 5, 7, 9}, {2, 4, 5, 7, 10}, {2, 4,
5, 8, 9}, {2, 4, 5, 8, 10}, {2, 4, 6, 7, 9}, {2, 4, 6, 7, 10}, {2,
4, 6, 8, 9}, {2, 4, 6, 8, 10}}
Out[2]:= 32

Here's a v0.5 solution using Base.product.
With
a = [1,3,5,7,9]
b = [2,4,6,8,10]
To create an array of tuples
julia> vec(collect(Base.product(zip(a, b)...)))
32-element Array{Tuple{Int64,Int64,Int64,Int64,Int64},1}:
(1,3,5,7,9)
(2,3,5,7,9)
(1,4,5,7,9)
(2,4,5,7,9)
(1,3,6,7,9)
(2,3,6,7,9)
(1,4,6,7,9)
(2,4,6,7,9)
(1,3,5,8,9)
(2,3,5,8,9)
⋮
(2,4,6,7,10)
(1,3,5,8,10)
(2,3,5,8,10)
(1,4,5,8,10)
(2,4,5,8,10)
(1,3,6,8,10)
(2,3,6,8,10)
(1,4,6,8,10)
(2,4,6,8,10)
and to collect that result into a matrix
julia> hcat((collect(row) for row in ans)...)
5×32 Array{Int64,2}:
1 2 1 2 1 2 1 2 1 2 1 2 1 … 2 1 2 1 2 1 2 1 2
3 3 4 4 3 3 4 4 3 3 4 4 3 4 3 3 4 4 3 3 4 4
5 5 5 5 6 6 6 6 5 5 5 5 6 6 5 5 5 5 6 6 6 6
7 7 7 7 7 7 7 7 8 8 8 8 8 7 8 8 8 8 8 8 8 8
9 9 9 9 9 9 9 9 9 9 9 9 9 10 10 10 10 10 10 10 10 10

There is a package Iterators.jl. By using it (First you should install it by Pkg.add("Iterators")) you can do the following:
using Iterators
for p in product([1,2],[3,4],[5,6],[7,8],[9,10])
#show p
end
Output:
p = (1,3,5,7,9)
p = (2,3,5,7,9)
p = (1,4,5,7,9)
p = (2,4,5,7,9)
p = (1,3,6,7,9)
p = (2,3,6,7,9)
p = (1,4,6,7,9)
p = (2,4,6,7,9)
p = (1,3,5,8,9)
p = (2,3,5,8,9)
p = (1,4,5,8,9)
p = (2,4,5,8,9)
p = (1,3,6,8,9)
p = (2,3,6,8,9)
p = (1,4,6,8,9)
p = (2,4,6,8,9)
p = (1,3,5,7,10)
p = (2,3,5,7,10)
p = (1,4,5,7,10)
p = (2,4,5,7,10)
p = (1,3,6,7,10)
p = (2,3,6,7,10)
p = (1,4,6,7,10)
p = (2,4,6,7,10)
p = (1,3,5,8,10)
p = (2,3,5,8,10)
p = (1,4,5,8,10)
p = (2,4,5,8,10)
p = (1,3,6,8,10)
p = (2,3,6,8,10)
p = (1,4,6,8,10)
p = (2,4,6,8,10)
EDIT
To get the results as array of arrays or matrix you can do :
arr = Any[]
for p in product([1,2],[3,4],[5,6],[7,8],[9,10])
push!(arr,[y for y in p])
end
# now arr is array of arrays. If you want matrix:
hcat(arr...)

Probably the simplest solution is to simply filter out the unsorted elements; filter(issorted, …) should do the trick. This yields 26 elements, though, so perhaps I'm misunderstanding your intention:
julia> collect(filter(issorted, combinations(ab,5)))
26-element Array{Array{Int64,1},1}:
[1,3,5,7,9]
[1,3,5,7,8]
⋮

Related

Paired T-Test over multiple paired columns (wide data format)

I've converted a data frame into wide format and now want to compute paired t-tests to obtain p-values. I have managed to do this for each pair of columns individually, but it's a lot more code than I feel is necessary. I'm still very new to R, data and coding generally, and couldn't easily see a solution here on Stack Overflow.
My wide data frame is:
> head(df_wide)
# A tibble: 6 x 21
Assessor `Appearance1 `Appearance2 `Aroma_1 `Aroma_2 `Flavour_1 `Flavour_2
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 10 10 10 10 10 10
2 6 7 7 5 8 4
# ... with 14 more variables
I want to perform a paired T-Test over the attributes, i.e. Appearance1 and Appearance2, Aroma1 and Aroma2, etc. The 14 other variables are all <dbl> and are also attributes to be included as paired columns for the T-Test.
Ideally, the output would be a vector of just the p-values, rather than having all the information. I've managed to do that coding for individual pairs, but I wanted to know if this would be possible to do as part of performing the T-Test over multiple pairs of columns.
Here is the code I have for the first two attributes:
p_values <- c(t.test(df_wide$`Appearance1`, df_wide$`Appearance2`, paired = T)[["p.value"]],
t.test(df_wide$`Aroma1`, df_wide$`Aroma2`, paired = T)[["p.value"]])
This creates the vector I want, but is cumbersome and error-prone. Ideally, I'd be able to perform it over all the pairs at once without needing to use column names.
I do have the original data frame in long format, if it would be easier to do it using that (EDIT: used dput() for first 20 rows instead of head():
> dput(df_test[1:20,])
structure(list(Assessor = c(1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8, 9, 9, 10, 10),
Product = c("MC", "MV", "MC", "MV", "MV", "MC", "MC", "MV", "MV", "MC", "MC", "MV", "MC", "MV", "MC", "MV", "MV", "MC", "MV", "MC"),
Appearance = c(10, 10, 6, 7, 9, 6, 7, 8, 9, 8, 10, 8, 6, 6, 9, 8, 8, 8, 9, 9),
Aroma = c(10, 10, 7, 5, 9, 8, 6, 7, 5, 7, 9, 8, 6, 6, 5, 3, 6, 7, 9, 6),
Flavour = c(10, 10, 8, 4, 10, 7, 7, 6, 8, 8, 9, 10, 8, 8, 6, 8, 7, 9, 9, 8),
Texture = c(10, 10, 8, 8, 9, 6, 7, 8, 8, 8, 9, 10, 8, 8, 9, 8, 8, 9, 9, 8),
`JAR Colour` = c(3, 2, 2, 3, 3, 3, 3, 3, 3, 2, 3, 2, 3, 2, 3, 3, 3, 3, 3, 3),
`JAR Strength Chocolate` = c(2, 2, 3, 2, 3, 3, 2, 3, 3, 3, 3, 3, 3, 3, 2, 2, 2, 3, 3, 2),
`JAR Strength Vanilla` = c(3, 3, 3, 2, 3, 2, 3, 3, 2, 3, 2, 3, 3, 3, 2, 2, 3, 3, 2, 3),
`JAR Sweetness` = c(2, 3, 3, 1, 3, 2, 2, 2, 3, 3, 2, 3, 3, 3, 3, 2, 3, 3, 3, 3),
`JAR Creaminess` = c(3, 3, 3, 3, 3, 1, 3, 2, 3, 3, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3),
`Overall Acceptance` = c(9, 10, 8, 4, 10, 5, 7, 7, 8, 8, 9, 10, 8, 8, 8, 8, 8, 9, 8, 8)),
row.names = c(NA, -20L), class = c("tbl_df", "tbl", "data.frame"))
The Product variable is the one which was used to make the paired columns in the wide format data frame. Thanks in advance.
if I understand correctly
df <- structure(list(Assessor = c(1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8, 9, 9, 10, 10),
Product = c("MC", "MV", "MC", "MV", "MV", "MC", "MC", "MV", "MV", "MC", "MC", "MV", "MC", "MV", "MC", "MV", "MV", "MC", "MV", "MC"),
Appearance = c(10, 10, 6, 7, 9, 6, 7, 8, 9, 8, 10, 8, 6, 6, 9, 8, 8, 8, 9, 9),
Aroma = c(10, 10, 7, 5, 9, 8, 6, 7, 5, 7, 9, 8, 6, 6, 5, 3, 6, 7, 9, 6),
Flavour = c(10, 10, 8, 4, 10, 7, 7, 6, 8, 8, 9, 10, 8, 8, 6, 8, 7, 9, 9, 8),
Texture = c(10, 10, 8, 8, 9, 6, 7, 8, 8, 8, 9, 10, 8, 8, 9, 8, 8, 9, 9, 8),
`JAR Colour` = c(3, 2, 2, 3, 3, 3, 3, 3, 3, 2, 3, 2, 3, 2, 3, 3, 3, 3, 3, 3),
`JAR Strength Chocolate` = c(2, 2, 3, 2, 3, 3, 2, 3, 3, 3, 3, 3, 3, 3, 2, 2, 2, 3, 3, 2),
`JAR Strength Vanilla` = c(3, 3, 3, 2, 3, 2, 3, 3, 2, 3, 2, 3, 3, 3, 2, 2, 3, 3, 2, 3),
`JAR Sweetness` = c(2, 3, 3, 1, 3, 2, 2, 2, 3, 3, 2, 3, 3, 3, 3, 2, 3, 3, 3, 3),
`JAR Creaminess` = c(3, 3, 3, 3, 3, 1, 3, 2, 3, 3, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3),
`Overall Acceptance` = c(9, 10, 8, 4, 10, 5, 7, 7, 8, 8, 9, 10, 8, 8, 8, 8, 8, 9, 8, 8)),
row.names = c(NA, -20L), class = c("tbl_df", "tbl", "data.frame"))
head(df)
#> # A tibble: 6 x 12
#> Assessor Product Appearance Aroma Flavour Texture `JAR Colour`
#> <dbl> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 1 MC 10 10 10 10 3
#> 2 1 MV 10 10 10 10 2
#> 3 2 MC 6 7 8 8 2
#> 4 2 MV 7 5 4 8 3
#> 5 3 MV 9 9 10 9 3
#> 6 3 MC 6 8 7 6 3
#> # ... with 5 more variables: JAR Strength Chocolate <dbl>,
#> # JAR Strength Vanilla <dbl>, JAR Sweetness <dbl>, JAR Creaminess <dbl>,
#> # Overall Acceptance <dbl>
library(tidyverse)
map_df(df[-c(1:2)], ~t.test(.x ~ df$Product, paired = TRUE)$p.value)
#> # A tibble: 1 x 10
#> Appearance Aroma Flavour Texture `JAR Colour` `JAR Strength Chocolate`
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 0.496 0.576 1 0.309 0.678 1
#> # ... with 4 more variables: JAR Strength Vanilla <dbl>, JAR Sweetness <dbl>,
#> # JAR Creaminess <dbl>, Overall Acceptance <dbl>
sapply(df[-c(1:2)], function(x) t.test(x ~ df$Product, paired = TRUE)$p.value)
#> Appearance Aroma Flavour
#> 0.4961016 0.5763122 1.0000000
#> Texture JAR Colour JAR Strength Chocolate
#> 0.3092332 0.6783097 1.0000000
#> JAR Strength Vanilla JAR Sweetness JAR Creaminess
#> 0.6783097 1.0000000 0.4433319
#> Overall Acceptance
#> 0.7803523
Created on 2021-06-22 by the reprex package (v2.0.0)

R function to find rank of a value in a sorted vector

I'd like to find the rank of a value in a sorted vector, i.e., given a sorted (increasing) vector and a value, find the index of the value in the vector if it is present (or the mean of indices if more than once), or the index of the greatest element less than the value, if it is not present, but within the range of the vector, or something reasonable if the value is outside the range of the vector altogether.
Let's say xx is the vector and x is the value. mean(which(xx == x)) covers the value-present case, and max(which(xx < x)) covers the value-not-present-and-in-range case. 1 and length(xx) are probably reasonable outputs for the not-in-range case.
So I could do that, but I'd like to avoid creating a Boolean vector the size of xx, and also there are just enough wrinkles that I'd prefer to call a built-in or library function instead of rolling my own. Perhaps there is something simple which I've overlooked.
Here's an example. The first value, 7, is present in the vector. The second, 7.3, is not present. I'd like to get the outputs 82.5 and 86, respectively.
> sort (floor (runif (100) * 10)) -> xx
> xx
[1] 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2
[38] 2 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6
[75] 6 6 6 6 7 7 7 7 7 7 7 7 8 8 8 8 8 8 8 8 8 8 8 9 9 9
> mean (which (xx == 7))
[1] 82.5
> max (which (xx <= 7.3))
[1] 86
EDIT: with hints from akrun, I've come up with the following. Note that when there are duplicates, make use of the fact that match returns the least index and findInterval returns the greatest.
# assume xx is sorted already
mean.rank.in <- function (xx, x) {
findInterval (x, xx) -> i
if (i == 0) 0
else
if (xx[[i]] == x)
# account for duplicates here:
# findInterval returned greatest index, call match to find least
(match(x, xx) + i)/2
else i
}
Here are some checks:
xx <- c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3,
3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5,
5, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 6, 6, 7, 7, 7, 7, 7,
7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 9, 9, 9)
mean.rank.in (xx, 7) == 82.5 # expect TRUE
mean.rank.in (xx, 7.3) == 86 # expect TRUE
sapply (xx, function (x) mean.rank.in (xx, x)) # looks right
sum (sapply (xx, function (x) mean.rank.in (xx, x))) == 5050 # expect TRUE
yy <- sort (runif (100))
all (sapply (yy, function (y) mean.rank.in (yy, y)) == 1:100) # expect TRUE
dyy <- min (yy[2:100] - yy[1:99])
yy1 <- yy + dyy/2
all (sapply (yy1, function (y) mean.rank.in (yy1, y)) == 1:100) # expect TRUE
mean.rank.in (yy, yy[[1]] - 1) == 0 # expect TRUE
mean.rank.in (yy, yy[[100]] + 1) == 100 # expect TRUE
Here is one option with rank
rank(xx)[match(7, xx)]
#[1] 82.5
and with findInterval
findInterval(7.3, xx)
#[1] 86
data
xx <- c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3,
3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5,
5, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 6, 6, 7, 7, 7, 7, 7,
7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 9, 9, 9)

R Tidyverse expand a dataframe with all combinations of two variables (edgelist)

This code generates a data frame just so:
library(tidyverse)
A = c(7, 4, 3, 12, 6)
B = c(1, 10, 9, 8, 5)
C = c(5, 3, 1, 7, 6)
df <- data_frame(A, B, C) %>% gather(letter1, rank)
nested <- df %>% group_by(letter1) %>% nest(ranks = c(rank))
nested
A grouped_df: 3 × 2
letter1 ranks
<chr> <list>
A 7, 4, 3, 12, 6
B 1, 10, 9, 8, 5
C 5, 3, 1, 7, 6
This is the desired data frame:
A tibble: 9 × 4
letter1 letter2 data1 data2
<chr> <chr> <list> <list>
A A 7, 4, 3, 12, 6 7, 4, 3, 12, 6
B A 1, 10, 9, 8, 5 7, 4, 3, 12, 6
C A 5, 3, 1, 7, 6 7, 4, 3, 12, 6
A B 7, 4, 3, 12, 6 1, 10, 9, 8, 5
B B 1, 10, 9, 8, 5 1, 10, 9, 8, 5
C B 5, 3, 1, 7, 6 1, 10, 9, 8, 5
A C 7, 4, 3, 12, 6 5, 3, 1, 7, 6
B C 1, 10, 9, 8, 5 5, 3, 1, 7, 6
C C 5, 3, 1, 7, 6 5, 3, 1, 7, 6
Once this step is solved, I'll run a mutate using data1 and data2 to get value, and then selecting letter1, letter2 and value will give an edgelist. I'm working with about 700 letters and the ranks lists will all be the same size and contain about 20 elements.
I'd expected to be able to use expand or expand.grid, but to no avail. Any tidyverse suggestions will be greatly appreciated.
crossing can be used
library(tidyr)
library(purrr)
library(dplyr)
crossing(ind1 = seq_len(nrow(nested)),
ind2 = seq_len(nrow(nested))) %>%
pmap_dfr(~ bind_cols(nested[..1,], nested[..2,]) )
We can use crossing after renaming the second dataframe.
tidyr::crossing(nested, setNames(nested, c('letter2', 'rank2')))
# letter1 ranks letter2 rank2
#1 A 7, 4, 3, 12, 6 A 7, 4, 3, 12, 6
#2 A 7, 4, 3, 12, 6 B 1, 10, 9, 8, 5
#3 A 7, 4, 3, 12, 6 C 5, 3, 1, 7, 6
#4 B 1, 10, 9, 8, 5 A 7, 4, 3, 12, 6
#5 B 1, 10, 9, 8, 5 B 1, 10, 9, 8, 5
#6 B 1, 10, 9, 8, 5 C 5, 3, 1, 7, 6
#7 C 5, 3, 1, 7, 6 A 7, 4, 3, 12, 6
#8 C 5, 3, 1, 7, 6 B 1, 10, 9, 8, 5
#9 C 5, 3, 1, 7, 6 C 5, 3, 1, 7, 6
The same is also valid for expand_grid.
tidyr::expand_grid(nested, setNames(nested, c('letter2', 'rank2')))

Check row by row and highlight mismatches in row/column when it occurred

I have a data frame with 3 months of data with individual information. Individual information must be fixed during the whole period, however, in my real data set it is not the case. I would like to check row by row and highlight the dates that something went wrong during data entry.
Here is sample of my dataset ( real dataset has more variables):
input <- data.frame(stringsAsFactors=FALSE,
date = c(20190218, 20190219, 20190220, 20190221, 20190222,
20190223, 20190101, 20190103, 20190105, 20190110,
20190112, 20190218, 20190219, 20190220, 20190221, 20190222,
20190223),
id = c("18105265-ab", "18105265-ab", "18105265-ab",
"18105265-ab", "18105265-ab", "18105265-ab",
"18161665-aa", "18161665-aa", "18161665-aa", "18161665-aa",
"18161665-aa", "18502020-aa", "18502020-aa", "18502020-aa",
"18502020-aa", "18502020-aa", "18502020-aa"),
size = c(3, 3, 3, 3, 2, 2, 2, 2, 2, 1, 1, 2, 2, 2, 2, 1, 1),
type = c(4, 4, 4, 4, 4, 4, 4, 4, 4, 2, 2, 4, 4, 4, 4, 2, 2),
county = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 5, 5, 5, 5, 5, 5),
member_p10 = c(3, 3, 3, 3, 2, 2, 2, 2, 2, 1, 1, 2, 2, 2, 2, 1, 1),
youngest_age = c(5, 5, 5, 5, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7),
sex = c(1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1),
position = c(5, 5, 5, 5, 5, 5, 4, 4, 4, 0, 0, 3, 3, 3, 3, 0, 0))
Is there any way for this type of operation? I would like to have this output at the end:
date id size type county member_p10 youngest_age sex position
1 20190221 18105265-ab 3 4 1 3 5 1 5
2 20190222 18105265-ab 2 4 1 2 7 1 5
3 20190105 18161665-aa 2 4 1 2 7 2 4
4 20190110 18161665-aa 1 2 1 1 7 2 0
5 20190221 18502020-aa 2 4 5 2 7 1 3
6 20190222 18502020-aa 1 2 5 1 7 1 0

How to identify fully connected node clusters with igraph?

I'm trying to calculate the clusters of a network using igraph in R, where all nodes are connected. The plot seems to work OK, but then I'm not able to return the correct groupings from my clusters.
In this example, the plot shows 4 main clusters, but in the largest cluster, not all nodes are connected:
I would like to be able to return the following list of clusters from this graph object:
[[1]]
[1] 8 9
[[2]]
[1] 7 10
[[3]]
[1] 4 6 11
[[4]]
[1] 2 3 5
[[5]]
[1] 1 3 5 12
Example code:
library(igraph)
topology <- structure(list(N1 = c(1, 3, 5, 12, 2, 3, 5, 1, 2, 3, 5, 12, 4,
6, 11, 1, 2, 3, 5, 12, 4, 6, 11, 7, 10, 8, 9, 8, 9, 7, 10, 4,
6, 11, 1, 3, 5, 12), N2 = c(1, 1, 1, 1, 2, 2, 2, 3, 3, 3, 3,
3, 4, 4, 4, 5, 5, 5, 5, 5, 6, 6, 6, 7, 7, 8, 8, 9, 9, 10, 10,
11, 11, 11, 12, 12, 12, 12)), .Names = c("N1", "N2"), row.names = c(NA,
-38L), class = "data.frame")
g2 <- graph.data.frame(topology, directed=FALSE)
g3 <- simplify(g2)
plot(g3)
The cliques function gets me part of the way there:
tmp <- cliques(g3)
tmp
but, this list also gives groupings where not all nodes connect. For example, this clique includes the nodes 1,2,3,5 but 1 only connects to 3, and 2 only connects to 3 and 5, and 5 only connects to 2 :
topology[tmp[[31]],]
# N1 N2
#6 3 2
#7 5 2
#8 1 3
Thanks in advance for any help.
You could use maximal.cliques in the igraph package. See below.
# Load package
library(igraph)
# Load data
topology <- structure(list(N1 = c(1, 3, 5, 12, 2, 3, 5, 1, 2, 3, 5, 12, 4,
6, 11, 1, 2, 3, 5, 12, 4, 6, 11, 7, 10, 8, 9, 8, 9, 7, 10, 4,
6, 11, 1, 3, 5, 12), N2 = c(1, 1, 1, 1, 2, 2, 2, 3, 3, 3, 3,
3, 4, 4, 4, 5, 5, 5, 5, 5, 6, 6, 6, 7, 7, 8, 8, 9, 9, 10, 10,
11, 11, 11, 12, 12, 12, 12)), .Names = c("N1", "N2"), row.names = c(NA,
-38L), class = "data.frame")
# Get rid of loops and ensure right naming of vertices
g3 <- simplify(graph.data.frame(topology[order(topology[[1]]),],directed = FALSE))
# Plot graph
plot(g3)
# Calcuate the maximal cliques
maximal.cliques(g3)
# > maximal.cliques(g3)
# [[1]]
# [1] 9 8
#
# [[2]]
# [1] 10 7
#
# [[3]]
# [1] 2 3 5
#
# [[4]]
# [1] 6 4 11
#
# [[5]]
# [1] 12 1 5 3

Resources