Find n-cliques in igraph - r

I would like to know if I can find so-called n-cliques in an igraph object. Those are defined as "a maximal subgraph in which the largest geodesic distance between any two nodes is no greater than n" according to Wasserman & Faust. I'm aware that cliques of n=1 can be found via cliques() and that the sizes of cliques can be defined beforehand, but is there any way to find cliques of n larger than 1?

In theory, you could try RBGL::kCliques:
library(igraph)
library(RBGL)
set.seed(1)
g <- random.graph.game(100, p.or.m = 300, type = "gnm")
coords <- layout.auto(g)
cl <- kCliques(igraph.to.graphNEL(g))
k <- 2
clSel <- cl[[paste0(k, '-cliques')]][[1]] # select first of all k-cliques (e.g.)
plot(
g,
layout = coords,
vertex.shape = "none",
vertex.label.color = ifelse(V(g) %in% clSel, "red", "darkgrey"),
edge.color = ifelse(tail_of(g, E(g)) %in% clSel & head_of(g, E(g)) %in% clSel, "orange", "#F0F0F099"),
vertex.size = .5,
edge.curved = 1
)
However, in practice...
all(print(distances(induced_subgraph(g, clSel))) <=k ) # should be TRUE
# [1] FALSE
there seems to be something wrong if we use the definition:
In Social Network Analysis, a k-clique in a graph is a subgraph where
the distance between any two nodes is no greater than k.
Or maybe I misunderstood something...

Thanks to lukeA for pointing out RBGL::kCliques as a solution within R for this problem.
n-cliques are allowed to have links through other nodes that aren't cliques. So
A -- B -- C -- D, with B -- E and C -- E as well can be a 2-clique if A and D are linked through another node, F, even though F is not in the 2-clique (since it is 3 away from E). See http://faculty.ucr.edu/~hanneman/nettext/C11_Cliques.html#nclique
n-clans are not allowed to have this behavior, however; all paths must pass through members of the subgraph to count. lukeA's test therefore demonstrates that the n-cliques are not all n-clans.
You could construct a function that outputs n-clans by throwing out all subgraphs in which the paths aren't fully within the subgraph, e.g.,
nclan <- function(g,n){
g <- as.undirected(g)
E(g)$weight <- 1 #just in case g has weights - does not modify original graph
ncliques <- kCliques(ugraph(igraph.to.graphNEL(g))) #get cliques
n.cand <- ncliques[[n]] #n-clique candidates to be an n-clan
n.clan <- list() #initializes a list to store the n-clans
n.clan.i <- 1 #initializes a list pointer
for (n.cand.i in 1:length(n.cand)){ #loop over all of the candidates
g.n.cand <- induced_subgraph(g,n.cand[[n.cand.i]]) #get the subgraph
if (diameter(g.n.cand)<=n){ #check diameter of the subgraph
n.clan[[n.clan.i]] <- n.cand[[n.cand.i]] #add n-clan to the list
n.clan.i <- n.clan.i+1 #increment list pointer
}
}
return(n.clan) #return the entire list
}
The removal of edge weights is due to an odd bug in RBGL's kCliques implementation. Similarly, you can write a k-plex function:
kplex <- function(g,k,m){
g.sym <- as.undirected(g) #to make sure that degree functions properly
g.sym.degmk <- induced_subgraph(g.sym,igraph::degree(g.sym)>=(m-k)) #makes algorithm faster
k.cand <- combn(V(g.sym.degmk)$name,m) #all candidate combinations with m members
k.plex <- list() #initializes a list to store the k-plexes
k.plex.i <- 1 #initializes a list pointer
for (k.cand.i in 1:dim(k.cand)[2]){ #loop over all of the columns
g.k.cand <- induced_subgraph(g.sym.degmk,k.cand[,k.cand.i]) #get the subgraph
if (min(igraph::degree(g.k.cand))>=(m-k)){ #if minimum degree of sugraph is > m=k, k-plex!
k.plex[[k.plex.i]] <- k.cand[,k.cand.i] #add k-plex to list
k.plex.i <- k.plex.i+1 #increment list pointer
}
}
return(k.plex) #return the entire list
}

You can use the connect.neighborhood() igraph function to connect each vertex all all others no more than distance k away. Then you can find cliques in the resulting graph. This will give you the "k-cliques", as you defined them.
Those are defined as "a maximal subgraph ...
I'm aware that cliques of n=1 can be found via cliques()
Careful here. cliques() finds all cliques, both maximal and non-maximal. max_cliques() finds only maximal cliques. Choose the one that is appropriate for your application.

Related

R igraph make “subgraph” from igraph object from list of vertices, infer edges between selected vertices if there are connected nodes in original

I am using igraph from R. I know we can make a subgraph with selected vertices but if those nodes aren’t directly connected, there won’t be an edge in the new subgraph. Is there a way to make a subgraph which creates an edge between two nodes if there are other nodes (that are not a part of the vertex list) indirectly connecting those two nodes?
For example, if I have a graph which has the following edges:
E-F
F-G
And my vertex list contains E and G, how can I create a new subgraph that creates that edge E-G?
Thank you!!!
One way to find neighbors that are two steps away is to multiply the adjacency matrix with itself (see comments here for example).
First create the graph described in the question:
library(igraph)
g <- graph_from_literal(E--F, F--G)
Then take the adjacency matrix (m) and multiply it with itself.
m <- get.adjacency(g, sparse = F)
m2 <- m %*% m
Built new graph from resulting adjacency matrix and remove all vertices that have a degree of 0 (no second-degree neighbor):
g2 <- graph_from_adjacency_matrix(m2, diag = F, mode = "undirected")
induced_subgraph(g2, degree(g2) > 0)
#> IGRAPH 089bf67 UN-- 2 1 --
#> + attr: name (v/c)
#> + edge from 089bf67 (vertex names):
#> [1] E--G
Created on 2022-08-26 with reprex v2.0.2
Building upon the suggestions in the comments, I arrive at:
require(igraph)
set.seed(1)
g <- erdos.renyi.game(2^6, 1/32)
V(g)$name <- seq(vcount(g))
filter <- c(7,22, 1, 4, 6)
amg <- g[] # adjacency matrix g
clg <- clusters(g)$membership # strongly connected components
amtc <- clg[row(amg)] == clg[col(amg)] # adjacency matrix of transitive closure
dim(amtc) <- dim(amg)
gtc <- simplify(graph.adjacency(amtc, mode="undirected")) # transitive closure of g
V(gtc)$name <- V(g)$name
isg <- induced_subgraph(gtc, filter)
plot(isg)
However this solution is not feasible if g is large and the subgraph significantly smaller.
If subgraph << original graph then:
require(igraph)
set.seed(1)
g <- erdos.renyi.game(2^6, 1/32)
V(g)$name <- seq(vcount(g))
filter <- c(1, 4, 6, 7, 22, 25)
stopifnot(!is.directed(g)) # assume undirected graph
mscc <- components(g)$membership[filter] # membership strongly connected components
amfi <- outer(X=mscc, mscc, FUN = "==")*1 # cross product = 1, when equal
fitc <- simplify(graph.adjacency(amfi, mode="undirected")) # transitive closure of filter in g
plot(fitc)
Building on Szabolcs, note that connect(g, vcount(g)) computes the transitive closure of g. However not suitable for larger graphs (vcount > 8192).
require(igraph)
g <- make_graph(~ E-G, G-F)
fi <- c("E", "F")
system.time(tcg <- connect(g, vcount(g)) )
sg <- subgraph(tcg, V(tcg)[fi])
sg

Does igraph has a function that generates sub-graphs limited by weights? dfs, random_walk

I have a weighted graph in igraph R environment.
And need to obtain sub-graphs recursively, starting from any random node. The sum of weights in each sub-graph has to be less them a number.
The Deep First Search algorithm seems to deal with this problem. Also the random walk function.
Does anybody know which igraph function could tackle this?
This iterative function finds the sub-graph grown from vertex vertex of any undirected graph which contains the biggest possible weight-sum below a value spevified in limit.
A challange in finding such a graph is the computational load of evaluating the weight sum of any possible sub-graphs. Consider this example, where one iteration has found a sub-graph A-B with a weight sum of 1.
The shortest path to any new vertex is A-C (with a weight of 3), a sub-graph of A-B-D has a weight-sum of 6, while A-B-C would have a weight-sum of 12 because of the inclusion of the edge B-C in the sub-graph.
The function below looks ahead and evaluates iterative steps by choosing to gradually enlarge the sub-graph by including the next vertex that would result in the lowest sub-graph weight-sum rather than that vertex which has the shortest direct paths.
In terms of optimisation, this leaves something to be desired, but I think id does what you requested in your first question.
find_maxweight_subgraph_from <- function(graph, vertex, limit=0, sub_graph=c(vertex), current_ws=0){
# Keep a shortlist of possible edges to go next
shortlist = data.frame(k=integer(0),ws=numeric(0))
limit <- min(limit, sum(E(graph)$weight))
while(current_ws < limit){
# To find the next possible vertexes to include, a listing of
# potential candidates is computed to be able to choose the most
# efficient one.
# Each iteration chooses amongst vertecies that are connected to the sub-graph:
adjacents <- as.vector(adjacent_vertices(graph, vertex, mode="all")[[1]])
# A shortlist of possible enlargements of the sub-graph is kept to be able
# to compare each potential enlargement of the sub-graph and always choose
# the one which results in the smallest increase of sub-graph weight-sum.
#
# The shortlist is enlarged by vertecies that are:
# 1) adjacent to the latest added vertex
# 2) not alread IN the sub-graph
new_k <- adjacents[!adjacents %in% sub_graph]
shortlist <- rbind(shortlist[!is.na(shortlist$k),],
data.frame(k = new_k,
ws = rep(Inf, length(new_k)) )
)
# The addition to the weight-sum is NOT calculated by the weight on individual
# edges leading to vertecies on the shortlist BUT on the ACTUAL weight-sum of
# a sub-graph that would be the result of adding a vertex `k` to the sub-graph.
shortlist$ws <- sapply(shortlist$k, function(x) sum( E(induced_subgraph(graph, c(sub_graph,x)))$weight ) )
# We choose the vertex with the lowest impact on weight-sum:
shortlist <- shortlist[order(shortlist$ws),]
vertex <- shortlist$k[1]
current_ws <- shortlist$ws[1]
shortlist <- shortlist[2:nrow(shortlist),]
# Each iteration adds a new vertex to the sub-graph
if(current_ws <= limit){
sub_graph <- c(sub_graph, vertex)
}
}
(induced_subgraph(graph, sub_graph))
}
# Test function using a random graph
g <- erdos.renyi.game(16, 30, type="gnm", directed=F)
E(g)$weight <- sample(1:1000/100, length(E(g)))
sum(E(g)$weight)
plot(g, edge.width = E(g)$weight, vertex.size=2)
sg <- find_maxweight_subgraph_from(g, vertex=12, limit=60)
sum(E(sg)$weight)
plot(sg, edge.width = E(sg)$weight, vertex.size=2)
# Test function using your example code:
g <- make_tree(10, children = 2, mode = c("undirected"))
s <- seq(1:10)
g <- set_edge_attr(g, "weight", value= s)
plot(g, edge.width = E(g)$weight)
sg <- find_maxweight_subgraph_from(g, 2, 47)
sum(E(sg)$weight)
plot(sg, edge.width = E(g)$weight)
It is done here below, however, it does not seem to be effective.
#######Example code
g <- make_tree(10, children = 2, mode = c("undirected"))
s <- seq(1:19)
g <- set_edge_attr(g, "weight", value= s)
plot(g)
is_weighted(g)
E(g)$weight
threshold <- 5
eval <- function(r){
#r <- 10
Vertice_dfs <- dfs(g, root = r)
Sequencia <- as.numeric(Vertice_dfs$order)
for (i in 1:length(Sequencia)) {
#i <- 2
# function callback by vertice to dfs
f.in <- function(graph, data, extra) {
data[1] == Sequencia[i]-1
}
# DFS algorithm to the function
dfs <- dfs(g, root = r,in.callback=f.in)
# Vertices resulted from DFS
dfs_eges <- na.omit(as.numeric(dfs$order))
# Rsulted subgraph
g2 <- induced_subgraph(g, dfs_eges)
# Total weight subgraph g2
T_W <- sum(E(g2)$weight)
if (T_W > threshold) {
print(T_W)
return(T_W)
break
}
}
}
#search by vertice
result <- lapply(1:length(V(g)),eval)

Sort list into hash table according to specific comparisson criteria in R

I am looking for a way, in R, to convert a list into a hash table, grouping elements that are similar according to a specific criteria.
The details are specific to "graph theory", as explained bellow, but I suppose the answer is a general procedure to hash based on some specific criteria.
The list is comprised of "graph" objects (from igraph package).
library(igraph)
#Creating the list of graphs
edgeList <- data.frame(
idA=c(008, 001, 001, 010, 047, 002, 005, 005),
idB=c(100, 010, 020, 030, 030, 001, 011, 111)
)
edgeList$idB= edgeList$idB+0.1
g <- graph_from_data_frame(edgeList, directed = TRUE)
g_list <- decompose(g, mode = "weak")
#from the 8 edges we obtain 5 graphs (connected components of the original graph)
The similarity criteria is that graphs must be isomorphic:
isomorphic(g_list[[1]],g_list[[4]])
How can I hash the indexes for the elements in g_list into a hash table?
For this toy example the expected result should be:
g_inded_hash
[[1]]
[1] 1 4
[[2]]
[1] 2 5
[[3]]
[1] 3
(not necessarily a list, but some data structure that groups graphs (1 and 4) and (2 and 5) which are similar)
In reality, I have 40 millions of (small) graphs that I need to group according to the isomorphisms.
From searching I found the answer must be related to the hash package or environment, but could not adapt that into a solution.
EDIT: changed directed = TRUE in graph_from_data_frame(), above.
Since isomorphism is transitive, we can look at all the pairs of components (i,j), such that i < j, then build a graph where the nodes are the components and the edges are defined by the isomorphic property. The hash table can be extracted from the connected components of this new graph.
# all pairs (i,j) such that i < j
combinations <- unlist(sapply(seq_along(g_list),
function(j) lapply(seq_len(j-1),
function(i) c(i,j))),
recursive = FALSE)
# filter the isomorphic pairs
iso <- Filter(function(pair) isomorphic(g_list[[pair[1]]],g_list[[pair[2]]]),
combinations)
# convert to data frame
df <- data.frame(matrix(unlist(iso), ncol = 2, byrow = TRUE))
# build graph where the vertices are the components
# and the edges indicate the isomorphic property
g_iso <- graph_from_data_frame(df, directed = FALSE)
# identify groups that share the same property
groups <- clusters(g_iso)$membership
# the names are the indices of g_list
g_hash <- lapply(unique(groups),
function(i) as.integer(names(which(groups == i))))
Result:
> g_hash
[[1]]
[1] 2 3 5
[[2]]
[1] 1 4
This does not match the expected result in the question but isomorphic(g_list[[2]],g_list[[3]]) and isomorphic(g_list[[3]],g_list[[5]]) are true.
It's probably not the most straightforward way to do this but that's what came to mind.
I managed to write a solution for my problem. It is probably not very "Rish", not very efficient, with all the loops, but I think it works. Please let me know of a better way to do this.
gl_hash <- list()
gl_hash[1] <- 1
j <- 1
for(i in 2:length(gl)) {
m <- 0
for(k in 1:j){
if(isomorphic( gl[[ gl_hash[[k]][1] ]], gl[[i]])) {
gl_hash[[k]] <- c(gl_hash[[1]],i)
m <- 1
break
}
}
if(m==0) {
j <- j+ 1
gl_hash[j] <- i
}
}

Deleting a single node in R

I'm trying to visualize a preferential network of products using R. I already have a graph of the product network using igraph, but I want to see what would happen if I were to remove one product. I found that I can delete a node using
g2 <- g - V(g)[15]
but it would also delete all the edges connected to that specific node.
Is there any way to delete just the node and to see how the other nodes reconnect to each other after the deletion of that one node? Any help in this matter is appreciated.
P.S.
Hopefully this will make it clearer:
For example, if we generate the random graph:
set.seed(10)
Data <- data.frame(
X = sample(1:10),
Y = sample(3, 10, replace=T)
)
d <- graph.data.frame(Data)
plot(d)
d2 <- d-V(d)[2] #deleting '3' from the network
plot(d2)
If you notice, when you delete the node '3' from the network, node '9' remains unconnected. Is there a way to see the new edge of node '9' after node '3' is connected? Still following the same plot, we would expect that it would connect to node '2'. Is there a function that does this in igraph? Or should i make a code for it?
Maybe not the most efficient way, but it should work :
library(igraph)
set.seed(10) # for plot images reproducibility
# create a graph
df <- data.frame(
X = c('A','A','B','B','D','E'),
Y = c('B','C','C','F','B','B')
)
d <- graph.data.frame(df)
# plot the original graph
plot(d)
# function to remove the vertex
removeVertexAndKeepConnections <- function(g,v){
# we does not support multiple vertices
stopifnot(length(v) == 1)
vert2rem <- V(g)[v]
if(is.directed(g) == FALSE){
# get the neigbors of the vertex to remove
vx <- as.integer(V(g)[nei(vert2rem)])
# create edges to add before vertex removal
newEdges <- as.matrix(unique(expand.grid(vx,vx)))
# remove the cycles
newEdges <- newEdges[newEdges[,1] != newEdges[,2],]
# sort each row index to remove all the duplicates
newEdges <- t(apply(newEdges,1,sort))
newEdges <- unique(newEdges)
}else{
# get the ingoing/outgoing neigbors of the vertex to remove
vx <- as.integer(V(g)[nei(vert2rem,mode='in')])
vy <- as.integer(V(g)[nei(vert2rem,mode='out')])
# create edges to add before vertex removal
newEdges <- as.matrix(unique(expand.grid(vx,vy)))
}
# remove already existing edges
newEdges <- newEdges[!apply(newEdges,MARGIN=1,FUN=function(x)are.connected(g,x[1],x[2])),]
# add the edges
g <- g + edges(as.integer(t(newEdges)))
# remove the vertex
g <- g - vertex(vert2rem)
return(g)
}
# let's remove B (you can also use the index
v <- 'B'
plot(removeVertexAndKeepConnections(d,v))
Original :
Modified :

R and Igraph edges

I ultimately wish to get a subset of my graph by removing connected components with 2 vertices (i.e. both vertices have an edge between them and)
You could rephrase this question as:
given an edge e = (s, d) if degree(s) == degree(d) == 1 then delete edge e
I am using R and Igraph, how would I do this? I know I can subset my graph to remove all nodes with zero degree by doing the following:
g = some_graph()
ldegs <- V(g)[degree(g) < 1]
g = delete.vertices(g, ldegs)
Thanks in advance!
I don't think this is too hard, you just find the list of nodes with degree == 1, find their neighbours, and if any of the neighbours are in the list, they're the ones to delete:
library(igraph)
g = erdos.renyi.game(100, 0.02)
ones = V(g)[degree(g) == 1]
one_ns = sapply(ones, neighbors, graph=g)
# If any of the neighbours are in ones, we
# can delete these
to_delete = one_ns[one_ns %in% ones]
# Visualize:
plot(g, mark.groups=to_delete)

Resources