Randomly removing n edges without disconnecting graph - r

I'm using the igraph package in R. I have a connected graph G=(V,E), how can I randomly remove some edges (say, n < |E|) but without disconnecting the given graph. In other words, I mustn't remove any bridges. Any hints on how that could be done?

A simple approach would be to keep randomly selecting and removing sets of n nodes until you found a set that doesn't increase the number of components of the graph:
remove.edges <- function(g, n) {
num.tries <- 0
while (TRUE) {
num.tries <- num.tries + 1
g2 <- delete.edges(g, E(g)[sample(seq_along(E(g)), n)])
if (no.clusters(g2) == no.clusters(g)) {
print(paste("Total number of tries:", num.tries))
return(g2)
}
}
}
Let's try it out with a sample graph:
library(igraph)
set.seed(144)
g <- erdos.renyi.game(10, 0.4)
g2 <- remove.edges(g, 5)
# [1] "Total number of tries: 3"
This could be terribly inefficient for a large, sparse graph coupled with a large n value. In that case, you'll probably want to run something like Tarjan's Bridge-Finding Algorithm at each iteration and limit your random selections to not be bridges. Unfortunately, I can't find an implementation of that algorithm in R, so you'd probably need to do some implementation to get that to work.

A simple technique is to find a cycle in the graph and remove an edge from this cycle. To find a cycle I would do a depth first search until you find a node you have previously seen on the search.
For example, if you are at node x while performing the DFS and you discover a node y in x's neighborhood, then if x also already exists in the DFS tree, you have found a cycle. At this point you can safely remove any edge on this cycle without risk of it being a bridge. This should run pretty quickly if the graph isn't very sparse.
Note that in practice this DFS technique will often just resemble a random walk around the graph until encountering a previously seen node.

Related

Add iteratively one edge in a graph generated by a k regular game maintaining past connections

I have generated an undirected regular graph with an even number of nodes with the same degree, e.g. k, by using the function k.regular.game of the R package igraph.
Now I need to iteratively add one edge to each node, so that in each iteration the degree remains constant for every node and it is equal to k + i, where i is the number of iterations performed.
In addition, I want connections to be preserved in each iteration, that is: the set of neighbors of agent j for iteration i should be the same of the set of neighbors of agent j for iteration i + 1 except for one connection: e.g., if j is connected to w and y when k = 2, j must be connected to w, y and z when k = 3.
My final goal is to obtain (n-1) graphs, where n is equal to the number of nodes in the regular graph. As a result, I will obtain that the first generated graph has k = 1 and the last generated graph has k = (n-1).
Any suggestion on how to do this?
This is a nice network problem solved with two partial solutions below.
Let's imagine there is a function which would bring a graph g from all degrees being 1 to all degrees being 2. It would have to be a graph with an even number of nodes.
increment.k <- function(g){}
It follows that increment.k will increase the degree of each node by one by adding |V|/2 edges to it - one edge for each two nodes in the graph. From what I understand from your problem specification, any of those edges must not connect agin two nodes that are already connected. This makes increment.k() a puzzle in which a random edge between two nodes might close the possibility for all nodes to reach the new k-value of degrees. What if a graph has k=1 and we start adding edges at random only to arrive at the last edge only to find that the only two nodes still with degree 1 are already connected?!
I cannot intuitively grasp if this allows for the possibility of graphs that cannot be incremented since no combination of random edges allows for the creation of |V|/2 edges between previously unconnected nodes. But I can imagine that such graphs exist.
I've done this example on a graph with 20 nodes (which consequently can have a k between 1 and 19):
g <- k.regular.game(no.of.nodes=20, k=1, directed=F)
What if you were to generate random k.regular.games with a higher k until you found a graph where the edges of your graph is a subset of the edges of the higher-k random graph? It should be spectacularly slow.
The problem, of course, is that you don't want to allow for duplicated arches. If not, the solution would be quite simple:
increase.k.allowing.duplicates <- function(graph){
if(length(V(graph))%%2!=0){
stop("k can only be incremented for graphs with an even number of nodes.")
}
# Add random edges to the graph and allow dual edges just to increase k
graph %>% add_edges(as.numeric(sample(1:length(V(graph)), length(V(graph)))))
}
The above code would solve the problem if double arches were allowed. This would return graphs of ever higher k, and would let k go towards infinity since the number of nodes of the graph don't set any maximum average degree of the graph.
I have come up with this Montecarlo approach below. To increase k by one, a given number of edges is added one by one between nodes, but if the loop runs out of alternatives when placing arches between nodes that are 1) not connected and 2) not already incremented to the higher k/degree, the process of creating a new graph with a higher k starts over. The function has a maximum number of tries start over in maximum.tries.
increase.k <- function(graph, maximum.tries=200){
if(length(V(graph))%%2!=0){
stop("k can only be incremented for graphs with an even number of nodes.")
}
k <- mean(degree(graph))
if(k != round(k) ){
stop("Nodes in graph do not have the same degree")
}
if(k >= length(V(graph))-1 ) {
stop("This graph is complete")
}
# each node has the following available arches before starting the iteration:
#posisble.arches <- lapply(neighbors(graph,1), function(x) setdiff(V(graph), x[2:length(x)]))
# Here we must lay the puzzle. If we run into a one-way street with the edges we add, we'll have to start afresh
original.graph <- graph
for(it in 1:maximum.tries){
# We might need many tries to get the puzzle right by brute-forcing
# For each try we increment in a loop to avoid duplicate links
for(e_ij in 1:(length(V(graph))/2)){
# Note that while(mean(degree(graph)) < k + 1){} is a logical posibility, but less safe
# Add a new edge between two nodes of degree k. i is any such node and j is any such node not already connected to i
i <- sample(as.numeric(V(graph)[degree(graph)==k]), 1)
js <- as.numeric(V(graph)[degree(graph) == k * !V(graph) %in% c(as.numeric(neighbors(graph,i)), i)])
# Abandon this try if no node unconnected to i and with degree == k exists
if(length(js)==0){break}
j <- sample(c(js), 1); if(length(js)==1){j<-js}
graph <- graph %>% add_edges(c(i,j))
}
# Did we lay the puzzle to completion successfully crating a random graph with a higher k?
if(mean(degree(graph)) == k+1){
# Success
print(paste("Succeded at iteration ", it))
break
} else {
# Failure, let's try again
graph <- original.graph
print("Failed")
}
}
(graph)
}
# Compare the two approaches
g1 <- increase.k.allowing.duplicates(g)
g2 <- increase.k(g)
degree(g1) == degree(g2)
l <- layout_with_gem(g2)
par(mfrow=c(1,2))
plot(g1, layout=l, vertex.label="")
plot(g2,layout=l, vertex.label="")
dev.off()
# Note that increase.k() can be run incrementally up untill a complete graph:
is.complete <- function(graph){mean(degree(graph)) >= (length(V(graph))-1)}
while(!is.complete(g)){
print(mean(degree(g)))
g <- increase.k(g)
}
# and that increase.k() cannot increase k in already complete graphs.
g <- increase.k(g)
The above code has solved the problem for some graphs. More iterations are needed to lay the puzzle the larger the graph is. In this example with only 20 nodes, each k-level can be generated from 1-19 relatively quickly. I did manage to get 19 separate networks from k=1 to k=19. But I have managed to get stuck in the loop also, which I take as evidence for the existing network structures of which k cannot be successfully incremented. Particularly since the same starting specification can get stuck sometimes, but manage to arrive at a complete graph on other occasions.
To test the function, I set the maximum.tries to 25 and tried to go from k=1 to 19 100 times. It never worked. The higher the k, the more difficult it is to lay the puzzle and find arches that fit, even though the next-to-last iteration is faster before a collapse. The risk of hitting the cap of 25 increased between the 15th and 18th iteration, and most graphs only made it to k=17.
It is possible to imagine this method being performed backwards starting at a complete graph, removing edges within a Montecarlo process which tries to remove edges to achieve a graph with all degrees at k-1. It should run into similar problems, though.
The code above is really an attempt to brute-force this problem without going into the underlying mathematics of graphs of this type. I am not a mathematician and lack the skills, but maybe the creation of a fail-safe k.increment()-function is a real and unsolved mathematical problem. If any graph-theoreticians come by this post, please enlighten us.

Solving Chinese Postman algorithm with eulerization

I'm would like to solve Chinese Postman problem in a graph where an eulerian cycle does not exist. So basically I'm looking for a path in a graph which visits every edge exactly once, and starts and ends at the same node. A graph will have an euler cycle if and only if every node has same number of edges entering into and going out of it. Obviously my graph doesn't .
I found out that Eulerization (making a graph Eulerian) could solve my question LINK. Can anyone suggest a script to add duplicate edges to a graph so that the resulting graph has no vertices of odd degree (and thus does have an Euler Circuit)?
Here is my example:
require(igraph)
require(graph)
require(eulerian)
require(GA)
g1 <- graph(c(1,2, 1,3, 2,4, 2,5, 1,5, 3,5, 4,7, 5,7, 5,8, 3,6, 6,8, 6,9, 9,11, 8,11, 8,10, 8,12, 7,10, 10,12, 11,12), directed = FALSE)
mat <- get.adjacency(g1)
mat <- as.matrix(mat)
rownames(mat) <- LETTERS[1:12]
colnames(mat) <- LETTERS[1:12]
g2 <- as(graphAM(adjMat=mat), "graphNEL")
hasEulerianCycle(g2)
Fun problem.
The graph you sugest in the code above, can be made to have duplicates that enable a eulerian cycle to be created. The function I provide below tries to add the minimum amount of duplicate edges, but also readily breaks the graph structure by adding new links if it has to.
You can run:
eulerian.g1 <- make.eulerian(g1)$graph
Check what the function did to your graph with:
make.eulerian(g1)$info
Bare in mind that:
This is not the only graph structure where duplicates added to the original g1 graph can form an eulerian cycle. Imagine for example my function looping the vertices of the graph backwards instead.
Your graph already has an uneven number of vertices with uneven degree, and all of the vertices that are, have neighbours with uneven degrees to pair them with. This function therefore works well four your particular example data.
The function could fail to produce a graph using only duplicates even in graphs where eulerian cycles are possible with correctly added duplicates. This is since it always goes for connecting a node with the first of its neighbours with uneven degree. If this is something that you'd absolutely like to get around, an MCMC-approach would be the way to go.
See also this excellent answer on probability calculation:
Here's my function in a full script that you can source out-of-the-box:
library(igraph)
# You asked about this graph
g1 <- graph(c(1,2, 1,3, 2,4, 2,5, 1,5, 3,5, 4,7, 5,7, 5,8, 3,6, 6,8, 6,9, 9,11, 8,11, 8,10, 8,12, 7,10, 10,12, 11,12), directed = FALSE)
# Make a CONNECTED random graph with at least n nodes
connected.erdos.renyi.game <- function(n,m){
graph <- erdos.renyi.game(n,m,"gnm",directed=FALSE)
graph <- delete_vertices(graph, (degree(graph) == 0))
}
# This is a random graph
g2 <- connected.erdos.renyi.game(n=12, m=16)
make.eulerian <- function(graph){
# Carl Hierholzer (1873) had explained how eulirian cycles exist for graphs that are
# 1) connected, and 2) contain only vertecies with even degrees. Based on this proof
# the posibility of an eulerian cycle existing in a graph can be tested by testing
# on these two conditions.
#
# This function assumes a connected graph.
# It adds edges to a graph to ensure that all nodes eventuall has an even numbered. It
# tries to maintain the structure of the graph by primarily adding duplicates of already
# existing edges, but can also add "structurally new" edges if the structure of the
# graph does not allow.
# save output
info <- c("broken" = FALSE, "Added" = 0, "Successfull" = TRUE)
# Is a number even
is.even <- function(x){ x %% 2 == 0 }
# Graphs with an even number of verticies with uneven degree will more easily converge
# as eulerian.
# Should we even out the number of unevenly degreed verticies?
search.for.even.neighbor <- !is.even(sum(!is.even(degree(graph))))
# Loop to add edges but never to change nodes that have been set to have even degree
for(i in V(graph)){
set.j <- NULL
#neighbors of i with uneven number of edges are good candidates for new edges
uneven.neighbors <- !is.even(degree(graph, neighbors(graph,i)))
if(!is.even(degree(graph,i))){
# This node needs a new connection. That edge e(i,j) needs an appropriate j:
if(sum(uneven.neighbors) == 0){
# There is no neighbor of i that has uneven degree. We will
# have to break the graph structure and connect nodes that
# were not connected before:
if(sum(!is.even(degree(graph))) > 0){
# Only break the structure if it's absolutely nessecary
# to force the graph into a structure where an euclidian
# cycle exists:
info["Broken"] <- TRUE
# Find candidates for j amongst any unevenly degreed nodes
uneven.candidates <- !is.even(degree(graph, V(graph)))
# Sugest a new edge between i and any node with uneven degree
if(sum(uneven.candidates) != 0){
set.j <- V(graph)[uneven.candidates][[1]]
}else{
# No candidate with uneven degree exists!
# If all edges except the last have even degrees, thith
# function will fail to make the graph eulerian:
info["Successfull"] <- FALSE
}
}
}else{
# A "structurally duplicated" edge may be formed between i one of
# the nodes of uneven degree that is already connected to it.
# Sugest a new edge between i and its first neighbor with uneven degree
set.j <- neighbors(graph, i)[uneven.neighbors][[1]]
}
}else if(search.for.even.neighbor == TRUE & is.null(set.j)){
# This only happens once (probably) in the beginning of the loop of
# treating graphs that have an uneven number of verticies with uneven
# degree. It creates a duplicate between a node and one of its evenly
# degreed neighbors (if possible)
info["Added"] <- info["Added"] + 1
set.j <- neighbors(graph, i)[ !uneven.neighbors ][[1]]
# Never do this again if a j is correctly set
if(!is.null(set.j)){search.for.even.neighbor <- FALSE}
}
# Add that a new edge to alter degrees in the desired direction
# OBS: as.numeric() since set.j might be NULL
if(!is.null(set.j)){
# i may not link to j
if(i != set.j){
graph <- add_edges(graph, edges=c(i, set.j))
info["Added"] <- info["Added"] + 1
}
}
}
# return the graph
(list("graph" = graph, "info" = info))
}
# Look at what we did
eulerian <- make.eulerian(g1)
eulerian$info
g <- eulerian$graph
par(mfrow=c(1,2))
plot(g1)
plot(g)

After clustering in R (iGraph, etc), can you maintain nodes+edges from a cluster to do individual cluster analysis?

Basically I have tried a few different ways of clustering. I can usually get to a point in iGraph where each node is labeled with a cluster. I can then identify all the nodes within a single cluster. However, this loses their edges.
I'd have to re-iterate back over the original dataset for all the nodes in cluster 1 to get only those where both nodes+the edge are within the cluster. I'd have to do this for every cluster.
This seems like a painfully long process and there is probably a shortcut my google-fu is missing.
So, is there an easy way to, after clustering or performing community detection processes, to maintain an individual cluster/community as its own smaller graph -- that is, retaining all nodes AND edges between them?
You can use delete.vertices() to create a subgraph. Example:
library(igraph)
set.seed(123)
# create random graph
g <- barabasi.game(100, directed = F)
plot(g, layout=layout.fruchterman.reingold)
# do community detection
wc <- multilevel.community(g)
V(g)$community <- membership(wc)
# make community 1 subgraph
g_sub <- delete.vertices(g, V(g)[community != 1])
plot(g_sub, layout=layout.fruchterman.reingold)
An alternative:
#Create random network
d <- sample_gnm(n=50,m=40)
#Identify the communities
dc <- cluster_walktrap(d)
#Induce a subgraph out of the first community
dc_1 <- induced.subgraph(d,dc[[1]])
#plot that specific community
plot(dc_1)

How to study the interaction between a set of nodes when the network is quite dense, using igraph?

I have 4 undirected graph with 1000 vertices and 176672, 150994, 193477, 236060 edges. I am trying to see interaction between a specific set of nodes (16 in number) for each graph. This visualization in tkplot is not feasible as 1000 vertices is already way too much for it. I was thinking of if there is some way to extract the interaction of these 16 nodes from the parent graph and view separately, which will be then more easy to handle and work with in tkplot. I don't want the loss of information as in what is the node(s) in he path of interaction if it comes from other than 16 pre-specified nodes. Is there a way to achieve it?
In such a dense graph, if you only take the shortest paths connecting each pair of these 16 vertices, you will still get a graph too large for tkplot, or even to see any meaningful on a cairo pdf plot.
However, if you aim to do it, this is one possible way:
require(igraph)
g <- erdos.renyi.game(n = 1000, p = 0.1)
set <- sample(1:vcount(g), 16)
in.shortest.paths <- NULL
for(v in set){
in.shortest.paths <- c(in.shortest.paths,
unlist(get.all.shortest.paths(g, from = v, to = set)$res))
}
subgraph <- induced.subgraph(g, unique(in.shortest.paths))
In this example, subgraph will include approx. half of all the vertices.
After this, I think you should consider to find some other way than visualization to investigate the relationships between your vertices of interest. It can be some topological metric, but it really depends on the aims of your analysis.

Removal of N Random nodes from the graph in R

I am new to R/igraph. I would like to remove N nodes randomly from a graph. However, I could not find the right way to do that. I have generated the Erdos-Renyi graph with the help of the igraph package with 400 vertices.
igraph provides the deletion of the vertices, but not in the random way.
For example: delete.vertices(graph, v).
I referred to this documentation.
I also searched the web and previous questions on Stack Overflow, but could not get the right answer.
Can anyone please tell or refer me to documentation on how to remove the N (lets say N = 100) random nodes?
Basically you just need to generate a vector of random numbers ranging from 1 to 400:
random.deletes <- runif(n=100, min=1, max=400)
And then apply it:
my.new.graph <- delete.vertices(graph, random.deletes)
Of course, both can be done at once but you'd lose track of the deleted nodes:
my.new.graph <- delete.vertices(graph, runif(n=100, min=1, max=400))

Resources