Can breadth first search traverse a disconnected graph? - graph

I have an exam question:
Consider the undirected graph with vertices the numbers 3..16, and edges given by the following rule: two vertices a and b are linked by an edge whenever a is a factor of b, or b is a factor of a. List the vertices in BFS order starting from 14. (Many different orders are possible with a BFS, you may list any one of these).
I'm considering two answers:
Because the graph is not connected, from 14 BFS can only traverse to 7, so the result is 14, 7.
List out all the first level disconnected vertices, then traverse to their child nodes.
so, 14, 16, 15, 13, 12, 11, 10, 9, 7, 8, 6, 5, 4, 3
Which one is correct?
Can BFS traverse to disconnected vertices?

Answer 2 doesn't make sense in my opinion, because this requires knowledge about which nodes are disconnected. Which you need the algorithm to find out. So I'd say answer 1 is correct.

Related

Creating subgraphs with overlapping vertices

I've been looking for packages using which I could create subgraphs with overlapping vertices.
From what I understand in Networkx and metis one could partition a graph into two or multi-parts. But I couldn't find how to partition into subgraphs with overlapping nodes.
Suggestions on libraries that support partitioning with overlapping vertices will be really helpful.
EDIT: I tried the angel algorithm in CDLIB to partition the original graph into subgraphs with 4 overlapping nodes.
import networkx as nx
from cdlib import algorithms
if __name__ == '__main__':
g = nx.karate_club_graph()
coms = algorithms.angel(g, threshold=4, min_community_size=10)
print(coms.method_name)
print(coms.method_parameters) # Clustering parameters)
print(coms.communities)
print(coms.overlap)
print(coms.node_coverage)
Output:
ANGEL
{'threshold': 4, 'min_community_size': 10}
[[14, 15, 18, 20, 22, 23, 27, 29, 30, 31, 32, 8], [1, 12, 13, 17, 19, 2, 21, 3, 7, 8], [14, 15, 18, 2, 20, 22, 30, 31, 33, 8]]
True
0.6470588235294118
From the communities returned, I understand 1 and 3 have an overlap of 4 nodes but 2 and 3 or 1 and 3 don't have an overlap size of 4 nodes.
It is not clear to me how the overlap threshold (4 overlaps) has to be specified
here algorithms. angel(g, threshold=4, min_community_size=10). I tried setting threshold=4 here to define an overlap size of 4 nodes. However, from the documentation available for angel
:param threshold: merging threshold in [0,1].
I am not sure how to translate the 4 overlaps to the value that has to be set between the bounds [0, 1]. Suggestions will be really helpful.
You can check out CDLIB:
They have a great amount of community finding algorithms applicable to networkX, including some overlapping communities algorithms.
On a side note:
The return type of the functions is called Node Clustering which might be a little confusing at first so here are the methods applicable to it, usually you simply want to convert to a Python dictionary.
Specifically about the angel algorithm in CDLIB:
According to ANGEL: efficient, and effective, node-centric community discovery in static and dynamic networks, the threshold is not the overlapping threshold, but used as follows:
If the ratio is greater than (or equal to) a given threshold, the merge is applied and the node label updated.
Basically, this value determines whether to further merge the nodes into bigger communities, and is not equivalent to the number of overlapping nodes.
Also, don't mistake "labels" with "node's labels" (as in nx.relabel_nodes(G, labels)). The "labels" referred are actually correlated with the Label Propagation Algorithm which is used by ANGEL.
As for the effects of varying this threshold:
[...] Increasing the threshold, we obtain a higher number of communities since lower quality merges cannot take place.
[based on the comment by #J. M. Arnold]
From ANGEL's github repository you can see that when threshold >= 1 only the min_comsize value is used:
self.threshold = threshold
if self.threshold < 1:
self.min_community_size = max([3, min_comsize, int(1. / (1 - self.threshold))])
else:
self.min_community_size = min_comsize

How to plan the most efficient route for patio lights Part 2

This is a continuation of some questions I posed earlier, How to plan the most efficient route for patio lights and Christmas Light Route Efficiency (CS), about my attempt to cover a screened-in structure with patio lights as efficiently as possible.
Here's the rules:
Minimize light overlapping
Each string of lights is 234" long (this is important because I can't start a new branch of lights unless it's at the end of another branch).
Think of these as Christmas lights, you have a male and a female side:
start (male) end (female)
=[}~~~o~~~o~~~o~~~o~~~o~~~o~~~o~~~{=]
<- to outlet to other lights ->
So multiple strands can daisy chain as long as there's a female for the male to plug into, like this:
A female plug must supply power to the next strand of lights via a male plug, a male plug can't give power to another male plug.
Here is a diagram of my structure:
Pink Circle = Place to hang lights (No, there is not a place to hang lights at the intersection of 10, 11 & 12 - that is not a mistake).
"Start" = The only available electrical outlet.
Yellow Dots = Parts of the structure I want to run the lights along.
Based on my previous questions, I began looking into "Route Efficiency Problem" Algorithms. I used this post, Solving Chinese Postman algorithm with eulerization, to get started, which lead me to this code (with thanks to #DamianoFantini for his help in my previous post to set the graph up correctly):
gg <- graph_from_edgelist(cbind(c(1:4, 6, 8, 10, 12, 14, 16:19, 1, 6, 8, 21, 12, 14, 5, 7, 9, 11, 13, 15),
c(2:5, 7, 9, 11, 13, 15, 17:20, 6, 8, 10, 12, 14, 16, 7, 9, 11, 13, 15, 20)))
ll=matrix(
c( 0,0, 75.25,0, 150.5,0, 225.8125,0, 302.8125,0,
0,-87, 302.8125,-87,
0,-173.8125, 302.8125,-173.8125,
0,-260.9375, 302.8125,-260.9375,
16,-384.3125, 302.8125,-384.3125,
16,-435.9575, 302.8125,-435.9375,
16,-525.1875, 75.25,-525.1875, 150.5,-525.1875, 225.8125,-525.1875, 302.8175,-525.1875, 16, -260.9375),
ncol=2,byrow=TRUE)
# SOURCE: https://stackoverflow.com/q/40576910/1152809
make.eulerian <- function(graph){
# Carl Hierholzer (1873) had explained how eulirian cycles exist for graphs that are
# 1) connected, and 2) contain only vertecies with even degrees. Based on this proof
# the posibility of an eulerian cycle existing in a graph can be tested by testing
# on these two conditions.
#
# This function assumes a connected graph.
# It adds edges to a graph to ensure that all nodes eventuall has an even numbered. It
# tries to maintain the structure of the graph by primarily adding duplicates of already
# existing edges, but can also add "structurally new" edges if the structure of the
# graph does not allow.
# save output
info <- c("broken" = FALSE, "Added" = 0, "Successfull" = TRUE)
# Is a number even
is.even <- function(x){ x %% 2 == 0 }
# Graphs with an even number of verticies with uneven degree will more easily converge
# as eulerian.
# Should we even out the number of unevenly degreed verticies?
search.for.even.neighbor <- !is.even(sum(!is.even(degree(graph))))
# Loop to add edges but never to change nodes that have been set to have even degree
for(i in V(graph)){
set.j <- NULL
#neighbors of i with uneven number of edges are good candidates for new edges
uneven.neighbors <- !is.even(degree(graph, neighbors(graph,i)))
if(!is.even(degree(graph,i))){
# This node needs a new connection. That edge e(i,j) needs an appropriate j:
if(sum(uneven.neighbors) == 0){
# There is no neighbor of i that has uneven degree. We will
# have to break the graph structure and connect nodes that
# were not connected before:
if(sum(!is.even(degree(graph))) > 0){
# Only break the structure if it's absolutely nessecary
# to force the graph into a structure where an euclidian
# cycle exists:
info["Broken"] <- TRUE
# Find candidates for j amongst any unevenly degreed nodes
uneven.candidates <- !is.even(degree(graph, V(graph)))
# Sugest a new edge between i and any node with uneven degree
if(sum(uneven.candidates) != 0){
set.j <- V(graph)[uneven.candidates][[1]]
}else{
# No candidate with uneven degree exists!
# If all edges except the last have even degrees, thith
# function will fail to make the graph eulerian:
info["Successfull"] <- FALSE
}
}
}else{
# A "structurally duplicated" edge may be formed between i one of
# the nodes of uneven degree that is already connected to it.
# Sugest a new edge between i and its first neighbor with uneven degree
set.j <- neighbors(graph, i)[uneven.neighbors][[1]]
}
}else if(search.for.even.neighbor == TRUE & is.null(set.j)){
# This only happens once (probably) in the beginning of the loop of
# treating graphs that have an uneven number of verticies with uneven
# degree. It creates a duplicate between a node and one of its evenly
# degreed neighbors (if possible)
info["Added"] <- info["Added"] + 1
set.j <- neighbors(graph, i)[ !uneven.neighbors ][[1]]
# Never do this again if a j is correctly set
if(!is.null(set.j)){search.for.even.neighbor <- FALSE}
}
# Add that a new edge to alter degrees in the desired direction
# OBS: as.numeric() since set.j might be NULL
if(!is.null(set.j)){
# i may not link to j
if(i != set.j){
graph <- add_edges(graph, edges=c(i, set.j))
info["Added"] <- info["Added"] + 1
}
}
}
# return the graph
(list("graph" = graph, "info" = info))
}
# Look at what we did
eulerian <- make.eulerian(gg)
g <- eulerian$graph
par(mfrow=c(1,2))
plot(gg)
plot(g)
Here's the result of the code:
Which, I think translates to this (but I am a graph/algorithm noob, so correct me if I'm wrong):
Obviously, there are some issues here:
I have no idea where the end/beginning of each strand of lights should be (and neither does the algorithm I think)
Node 1 is supplying power independently. This will not work in reality. All power must come from the "Start" position.
The distances and structure do not seem to be accounted for.
Is there a way to add these constraints into the algorithm? Is there another algorithm I could use that would make this easier?
https://en.wikipedia.org/wiki/Dijkstra%27s_algorithm
You can imeplement Dijkstra's algorithm using different edge data for different metrics for your path, such as light overlap, or for example, the total illuminance for the lights at each edge. I assume you might need a higher density of light in deep corners...
So the goal, can be the widest area of low light, or the perceived visibility of obstacles, or a path to which creates a homogenous ambient light. Regardless of how it is tuned though I believe Dijkstra's algorithm is a pretty standard goto for finding these things.
Update:
In the case of creating the widest covered area of light you would want a spanning tree rather than an optimal path algorithm. This might be more of what you have in mind:
https://en.wikipedia.org/wiki/Prim%27s_algorithm

How to plan the most efficient route for patio lights

I'm trying to string up some patio lights. Based on another question I asked, I realize I need an algorithm to solve a Route Inspection Problem to figure out the most efficient route the lights should take so there's minimal duplicate edges covered with lights. After some searching I realized that perhaps something like this would be my best bet: Solving Chinese Postman algorithm with eulerization.
However, I'm having trouble creating the graph.
Here's what it needs to look like:
pink circles represent places in the structure I can hang lights from
"Start" is the only available electrical outlet
The yellow dots represent all the places lights should cover
And here's what my graph looks like after referencing this post: Visualizing distance between nodes according to weights - with R:
As you can see, all the nodes are in the correct place, but the edges are connecting where they shouldn't connect. Here's my code:
library(igraph)
gg<-graph.ring(20)
ll=matrix(
c( 0,0, 75.25,0, 150.5,0, 225.8125,0, 302.8125,0,
0,-87, 302.8125,-87,
0,-173.8125, 302.8125,-173.8125,
0,-260.9375, 302.8125,-260.9375,
16,-384.3125, 302.8125,-384.3125,
16,-435.9575, 302.8125,-435.9375,
16,-525.1875, 75.25,-525.1875, 150.5,-525.1875, 225.8125,-525.1875, 302.8175,-525.1875),
ncol=2,byrow=TRUE)
plot(gg,layout=ll)
I think this has something to do with the nature of graph.ring, but I am unable to figure out another way to define the graphs' edges' lengths without error.
I think you can use graph_from_edgelist for a precise specification of which nodes to connect. It is sufficient to specify which nodes to connect in which order. Nice question btw!
gg <- graph_from_edgelist(cbind(c(1:4, 6, 8, 10, 12, 14, 16:19, 1, 6, 8, 21, 12, 14, 5, 7, 9, 11, 13, 15),
c(2:5, 7, 9, 11, 13, 15, 17:20, 6, 8, 10, 12, 14, 16, 7, 9, 11, 13, 15, 20)))
ll=matrix(
c( 0,0, 75.25,0, 150.5,0, 225.8125,0, 302.8125,0,
0,-87, 302.8125,-87,
0,-173.8125, 302.8125,-173.8125,
0,-260.9375, 302.8125,-260.9375,
16,-384.3125, 302.8125,-384.3125,
16,-435.9575, 302.8125,-435.9375,
16,-525.1875, 75.25,-525.1875, 150.5,-525.1875, 225.8125,-525.1875, 302.8175,-525.1875, 16, -260.9375),
ncol=2,byrow=TRUE)
plot(gg,layout=ll, edge.arrow.size = 0, vertex.size = c(rep(18, 20), 0),
edge.color="orange")
I added a node (n 21) to allow a branching that is similar to your scheme. Does this look more or less as it should?
I had a look at the previous post on Stack Overflow (the one you suggested) to try making this an Euler cycle. Actually, the custom function does work out of the box, but you may want to double check if you can use the resulting solution or not. Maybe, you could try defining a better connection design before "eulerizing" the circuit. This is what I got.
# load custom f(x) as in
# https://stackoverflow.com/questions/40576910/solving-chinese-postman-algorithm-with-eulerization/40596816#40596816
eulerian <- make.eulerian(gg)
eulerian$info
g <- eulerian$graph
# set the layout as before to keep the circuit formatted according to your specs
par(mfrow=c(1,2))
plot(gg,layout=ll, edge.arrow.size = 0, vertex.size = c(rep(18, 20), 0),
edge.color="orange", main = "Proposed")
plot(g,layout=ll, edge.arrow.size = 0, vertex.size = c(rep(18, 20), 0),
edge.color="orange", main = "Eulerized")

Julia LightGraphs weakly_connected_components

Isn't it true that the weakly_connected_components in julia LightGraphs should provide connected components where if The DiGraph is turned into an undirected graph, then each component should be connected?
I have tried this and I do not receive such components? As an example I have tried this on the political blogs data as an undirected network
data=readdlm(path,',',Int64) #contains edges in each row
N_ = length(unique(vcat(data[:,1],data[:,2]))) ##to get number of vertices
network = LightGraphs.DiGraph(N_)
#construct the network
for i in 1:size(data,1)
add_edge!(network, Edge(data[i,1], data[i,2]))
end
#largest weakly connected component
net = weakly_connected_components(network)[1]
temp_net,vmap = induced_subgraph(network, net)
and after getting the largest weakly connected component, I see the following:
isempty([i for i in vertices(temp_net) if isempty(edges(temp_net).adj[i])])
julia>false
which signigies some nodes have no incoming or outgoing edges.
What can be the problem? I am using the latest release 6, but the LightGraphs package tests seem to be working.
In addition to what #dan-getz said, I must implore you not to access any internal data fields of structures - we have accessors for everything that's "public". Specifically, edges(temp_net).adj is not guaranteed to be available. It's currently the same as fadj(g), the forward adjacency list of g, for both directed and undirected graphs, but it's not intended to be used except to help keep edge iteration state.
If you use .adj, your code will break on you without warning at some point.
The TL;DR answer is that edges(temp_net).adj[i] contains only the vertices i connects to, and not those connecting to i. And some vertices have no incoming edges.
The longer version, is the following which shows temp_net in a randomly generated network and assigned as in the question is indeed weakly-connected. First building a random network with:
julia> using LightGraphs
julia> N_ = 1000 ;
julia> network = LightGraphs.DiGraph(N_)
{1000, 0} directed simple Int64 graph
julia> using Distributions
julia> for i in 1:N_
add_edge!(network, sample(1:N_,2)...)
end
julia> net = weakly_connected_components(network)[1]
julia> temp_net,vmap = induced_subgraph(network, net)
({814, 978} directed simple Int64 graph, [1, 3, 4, 5, 6, 9, 10, 11, 12, 13 … 989, 990, 991, 993, 995, 996, 997, 998, 999, 1000])
And, now, we have:
julia> length(vertices(temp_net))
814
julia> invertices = union((edges(temp_net).adj[i] for i in vertices(temp_net))...);
julia> outvertices = [i for i in vertices(temp_net) if !isempty(edges(temp_net).adj[i])] ;
julia> length(union(invertices,outvertices))
814
Thus all 814 vertices either have edges from, or to them (or both) and are part of the weakly connected component.

Node labels are not in sequence in trees using fancyRpartPlot

I applied the decision tress algorithm to famous breast cancer data set from UCI after removing 16 records with missing rows. The tree I obtained is given below. As it can be seen that the small number appearing on top of the node are not in order. The numbers are 1, 2, 3, 4, 5, 6, 7, 12, 13, 14, 28 and 20. Are the missing nodes deleted due to pruning? Can I enable the option to see them? Can I correct the sequence according to the nodes appearing i.e. 1 to 13?

Resources