Which is the correct option for this question? - graph

A regional airline serves 10 cities. It operates to-and-fro return flights between 9 pairs of cities in such a way that every city is reachable from every other city through a sequence of connecting flights.
We know the fuel consumption for the direct flights between each pair of cities. We want to compute the minimum fuel consumption from a given city to every other city in the airline network. Which of the following is true for this specific situation?
We can use BFS, DFS, or Dijkstra's algorithm to compute this.
We can use BFS or Dijkstra's algorithm to compute this, but not DFS.
We can use DFS or Dijkstra's algorithm to compute this, but not BFS.
We can only use Dijkstra's algorithm to compute this, not BFS or DFS
I am unable to get any idea of solving this question

Related

How to to find the strongly connected components in a graph by applying the 2nd DFS on the reverse graph again in the increasing order of post values

We have seen that algorithm for finding strongly connected components of a directed graph
G= (V,E) works as follows. In the first step, compute DFS on the reverse graph G
R and
compute post numbers, then run the undirected connected component algorithm on G, and
during DFS, process the vertices in decreasing order of their post number from step 1. Now
professor Smart Joe claims that the algorithm for strongly connected component would be
simpler if it runs the undirected connected component algorithm on G
R (instead of G), but
during DFS, process the vertices in increasing order of their post number from step 1.
(a) Explain when the algorithm proposed by prof. Smart Joe might produce
incorrect answer.
(b) With an example, (along with post numbers) show that professor Smart Joe
is wrong.

MST on Complete graph to cluster them (for cosine similarity)

I need to cluster (let's say given as parameter k), words (that I
store in array List) according to their cosine similarity. I have stored my all words as vertexes in list in a complete ,weighed, and undirected graph (that uses adjacency list), and put their cosine similarity values on edges. As I understand I need to use MST (Kruskals Algorithm) for clustering process.
However since, my graph is complete graph and MST used for connected graphs, I am kind of confused how to use it on complete graph? Or I am doing wrong by using complete graph?
This is my wordList:
[directors, producers, film, movie, black, white, man, woman, person, man, young, woman, science, fiction, thrilling, realistic, lovely, stunning, criminals, zombies, father, son, girlfriend, boyfriend, nurse, soldier, professor, college]
And I need to cluster them by MST so that if k (number of clusters) is 2 it will be like this (2 clusters according to their similarities):
boyfriend,college,father,girlfriend,man,nurse,person,professor,son,woman,young
criminals,directors,fiction,film,lovely,movie,producers,science,stunning,thrilling,zombies
It's standard to use minimum spanning trees on complete graphs.
You will often find the runtime complexity given for this case separately. You may want to check if Prim's is faster than Kruskal's on a complete graph.
Clustering with the minimum spanning tree is also known as Single-Link clustering, and the fast SLINK algorithm is closely related to Prim's MST algorithm. But the output format is more suitable for clustering.

Find highest scoring pair of nodes in graph.

I'm trying so solve an optimization problem where I want to find a combination of two nodes with the highest impact/importance in a graph. Lets say I want to base this on betweenness centrality (BC). I guess the more sensible approach is to select one node (maybe one with a high BC), then calculate the BC for the resulting network and then remove the node with the highest value for BC. My goal is to generate a list of the highest scoring combinations of nodes when removed from the original graph. I've implemented a simplified method that picks out random nodes and if the score is higher than the previous, one of the two nodes is reused in the next combination. I'm not sure if this approach is good enough of if the code will "get stuck" at local optima combinations.
Any pointers to steer me in the right direction would be appreciated.
Unless there are properties of the graph and/or function that you can exploit, you have to check all pairs to be sure that the maximum is found.
Several approximate betweenness centrality calculation algorithms have been proposed.
Your general method is good, and it is somewhat similar to the one used in Fast approximation of betweenness centrality through sampling [Riondato, Kornaropoulos] 2015. here here.
Quoting:
"Since exact computation in large networks is prohibitively expensive,
we present two efficient randomized algorithms for betweenness
estimation. The algorithms are based on random sampling of shortest
paths and offer probabilistic guarantees on the quality of the
approximation. [...] The first algorithm estimates the betweenness of all vertices (or edges): all approximate values are within an additive factor ε ∈ (0, 1) from the real values, with probability at least 1 − δ. The second algorithm focuses on the top-K vertices (or edges) with highest betweenness and estimate their betweenness value to within a multiplicative factor ε, with probability at least 1 − δ. This is the first algorithm that can compute such approximation for the top-K vertices (or edges). "
The time complexity for both algorithms is O(r*(|E|+|V| log |V|)), where r is the sample size (which determines the accuracy).
Their second algorithm is quite relevant for your use case (K=2):
"This is the first algorithm that can compute such approximation for
the top-K vertices (or edges)."
First, calculate betweeness centrality value for all nodes. Sort in ascending order. Select the node with the highest BC value and remove it from the network. Reconnect the remaining nodes and repeat the process continuously. This will enable you pick the nodes with the highest BC on the network.

Graph Shortest Paths w/Dynamic Weights (Repeated Dijkstra? Distance Vector Routing Algorithm?) in R / Python / Matlab

I have a graph of a road network with avg. traffic speed measures that change throughout the day. Nodes are locations on a road, and edges connect different locations on the same road or intersections between 2 roads. I need an algorithm that solves the shortest travel time path between any two nodes given a start time.
Clearly, the graph has dynamic weights, as the travel time for an edge i is a function of the speed of traffic at this edge, which depends on how long your path takes to reach edge i.
I have implemented Dijkstra's algorithm with
edge weights = (edge_distance / edge_speed_at_start_time)
but this ignores that edge speed changes over time.
My questions are:
Is there a heuristic way to use repeated calls to Dijkstra's algorithm to approximate the true solution?
I believe the 'Distance Vector Routing Algorithm' is the proper way to solve such a problem. Is there a way to use the Igraph library or another library in R, Python, or Matlab to implement this algorithm?
EDIT
I am currently using Igraph in R. The graph is an igraph object. The igraph object was created using the igraph command graph.data.frame(Edges), where Edges looks like this (but with many more rows):
I also have a matrix of the speed (in MPH) of every edge for each time, which looks like this (except with many more rows and columns):
Since I want to find shortest travel time paths, then the weights for a given edge are edge_distance / edge_speed. But edge_speed changes depending on time (i.e. how long you've already driven on this path).
The graph has 7048 nodes and 7572 edges (so it's pretty sparse).
There exists an exact algorithm that solves this problem! It is called time-dependent Dijkstra (TDD) and runs about as fast as Dijkstra itself.
Unfortunately, as far as I know, neither igraph nor NetworkX have implemented this algorithm so you will have to do some coding yourself.
Luckily, you can implement it yourself! You need to adapt Dijkstra in single place.
In normal Dijkstra you assign the weight as follows:
With dist your current distance matrix, u the node you are considering and v its neighbor.
alt = dist[u] + travel_time(u, v)
In time-dependent Dijkstra we get the following:
current_time = start_time + dist[u]
cost = weight(u, v, current_time)
alt = dist[u] + cost
TDD Dijkstra was described by Stuart E. Dreyfus. An appraisal of some shortest-path
algorithms. Operations Research, 17(3):395–412, 1969
Currently, much faster heuristics are already in use. They can be found with the search term: 'Time dependent routing'.
What about igraph package in R? You can try get.shortest.paths or get.all.shortest.paths function.
library(igraph)
?get.all.shortest.paths
get.shortest.paths()
get.all.shortest.paths()# if weights are NULL then it will use Dijkstra.

JUNG graphs: vertex similarity?

I have a JUNG graph containing about 10K vertices and 100K edges, and I'd like to get a measure of similarity between any pair of vertices.
The vertices represent concepts (e.g. dog, house, etc), and the links represent relations between concepts (e.g. related, is_a, is_part_of, etc).
The vertices are densely inter-linked, so a shortest-path approach doesn't give good results (the shortest paths are always very short).
What approaches would you recommend to rank the connectivity between vertices?
JUNG has some algorithms to score the importance of vertices, but I don't understand if there are measures of similarity between 2 vertices.
SimPack seems also promising.
Any hints?
The centrality scores don't measure similarity of pairs of vertices, but some kind of (depending on the method) centrality of single nodes of the network in general. Therefore this approach is possibly not what you want.
SimPack indeed has a nice goal set out, but for graphs it implements isomorphism-based comparations, which rather compare multiple graphs for similarity than pairs of nodes of one given graph. Therefore this is out of scope for now.
What you are seeking are so-called graph clustering methods (also called network module determination or network community determination methods), which divide the graph (network) into multiple partitions so that the nodes in each partition are more strongly interconnected with each other than with nodes of other partitions.
The most classic method is maybe the betweenness centrality clustering of Newman & Girvan where you can exploit the dendrogram for similarity calculation, and it is in JUNG. Of course there are throngs of methods nowadays. You may want to try (shameless plug) our ModuLand method, or read the fine table of module detection algorithms at the end of the Electronic Supplementary Material. That is an overlapping graph clustering method family, that is its result for each node is a vector containing the strengths of belonging to any respective cluster of the network. Pairwise node similarity is easy to derive from pairs of these node-to-cluster vectors.
Graph clustering is non-trivial, and possible you would need to adapt any method for very precise domain-specific results, but that's up to the reader ;) Good luck!

Resources