Single source shortest path using BFS for a undirected weighted graph - graph

I was trying to come up with a solution for finding the single-source shortest path algorithm for an undirected weighted graph using BFS.
I came up with a solution to convert every edge weight say x into x edges between the vertices each new edge with weight 1 and then run the BFS. I would get a new BFS tree and since it is a tree there exists only 1 path from the root node to every other vertex.
The problem I am having with is trying to come up with the analysis of the following algorithm. Every edge needs to be visited once and then be split into the corresponding number of edges according to its weight. Then we need to find the BFS of the new graph.
The cost for visiting every edge is O(m) where m is the number of edges as every edge is visited once to split it. Suppose the new graph has km edges (say m').
The time complexity of BFS is O (n + m') = O(n + km) = O(n + m) i.e the Time complexity remains unchanged.
Is the given proof correct?
I'm aware that I could use Dijkstra's algorithm here, but I'm specifically interested in analyzing this BFS-based algorithm.

The analysis you have included here is close but not correct. If you assume that every edge's cost is at most k, then your new graph will have O(kn) nodes (there are extra nodes added per edge) and O(km) edges, so the runtime would be O(kn + km). However, you can't assume that k is a constant here. After all, if I increase the weight on the edges, I will indeed increase the amount of time that your code takes to run. So overall, you could give a runtime of O(kn + km).
Note that k here is a separate parameter to the runtime, the same way that m and n are. And that makes sense - larger weights give you larger runtimes.
(As a note, this is not considered a polynomial-time algorithm. Rather, it's a pseudopolynomial-time algorithm because the number of bits required to write out the weight k is O(log k).)

Related

Explanation of network indices normalization

Could someone explain in pretty simple words why the normalization of many network analysis indicies (graph theory) is n(n - 1), where n – the graph size? Why do we need to take into account (n - 1)?
Most network measures that focus on counting edges (e.g. clustering coefficient) are normalized by the total number of possible edges. Since every edge connects a pair of vertices, we need to know how many possible pairs of vertices we can make. There are n possible vertices we could choose as the source of our edge, and therefore there are n-1 possible vertices that could be the target of our edge (assuming no self-loops, and if undirected divide by 2 bc source and target are exchangeable). Hence, you frequently encounter $n(n-1)$ or $\binomal{n}{2}$.

finding maximum weight subgraph

My graph is as follows:
I need to find a maximum weight subgraph.
The problem is as follows:
There are n Vectex clusters, and in every Vextex cluster, there are some vertexes. For two vertexes in different Vertex cluster, there is a weighted edge, and in the same Vextex cluster, there is no edge among vertexes. Now I
want to find a maximum weight subgraph by finding only one vertex in each
Vertex cluster. And the total weight is computed by adding all weights of the edges between the selected vertex. I add a picture to explain the problem. Now I know how to model this problem by ILP method. However, I do not know how to solve it by an approximation algorithm and how to get its approximation ratio.
Could you give some solutions and suggestions?
Thank you very much. If any unclear points in this description,
please feel free to ask.
I do not think you can find an alpha-approx for this problem, for any alpha. That is because if such an approximation exists, then it would also prove that the unique games conjecture(UGC) is false. And disproving (or proving) the UGC is a rather big feat :-)
(and I'm actually among the UGC believers, so I'd say it's impossible :p)
The reduction is quite straightforward, since any UGC instance can be described as your problem, with weights of 0 or 1 on edges.
What I can see as polynomial approximation is a 1/k-approx (k the number of clusters), using a maximum weight perfect matching (PM) algorithm (we suppose the number of clusters is even, if it's odd just add a 'useless' one with 1 vertex, 0 weights everywhere).
First, you need to build a new graph. One vertex per cluster. The weight of the edge u, v has the weight max w(e) for e edge from cluster u to cluster v. Run a max weight PM on this graph.
You then can select one vertex per cluster, the one that corresponds to the edge selected in the PM.
The total weight of the solution extracted from the PM is at least as big as the weight of the PM (since it contains the edges of the PM + other edges).
And then you can conclude that this is a 1/k approx, because if there exists a solution to the problem that is more than k times bigger than the PM weight, then the PM was not maximal.
The explanation is quite short (lapidaire I'd say), tell me if there is one part you don't catch/disagree with.
Edit: Equivalence with UGC: unique label cover explained.
Think of a UGC instance. Then, every node in the UGC instance will be represented by a cluster, with as many nodes in the cluster as there are colors in the UGC instance. Then, create edge with weight 0 if they do not correspond to an edge in the UGC, or if it correspond to a 'bad color match'. If they correspond to a good color match, then give it the weight 1.
Then, if you find the optimal solution to an instance of your problem, it means it corresponds to an optimal solution to the corresponding UGC instance.
So, if UGC holds, it means it is NP-hard to approximate your problem.
Introduce a new graph G'=(V',E') as follows and then solve (or approximate) the maximum stable set problem on G'.
Corresponding to each edge a-b in E(G), introduce a vertex v_ab in V'(G') where its weight is equal to the weight of the edge a-b.
Connect all of vertices of V'(G') to each other except for the following ones.
The vertex v_ab is not connected to the vertex v_ac, where vertices b and c are in different clusters in G. In this manner, we can select both of these vertices in an stable set of G' (Hence, we can select both of the corresponding edges in G)
The vertex v_ab is not connected to the vertex v_cd, where vertices a, b, c and d are in different clusters in G. In this manner, we can select both of these vertices in an stable set of G' (Hence, we can select both of the corresponding edges in G)
Finally, I think you can find an alpha-approximation for this problem. In other words, in my opinion the Unique Games Conjecture is wrong due to the 1.999999-approximation algorithm which I proposed for the vertex cover problem.

algorithm for 'generalized' matching in complete graphs

My problem is a generalization of a task solved by [Blossom algorithm] by Edmonds. The original task is the following: given a complete graph with weighted undirected edges, find a set of edges such that
1) every vertex of the graph is adjacent to only one edge from this set (i.e. vertices are grouped into pairs)
2) sum over weights of edges in this set is minimal.
Now, I would like to modify the first goal into
1') vertices are grouped into sets of 3 vertices (or in general, d vertices), and leave condition 2) unchanged.
My questions:
Do you know if this 'generalised' problem has a name?
Do you know about an algorithm solving it in number of steps being polynomial of number of vertices (like Blossom algorithm for an original problem)? I don't see a straightforward generalisation of Blossom algorithm, as it is based on looking for augmenting paths on a graph compressed to a bipartite graph (and uses here Hungarian algorithm). But augmenting paths do not seem to point to groups of vertices different than pairs.
Best regards,
Paweł

Pathfinding - A* with least turns

Is it possible to modify A* to return the shortest path with the least number of turns?
One complication: Nodes can no longer be distinguished solely by their location, because their parent node is relevant in determining future turns, so they have to have a direction associated with them as well.
But the main problem I'm having, is how to work number of turns into the partial path cost (g). If I multiply g by the number of turns taken (t), weird things are happening like: A longer path with N turns near the end is favored over a shorter path with N turns near the beginning.
Another less optimal solution I'm considering is: After calculating the shortest path, I could run a second A* iteration (with a different path cost formula), this time bounded within the x/y range of the shortest path, and return the path with the least turns. Any other ideas?
The current "state" of the search is actually represented by two things: The node you're in, and the direction you're facing. What you want is to separate each of those states into different nodes.
So, for each node in the initial graph, split it into E separate nodes, where E is the number of incoming edges. Each of these new nodes represents the old node, but facing in different directions. The outgoing edges of these new nodes will all be the same as the old outgoing edges, but with a different weight. If the old weight was w, then...
If the edge doesn't represent a turn, make the new weight w as well
If the edge does represent a turn, make the new weight w + ε, where ε is some number significantly smaller than the smallest weight.
Then just do a normal A* search. Since none of the weights have decreased, your heuristic will still be admissible, so you can still use the same heuristic.
If you really want to minimize the number of turns (as opposed to finding a nice tradeoff between turns and path length), why not transform your problem space by adding an edge for every pair of nodes connected by an unobstructed straight line; these are the pairs you can travel between without a turn. There are O(n) such edges per node, so the new graph is O(n3) in terms of edges. That makes A* solutions as much as O(n3) in terms of time.
Manhattan distance might be a good heuristic for A*.
Is it possible to modify A* to return the shortest path with the least number of turns?
It is most likely not possible. The reason being that it is an example of the weight-constrained shortest path problem. It is therefore NP-Complete and cannot be solved efficiently.
You can find papers that discuss solving this problem e.g. http://web.stanford.edu/~shushman/math15_report.pdf

Algorithm to modify the weights of the edges of a graph, given a shortest path

Given a graph with edges having positive weights, a pair of nodes, and a path between the nodes, what's the best algorithm that will tell me how to modify the edge weights of the graph to the minimum extent possible such that the specified path becomes the shortest path between the nodes (as computed by A*)? (Of course, had I specified the shortest path as input, the output would be "make no changes").
Note: Minimum extent refers to the total changes made to edge weights. For example, the other extreme (the most disruptive change) would be to change the weights of all edges not along the specified path to infinity and those along the path to zero.
You could use the Floyd-Warshall algorithm to compute the distances for all the paths, and then modify the desired path so that it becomes the shortest path. For example, imagine the following graph of 3 nodes.
Let the path be a -> b -> c. The Floyd-Warshall algorithm will compute the following matrix.
The numbers with green circles are the distances of a -> b (2) and b -> c (4). The red circled number is the shortest distance for a path between a and c (3). Since 2 + 4 = 6 ≠ 3, you know that the path must be adjusted by 3 to be the minimum path.
The reason I suggest this approach as opposed to just calculating the distance of the shortest path and adjusting the desired path accordingly is that this method allows you to see the distances between any two nodes so that you can adjust the weights of the edges as you desire.
This reminds me vaguely of a back-propagation strategy as is often found in neural network training. I'll sketch two strategies, the first of which is going to be flawed:
Compute the cost of your candidate path P, which we will call c(P).
Compute the cost of the shortest path S, which we will call c(S).
Reduce every edge weight w(p) ∈ P by (c(P) - c(S) - epsilon) / |P|, where epsilon is some vanishingly small constant by which you would like your path to be less than c(S), and |P| is the number of edges in P.
Of course, the problem with this is that you might well reduce the cost of path S (or some other path) by more than you reduce the cost of P! This suggests to me that this is going to require an iterative approach, whereby you start forwards and reduce the cost of a given weight relative to the shortest path cost which you recompute at each step. This is hugely more expensive, but thankfully shortest path algorithms tend to have nice dynamic programming solutions!
So the modified algorithm looks something like this (assume i = 0 to start):
Compute the cost of the first i steps of your candidate path P, which we will call c(p_0...p_i).
Compute the cost of the shortest path S, which we will call c(S), and the cost of its first i components, which we will denote by c(s_0...s_i).
Reduce edge weight w(p_n) by c(p_0...p_i) - c(s_0...s_i) - epsilon, where epsilon is some vanishingly small constant by which you would like your path to be less than c(S).
Repeat from step 1, increasing i by 1.
Where P = S to begin with, if epsilon is 0, you should leave the original path untouched. Otherwise you should reduce by no more than epsilon * |P| beyond the ideal update.
Optimizing this algorithm will require that you figure out how to compute c(s_0...s_i+1) from c(s_0...s_i) in an efficient manner, but that is left as an exercise to the reader ;-)

Resources