Pathfinding through graph with special vertices - path-finding

Heyo!
So I've got this directed and/or undirected graph with a bunch of vertices and edges. In this graph there is a start vertex and an end vertex. There's also a subset of vertices which are coloured red (this subset can include the start and end vertices). Also, no pair of vertices can have more than one edge between them.
What I have to do is to find:
A) The shortest path that passes no red vertices
B) If there is a path that passes at least one red vertex
C) The path with the greatest amount of red vertices
D) The path with the fewest amount of red vertices
For A I use a breadth first search ignoring red branches. For B I simply brute force it with a depth first search of the graph. And for C and D I use dynamic programming, memoizing the number of red vertices I find in all paths, using the same DFS as in B.
I am moderately happy with all the solutions and I would very much appreciate any suggestions! Thanks!!

For A I use a breadth first search ignoring red branches
A) is a Typical pathfinding problem happening in the sub-graph that contains no red edges. So your solution is good (could be improved with heuristics if you can come up with one, then use A*)
For B I simply brute force it with a depth first search of the graph
Well here's the thing. Every optimal path A->C can be split at an arbitrary intermediate point B. A Nice property of optimal paths, is that every sub-path is optimal. So A->B and B->C are optimal.
This means if you know you must travel from some start to some end through an intermediary red vertex, you can do the following:
Perform a BFS from the start vertex
Perform a BFS from the endvertex backwards (If your edges are directed - as I think - you'll have to take them in reverse, here)
Alternate expanding both BFS so that both their 'edge' (or open lists, as they are called) have the same distance to their respective start.
Stop when:
One BFS hits a red vertex encountered by (or in the 'closed' list of) the other one. In this case, Each BFS can construct the optimal path to that commen vertex. Stitch both semi-paths, and you have your optimal path with at least a red vertex.
One BFS is stuck ('open' list is empty). In this case, there is no solution.
C) The path with the greatest amount of red vertices
This is a combinatorial problem. the first thing I would do is make a matrix of reachability of [start node + red nodes + end nodes] where:
reachability[i, j] = 1 iff there is a path from node i to node j
To compute this matrix, simply perform one BFS search starting at the start node and at every red node. If the BFS reaches a red node, put a 1 in the corresponding line/column.
This will abstract away the underlying complexity of the graph, and make an order of magnitude speedup on the combinatorial search.
The problem is now a longest path problem through that connectivity matrix. dynamic programming would be the way to go indeed.
D) The path with the fewest amount of red vertices
Simply perform a Dijkstra search, but use the following metric when sorting the nodes in the 'open' list:
dist(start, a) < dist(start, b) if:
numRedNodesInPath(start -> a) < numRedNodesInPath(start -> b)
OR (
numRedNodesInPath(start -> a) == numRedNodesInPath(start -> b)
AND
numNodesInPath(start -> a) < numNodesInPath(start -> a)
)
For this, when discovering new vertices, you'll have to store the path leading up to them (well, just the nb of nodes in the path, as well as the nb of red nodes, separately) in a dedicated map to be fetched. I mention this because usually, the length of the path is stored implicitly as the position of the verrtex in the array. You'll have to enforce it explicitely in your case.
Note on length optimality:
Even though you stated you did not care about length optimality outside of problem A), the algorithm I provided will produce shortest-length solutions. In many cases (like in D) it helps Dijkstra converge better I believe.

Related

Recursive node traversal go through whole graph?

I'm trying to solve this work problem, but I'm having some difficuly, since I suck at recursion.
My question is: is there a way where I can pick a node in a graph and traverse the graph all the way through back to the same node where I started, but the nodes can only be visited once? And of course to save the resulting edges traversed.
The graph is unweighted, but it is coordinates in a 2D x and y coordinate system, so each coordinate has an x and y value, meaning the edges can weighted by calculating the distance between the coordinates. If that helps...
I'm not completely sure I understand, but here is a suggestion: pick a node n0, then an edge e=(n0,n1). Then remove that edge from the graph, and use a breadth-first search to find the shortest path from n1 back to n0 if it exists.
Another suggestion, which might help you to control the length of the resulting path better: Pick a starting node n0 and find a spanning tree T emanating from n0. Remove n0, and T will (hopefully) break into components. Find an edge e=(n1,n2) from one component to another. Then that edge, plus the edges in T connecting the n1 to n0, plus the edges in T connecting n2 to n0, is a cycle with the properties you desire.

Cypher query to find paths through directed weighted graph to populate ordered list

I'm new to Neo4j and am trying wrap my mind around the following problem in Cypher.
I am looking for a list of nodes, sorted by ascending visitation order, after a run of n path iterations, each of which adds nodes to the list. The visitation sort depends on depth and edge cost. Because the final list represents a sequence of nodes you could also look at it as a path of paths.
Description
My graph has an initial starting node (START), is directional, of unknown size, and has weighted edges.
A node can only be added to the list once, when it is first visited (e.g. when visiting a node, we compare to the final list and add if the node isn't on the list already).
Every edge can only be traveled once.
We can only visit the next adjacent, lowest-cost node.
There are two underlying hierarchies: depth (the closer to START the better) and edge costs (the lower the cost incurred to reach the next adjacent node the better). Depth follows the alphabetical order in the example below. Cost properties are integers but are presented in the example as strings (e.g. "costs1" means edge cost = 1).
Each path starts with the starting node of least depth that is "available" (= possessing untraveled outbound edges). In the example below all edges emanating from START will have been exhausted at some point. For the next run we'll continue with A as starting node.
A path run is done when it cannot continue anymore (i.e. no available outbound edges to travel on)
We're done when the list contains y nodes, which may or may not represent a traversal.
Any ideas on how to tackle this using Cypher queries?
Example data: http://console.neo4j.org/r/o92sjh
This is what happens:
We start at START and travel along the lowest-cost edge available to arrive at A. --> A gets the #1 spot the list and the costs1 edge in START-[:costs1]->a gets eliminated because we've just used it.
We’re on A. The lowest cost edge (costs1) circles back to START, which is a no-go, so we take this edge off the table as well and choose the next available lowest-cost edge (costs2), leading us to B. --> We output B to the list and eliminate the edge in a-[:costs2]->b.
We're now on B. The lowest cost edge (costs1) circles back to START, which is a no-go, so we eliminate that edge as well. The next lowest-cost edge (costs2) leads us to C. --> We output C to the list and eliminate the just traveled edge between B and C.
We're on C and continue from C over its lowest-cost relation on to G. --> We output G to the list and eliminate the edge in c-[:costs1]->g.
We're on G and move on to E via g-[:costs1]->e. --> E goes on the list and the just traveled edge is eliminated.
We're on E, which only has one relation with I. We incur the cost of 1 and travel on to I. --> I goes on the list and E's "costs1" edge gets eliminated.
We're on I, which has no outbound edges and thus no adjacent nodes. Our path run ends and we return to START iterating the whole process with the edges that remain.
We're on START. Its lowest available outbound edge is "cost3", leading us to C. --> C is already on the list, so we just eliminate the edge in START-[:costs3]->c and move on to the next available lowest-cost node, which is F. Note that now we've used up all edges emanating from START.
We're on F, which leads us to J (cost =1) --> J goes on the list, the edge gets eliminated.
We're on J, which leads us to L (cost = 1)--> L goes on the list, the edge gets eliminated.
We're on L, which leads us to N (cost = 1)--> N goes on the list, the edge gets eliminated.
We're on N, which is a dead end, meaning our second path run ends. Because we cannot start the next run from START (as it has no edges available anymore), we move on to next available node of least depth, i.e. A.
We're on A, which leads us to B (cost = 2) --> B is already on the list and we dump the edge.
We're on B, which leads us to D (cost = 3) --> D goes on the list, the edge gets eliminated.
Etc.
Output / final list / "path of paths" (hopefully I did this correctly):
A
B
C
G
E
I
F
J
L
N
D
M
O
H
K
P
Q
R
CREATE ( START { n:"Start" }),(a { n:"A" }),(b { n:"B" }),(c { n:"C" }),(d { n:"D" }),(e { n:"E" }),(f { n:"F" }),(g { n:"G" }),(h { n:"H" }),(i { n:"I" }),(j { n:"J" }),(k { n:"K" }),(l { n:"L" }),(m { n:"M" }),(n { n:"N" }),(o { n:"O" }),(p { n:"P" }),(q { n:"Q" }),(r { n:"R" }),
START-[:costs1]->a, START-[:costs2]->b, START-[:costs3]->c,
a-[:costs1]->START, a-[:costs2]->b, a-[:costs3]->c, a-[:costs4]->d, a-[:costs5]->e,
b-[:costs1]->START, b-[:costs2]->c, b-[:costs3]->d, b-[:costs4]->f,
c-[:costs1]->g, c-[:costs2]->f,
d-[:costs1]->g, d-[:costs2]->f, d-[:costs3]->h,
e-[:costs1]->i,
f-[:costs1]->j,
g-[:costs1]->e, g-[:costs2]->j, g-[:costs3]->k,
j-[:costs1]->l, j-[:costs2]->m, j-[:costs3]->n,
l-[:costs1]->n, l-[:costs2]->f,
m-[:costs1]->o, m-[:costs2]->p, m-[:costs3]->q,
q-[:costs1]->n, q-[:costs2]->r;
The algorithm being sought is a modification to the nearest neighbor (greedy) heuristic for TSP. The changes to the algorithm result in an algorithm that looks like this:
stand on the start vertex an arbitrary vertex as current vertex.
find out the shortest unvisited edge, E, connecting current vertex, terminate if no such edge.
set current vertex to V.
mark E as visited.
if the the number of visited edges has reached the limit, then terminate.
Go to step 2.
As with the original algorithm, the output is the visited vertices.
To handle the use case, allow for the algorithm to take in a set of already visited edges as an additional input. Rather than always starting with an empty set of visited edges. You then just call the function again but with the set of visited edges rather than an empty set until the starting vertex only leads to visited edges.
(Sorry, I'm new on the site, can't comment)
I was hired to find a solution to this particular query. I only learned of this question afterward. I am not going to post it at full here, but I am willing to discuss the solution and get feedback of anyone interested.
It turned out not being possible with cypher alone (well, I could not find out how myself). So I wrote a java function with the Neo4j bindings to implement this.
The solution is single threaded, flat (no recursion), and very close to the description of #Nuclearman. It uses two data structures (ordered maps), one to remember visited edges, another to keep a list of "start" nodes (for when the path runs out):
Follow path of smallest costs (memorize visited edges, store nodes by depth/cost)
On end of path, pick a new start node (smallest depth first, then smallest cost)
Report any new node in the order they are accessed
The use of hash sets, coupled with the fact that edges are visited only once makes it fast and memory efficient.

Directed Graph BFS without all nodes reachable

I'm doing a breadth-first search on a digraph. I'm lost at nodes c and f, and I'm not sure if and how they should be in the BF-tree or if you only go as far as reachable from the source node and don't start at another node in order to get all the vertices.
Here's what I'm getting so far. As you can see, letters mark the nodes. Distance and predecessor are marked by d and pi:
This was helpful BFS traversal of directed graph from a given node but I'm not familiar with graphs enough to understand how that applies to this situation. From what I'm getting from that question, it seems like in this case I would not include c and f at all.
In fact, it seems like I have the maximal number of nodes included already, just because I started at i. I think that the d=4 at the g node (also at k but that doesn't even connect to any other nodes), this is the greatest distance and max-depth possible in BFS in this graph.

How to find a path from source to destination by adding edges with minimum total weight

I have about 100 atoms in a 3-D space. Each atom is a node. Edges are added between two nodes when they are closer than 0.32 nm with weight equals distance. I want to find a path from source node to destination node. Since the 100 atoms are not fully connected, sometimes I can't find a path.
What I want to do is to add one or more edges to make source and destination connected. Meanwhile, I also want to minimize the total weights of the new added edges. Again, weight is calculated from the two nodes' distance.
It is kind of a reverse problem of minimum cut. Is there any algorithm helps to do this?
Thanks a lot!
It seems like one way would be to make use of a graph search algorithm for finding the shortest path, like Dijkstra's algorithm, and perhaps work from both ends (source and destination).
The only difference is that you can't know if any edge actually exists or not, and so you are creating the graph as you go. So if you start at A, and the nodes of the graph are A, B, C, D, E. Then you need to check if A-B, A-C, A-D, and A-E exists. If only A-B exists, then you check B-C, B-D, and B-E.
This will be O(|V|^2), but will really depend on how many edges get explored.
The same idea applies if you are only interested in adding edges that are longer than 0.32 nm. It's just that the path length calculation changes. Any edges that are less than .32 nm are zero-length, or simply a lot shorter (as they become less important). If that last bit doesn't work, then it gets a bit trickier.

How can I find all 'long' simple acyclic paths in a graph?

Let's say we have a fully connected directed graph G. The vertices are [a,b,c]. There are edges in both directions between each vertex.
Given a starting vertex a, I would like to traverse the graph in all directions and save the path only when I hit a vertex which is already in the path.
So, the function full_paths(a,G) should return:
- [{a,b}, {b,c}, {c,d}]
- [{a,b}, {b,d}, {d,c}]
- [{a,c}, {c,b}, {b,d}]
- [{a,c}, {c,d}, {d,b}]
- [{a,d}, {d,c}, {c,b}]
- [{a,d}, {d,b}, {b,c}]
I do not need 'incomplete' results like [{a,b}] or [{a,b}, {b,c}], because it is contained in the first result already.
Is there any other way to do it except of generating a powerset of G and filtering out results of certain size?
How can I calculate this?
Edit: As Ethan pointed out, this could be solved with depth-first search method, but unfortunately I do not understand how to modify it, making it store a path before it backtracks (I use Ruby Gratr to implement my algorithm)
Have you looked into depth first search or some variation? A depth first search traverses as far as possible and then backtracks. You can record the path each time you need to backtrack.
If you know your graph G is fully connected there is N! paths of length N when N is number of vertices in graph G. You can easily compute it in this way. You have N possibilities of choice starting point, then for each starting point you can choose N-1 vertices as second vertex on a path and so on when you can chose only last not visited vertex on each path. So you have N*(N-1)*...*2*1 = N! possible paths. When you can't chose starting point i.e. it is given it is same as finding paths in graph G' with N-1 vertices. All possible paths are permutation of set of all vertices i.e. in your case all vertices except starting point. When you have permutation you can generate path by:
perm_to_path([A|[B|_]=T]) -> [{A,B}|perm_to_path(T)];
perm_to_path(_) -> [].
simplest way how to generate permutations is
permutations([]) -> [];
permutations(L) ->
[[H|T] || H <- L, T <- permutations(L--[H])].
So in your case:
paths(A, GV) -> [perm_to_path([A|P]) || P <- permutations(GV--[A])].
where GV is list of vertices of graph G.
If you would like more efficient version it would need little bit more trickery.

Resources