For my flowchart, I have a vertical chart detailing the data flow. However on the downward arrows, I want to add side arrows to describe where the missing data is going. How do I do this? I can't see it in any of the documentation and examples because it tends to be about far more complex things, and I know this is a very basic task!
library(DiagrammeR)
grViz("digraph flowchart {
# node definitions with substituted label text
node [fontname = Helvetica, shape = rectangle, fixedsize = false, width = 1]
1 [label = 'data (100%)']
2 [label = 'data (90.4%)']
3 [label = 'data \\ndata (83.3%)']
4 [label = 'data (66%)']
7 [label = 'data (100%)']
8 [label = 'data (74.4%)']
9 [label = 'data (69.6%)']
10 [label = 'data (55.4%)']
1 -> 2 -> 3 -> 4;
7 -> 8 -> 9 -> 10
} ")
This gives me two side by side panels, but I want arrows coming off the downward arrows where I can put the n for missing data.
The standard trick is to create invisible dummy nodes, then break up each edge into two parts: 1) source -> dummy, and 2) dummy -> target:
library(DiagrammeR)
grViz("digraph flowchart {
# node definitions with substituted label text
node [fontname = Helvetica, shape = rectangle, fixedsize = false, width = 1]
1 [label = 'data (100%)']
2 [label = 'data (74.4%)']
3 [label = 'data (69.6%)']
4 [label = 'data (55.4%)']
m1 [label = 'missing (25.6%)']
m2 [label = 'missing (4.8%)']
node [shape=none, width=0, height=0, label='']
p1 -> 2; p2 -> 3 -> 4;
{rank=same; p1 -> m1}
{rank=same; p2 -> m2}
edge [dir=none]
1 -> p1; 2 -> p2;
}")
I shortened your example for demonstration purposes. In the above, p1 and p2 are invisible dummy nodes. There are three sets of edges:
Downward directed edges from dummy nodes to targets (e.g., p1 -> 2)
Horizontal directed edges from dummy nodes to "missing" nodes. Edge direction is imposed through rank=same.
Undirected edges from source to the dummy nodes
I need to design an algorithm to find a path in a public transportation system. In theory only the best (lowest cost) path is required, but in reality it's different. When traveling in a public transportation system, it's difficult to define the cost, it cannot be simplified to traveling time, waiting time, transfer time, bus/subway fees etc, are all need to take into consideration.
Firstly I need to simplify the problem, design a cost function which is a combination of all those "time" and "fee", then use a graph algorithm to find a few paths (3~5 paths). Finally present all these paths to the end users and let them make the decision.
The reason I need to present more than one path is that, for different users/situations these "time" and "fee" are different, thus presenting a few paths is better than just giving out the "best" path.
Algorithms like A* are good for finding the shortest path, but how can I find those "short enough" paths in a graph ? or how can I find the shortest N paths ?
BTW, I don't even need to find the shortest path, because in practice the end users never know the shortest path (unless the shortest path is obvious), they will be happy if the results are close to the shortest path.
A* star's "cost" is more versatile than you are thinking. A* is typically explained with nodes who's cost is simply a distance. We can, however, beef this up a little.
I'm not seeing a language you prefer, maybe Graph? Oh well, here's some c++:
namespace Astar
{
struct CostEvaluation
{
int distance_cost;
int transfer_cost;
// others
int costToTraverseNodes( const Node& first, const Node& second ) const
{
int distance = // apply distance_cost to distance between nodes
int transfer = // apply transfer_cost if there is a transfer between nodes
return distance + transfer;
}
}
}
Now, the actual implementation of your A* will take a CostEvaluation object to determine the cost on the route. If transfers don't matter, set the transfer_cost to zero.
As far as a "good enough" route: I'm sure other people would be able to help you better, but I feel like you might run into a situation where the program says something like "oh, you want to be there in an hour, but the best route only takes twenty minutes? Here, go around in circles for forty minutes, that's good enough".
As I hinted in my comments, it is possible to create a modified A* version which reports multiple routes. I just drove my implementation to the point where it is obviously confirming this statement.
The code below starts with a "classic" A* implementation, which I keep around so one can study the differences between "classic" and "modified".
The basic idea for the modified version is to start the search both forwards and backwards in parallel. This often also yields more robust results, given that the "greediness" of A* is largely driven by its heuristic function (h(x)). It is possible to construct cases, where the greediness opts for fast progress at the start of the route, while that route towards the end "slows down" drastically. Starting from both sides (source, target), this effect can be mitigated to bits. (If one calculates to the end, it should always be optimal routes, if not necessarily the same route. If one were to calculate to the "classic" ending condition in both directions a picture as the one below could result, showing that the both directions yield 2 different paths.
Now, the "explored lists" of both directions can be used to find out when while searching e.g. "forwards", the next node is already explored by the "backwards" search - or vice versa. Obviously, those "connection points" between the two searches yield a route, which is not necessarily optimal but - a valid route.
My implementation traces those intermediate routes and I did not bother to collect them. The traces show the id of the node where both exploration lists "meet" and the resulting 2 parts of the route (source -> meeting point, meeting point -> destination).
Now, using those intermediate lists along with some post processing, e.g. by means of evaluating the routes according to single dimensions of the heuristics function (e.g. comfort, speed, ...), it should be possible to find a good enough selection of routes, associated with different trade-offs in those dimensions.
The full F# script is around 340 lines - a bit too long for this site, so I will omit some nonessential parts (such as my rendering function, creating those bitmaps, etc.
module AStar =
module Internals =
let makeRoute (explo : Map<int,(int * float)>) at tgt =
let rec loop at acc =
let dst,c = explo.[at]
match at,dst with
| (_,b) when b = tgt -> (at,c) :: acc
| (_,b) -> loop b ((at,c) :: acc)
[(tgt,0.0)] # loop at []
let makeRouteBackward (exploBW : Map<int, (int * float)>) at tgt =
let rec loop at acc =
let src,c = exploBW.[at]
match at,src with
| (_,b) when b = tgt -> acc # [(at,c)]
| (_,b) -> loop b (acc # [at,c])
let r = loop at [] # [(tgt,0.0)]
let rev = List.rev r
List.zip r rev |> List.map (fun ((id1,c1),(id2,c2)) -> id1,c2)
let classic neighbors h cost start goal =
let prioSelect (lopen : (int * float) list) =
let sorted = List.sortBy (fun (id,p) -> p) lopen //|> List.rev
(fst (List.head sorted), List.tail sorted)
let rec search (lopen : (int * float) list) (routes : Map<int,int * float>) =
let rec searchNeighbors cur nl o (r : Map<int,(int * float)>) =
match nl with
| [] -> o,r
| next::others ->
let newCost = (snd (r.[cur])) + cost cur next
if (not (Map.containsKey next r)) || (newCost < snd r.[next])
then
let r1 = r |> Map.remove next |> Map.add next (cur,newCost)
let prio = newCost + h next goal
//printfn "current = %d -- next = %d -- newCost = %f -- prio = %f -- h = %f" cur next newCost prio (h next goal)
let o1 = (next,prio) :: o
searchNeighbors cur others o1 r1
else
searchNeighbors cur others o r
match lopen with
| [] -> []
| _::_ ->
let current,rest = prioSelect lopen
if current = goal then Internals.makeRoute routes current start
else
let lopen1,routes1 = searchNeighbors current (neighbors current) rest routes
search lopen1 routes1
search [start,0.] (Map.ofList [start,(start,0.0)])
let twoWay sources targets hforward hbackward costforward costbackward (start : int) (goal : int) (n : int) rr =
let prioSelect (lopen : (int * float) list) =
let sorted = List.sortBy (fun (id,p) -> p) lopen //|> List.rev
(fst (List.head sorted), List.tail sorted)
let searchforward lopen exploredF exploredB nfound acc =
let rec searchNeighbors cur nl o (r : Map<int,(int * float)>) =
match nl with
| [] -> o,r
| next::others ->
//printfn "fwd: current = %d -- next = %d -- nl = %A -- r = %A" cur next nl r
let newCost = (snd (r.[cur])) + costforward cur next
if (not (Map.containsKey next r)) || (newCost < snd r.[next])
then
let r1 = r |> Map.remove next |> Map.add next (cur,newCost)
let prio = newCost + hforward next goal
let o1 = (next,prio) :: o
if Map.containsKey next exploredB then
rr (next, Internals.makeRoute r1 next start, Internals.makeRouteBackward exploredB next goal)
searchNeighbors cur others o1 r1
else
searchNeighbors cur others o r
match lopen with
| [] -> (lopen,exploredF,0,acc)
| _::_ ->
let current,rest = prioSelect lopen
if current = goal then
(rest,exploredF,nfound+1,acc # [Internals.makeRoute exploredF current start] )
else
let lopen1,explored1 = searchNeighbors current (targets current) rest exploredF
(lopen1, explored1, nfound, acc)
let searchbackward lopen exploredB exploredF nfound acc =
let rec searchNeighbors cur nl o (r : Map<int,(int * float)>) =
match nl with
| [] -> o,r
| next::others ->
//printfn "bwd: current = %d -- next = %d -- nl = %A -- r = %A" cur next nl r
let newCost = (snd (r.[cur])) + costbackward cur next
if (not (Map.containsKey next r)) || (newCost < snd r.[next])
then
let r1 = r |> Map.remove next |> Map.add next (cur,newCost)
let prio = newCost + hbackward next start
let o1 = (next,prio) :: o
searchNeighbors cur others o1 r1
else
searchNeighbors cur others o r
match lopen with
| [] -> (lopen,exploredB,0,acc)
| _::_ ->
let current,rest = prioSelect lopen
if current = start then
//(rest,explored,nfound+1,acc # [Internals.makeRoute explored current goal []])
(rest,exploredB,nfound+1,acc # [Internals.makeRouteBackward exploredB current goal] )
else
let lopen1,explored1 = searchNeighbors current (sources current) rest exploredB
(lopen1, explored1, nfound, acc)
let rec driver openF openB exploredF exploredB nfoundF nfoundB accF accB =
let openF1, exploredF1,nfoundF1,accF1 = searchforward openF exploredF exploredB nfoundF accF
let openB1, exploredB1,nfoundB1,accB1 = searchbackward openB exploredB exploredF nfoundB accB
match (nfoundF1+nfoundB1), List.isEmpty openF1, List.isEmpty openB1 with
| (s,false,false) when s < n ->
driver openF1 openB1 exploredF1 exploredB1 nfoundF1 nfoundB1 accF1 accB1
| _ ->
accF1 # accB1
driver [start,0.0] [goal,0.0] (Map.ofList [start,(start,0.0)]) (Map.ofList [goal,(goal,0.0)]) 0 0 [] []
// Location : x,y coordinate or lat/long - whatever.
// Edges: (id,cost) list
type Node = { Id : int; Location : int * int; Edges : (int * float) list; EdgesBackward : (int * float) list}
type Graph = Map<int,Node>
let addNode node graph =
Map.add (node.Id) node graph
let newNode idgen x y =
{ Id = idgen(); Location = (x,y); Edges = []; EdgesBackward = [] }
let addEdge id cost node =
{ node with Node.Edges = node.Edges # [(id,cost)]; }
let addEdgeBackward id cost node =
{ node with Node.EdgesBackward = node.EdgesBackward # [(id,cost)]; }
let idgen startvalue =
let next = ref startvalue
fun () ->
let id = !next
next := !next + 1
id
let appendNode node nodeList = nodeList # [node]
let sq x = x*x
let distance p1 p2 =
let x1,y1 = p1
let x2,y2 = p2
sqrt( float (sq (x2-x1) + sq (y2-y1)) )
let solve (g : Graph) s e =
let ns id =
g.[id].Edges |> List.map (fun (id,c) -> id)
let h at goal =
float (distance (g.[at].Location) (g.[goal].Location))
let c a b =
g.[a].Edges |> List.pick (fun (id,cost) -> if id = b then Some(cost) else None)
[AStar.classic ns h c s e] // give it the same return type as solveTwoWay to make stuff below easier and shorter
let solveTwoWay (g : Graph) s e n =
let edges id =
let nl = g.[id].Edges |> List.map (fun (id,c) -> id)
//printfn "2way edges id = %d list = %A" id nl
nl
let edgesBackward id =
let nl = g.[id].EdgesBackward |> List.map (fun (id,c) -> id)
//printfn "2way backwards edges id = %d list = %A" id nl
nl
let hforward at goal =
float (distance (g.[at].Location) (g.[goal].Location))
let hbackward at start =
float (distance (g.[at].Location) (g.[start].Location))
let costF a b =
g.[a].Edges |> List.pick (fun (id,cost) -> if id = b then Some(cost) else None)
let costB a b =
g.[a].EdgesBackward |> List.pick (fun (id,cost) -> if id = b then Some(cost) else None)
let debugView arg =
let id,r1,r2 = arg
printfn "meeting at %d: r1 = %A r2 = %A" id r1 r2
AStar.twoWay edgesBackward edges hforward hbackward costF costB s e n debugView
let solveProblem problem =
let g, start, goal = problem
g,start,goal,solve g start goal
let solveProblemTwoWay problem n =
let g, start, goal = problem
g,start,goal,solveTwoWay g start goal n
let save name solution =
let graph, start, goal, routes = solution
use writer = System.IO.File.CreateText("""E:\temp\""" + name + """.txt""")
fprintf writer "------------------------------------\n start = %d ----> goal = %d: %d routes found.\n" start goal (List.length routes)
fprintf writer "Graph:\n"
graph |> Map.iter
(fun id node ->
fprintf writer "Node: %A\n" node
)
routes |> List.iteri
(fun index route ->
fprintf writer "Route %d: %A\n" index route
)
// An example problem I used to play with:
// The graph is such, that the nodes are connected to the right and
// downwards and diagonally downwards only.
// The cost is either 1.0 or sqrt(2), for the horizontal or vertical and
// the diagonal connection, respectively.
let problem2 () =
let newNodeAN = newNode (idgen 0)
let cond c x n =
if c then n |> x else n
let accessCost p =
match p with
| (4,4) | (4,5) | (5,4) | (5,5) -> 10.0
| _ -> 1.0
let right (n : Node) : Node =
let t = 1 + fst n.Location, snd n.Location
let c = accessCost t
n
|> cond (fst n.Location < 9) (fun n -> addEdge (n.Id + 1) c n)
|> cond (fst n.Location > 0) (fun n -> addEdgeBackward (n.Id - 1) c n)
let down n =
let t = fst n.Location, 1 + snd n.Location
let c = accessCost t
n
|> cond (snd n.Location < 9) (fun n -> addEdge (n.Id + 10) c n)
|> cond (snd n.Location > 0) (fun n -> addEdgeBackward (n.Id - 10) c n)
let diagdown n =
let t = 1 + fst n.Location, 1 + snd n.Location
let c = (sqrt(2.0)) * accessCost t
n
|> cond (fst n.Location < 9 && snd n.Location < 9) (fun n -> addEdge (n.Id + 11) c n)
|> cond (fst n.Location > 0 && snd n.Location > 0) (fun n -> addEdgeBackward (n.Id - 11) c n)
[
for y = 0 to 9 do
for x = 0 to 9 do
yield newNodeAN x y
]
|> List.map
(fun n ->
n
|> right
|> down
|> diagdown
)
|> List.map (fun n -> (n.Id,n))
|> Map.ofList
, 0, 99
// Last not least, the code can be executed like this:
// And since both implementations yield the same data structures,
// they can be used interchangeably and compared to each other.
solveProblemTwoWay (problem2() 5) |> save "problem2_solution"
The output printed during runtime which shows the "intermediate routes", then looks like this:
...
meeting at 48:
r1 = [(0, 0.0); (11, 1.414213562); (12, 2.414213562); (23, 3.828427125);
(34, 5.242640687); (35, 6.242640687); (46, 7.656854249); (47, 8.656854249);
(48, 9.656854249)]
r2 = [(48, 0.0); (58, 1.414213562); (68, 2.414213562); (78, 3.414213562);
(88, 4.414213562); (99, 5.414213562)]
meeting at 84:
r1 = [(0, 0.0); (11, 1.414213562); (21, 2.414213562); (32, 3.828427125);
(43, 5.242640687); (53, 6.242640687); (64, 7.656854249); (74, 8.656854249);
(84, 9.656854249)]
r2 = [(84, 0.0); (85, 1.414213562); (86, 2.414213562); (87, 3.414213562);
(88, 4.414213562); (99, 5.414213562)]
meeting at 95:
r1 = [(0, 0.0); (11, 1.414213562); (21, 2.414213562); (32, 3.828427125);
(43, 5.242640687); (53, 6.242640687); (64, 7.656854249); (75, 9.071067812);
(85, 10.07106781); (95, 11.07106781)]
r2 = [(95, 0.0); (96, 1.0); (97, 2.0); (98, 3.0); (99, 4.0)]
...