Convert a map with houses into a graph - graph

I am curious how map software (Google/Bing maps) convert a map into a graph in the backend.
Now if we add houses between intersections 1 and 2, then how would the graph change. How do map software keep track of where the houses are?
Do they index the intersection nodes and also have smaller "subnodes" (between 1 and 2 in this case)? Or do they do this by having multiple layers? So when a user enters a home number, it looks up where the home is (i.e. between which vertices the home is located). After that, they simply apply a shortest path algorithm between those two node and at the beginning and the end, they basically make the home node go to one of the main vertices.
Could someone please give me a detailed explanation of how this works? Ultimately I would like to understand how the shortest path is determined given two the "address" of two "homes" (or "subnodes").

I can only speak for GraphHopper, not for the closed source services you mentioned ;)
GraphHopper has nodes (junctions) and edges (connection between those junctions), nearly exactly how your sketch looks like. This is very fast for the routing algorithms as it avoids massive traversal overhead of subnodes. E.g. in an early version we used subnodes everytime the connection was not straight (e.g. curved street) and this was 8 times slower and so we avoided those 'pillar' nodes and only used the 'tower' nodes for routing.
Still you have to deal with two problems:
How to deal with queries starting on the edge at e.g. house number 1? This is solved via introducing virtual nodes for every query (which can contain multiple locations), and you also need the additional virtual edges and hide some real edges. In GraphHopper we create a lightweight wrapper graph around the original graph (called QueryGraph) which handles all this. It then behaves exactly like a normal 'Graph' for every 'RoutingAlgorithm' like Dijkstra or A*. Also it becomes a bit hairy when you have multiple query locations on one edge, e.g. for a route with multiple via points. But I hope you get the main idea. Another idea would be to do the routing for two sources and two targets but initialized with the actual distance not with 0 like it is normally done for the first nodes. But this makes the routing algorithms more complex I guess.
And as already stated, most of the connections between junctions are not straight and you'll have to store this geometry somewhere and use it to draw the route but also to 'snap a location to the closest road' do finally do the actual routing. See LocationIndexTree for code.
Regarding the directed graphs. GraphHopper stores the graph via undirected edges, to handle oneways it stores the access properties for every edge and for every vehicle separately. So we avoid storing two directed edges and all of its properties (name/geometry/..), and make the use case possible "oneway for car and twoway for bike" etc. It additionally allows to traverse an edge in the reverse direction which is important for some algorithms and e.g. the bidirectional Dijkstra. This would not be possible if the graph would be used to model the access property.
Regarding 'nearly exactly how your sketch looks like': node 1, 3, 7 and 8 would not exist as they are 'pillar' nodes. Instead they would only 'exist' in the geometry of the edge.

To represent the connectivity of a road network, you want your directed road segments to be the graph nodes and your intersections to be collections of directed edges. There is a directed edge from X to Y if you can drive along X and then turn onto or continue on Y.
Consider the following example.
a====b====c
|
| <--one way street, down
|
d
An example connectivity graph for this picture follows.
Nodes
ab
ba
bc
cb
bd
Edges
ab -> bc
ab -> bd
cb -> ba
cb -> bd
Note that this encodes the following information:
No U-turns are allowed at the intersection,
because the edges ab -> ba and cb -> bc are omitted.
When coming from the right a left turn onto the vertical road is allowed,
because the edge cb -> bd is included.
With this representation, each node (directed road segment) has as an attribute all of the addresses along its span, each marked at some distance along the directed road segment.

Related

Creating a network graph with set node positions and concentrated edges with both circleheads and arrowheads in R

I've been trying to find a way to replicate the following network graph format in R using DiagrammeR/GraphViz, but without success (ignore the thick black arrow on N1): https://i.stack.imgur.com/oHpQz.png
The graph is a directed graph and each edge in a certain direction either ends with an arrowhead (-->) if the edge value is positive, or a circle/odot (--o) if the edge value is negative. Between a pair of nodes (ex. N1 -- A1), there can be an edge N1 --> A1 and an edge A1 --o N1, and these need to be concentrated so that the two edges look like one line with an arrowhead on one end and a circlehead on the opposite end (like this: o--->). These cannot be parallel or look like two edges ideally.
Another requirement is that the nodes have to be in very specific positions and remain there throughout model simulations where edges might change. From what I have tried and the documentation I have read, this is not possible to do in DOT format, but is possible in neato format.
This is where I get a problem. In neato, I can align the nodes exactly where I want them by defining their x,y positions. However, when I use concentrate = true to create the o---> edge from two otherwise parallel edges, only one type of arrowhead remains. So an edge that's supposed to look like o---> ends up looking like ---> or o---.
This is not a problem in DOT format as concentrate = true does what I want it to do, but in DOT I cannot assign exact node positions. I have tried getting around this using node ranks but without much luck. It seems to stack nodes I want in different ranks within the same rank. As well, concentrate = true doesn't seem to work for edges between nodes within the same rank, as it leaves them as two separate curved edges ---> and o--- without concentrating them.
The reason why I need this to work is because I'm running model simulations where the edges change, and I need to generate hundreds of such graphs. For easy comparison, the nodes need to stay in the same place for consistency.
This is the closest I could come up with using neato format (nodes are positioned the way I want but it's not showing the proper o---> for all the black edges minus self-edges; red edges are true one-way links): https://i.stack.imgur.com/YJBY7.jpg
If only the edges showed up as the proper o---> format, this would be perfect for my needs. If you know of any way to fix this issue using DiagrammeR/GraphViz, or even another program, I would be so grateful. Thanks!
You probably don't need concentrate. Look at arrowtail and dir (https://www.graphviz.org/doc/info/attrs.html#d:arrowtail and https://www.graphviz.org/doc/info/attrs.html#d:dir) and neato -n
digraph c {
graph[label="can neato do the work?"]
node[shape=circle]
a [pos="100,100"]
b [pos="200,100"]
c [pos="300,100"]
a->b [dir=both arrowtail=odot]
c->c [dir=both arrowtail=odot arrowhead=none]
}
Giving:

Reduce openstreetmap graph size in networkx

I have a graph (transformed from OSMNX) of London's walk path, containing 667.588 edges, with different highway attributes (the street types in openstreetmap). Running a shortest_path algorithm is quite slow (4 seconds). To improve the speed, I want to largely reduce the number of edges in a systematic way without losing main connections/city structures, but not sure how to do it? Any suggestions? Is there a way to group some close nodes to a more important one, thus reduce the size?
You can extract edges with desired highway types from your main graph G:
highways_to_keep = ['motorway', 'trunk', 'primary']
H = nx.MultiDiGraph()
for u,v,attr in G.edges(data=True):
if attr['highway'] in highways_to_keep:
H.add_edge(u,v,attr_dict=attr)
H.node[u] = G.node[u]
H.node[v] = G.node[v]
Here, we first initialized an empty MultiDiGraph, which is a type of graph used by OSMnx, then populate it with data from the main graph G, if the 'highway' attribute is in our list of highways_to_keep. You can find more about highway types in this OpenStreetMap page.
Our graph is a valid NetworkX graph, but you need to do one more thing before you can take advantage of OSMnx functionalities as well. if you execute G.graph, you will see graph attributes which contains crs (coordinate reference system) and some other things. you should add this information into your newly create graph:
H.graph = G.graph
here is the plot of H , osmnx.plot_graph(H):
It depends what type of network you're working with (e.g., walk, bike, drive, drive_service, all, etc.). The drive network type would be the smallest and prioritize major routes, but at the expense of pedestrian paths and passageways.
OSMnx also provides the ability to simplify the graph's topology with a built-in function. This is worth doing if you haven't already as it can reduce graph size by 90% sometimes while correctly retaining all intersection and dead-end nodes, as well as edge geometries, faithfully.
The above solution does not work anymore since the networkx library has changed. Specifically
H.node[u] = G.node[u]
is not supported anymore.
The following solution relies on the osmnx.geo_utils.induce_subgraph and used a node list as an argument to this function.
highways_to_keep = ['motorway', 'trunk', 'primary', 'secondary', 'tertiary']
H = nx.MultiDiGraph() # new graph
Hlist = [] # node list
for u,v,attr in G.edges(data=True):
if "highway" in attr.keys():
if attr['highway'] in highways_to_keep :
Hlist.append(G.nodes[u]['osmid'])
H = ox.geo_utils.induce_subgraph(G, Hlist)
The osmnx simplification module worked for me in this case https://osmnx.readthedocs.io/en/stable/osmnx.html#module-osmnx.simplification:
osmnx.simplification module
Simplify, correct, and consolidate network topology.
osmnx.simplification.consolidate_intersections(G, tolerance=10, rebuild_graph=True, dead_ends=False, reconnect_edges=True)
Consolidate intersections comprising clusters of nearby nodes.
osmnx.simplification.simplify_graph(G, strict=True, remove_rings=True)
Simplify a graph’s topology by removing interstitial nodes.

Find the distance between two points on any 3d surface

I am making a game in Unity3d and I need a pathfinding algorithm that can guide enemy's towards the player on a 3d surface. The problem is that the 3d surface can take any shape, so it can be a 3d sphere, cube, torus and many more shapes.
I tried using A* but for that formula I need the distance between the two points, and since the object is curved I cannot get that so easily. I found that you can use the Haversine formula if its a sphere, but that won't work on a torus or a random 3d shape.
I want this kind of result except with every kind of object:
https://www.youtube.com/watch?v=hvunNq7yVcU
Is there a way/algorithm that I can use to get that result. I know there is something called nav mesh but I need to program it myself. Also I cannot find how nav mesh approaches this dilema. I am going to use the triangles of my object as nodes.
So my question boils down to:
Does anyone know a algorithm for pathfinding that works on any 3d surface?
Thanks in advance.
I think your problem is that you are not using a graph, I would suggest that you look into a tutorial on how to create a graph, for the language you are using if you can, (this may also help here they are using edges to connect their node which is needed if you have more then one weight). If you do make a graph you will need a node class. Each node must contain pointers to any nodes that it is connected to and an ID of some kind. In your case that is probably all you need but it is also possible to assign a weight to each move if you also have an edge class (connectors between nodes) which would be used to connect the nodes. If you do have an edge class your nodes will have pointers to edges instead of other nodes and each edge will have a weight and a pointer to 1 or 2 nodes (depending on if it is a directed path or not). You can also make a graph class to contain all of your nodes and edges.
Summary:
make a node class and determine if you need the edge class (if everything has a weight of 1 you can get away with out it). Use the node class to create a graph to represent your map with each tile being a node with pointers to connected tiles. Use A* or dijkstra's algorithm to search your graph to find the shortest path.
note: most examples you will find will be for 2d graphs, yours is no different except that there are no bounds on yours, you just need to connect the nodes to their adjacent tiles.

Clustering in Gephi (Louvain Method)

I have started to work with gephi to help me display a dataset.
The dataset contains:
tags (terms for a certain picture) as nodes
Normalized Google Similarity Distance between those tags as edges with a weight (between 0 und 1)
Every tag is connected to every other tag, as long as they both belong to the same picture. So I have one cluster of nodes and edges for every picture.
I have now imported this dataset to gephi in the following format:
nodes: id, label
edges: target, source, weight (between 0 and 1)
Like 500 nodes and 6000 edges.
My problem now is that after importing all those nodes and edges the graph looks kind of bunched with no real order. Every cluster of every picture is mixed into other clusters of other pictures.
Now using Modularity as Partition algorithm (which should use the Louvain method) the graph is getting colored, each color represent a picture. Now I can split this mess, using the Force Atlas 2 Layout.
I now have a colored graph with something like 15 clusters (every cluster represent 1 picture)
Now I want to cluster those clusters again using tags (nodes) according to their Normalized google distance (weight of the edges), which should then be tags which are somewhat equal in their meaning.
I hope you guys understand what I want to accomplish.
I can also upload a picture to clarify it.
Thanks a lot
I don't think you can do that with the standard version of Gephi. You would need to develop a plugin to implement the very last step of your process.
Gephi is good for visualizing and browsing graphs, but (for now) there are more complete tools when it comes to processing topological properties. for instance, the igraph library (available in C, R and python) might be more appropriate for you. And note that you can use a file format compatible with both Gephi and igraph, which allows you to use both tools on the same data.
I was able to solve my problem. I had to import every one of these 15 clusters on their own. In this way i could use the Modularity method on just those few.

Find All Cycle Bases In a Graph, With the Vertex Coordinates Given

A similar question is posted here.
I have an undirected graph with Vertex V and Edge E. I am looking for an algorithm to identify all the cycle bases in that graph. An example of such a graph is shown below:
Now, all the vertex coordinates are known ( unlike previous question, and contrary to the explanation in the above diagram), therefore it is possible to find the smallest cycles that encompass the whole graph.
In this graph, it is possible that there are edges that don't form any cycles.
What is the best algorithm to do this?
Here's another example that you can take a look at:
Assuming that e1 is the edge that gets picked first, and the arrow shows the direction of the edge.
I haven't tried this and it is rather greedy but should work:
Pick one node
Go to one it's neighbors's
Keep on going until you get back to your starting node, but you're not allowed to visit an old node.
If you get a cycle save it if it doesn't already exist or a subset of those node make up a cycle. If the node in the cycle is a subset of the nodes in another cycle remove the larger cycle (or maybe split it in two?)
Start over at 2 with a new neighbor.
Start over at 1 with a new node.
Comments: At 3 you should of course do the same thing as for step 2, so take all possible paths.
Maybe that's a start? As I said, I haven't tried it so it is not optimized.
EDIT: An undocumented and not optimized version of one implementation of the algorithm can be found here: https://gist.github.com/750015. But, it doesn't solve the solution completely since it can only recognize "true" subsets.

Resources