Sum up shortest paths in a weighted network - r

I have a graph with 340 nodes and 700 links. As for performance indicator of the network, I want to compute the sum of all weighted shortest paths in my network.
I tried the all_shortest_paths command from the igraph package. But my system doesn't have enough RAM to store the resulting matrix.
Can someone recommend a package or code which computes the sum of all shortest paths? (So the big matrix is not needed?)
For unweighted networks is the command mean_distance, which does basically something similar!?

You could try the package dodgr. With
dodgr_dists(graph)
you can generate a square matrix of distances between your nodes (more info).
Note: This will only work if your graph is directed.

Related

All pairs shortest paths in graph directed with non-negative weighted edges

I have a directed graph with non-negative weighted edges where there are multiple edges between two vertices.
I need to compute all pairs shortest path.
This graph is very big (20 milion vertices and 100 milion of edges).
Is Floyd–Warshall the best algorithm ? There is a good library or tool to complete this task ?
There exists several all-to-all shortest paths algorithms for directed graphs with non-negative cycles, Floyd-Warshall being probably the most famous, but with the figures you gave, I think you will have in any case memory issues (time could be an issue, but you can find all-to-all algorithm that can be easily and massively parallelized).
Independently of the algorithm you use, you will have to store the result somewhere. And storing 20,000,000² = 400,000,000,000,000 paths length (if not the full paths themselves) would use hundreds of terabytes, at the very least.
Accessing any of these results would probably be longer than calculating one shortest path (memory wall), which can be done in less than a milisecond (depending on the graph structure, you can find techniques that are much, much faster than Dijkstra or any priority queue based algorithm).
I think you should look for an alternative where computing all-to-all shortest paths is not required, to be honnest. Or, to study the structure of your graph (DAG, well structured graph easy to partition/cluster, geometric/geographic information ...) in order to apply different algorithms, because in the general case, I do not see any way around.
For example, with the figures you gave, an average degree of about 5 makes for a decently sparse graph, considering its dimensions. Graph partitioning approaches could then be very useful.

Graph Shortest Paths w/Dynamic Weights (Repeated Dijkstra? Distance Vector Routing Algorithm?) in R / Python / Matlab

I have a graph of a road network with avg. traffic speed measures that change throughout the day. Nodes are locations on a road, and edges connect different locations on the same road or intersections between 2 roads. I need an algorithm that solves the shortest travel time path between any two nodes given a start time.
Clearly, the graph has dynamic weights, as the travel time for an edge i is a function of the speed of traffic at this edge, which depends on how long your path takes to reach edge i.
I have implemented Dijkstra's algorithm with
edge weights = (edge_distance / edge_speed_at_start_time)
but this ignores that edge speed changes over time.
My questions are:
Is there a heuristic way to use repeated calls to Dijkstra's algorithm to approximate the true solution?
I believe the 'Distance Vector Routing Algorithm' is the proper way to solve such a problem. Is there a way to use the Igraph library or another library in R, Python, or Matlab to implement this algorithm?
EDIT
I am currently using Igraph in R. The graph is an igraph object. The igraph object was created using the igraph command graph.data.frame(Edges), where Edges looks like this (but with many more rows):
I also have a matrix of the speed (in MPH) of every edge for each time, which looks like this (except with many more rows and columns):
Since I want to find shortest travel time paths, then the weights for a given edge are edge_distance / edge_speed. But edge_speed changes depending on time (i.e. how long you've already driven on this path).
The graph has 7048 nodes and 7572 edges (so it's pretty sparse).
There exists an exact algorithm that solves this problem! It is called time-dependent Dijkstra (TDD) and runs about as fast as Dijkstra itself.
Unfortunately, as far as I know, neither igraph nor NetworkX have implemented this algorithm so you will have to do some coding yourself.
Luckily, you can implement it yourself! You need to adapt Dijkstra in single place.
In normal Dijkstra you assign the weight as follows:
With dist your current distance matrix, u the node you are considering and v its neighbor.
alt = dist[u] + travel_time(u, v)
In time-dependent Dijkstra we get the following:
current_time = start_time + dist[u]
cost = weight(u, v, current_time)
alt = dist[u] + cost
TDD Dijkstra was described by Stuart E. Dreyfus. An appraisal of some shortest-path
algorithms. Operations Research, 17(3):395–412, 1969
Currently, much faster heuristics are already in use. They can be found with the search term: 'Time dependent routing'.
What about igraph package in R? You can try get.shortest.paths or get.all.shortest.paths function.
library(igraph)
?get.all.shortest.paths
get.shortest.paths()
get.all.shortest.paths()# if weights are NULL then it will use Dijkstra.

How to calculate the betweenness using the random walk algorithm?

The igraph package calculates the betweenness using shortest path between nodes.
http://igraph.sourceforge.net/doc/R/betweenness.html
Now I want to calculate the betweenness using the random walk.
A measure of betweenness centrality based on random walks, M. E. J. Newman, Social Networks 27, 39-54 (2005).
I know that NetworkX in python can implement this function. But it turns out the memory error because of the large network I used.
Is there any suggestion about how to calculate betweenness using the random walk?
Thanks!
After running for three days and nights, the computer finally obtained the result of betweenness using NetworkX.
The graph I used consists of about six thousand nodes and 5 million edges. The RAM of computer is 16G.
The solver is set to “full” (uses most memory), not the default 'lu'.
This link also mentioned the problem of run time using NetworkX to calculate the betweenness based on random walk.

correlation matrix to build networks

I have used the MixOmics package in R for a two matrices (canonical correlation analysis) and I have a resultant correlation matrix. I would like to build a correlation network from the result obtained. I earlier thought of using the gene set correlation analysis package but I do not know how to install it and there are no sources over the internet to install it in R (http://www.biostat.wisc.edu/~kendzior/GSCA/).
Also could you suggest what other packages I could use to build networks with correlation matrix as input ?? I thought of Rgraphviz but do not know if it is possible.
Copying this answer mostly from my previous answer at https://stackoverflow.com/a/7600901/567015
The qgraph package is mostly intended to visualize correlation matrices as a network. This will plot variables as nodes and correlations as edges connecting the nodes. Green edges indicate positive correlations and red edges indicate negative correlations. The wider and more saturated the edges the stronger the absolute correlation.
For example (this is the first example from the help page), the following code will plot the correlation matrix of a 240 variable dataset.
library("qgraph")
data(big5)
data(big5groups)
qgraph(cor(big5),minimum=0.25,cut=0.4,vsize=2,groups=big5groups,legend=TRUE,borders=FALSE)
title("Big 5 correlations",line=-2,cex.main=2)
You can also cluster strongly correlated nodes together (uses Fruchterman-Reingold) which creates quite a clear image of what the structure of your correlation matrix actually looks like:
For an extensive introduction take a look at http://www.jstatsoft.org/v48/i04/paper
You might also want to take a look at the network and sna packages on CRAN. Both include tools for converting a matrix into a network data object.

Is there a equivalent of igraph for MATLAB?

Hi
I am looking for igraph possibly something similar for MATLAB to compute common network paramerers like clustering coefficient,
Thank YOU
You could try the Matlab interface to the Boost Graph Library, which includes clustering coefficient, minimum spanning tree, shortest paths, etc.
http://www.stanford.edu/~dgleich/programs/matlab_bgl/
http://www.mathworks.com/matlabcentral/fileexchange/10922

Resources