Generate an OD list of nodes within n stops - r

I have a graph G(V,E), the number of edges is 35000 and the number of nodes is 3500,
Is there anyway I can generate a origin-destination list within n (say 4) stops for each node?

I think the function neighborhood() does exactly what you want. Set the order argument to 4 and for each vertex you'll get a vector of vertex ids for the vertices that are at most 4 steps away from it.

I figure it out:
Use the property of the adjacency matrix A, the entry in row i and column j of A^n gives the number of (directed or undirected) walks of length n from vertex i to vertex j. So for n stop, construct n matrix An, A(n-1)......A1, in which, An= A^n. Then the union of An,An-1....A1 should be the matrix that representing n stop reachable destinations for an origin.

Related

Alternative for shortest_path algorithm

I have a network consisting of 335 nodes. I computed the weighted shortest.paths between all of the nodes.
Now I would like to see which path sequences where used to travel between the nodes.
I use the the shortest_path command in igraph and iterate through all combinations of nodes in my network (335² combinations - 335(path from/to same node is 0)/2 (graph is undirected). So all in all I have to iterate over 55.945 combinations.
My approach looks like this:
net is my network
sp_data is a df with all combinations of links in the network
results1 <- sapply(sp_data[,1], function(x){shortest_paths(net, from = x, to = V(net), output="epath"})
Unfortunately this needs ages to compute and at the end I don't have enough memory to store the information. (Error: cannot allocate vector of size 72 Kb).
Basically I have two questions:
How can it be that the shortest.paths command needs seconds to compute the distance between all nodes of my network whereas extracting the path sequences (not just it length) needs days and exceeds the memory capacity?
Is there an alternative to get the desired output (path sequences of shortest path)? I guess that the sapply Syntax should already be faster than a for::loop?
you could try cppRouting package.
It provides get_multi_paths function which return a list containing the node sequence for each shortest path.
library(igraph)
library(cppRouting)
#random graph
g <- make_full_graph(335)
#convert to three columns data.frame
df<-data.frame(igraph::as_data_frame(g),dist=1)
#instantiate cppRouting graph
gr<-cppRouting::makegraph(df)
#extract all nodes
all_nodes<-unique(c(df$from,df$to))
#Get all paths sequence
all_paths<-get_multi_paths(Graph=gr,from=all_nodes,to=all_nodes)
#Get path from node 1 to 3
all_paths[["1"]][["3"]]

igraph: get and permute vertex ids

In R package igrpah, how are ids assigned to each vertex? Is there a similar function like get.edge.ids that we can get vertex id from a vertex, e.g. g = graph.ring(3); V(g)$name=LETTERS[1:3], so get.vertex.ids(V(g)['A']) will return the id of the vertex 'A', which is 1 in this example. And how can we change the id of the vertices? Sure we cannot change the id of only one node, but can we permute the vertex ids?
Use permute.vertices() to permute the IDs of the vertices. Note that the vertex IDs are always integers between 1 and |V| in R (where |V| is the number of vertices).

Weakly connected graph traversal from least number of nodes

I've been given the following exercise: There's an unweighted, directed, weakly connected graph with n nodes (n < 1 000 000). We want to traverse the whole graph, starting from the least number of nodes. The question is: from which nodes do I start the traversals? I couldn't find any content on this particular topic. However, I managed to come up with an algorithm, but it's not efficient enough:
I store the graph in an adjacency list (n can be too high for a two-dimensional matrix)
I start a BFS from each node i, and store the nodes it reached in x[i][...] (x = List<List<int>>)
I check whether any x[i].Count == n
I check whether any (x[i] union x[j]).Count == n
I check whether any (x[i] union x[j] union x[k]).Count == n
... So I make all possible unions of 2, 3, 4... subsets of x, and check whether its count is n.
It works all right if n is not too high, but I would need a more efficient algorithm for bigger n.
Any help is appreciated (you would make me be able to fall asleep again)! :)
Find the nodes that do not have any incoming edges. Loop over these nodes, and for each node v, begin traversing the graph. Remember which nodes you visited (by putting them in a hash table or marking them). Stop traversing when you reach a node you have already visited.
You would need an adjacency list representation, where each node has a list of incoming and a list of outgoing edges. Then do something like this:
Set nodesToVisit = emptySet;
for i=1 to n:
if incoming[i].size() == 0:
nodesToVisit.add(i)
Set visited = emptySet;
for v in nodesToVisit:
nodesToVisit.remove(v)
if(v is not in visited):
visit(v);
visited.add(v);
for u in outgoing[v]:
nodesToVisit.add(u)

How to copy a vertex with it's respective edges (all/in/out) from a directed graph g, to a new directed graph g1?

Is there a method or a class in igraph to do this procedure fast and efectively?
Let's assume that your graph is in g and the set of vertices to be used is in sampled (which is a vector consisting of zero-based vertex IDs).
First, we select the set of edges where at least one endpoint is in sampled:
all.vertices <- (1:vcount(g)) - 1
es <- E(g) [ sampled %--% 1:n ]
es is now an "edge sequence" object that consists of the edges of interest. Next, we take the edge list of the graph (which is an m x 2 matrix) and select the rows corresponding to the edges:
el <- get.edgelist(g)[as.vector(es)+1]
Here, as.vector(es) converts the edge sequence into a vector consisting of the edge IDs of the edges in the edge sequence, and use it to select the appropriate subset of the edge list. Note that we had to add 1 to the edge IDs because R vectors are indexed from 1 but igraph edge IDs are from zero.
Next, we construct the result from the edge list:
g1 <- graph(el, vcount(g), directed=is.directed(g))
Note that g1 will contain exactly as many vertices as g. You can take the subgraph consisting of the sampled vertices as follows:
g1 <- subgraph(g1, sampled)
Note to users of igraph 0.6 and above: igraph 0.6 will switch to 1-based indexing instead of 0-based, so there is no need to subtract 1 from all.vertices and there is no need to add 1 to as.vector(es). Furthermore, igraph 0.6 will contain a function called subgraph.edges, so one could simply use this:
g1 <- subgraph.edges(g, es)

Partitioning adjacency matrix of bipartite graph

Lets say I have a graph G with its adjacency matrix A. I know that G is bipartite.
How can I split the vertices in G into the two sets that always form a bipartite graph?
Thanks!
Declare an array which of size equal to the number of vertices, setting each element to 0 initially. Then perform a depth-first search through the graph, recording the "level number" that you are on as you go. This starts at 1, and alternates between 1 and 2 with each edge traversed. For every vertex reached, assign the current level to the corresponding entry of which, and (if it was previously 0) recurse to process its children. Afterwards, all elements of which will be either 1 or 2, and which[i] indicates which set vertex i belongs to.
Intuitively, you can imagine that each traversal from parent to child in the DFS takes you "down" a level, and each traversal back takes you back "up". By the bipartite property, all vertices on even levels can be connected only to vertices on odd levels and vice versa, so labelling nodes "even" or "odd" suffices to partition them into the two sets.
If your graph contains more than one component, you will of course need a separate DFS for each component.

Resources