R, determine shortest path - r

I have a graph and need the shortest distance between all nodes.
Now I made the following function,
shortestPath <- function(streets, length)
{
streets <- matrix(streets, byrow=TRUE, ncol=2) # from -> to
g <- graph.data.frame(as.data.frame(streets)) # create graph, see plot(g)
return <- shortest.paths(g, weights = length) # return routes lengths
}
Here streets is a vector which contains data where we have an edge and length is (obviously) the length of the edge.
I have the following graph where each edge has length two, note that the graph has to be undirected.
You can use the following data to reproduce the problem.
# Data
edges <- c(1,2, 2,3, 3,4, 4,5, 2,6, 3,7, 4,8, 6,8);
length <- rep(2,8);
aantalNodes <- 8;
# Determine shortest path
routes <- matrix(shortestPath(edges,length), byrow=FALSE, ncol=aantalNodes);
We clearly see that the shortest path between node 6 and node 8 has length 2, however, this function returns length 4. What's going wrong? I'm already tinker with it for two days. Looking forward for you help!

You may want to have a look at the rownames and colnames of shortestPath(edges,length). It's really rather revealing...
res <- shortestPath(edges,length)
res[order(as.integer(rownames(res))),
order(as.integer(colnames(res)))]

Related

How to get all the shortest paths of the length of the diameter for iGraph object?

I want to get all the longest shortest paths for iGraph object. There is this function
get.diameter (graph, directed = TRUE, unconnected = TRUE)
But it returns only one path. So if there are many shortest paths of the length of the diameter, then it returns the first one found
You can easily extract which nodes are connected at what lengths using the shortest-distance matrix returned by shortest.paths(graph). In R, you can use which() and arr.ind=TRUE like so:
longest.shortest.paths <- function(graph){
# Return edgelist of all node-pairs between which the shortest path
# in a graph are the longest shortest path observed in that graph.
# Get all the shortest paths of a graph
shortest.paths = shortest.paths(graph)
# Make sure that there are no Inf-values caused by isolates in the graph
shortest.paths[shortest.paths == Inf] <- 0
# What nodes in the distance matrix are linked by longest shortest paths?
el <- which(shortest.paths==max(shortest.paths), arr.ind=TRUE)
colnames(el) <- c("i","j")
(el)
}
graph <- erdos.renyi.game(100, 140, "gnm", directed=FALSE)
longest.shortest.paths(graph)

Get subgraph of shortest path between n nodes

I have an unweighted graph and I want to get a subgraph that has just the nodes and edges that contain the shortest paths between n known nodes. In this case 3 nodes (11, 29, & 13 are the names).
Question
How can I get a subgraph of shortest path between n nodes in R?
MWE
library(ggraph)
library(igraph)
hs <- highschool[highschool$year == '1958',]
set.seed(11)
graph <- graph_from_data_frame(hs[sample.int(nrow(hs), 60),])
# plot using ggraph
ggraph(graph, layout = 'kk') +
geom_edge_fan() +
geom_node_text(aes(label = name))
Desired Output
The desired output would be the following green subgraph (Or close, I'm eyeballing the graph above and visually picking out what would be the subgraph) ignoring/removing the other nodes and edges.
You can't find the shortest path between n nodes. Since the shortest path is defined only between two nodes.
I think you want shortest path from 1 node to other n-1 node you can use
get_all_shortest_paths(v, to=None, mode=ALL) from igraph library.
v - the source for the calculated paths
to - a vertex selector describing the destination for the
calculated paths. This can be a single vertex ID, a list of vertex
IDs, a single vertex name, a list of vertex names. None means all the vertices.
mode - the directionality of the paths. IN means to calculate
incoming paths, OUT mean to calculate outgoing paths, ALL means to calculate both ones.
Returns: all of the shortest path from the given node to every other reachable node in the graph in a list.
get_all_shortest_paths
So, now you have to create a graph from a list of the shortest paths.
Initialize an empty graph then add all path to it from the list of
the path
adding path in graph
OR
make a graph for every shortest path found and take graphs union.
union igraph
You need a matrix of shortest paths to then create a sub-graph using a union of all edges belonging to those paths.
Let key vertices be those vertices between which your desired sub-graph appears. You say you have three such key vertices.
Consider that the shortest path between any i and j of them is unlist(shortest_paths(g, i, j, mode="all", weights=NULL)$vpath). You'd want to list all i-j combinations (1-2, 1-3, 2-3 in your case) of your key-verticies, and then list all vertices that appear on the paths between them. Sometime, surely, the same vertices appear on the shortest paths of more than one of your ij-pairs (See betweenness centrality). Your desired subgraph should include only these vertices, which you can give to induced_subgraph().
Then arises another interesting problem. Not all edges between your choosen vertices are part of your shortest paths. I'm not sure about what you desire in your sub-graph, but I assume that you only want vertices and edges that are part of shortest paths. The manual for induced_subgraph() says that eids can be provided to filter sub-graphs on edges too, but I didn't get that to work. Comments on that are welcome if anyone cracks it. To create a subgraph with only edges and vertices actually in your shortest path, some surplus edges must be deleted.
Below is an example where some key verticies are chosen at random, the surplus-edge problem of subgraphs is visualized, and a proper shortert-paths-only subgraph is generated:
library(igraph)
N <- 40 # Number of vertices in a random network
E <- 70 # Number of edges in a random network
K <- 5 # Number of KEY vertices between which we are to calculate the
# shortest paths and extract a sub-graph.
# Make a random network
g <- erdos.renyi.game(N, E, type="gnm", directed = FALSE, loops = FALSE)
V(g)$label <- NA
V(g)$color <- "white"
V(g)$size <- 8
E(g)$color <- "gray"
# Choose some random verteces and mark them as KEY vertices
key_vertices <- sample(1:N, 5)
g <- g %>% set_vertex_attr("color", index=key_vertices, value="red")
g <- g %>% set_vertex_attr("size", index=key_vertices, value=12)
# Find shortest paths between two vertices in vector x:
get_path <- function(x){
# Get atomic vector of two key verteces and return their shortest path as vector.
i <- x[1]; j <- x[2]
# Check distance to see if any verticy is outside component. No possible
# connection will return infinate distance:
if(distances(g,i,j) == Inf){
path <- c()
} else {
path <- unlist(shortest_paths(g, i, j, mode="all", weights=NULL)$vpath)
}
}
# List pairs of key vertices between which we need the shortest path
key_el <- expand.grid(key_vertices, key_vertices)
key_el <- key_el[key_el$Var1 != key_el$Var2,]
# Get all shortest paths between each pair of key_vertices:
paths <- apply(key_el, 1, get_path)
# These are the vertices BETWEEN key vertices - ON the shortest paths between them:
path_vertices <- setdiff(unique(unlist(paths)), key_vertices)
g <- g %>% set_vertex_attr("color", index=path_vertices, value="gray")
# Mark all edges of a shortest path
mark_edges <- function(path, edges=c()){
# Get a vector of id:s of connected vertices, find edge-id:s of all edges between them.
for(n in 1:(length(path)-1)){
i <- path[n]
j <- path[1+n]
edge <- get.edge.ids(g, c(i,j), directed = TRUE, error=FALSE, multi=FALSE)
edges <- c(edges, edge)
}
# Return all edges in this path
(edges)
}
# Find all edges that are part of the shortest paths between key vertices
key_edges <- lapply(paths, function(x) if(length(x) > 1){mark_edges(x)})
key_edges <- unique(unlist(key_edges))
g <- g %>% set_edge_attr("color", index=key_edges, value="green")
# This now shoes the full graph and the sub-graph which will be created
plot(g)
# Create sub-graph:
sg_vertices <- sort(union(key_vertices, path_vertices))
unclean_sg <- induced_subgraph(g, sg_vertices)
# Note that it is essential to provide both a verticy AND an edge-index for the
# subgraph since edges between included vertices do not have to be part of the
# calculated shortest path. I never used it before, but eids=key_edges given
# to induced_subgraph() should work (even though it didn't for me just now).
# See the problem here:
plot(unclean_sg)
# Kill edges of the sub-graph that were not part of shortest paths of the mother
# graph:
sg <- delete.edges(unclean_sg, which(E(unclean_sg)$color=="gray"))
# Plot a comparison:
l <-layout.auto(g)
layout(matrix(c(1,1,2,3), 2, 2, byrow = TRUE))
plot(g, layout=l)
plot(unclean_sg, layout=l[sg_vertices,]) # cut l to keep same layout in subgraph
plot(sg, layout=l[sg_vertices,]) # cut l to keep same layout in subgraph

Solving Chinese Postman algorithm with eulerization

I'm would like to solve Chinese Postman problem in a graph where an eulerian cycle does not exist. So basically I'm looking for a path in a graph which visits every edge exactly once, and starts and ends at the same node. A graph will have an euler cycle if and only if every node has same number of edges entering into and going out of it. Obviously my graph doesn't .
I found out that Eulerization (making a graph Eulerian) could solve my question LINK. Can anyone suggest a script to add duplicate edges to a graph so that the resulting graph has no vertices of odd degree (and thus does have an Euler Circuit)?
Here is my example:
require(igraph)
require(graph)
require(eulerian)
require(GA)
g1 <- graph(c(1,2, 1,3, 2,4, 2,5, 1,5, 3,5, 4,7, 5,7, 5,8, 3,6, 6,8, 6,9, 9,11, 8,11, 8,10, 8,12, 7,10, 10,12, 11,12), directed = FALSE)
mat <- get.adjacency(g1)
mat <- as.matrix(mat)
rownames(mat) <- LETTERS[1:12]
colnames(mat) <- LETTERS[1:12]
g2 <- as(graphAM(adjMat=mat), "graphNEL")
hasEulerianCycle(g2)
Fun problem.
The graph you sugest in the code above, can be made to have duplicates that enable a eulerian cycle to be created. The function I provide below tries to add the minimum amount of duplicate edges, but also readily breaks the graph structure by adding new links if it has to.
You can run:
eulerian.g1 <- make.eulerian(g1)$graph
Check what the function did to your graph with:
make.eulerian(g1)$info
Bare in mind that:
This is not the only graph structure where duplicates added to the original g1 graph can form an eulerian cycle. Imagine for example my function looping the vertices of the graph backwards instead.
Your graph already has an uneven number of vertices with uneven degree, and all of the vertices that are, have neighbours with uneven degrees to pair them with. This function therefore works well four your particular example data.
The function could fail to produce a graph using only duplicates even in graphs where eulerian cycles are possible with correctly added duplicates. This is since it always goes for connecting a node with the first of its neighbours with uneven degree. If this is something that you'd absolutely like to get around, an MCMC-approach would be the way to go.
See also this excellent answer on probability calculation:
Here's my function in a full script that you can source out-of-the-box:
library(igraph)
# You asked about this graph
g1 <- graph(c(1,2, 1,3, 2,4, 2,5, 1,5, 3,5, 4,7, 5,7, 5,8, 3,6, 6,8, 6,9, 9,11, 8,11, 8,10, 8,12, 7,10, 10,12, 11,12), directed = FALSE)
# Make a CONNECTED random graph with at least n nodes
connected.erdos.renyi.game <- function(n,m){
graph <- erdos.renyi.game(n,m,"gnm",directed=FALSE)
graph <- delete_vertices(graph, (degree(graph) == 0))
}
# This is a random graph
g2 <- connected.erdos.renyi.game(n=12, m=16)
make.eulerian <- function(graph){
# Carl Hierholzer (1873) had explained how eulirian cycles exist for graphs that are
# 1) connected, and 2) contain only vertecies with even degrees. Based on this proof
# the posibility of an eulerian cycle existing in a graph can be tested by testing
# on these two conditions.
#
# This function assumes a connected graph.
# It adds edges to a graph to ensure that all nodes eventuall has an even numbered. It
# tries to maintain the structure of the graph by primarily adding duplicates of already
# existing edges, but can also add "structurally new" edges if the structure of the
# graph does not allow.
# save output
info <- c("broken" = FALSE, "Added" = 0, "Successfull" = TRUE)
# Is a number even
is.even <- function(x){ x %% 2 == 0 }
# Graphs with an even number of verticies with uneven degree will more easily converge
# as eulerian.
# Should we even out the number of unevenly degreed verticies?
search.for.even.neighbor <- !is.even(sum(!is.even(degree(graph))))
# Loop to add edges but never to change nodes that have been set to have even degree
for(i in V(graph)){
set.j <- NULL
#neighbors of i with uneven number of edges are good candidates for new edges
uneven.neighbors <- !is.even(degree(graph, neighbors(graph,i)))
if(!is.even(degree(graph,i))){
# This node needs a new connection. That edge e(i,j) needs an appropriate j:
if(sum(uneven.neighbors) == 0){
# There is no neighbor of i that has uneven degree. We will
# have to break the graph structure and connect nodes that
# were not connected before:
if(sum(!is.even(degree(graph))) > 0){
# Only break the structure if it's absolutely nessecary
# to force the graph into a structure where an euclidian
# cycle exists:
info["Broken"] <- TRUE
# Find candidates for j amongst any unevenly degreed nodes
uneven.candidates <- !is.even(degree(graph, V(graph)))
# Sugest a new edge between i and any node with uneven degree
if(sum(uneven.candidates) != 0){
set.j <- V(graph)[uneven.candidates][[1]]
}else{
# No candidate with uneven degree exists!
# If all edges except the last have even degrees, thith
# function will fail to make the graph eulerian:
info["Successfull"] <- FALSE
}
}
}else{
# A "structurally duplicated" edge may be formed between i one of
# the nodes of uneven degree that is already connected to it.
# Sugest a new edge between i and its first neighbor with uneven degree
set.j <- neighbors(graph, i)[uneven.neighbors][[1]]
}
}else if(search.for.even.neighbor == TRUE & is.null(set.j)){
# This only happens once (probably) in the beginning of the loop of
# treating graphs that have an uneven number of verticies with uneven
# degree. It creates a duplicate between a node and one of its evenly
# degreed neighbors (if possible)
info["Added"] <- info["Added"] + 1
set.j <- neighbors(graph, i)[ !uneven.neighbors ][[1]]
# Never do this again if a j is correctly set
if(!is.null(set.j)){search.for.even.neighbor <- FALSE}
}
# Add that a new edge to alter degrees in the desired direction
# OBS: as.numeric() since set.j might be NULL
if(!is.null(set.j)){
# i may not link to j
if(i != set.j){
graph <- add_edges(graph, edges=c(i, set.j))
info["Added"] <- info["Added"] + 1
}
}
}
# return the graph
(list("graph" = graph, "info" = info))
}
# Look at what we did
eulerian <- make.eulerian(g1)
eulerian$info
g <- eulerian$graph
par(mfrow=c(1,2))
plot(g1)
plot(g)

choose neighborhood that have specific edge weights in igraph R

I have a weighted, undirected network with weights 1 and 2. I need to choose the neighbors of all the vertices that are up to 5 step away. However the 5 steps should include 2 edges with weight=2. For example, if all the 5 edges have weight 1, these neighbors should be excluded.
Question: How do I choose neighbors that are connected with specific edge weights?
Code:
matrix= matrix(as.integer(runif(100,0,3)), 10, 10)
matrix
ntwrk=graph.adjacency(matrix,weighted=TRUE, mode="undirected")
neighborhood(ntwrk,5)
Now, I need to figure out which one of those include edges with weight=2. Then, I need to keep only those neighbors, and measure the neighborhood size with neighborhood.size
The result of neighborhood is a list of of list of edges. You can use lapply and filter each list using the attribute weight.
res.tokeep <- lapply(res, function(x) which(E(ntwrk)[x]$weight==2))
Here a complete example , where I plot the graph before and after the weight filter.
library(igraph)
set.seed(1)
mat = matrix(as.integer(runif(10*10,0,3)), 10, 10)
ntwrk=graph.adjacency(mat,weighted=TRUE, mode="undirected")
res <- neighborhood(ntwrk,5)
op <- par(mfrow=c(1,2))
E(ntwrk)$label <- E(ntwrk)$weight
plot(ntwrk)
res.tokeep <- lapply(res, function(x) which(E(ntwrk)[x]$weight==2))
res.todelete <- lapply(res, function(x) which(E(ntwrk)[x]$weight!=2))
ntwrk <- delete.vertices(ntwrk, unique(unlist(res.todelete)))
plot(ntwrk)
par(mfrow=op)

2nd Degree Connections in igraph

I think have this working correctly, but I am looking to mimic something similar to Facebook's Friend suggestion. Simply, I am looking to find 2nd degree connections (friends of your friends that you do not have a connection with). I do want to keep this as a directed graph and identify the 2nd degree outward connections (the people your friends connect to).
I believe my dummy code is achieving this, but since the reference is on indices and not vertex labels, I was hoping you could help me modify the code to return useable names.
### create some fake data
library(igraph)
from <- sample(LETTERS, 50, replace=T)
to <- sample(LETTERS, 50, replace=T)
rel <- data.frame(from, to)
head(rel)
### lets plot the data
g <- graph.data.frame(rel)
summary(g)
plot(g, vertex.label=LETTERS, edge.arrow.size=.1)
## find the 2nd degree connections
d1 <- unlist(neighborhood(g, 1, nodes="F", mode="out"))
d2 <- unlist(neighborhood(g, 2, nodes="F", mode="out"))
d1;d2;
setdiff(d2,d1)
Returns
> setdiff(d2,d1)
[1] 13
Any help you can provide will be great. Obviously I am looking to stay within R.
You can index back into the graph vertices like:
> V(g)[setdiff(d2,d1)]
Vertex sequence:
[1] "B" "W" "G"
Also check out ?V for ways to get at this type of info through direct indexing.
You can use the adjacency matrix $G$ of the graph $g$ (no latex here?). One of the properties of the adjacency matrix is that its nth power gives you the number of $n$-walks (paths of length n).
G <- get.adjacency(g)
G2 <- G %*% G # G2 contains 2-walks
diag(G2) <- 0 # take out loops
G2[G2!=0] <- 1 # normalize G2, not interested in multiplicity of walks
g2 <- graph.adjacency(G2)
An edge in graph g2 represents a "friend-of-a-friend" bond.

Resources