Shortest path connecting multiple nodes in iGraph R - r

I have an iGraph network in R and would like to find the shortest path connecting multiple nodes in my network (let's say nodes 1,3,4,7). Is there a function that can do that? Something like all_simple_paths but for one global solution?
The solution should look something like the path highlighted in yellow. Note that 1->2->4 is not selected even though it is just as short as 1->3->4.
library(igraph)
tree <- graph.tree(n = 8, children = 2, mode = "out")
tree <- add_edges(tree, c(3,4, 3,5))
plot(tree)

After doing some digging I think I found the answer to my own question. What I was describing is a variation of the minimum spanning tree problem called the Steiner tree problem.
Given a weighted graph G = (V, E), a subset S ⊆ V of the vertices, and a root r ∈ V , we want to find a minimum weight tree which connects all the vertices in S to r. [ref]
Turns out there is a R package called SteinerNet created specifically for these types of problems. I had trouble installing their package directly but was able to copy the relevant source code from their GitHub repo.
out <- steinertree(type = "KB", terminals = c(1,3,4,7), graph = tree)
The package does exactly what I wanted to do, and it even produced a pretty graph!
>out[[2]]
IGRAPH fbb52e5 UN-- 4 3 -- Tree
+ attr: name (g/c), children (g/n), mode (g/c), name (v/c), color (v/c)
+ edges from fbb52e5 (vertex names):
[3] 1--3 3--4 3--7
>plot(out[[1]])

Related

How to select a graph vertex with igraph in R?

In Python you can simply use graph.select() (atleast when reading the documentation: https://igraph.org/python/doc/api/igraph.VertexSeq.html) to select a vertex based on a value. I have a huge graph connecting movies that share an actor. However I would simply like to select one vertex from that graph and then plot it with its direct neighbors. I'm running into issues when I want to select just that single vertex.
I had hoped something like worked
graph.movies.select('Snatch (2000)')
But no luck.
Another approach I took was grabbing the single Snatch vertex by filtering all others out and then adding the edges.
snatch.graph <- induced.subgraph(g.movies, vids=V(g.movies)$name == 'Snatch (2000)')
snatch.edges <- edges(g.movies, "Snatch (2000)")
add_edges(snatch.graph, snatch.edges$edges)
However this returns an empty graph with only the snatch vertex.
My goal is to grab the Snatch vertex and plot this vertex, its DIRECT neighbors and the edges amomng them. Any suggestions? Thanks alot :D Been stuck o nthis for a quite a while -.-
You can use ?ego to grab the neighbours or ?make_ego_graph to form the graph. (Depending on whether your graph is directed or not, you may need to use the mode argument).
An example:
library(igraph)
# create some data
set.seed(71085002)
n = 10
g = random.graph.game(n, 0.25)
V(g)$name = seq_len(n)
# grab neighbours of node "1"
# set mindist = 0 to include node itself
# set order for the stepsize of the neigbourhood;
nb_g = make_ego_graph(g, order=1, node="1", mindist=0)[[1]]
# plot
# you could use the `layout` argument to
# keep nodes in the same position
op = par(mfrow=c(1,2), oma=rep(0,4))
plot(g, vertex.size=0, vertex.label.cex=3, main="Full graph")
plot(nb_g, vertex.size=0, vertex.label.cex=3, main="Sub graph")
par(op) # reset
If you just need a list of the neighbouring nodes:
ego(g, order=1, node="1", mindist=0)
# [[1]]
# + 4/10 vertices, named, from 00cfa70:
# [1] 1 4 6 9
I think the method using ego (by #user20650) is comprehensive and efficient.
Here is another option if you would like to find the sub-graph built on direct neighbors, which applies distances + induced_subgraph
> induced_subgraph(g, which(distances(g, "1") <= 1))
IGRAPH 99f872b UN-- 4 3 -- Erdos renyi (gnp) graph
+ attr: name (g/c), type (g/c), loops (g/l), p (g/n), name (v/n)
+ edges from 99f872b (vertex names):
[1] 1--4 1--6 1--9

Maximum number of nodes which can be reached from each node in a graph using igraph

I want to find the maximum number of nodes which can be reached from each node in a graph using igraph in R.
For example, I have the following graph:
IGRAPH fb9255f DN-- 4 3 --
+ attr: name (v/c), X (e/l)
+ edges from fb9255f (vertex names):
1 1->2 2->3 2->4
Resulting graph
For node 1, for example, I would like to obtain the list of all the possible
nodes reachable (not only using one hop) from it.
In this case, for node 1 it will be: [2,3,4]
I have read the igraph documentation, but I do not see any function that can help.
Any help will be appreciated.
Thanks
Carlos
You can compute this using the subcomponent function.
Since you do not provide any data, I will illustrate with an arbitrary example.
## Example graph
library(igraph)
set.seed(123)
g = erdos.renyi.game(15, 0.15, directed = TRUE)
plot(g)
subcomponent gives you all of nodes that are reachable. Here, I am assuming that you are using directed graphs and you mean reachable by going forward along the directed edges. You can change this by altering the mode argument to subcomponent.
sort(subcomponent(g, 2, mode="out"))
+ 7/15 vertices:
[1] 2 5 10 12 13 14 15
If you just want the number of nodes that can be reached, just take the length
length(subcomponent(g, 2, mode="out"))
[1] 7

Weird number appears when printing object

When creating an igraph object I get these strange numbers appearing in between IGRAPH and the codes that describe the network properties (e367cdc, in this object). I've also used the example graphs and they have similar strange codes. The graph appears to operate OK so not a problem - I'm just curious, that's all.
library(igraph)
g1 <- graph( edges=c(1,2, 2,3, 3, 1), n=3, directed=F )
g1
IGRAPH e367cdc U--- 3 3 -- + edges from e367cdc:[1] 1--2 2--3 1--3
I'm using igraph version 1.2.1.
the info is as follows:
IGRAPH 69d704e = Graph number randomly generated.
UN-- = Undirected Named
15554 = Number of nodes
109746 = Number of edges
attr: name (v/c), disease (v/n), hub (v/n), ptype (v/c), comp (v/n) + edges from 69d704e (vertex names) = attractors, this is detailed information enclosed on your network.

igraph: identifying nodes linking two subgraphs in igraph (R)

I have a list of networks of family names. Each network is composed of two different subgraphs (for instance, net_277_278 is composed of subgraphs sg_277 and sg_278). (I induced the subgraphs using a complicated mix of the gsub and induced_subgraph functions).
All the edges in the same subgraph have the form Johnson_278--Smith_278, indicating a link between two surnames belonging to the same subgraph. Links between the two subgraphs are given by the same surname (but have a different subscript). A link between the two subgraphs looks like this: Johnson_277--Johnson_278.
For each subgraph, I want to to compute some network measures (e.g., centrality), but only for the nodes that are directly connected to the other subgraph. (Note that I do not want to induce new subgraphs, since this would alter the measures). For instance, for sg_277 I would like to compute some measures only for Johnson_277, if that's the only node with a link to the sg_278.
As a small example, this is one of the networks:
net[10] #277_278
$11_277_278
IGRAPH UN-- 9 6 --
+ attr: name (v/c)
+ edges (vertex names):
[1] SANCHEZ_110277--SANCHEZ_110278 SANCHEZ_110277--PANTOJA_110277 SANCHEZ_110278--GALVAN_110278
[4] PEREZ_110278 --VEGA_110278 PEREZ_110278 --OLVERA_110278 PATIÑO_110278 --SERRANO_110278
Graphically, it looks like this
And this is one of the subgraphs:
sgraph1[[10]] #277
IGRAPH UN-- 2 1 --
+ attr: name (v/c)
+ edge (vertex names):
[1] SANCHEZ_110277--PANTOJA_110277
In this case, I would like to run the function for eigenvector centrality (evcent(sgraph1[[10]])$vector), only for the nodes linking the two subgraphs (in this case, just SANCHEZ_110277, which, as can be seen in the full network, has a link to SANCHEZ_110278.
Is there a way to do this? I have been trying stuff using regular expressions, neighbors and ego() functions but none of these seem to help.

Network Modularity Calculations in R

The equation for Network Modularity is given on its wikipedia page (and in reputable books). I want to see it working in some code. I have found this is possible using the modularity library for igraph used with R (The R Foundation for Statistical Computing).
I want to see the example below (or a similar one) used in the code to calculate the modularity. The library gives on example but it isn't really what I want.
Let us have a set of vertices V = {1, 2, 3, 4, 5} and edges E = {(1,5), (2,3), (2,4), (2,5) (3,5)} that form an undirected graph.
Divide these vertices into two communities: c1 = {2,3} and c2 = {1,4,5}. It is the modularity of these two communities that is to be computed.
library(igraph)
g <- graph(c(1,5,2,3,2,4,2,5,3,5))
membership <- c(1,2,2,1,1)
modularity(g, membership)
Some explanation here:
The vector I use when creating the graph is the edge list of the graph. (In igraph versions older than 0.6, we had to subtract 1 from the numbers because igraph uses zero-based vertex indices at that time, but not any more).
The i-th element of the membership vector membership gives the index of the community to which vertex i belongs.

Resources