Using igraph subgraph_isomorphisms to find given network motifs - r

I'm looking for motifs of size 5 in graphs with less than 5000 nodes and less than 10000 edges. (everything uncolored)
To do this I use function provided in igraph library for R subgraph_isomorphisms using method vf2 (see example below). I use adjacency matrix to generate subgraph and edgelist to generate the graph itself.
A lot of isomorphic subgraphs that I find have extra edges. Is there any way to only find subgraphs with exact given structure? Looking for answers using igraph or any other library in R
See reproducible example below (looking at this example is way easier if you just draw graph given by this adjacency matrix on a piece of paper)
library(igraph)
subgraph <- matrix(
data = c(0, 1,
1, 0), ncol = 2)
graph <- matrix(
data = c(0, 1, 0, 0,
1, 1, 0, 1,
1, 0, 0, 1,
0, 0, 1, 0), ncol = 4)
subgraph <- graph_from_adjacency_matrix(subgraph, mode = "directed", weighted = T, diag = T)
graph <- graph_from_adjacency_matrix(graph, mode = "directed", weighted = T, diag = T)
subgraph_isomorphisms(subgraph, graph, method = "vf2")
Output gives you two pairs of (1,2) and (3,4), when in fact adjacency matrix of (1,2) looks like
(0 1)
(1 1)
Which is different from the one we were looking for

The answer to this question is in definitions of what I'm looking for and what I'm finding.
What I was looking for was network motifs of size 5. When I'm looking for network motifs from the graph theory perspective it means that I'm looking for induced subgraphs with given adjacency matrix.
What this function does is it finds subgraphs of a graph that are isomorphic to a given graph. The difference is I was looking for induced subgraph, whereas the function just gives subgraphs, so extra edges are allowed.
That is exactly the problem that I was experiencing. To deal with it I ended up just comparing adjacency matrix of subgraphs that I found with those of the motif. Hope it will be helpful to someone.

Adding to the previous comment, I also noticed that the function returns "True" when I try to find an isomorphic triad of type 210 (2 mutual edges and 1 asymmetric) within a complete graph of four vertices. The solution is to add:
subgraph_isomorphisms(subgraph, graph, method = "vf2", **induced = TRUE**)

Related

Calculating the percentage of overlap between polytopes (n-dimensions)

I have to figure out the percentage of overlap between polytopes in n-dimensional spaces, where my only available source of reference is a set of randomly sampled points within those polytopes.
Assume that the following two R objects are two sets of randomly sampled points from two different polytopes in 5 dimensions:
one <- matrix(runif(5000, min = 0, max = 5), ncol = 5)
two <- matrix(runif(5000, min = 0, max = 4), ncol = 5)
In this example, I selected a smaller range for the second object, so we know that there should be less than 10% overlap. Let me know if I am wrong.
EDIT:
Just to make it really clear, the question is what is the percentage of overlap between those two objects?
I need a method that generalizes to n-dimensional spaces.
This stackoverflow question is somewhat similar to what I am trying to do, but I didn't manage to get it to work.
So, the most straightforward way is to use the hypervolume package.
library(hypervolume)
one <- hypervolume(matrix(runif(5000, min = 0, max = 5), ncol = 5))
two <- hypervolume(matrix(runif(5000, min = 0, max = 4), ncol = 5))
three = hypervolume_set(one, two, check.memory=FALSE)
get_volume(three)
This will get you the volume.
hypervolume_overlap_statistics(three)
This function will output four different metrics, one if which is the Jaccard Similarity Index.
The Jaccard Similarity is the proportion of overlap between the two sample sets (the intersection divided by the union).
Alternatives
Chris suggested volesti as an alternative. Another alternative would be the geometry package.
They do not calculate the proportion straight away. Here you need to find the intersection (e.g. intersectn in geometry, VpolytopeIntersection in volesti), then calculate the volume for the polytopes separately and also their intersection, then you need to divide the volume of the intersection with the sum of the volumes for the two polytopes.
Here, they are also using a different method to calculate the volume and it might be more appropriate for you if you are trying to construct convex hulls in an n-dimensional space. For me, hypervolume is a better solution, because I am doing something more akin to Hutchinson’s n-dimensional hypervolume concept from ecology and evolutionary biology.

R igraph: Barabasi-Model start.graph too many vertices

I have a large network, which I want to use as a "start.graph" for my Barabasi-Albert-Model, but unfortunately I get this Error.
sample_pa(100, power = 1, m = 2, start.graph = large_network)
Error in sample_pa(100, power = 1, m = 2, start.graph = igraph_worm_traffic_colored[[1]]) :
At games.c:519 : Starting graph has too many vertices, Invalid value
Is there any way to change the maximal number of vertices?
Your error is because you need to have more vertices in your output graph than in your starting graph in order for the BA-model to work. You can take a subgraph of your large network if you want to use it to produce a 100-vertex graph.
g2<-induced.subgraph(large.network, sample(V(large.network), 20))
Or you can increase the number of vertices in your output graph.

Generating small world network with fixed degree

I would like to generate a small world network with a fixed degree of 10.
I have tried watts.strogatz.game(1, 100, 5, 0) which results in a degree of 10 but only 5 neighbours for each node.
I'm guessing this is because the network is undirected. is there any way to make it undirected?
The igraph package contains many functions to create and manipulate graphs.
In particular, the get.edgelist function returns the list of edges, in the format you want.
The erdos.renyi.game function, when you set the probability to 1, generates complete graphs.
library(igraph)
g1 <- erdos.renyi.game(5, 1)
plot(g1)
get.edgelist(g1)
The degree.sequence.game function generates random graphs with a prescribed degree distribution.
g2 <- degree.sequence.game( c(3,3,3,2,1,1,1), method="vl" )
plot(g2)
The watts.strogatz.game function generates small-world networks.
g <- watts.strogatz.game(1, 100, 5, 0.05)
plot(g)

Draw Network in R (control edge thickness plus non-overlapping edges)

I need to draw a network with 5 nodes and 20 directed edges (an edge connecting each 2 nodes) using R, but I need two features to exist:
To be able to control the thickness of each edge.
The edges not to be overlapping (i.e.,the edge form A to B is not drawn over the edge from B to A)
I've spent hours looking for a solution, and tried many packages, but there's always a problem.
Can anybody suggest a solution please and provide a complete example as possible?
Many Thanks in advance.
If it is ok for the lines to be curved then I know two ways. First I create an edgelist:
Edges <- data.frame(
from = rep(1:5,each=5),
to = rep(1:5,times=5),
thickness = abs(rnorm(25)))
Edges <- subset(Edges,from!=to)
This contains the node of origin at the first column, node of destination at the second and weight at the third. You can use my pacake qgraph to plot a weighted graph using this. By default the edges are curved if there are multiple edges between two nodes:
library("qgraph")
qgraph(Edges,esize=5,gray=TRUE)
However this package is not really intended for this purpose and you can't change the edge colors (yet, working on it:) ). You can only make all edges black with a small trick:
qgraph(Edges,esize=5,gray=TRUE,minimum=0,cut=.Machine$double.xmin)
For more control you can use the igraph package. First we make the graph:
library("igraph")
g <- graph.edgelist(as.matrix(Edges[,-3]))
Note the conversion to matrix and subtracting one because the first node is 0. Next we define the layout:
l <- layout.fruchterman.reingold(g)
Now we can change some of the edge parameters with the E()function:
# Define edge widths:
E(g)$width <- Edges$thickness * 5
# Define arrow widths:
E(g)$arrow.width <- Edges$thickness * 5
# Make edges curved:
E(g)$curved <- 0.2
And finally plot the graph:
plot(g,layout=l)
While not an R answer specifically, I would recommend using Cytoscape to generate the network.
You can automate it using a RCytoscape.
http://bioconductor.org/packages/release/bioc/html/RCytoscape.html
The package informatively named 'network' can draw directed networks fairly well, and handle your issues.
ex.net <- rbind(c(0, 1, 1, 1), c(1, 0, 0, 1), c(0, 0, 0, 1), c(1, 0, 1, 0))
plot(network(ex.net), usecurve = T, edge.curve = 0.00001,
edge.lwd = c(4, rep(1, 7)))
The edge.curve argument, if set very low and combined with usecurve=T, separates the edges, although there might be a more direct way of doing this, and edge.lwd can take a vector as its argument for different sizes.
It's not always the prettiest result, I admit. But it's fairly easy to get decent looking network plots that can be customized in a number of different ways (see ?network.plot).
The 'non overlapping' constraint on edges is the big problem here. First, your network has to be 'planar' otherwise it's impossible in 2-dimensions (you cant connect three houses to gas, electric, phone company buildings without crossovers).
I think an algorithm for planar graph layout essentially solves the 4-colour problem. Have fun with that. Heuristics exist, search for planar graph layout, and force-directed, and read Planar Graph Layouts

Network Modularity Calculations in R

The equation for Network Modularity is given on its wikipedia page (and in reputable books). I want to see it working in some code. I have found this is possible using the modularity library for igraph used with R (The R Foundation for Statistical Computing).
I want to see the example below (or a similar one) used in the code to calculate the modularity. The library gives on example but it isn't really what I want.
Let us have a set of vertices V = {1, 2, 3, 4, 5} and edges E = {(1,5), (2,3), (2,4), (2,5) (3,5)} that form an undirected graph.
Divide these vertices into two communities: c1 = {2,3} and c2 = {1,4,5}. It is the modularity of these two communities that is to be computed.
library(igraph)
g <- graph(c(1,5,2,3,2,4,2,5,3,5))
membership <- c(1,2,2,1,1)
modularity(g, membership)
Some explanation here:
The vector I use when creating the graph is the edge list of the graph. (In igraph versions older than 0.6, we had to subtract 1 from the numbers because igraph uses zero-based vertex indices at that time, but not any more).
The i-th element of the membership vector membership gives the index of the community to which vertex i belongs.

Resources