Generating small world network with fixed degree - r

I would like to generate a small world network with a fixed degree of 10.
I have tried watts.strogatz.game(1, 100, 5, 0) which results in a degree of 10 but only 5 neighbours for each node.
I'm guessing this is because the network is undirected. is there any way to make it undirected?

The igraph package contains many functions to create and manipulate graphs.
In particular, the get.edgelist function returns the list of edges, in the format you want.
The erdos.renyi.game function, when you set the probability to 1, generates complete graphs.
library(igraph)
g1 <- erdos.renyi.game(5, 1)
plot(g1)
get.edgelist(g1)
The degree.sequence.game function generates random graphs with a prescribed degree distribution.
g2 <- degree.sequence.game( c(3,3,3,2,1,1,1), method="vl" )
plot(g2)
The watts.strogatz.game function generates small-world networks.
g <- watts.strogatz.game(1, 100, 5, 0.05)
plot(g)

Related

Color nodes of a graph created with igraph proportionally to eigenvector centrality of that node

I am using the library igraph in R. I have created an MST graph by using the function mst based on some distance function stored in a dataframe called tree:
gf <- graph_from_data_frame(tree, directed = FALSE)
mstgf <- mst(gf, weights = tree$distance)
I have calculated the eigenvector centrality of each node in the MST as:
ec <- eigen_centrality(mstgf, directed=T, weights=NA)$vector
I have then joined the vector of eigenvector centralities to the data.frame tree:
x <- cbind(names(ec), as.numeric(ec)) %>% as_tibble() %>% mutate(V2 = as.numeric(V2)) %>%
rename(from = V1)
tree <- tree %>% inner_join(.,x, by = "from")
What I want to do is to plot the MST by coloring the nodes in a way to resemble their eigenvector centrality. I am using the following for plotting, yet I don't know how to change the argument vertex.color to obtain something like the image below?
plot.igraph(mstgf,
vertex.color = round(tree$V2,0),
edge.color = "blue",
edge.curved = TRUE,
edge.witdh = 1,
)
Once you've calculated the centralities of your choice, you would like to 1) scale the values to a meaningful categorical range (like 1,2,3,4,5) and 2) associate your centrality categories with colors from a gradient. You don't necessarily have to keep joining and calculating outside igraph.
This is a random network
# Random network
g <- erdos.renyi.game(100,250,'gnm', directed=F)
1) make categories
This forces every eigenvector centrality to assume an integer value between 1 and 10
# Calculate eigen centrality and check the distribution We're attaching the
# result of eigen_centrality() straight onto the vertices as verticy-attributes
V(g)$ec <- eigen_centrality(g, directed=T, weights=NA)$vector
hist(V(g)$ec)
# You could use the scales package, or define this normalisation function:
normalize <- function(x){(x-min(x))/(max(x)-min(x))}
(V(g)$ec_index <- round(normalize(V(g)$ec) * 9) + 1)
#ec_index should now be a category between 1 and 10 for your centralities
You can use any resolution you like.
2) Attach colours from the indexation
There are several packages and ways to load colour-ranges in R (colorspace, colorRamps, RColorBrewer etc).
# Build a color-mapping with 10 categories and set the color of each
# node to the corresponding color of each centrality-measure category
V(g)$color <- colorRampPalette(c("turquoise", "yellow","red"))(10)[V(g)$ec_index]
# Look at what we did
table(V(g)$color)
plot(g, vertex.label=NA, vertex.size=5)
This example should produce something along the lines of this graph here:

cluster walktrap returns three communities, but when plotting they are all on top of each other, with no visible clustering

I've been following documentation tutorials and even lecture tutorials step by step. But for some reason the output of my plot is like this:
The output doesn't make any sense to me. There clearly is no structure, or communities in this current plot, as you can see that the bigger circles are all overlapping. Shouldn't this, in this case, return only a single community? Additionally the modularity of my network is ~0.02 which would again, suggest there is no community structure. But why does it return 3 communities?
this is my code: (exactly same as in documentation, with different dataset)
m <- data.matrix(df)
g <- graph_from_adjacency_matrix(m, mode = "undirected")
#el <- get.edgelist(g)
wc <- cluster_walktrap(g)
modularity(wc)
membership(wc)
plot(wc,g)
my data set looks is a 500x500 adjacency matrix in the form of a csv, with a 1-500 column and index names corresponding to a person.
I tried understanding the community class and using different types of variables for the plot, e.g. membership(wc)[2] etc. My thought is that the coloring is simply wrong, but nothing Ive tried so far seems to fix the issue.
You can have inter-community connections. You're working with a graph of 500 nodes and they can have multiple connections. There will be a large number of connections between nodes of different communities, but if you conduct a random walk you're most likely to traverse connections between nodes of the same community.
If you separate the communities in the plot (using #G5W's code (igraph) Grouped layout based on attribute) you can see the different groups.
set.seed(4321)
g <- sample_gnp(500, .25)
plot(g, vertex.label = '', vertex.size = 5)
wc <- cluster_walktrap(g)
V(g)$community <- membership(wc)
E(g)$weight = 1
g_grouped = g
for(i in unique(V(g)$community)){
groupV = which(V(g)$community == i)
g_grouped = add_edges(g_grouped, combn(groupV, 2), attr=list(weight = 2))
}
l <- layout_nicely(g_grouped)
plot( wc,g, layout = l, vertex.label = '', vertex.size = 5, edge.width = .1)
Red edges are intercommunity connections and black edges are intracommunity edges

revealing clusters of interaction in igraph

I have an interaction network and I used the following code to make an adjacency matrix and subsequently calculate the dissimilarity between the nodes of the network and then cluster them to form modules:
ADJ1=abs(adjacent-mat)^6
dissADJ1<-1-ADJ1
hierADJ<-hclust(as.dist(dissADJ1), method = "average")
Now I would like those modules to appear when I plot the igraph.
g<-simplify(graph_from_adjacency_matrix(adjacent-mat, weighted=T))
plot.igraph(g)
However the only thing that I have found thus far to translate hclust output to graph is as per the following tutorial: http://gastonsanchez.com/resources/2014/07/05/Pretty-tree-graph/
phylo_tree = as.phylo(hierADJ)
graph_edges = phylo_tree$edge
graph_net = graph.edgelist(graph_edges)
plot(graph_net)
which is useful for hierarchical lineage but rather I just want the nodes that closely interact to cluster as follows:
Can anyone recommend how to use a command such as components from igraph to get these clusters to show?
igraph provides a bunch of different layout algorithms which are used to place nodes in the plot.
A good one to start with for a weighted network like this is the force-directed layout (implemented by layout.fruchterman.reingold in igraph).
Below is a example of using the force-directed layout using some simple simulated data.
First, we create some mock data and clusters, along with some "noise" to make it more realistic:
library('dplyr')
library('igraph')
library('RColorBrewer')
set.seed(1)
# generate a couple clusters
nodes_per_cluster <- 30
n <- 10
nvals <- nodes_per_cluster * n
# cluster 1 (increasing)
cluster1 <- matrix(rep((1:n)/4, nodes_per_cluster) +
rnorm(nvals, sd=1),
nrow=nodes_per_cluster, byrow=TRUE)
# cluster 2 (decreasing)
cluster2 <- matrix(rep((n:1)/4, nodes_per_cluster) +
rnorm(nvals, sd=1),
nrow=nodes_per_cluster, byrow=TRUE)
# noise cluster
noise <- matrix(sample(1:2, nvals, replace=TRUE) +
rnorm(nvals, sd=1.5),
nrow=nodes_per_cluster, byrow=TRUE)
dat <- rbind(cluster1, cluster2, noise)
colnames(dat) <- paste0('n', 1:n)
rownames(dat) <- c(paste0('cluster1_', 1:nodes_per_cluster),
paste0('cluster2_', 1:nodes_per_cluster),
paste0('noise_', 1:nodes_per_cluster))
Next, we can use Pearson correlation to construct our adjacency matrix:
# create correlation matrix
cor_mat <- cor(t(dat))
# shift to [0,1] to separate positive and negative correlations
adj_mat <- (cor_mat + 1) / 2
# get rid of low correlations and self-loops
adj_mat <- adj_mat^3
adj_mat[adj_mat < 0.5] <- 0
diag(adj_mat) <- 0
Cluster the data using hclust and cutree:
# convert to dissimilarity matrix and cluster using hclust
dissim_mat <- 1 - adj_mat
dend <- dissim_mat %>%
as.dist %>%
hclust
clusters = cutree(dend, h=0.65)
# color the nodes
pal = colorRampPalette(brewer.pal(11,"Spectral"))(length(unique(clusters)))
node_colors <- pal[clusters]
Finally, create an igraph graph from the adjacency matrix and plot it using the fruchterman.reingold layout:
# create graph
g <- graph.adjacency(adj_mat, mode='undirected', weighted=TRUE)
# set node color and plot using a force-directed layout (fruchterman-reingold)
V(g)$color <- node_colors
coords_fr = layout.fruchterman.reingold(g, weights=E(g)$weight)
# igraph plot options
igraph.options(vertex.size=8, edge.width=0.75)
# plot network
plot(g, layout=coords_fr, vertex.color=V(g)$color)
In the above code, I generated two "clusters" of correlated rows, and a third group of "noise".
Hierarchical clustering (hclust + cuttree) is used to assign the data points to clusters, and they are colored based on cluster membership.
The result looks like this:
For some more examples of clustering and plotting graphs with igraph, checkout: http://michael.hahsler.net/SMU/LearnROnYourOwn/code/igraph.html
You haven't shared some toy data for us to play with and suggest improvements to code, but your question states that you are only interested in plotting your clusters distinctly - that is, graphical presentation.
Although igraph comes with some nice force directed layout algorithms, such as layout.fruchterman.reingold, layout_with_kk, etc., they can, in presence of a large number of nodes, quickly become difficult to interpret and make sense of at all.
Like this:
With these traditional methods of visualising networks,
the layout algorithms, rather than the data, determine the visualisation
similar networks may end up being visualised very differently
large number of nodes will make the visualisation difficult to interpret
Instead, I find Hive Plots to be better at displaying important network properties, which, in your instance, are the cluster and the edges.
In your case, you can:
Plot each cluster on a different straight line
order the placement of nodes intelligently, so that nodes with certain properties are placed at the very end or start of each straight line
Colour the edges to identify direction of edge
To achieve this you will need to:
use the ggnetwork package to turn your igraph object into a dataframe
map your clusters to the nodes present in this dataframe
generate coordinates for the straight lines and map these to each cluster
use ggplot to visualise
There is also a hiveR package in R, should you wish to use a packaged solution. You might also find another visualisation technique for graphs very useful: BioFabric

Generate Geometric random graphs in r

I have a graph containing 16809 nodes an 173,393 edges. Now i want to generate a geometric random graph with same number of nodes and edges in r. How can I achieve that.
My example R script is as follows.
library(igraph)
g2 <- graph.formula(A:B - A:C, B:D - C:D , D:F - D:F )
g2
Then how can I model g2 in geometric random graph. Is there any function to create this random graphs in R.
There is a function to generate GRG by known radius and number of nodes, see sample_grg. But the concept of GRGs means that nodes are placed randomly, so the number of edges will be random. The following code gives pretty close results to what you need:
g <- sample_grg(16809, 0.02)
But I don't think there is a way to somehow determine the number of edges.

How to spread out community graph made by using igraph package in R

Trying to find communities in tweet data. The cosine similarity between different words forms the adjacency matrix. Then, I created graph out of that adjacency matrix. Visualization of the graph is the task here:
# Document Term Matrix
dtm = DocumentTermMatrix(tweets)
### adjust threshold here
dtms = removeSparseTerms(dtm, 0.998)
dim(dtms)
# cosine similarity matrix
t = as.matrix(dtms)
# comparing two word feature vectors
#cosine(t[,"yesterday"], t[,"yet"])
numWords = dim(t)[2]
# cosine measure between all column vectors of a matrix.
adjMat = cosine(t)
r = 3
for(i in 1:numWords)
{
highElement = sort(adjMat[i,], partial=numWords-r)[numWords-r]
adjMat[i,][adjMat[i,] < highElement] = 0
}
# build graph from the adjacency matrix
g = graph.adjacency(adjMat, weighted=TRUE, mode="undirected", diag=FALSE)
V(g)$name
# remove loop and multiple edges
g = simplify(g)
wt = walktrap.community(g, steps=5) # default steps=2
table(membership(wt))
# set vertex color & size
nodecolor = rainbow(length(table(membership(wt))))[as.vector(membership(wt))]
nodesize = as.matrix(round((log2(10*membership(wt)))))
nodelayout = layout.fruchterman.reingold(g,niter=1000,area=vcount(g)^1.1,repulserad=vcount(g)^10.0, weights=NULL)
par(mai=c(0,0,1,0))
plot(g,
layout=nodelayout,
vertex.size = nodesize,
vertex.label=NA,
vertex.color = nodecolor,
edge.arrow.size=0.2,
edge.color="grey",
edge.width=1)
I just want to have some more gap between separate clusters/communities.
To the best of my knowledge, you can't layout vertices of the same community close to each other, using igraph only. I have implemented this function in my package NetPathMiner. It seems it is a bit hard to install the package just for the visualization function. I will write the a simple version of it here and explain what it does.
layout.by.attr <- function(graph, wc, cluster.strength=1,layout=layout.auto) {
g <- graph.edgelist(get.edgelist(graph)) # create a lightweight copy of graph w/o the attributes.
E(g)$weight <- 1
attr <- cbind(id=1:vcount(g), val=wc)
g <- g + vertices(unique(attr[,2])) + igraph::edges(unlist(t(attr)), weight=cluster.strength)
l <- layout(g, weights=E(g)$weight)[1:vcount(graph),]
return(l)
}
Basically, the function adds an extra vertex that is connected to all vertices belonging to the same community. The layout is calculated based on the new graph. Since each community is now connected by a common vertex, they tend to cluster together.
As Gabor said in the comment, increasing edge weights will also have similar effect. The function leverages this information, by increasing a cluster.strength, edges between created vertices and their communities are given higher weights.
If this is still not enough, you extend this principle (calculating the layout on a more connected graph) by adding edges between all vertices of the same communities (forming a clique). From my experience, this is a bit of an overkill.

Resources