i'm now using lesmis.gml to do network analysis homework.
I can't adjust graph node's distance: there's more than 70 nodes and the nodes are too close.
graph is variable g and g2.
graph looks weird like this.(image)
here's my code using R.
I tried to use Gephi, but my laptop doesn't run it well. It shuts off.
install.packages('igraph')
install.packages('statnet')
library('igraph')
library('statnet')
g<-read.graph("lesmis.gml", format=c("gml"))
g
graph.density(g)
igraph::degree(g,mode="out")
plot(g)
vcount(g)
centralization.degree(g)
V(g)$size<-igraph::degree(g)*5
plot(g)
clo<-igraph::closeness(g)
clo
clo.score<-round((clo-min(clo))*length(clo)/max(clo))+1
clo.colors<-rev(heat.colors(max(clo.score)))
V(g)$color<-clo.colors[clo.score]
plot(g)
btw<-igraph::betweenness(g)
btw
btw.score<-round(btw)+1
btw.score
btw.colors<-rev(heat.colors(max(btw.score)))
V(g)$color<-btw.colors[btw.score]
plot(g)
clusters(g)
clusters(g)$csize
cliques(g)
sapply(cliques(g), length)
largest_cliques(g)
cliques(g)
sapply(cliques(g),length)
a<-largest_cliques(g)
a
clique1<-a[[1]]
g2<-induced.subgraph(graph=g,vids=clique1)
plot(g2)
vcol<-rep("grey80",vcount(g))
vcol[unlist(largest_cliques(g))]<-"gold"
plot(as.undirected(g),vertex.lavel=V(g)$name, vertex.color=vcol)
windows()
I have two suggestions. Before presenting them, I will set up the basics so that what I do is (mostly) repeatable. This is just a streamlined version of what you had in your code, with a change to the vertex size as you had it.
library(igraph)
g<-read.graph("temp/lesmis.gml", format=c("gml"))
V(g)$size<-igraph::degree(g)/2
btw<-igraph::betweenness(g)
btw.score<-round(btw)+1
btw.colors<-rev(heat.colors(max(btw.score)))
V(g)$color<-btw.colors[btw.score]
I think that this is what #nhl was suggesting. There are quite a few layout functions in igraph. Just try a bunch of them and see what looks good. I kind of liked the large graph layout.
set.seed(1234)
LO_LGL = layout_with_lgl(g)
plot(as.undirected(g), layout=LO_LGL, margin=c(-0.25,-0.25))
Once you get something that is pretty close, you might try using tkplot which will allow you to select nodes and move them around to make the graph more readable.
tkplot(as.undirected(g), layout=LO_LGL)
I used the previous layout as a starting place and adjusted the vertices by hand to make the graph clearer. It is not perfect, but you can see some of the communities.
Related
There is a social network graph with more than 6000 vertices and tens of thousands of edges. It is drawn directly by Python networkx as shown in the figure. How to make her look better and clearer, and make interactive information more obvious?
The disorganized graph
And what I have done is try to decrease all those transparencies or make the line thinner. But it looks like the result is still disorganized. Because my screen is fixed, no matter what I do, the lines are very dense just like the pic before.
My little thinking is I don't try to make all lines clear but to make some lines clear. It may be like, increasing the opacity of some lines which are more important or more prominent structures, decreasing the opacity of other lines, like to calculate the edge_betweenness value for every edge and using this to set the transparency of each line. The code is below:
G = nx.read_gml('data\\lesmis.gml')
cent = edge_betweenness_centrality(G)
#node_pos = nx.spring_layout(G)
node_pos = nx.kamada_kawai_layout(G)
#node_pos = nx.circular_layout(G)
nx.draw_networkx_nodes(G,node_pos,**options)#draw nodes
[nx.draw_networkx_edges(G,node_pos,edgelist=[key],alpha=np.amin([1,value*100]),width=2) for key,value in cent.items()] #loop through edges and draw them
plt.show()
But the result is just so so. Is there anyone who can give me some methods, or just a little advice or possible ideas?
I have no experience using R so apologies if this is a stupid question but I am on a tight deadline and have not been able to understand a clear answer from my own research.
I've just created a network graph and have changed some of the attributes, but others I can't get working:
How do I combine multiple arrows leading into one node into one higher-weighted arrow?
Why are the edges not changing to black, or any other colour that I try?
Is there a way for me to space out the nodes more so that they aren't all on top of each other around the centre, while 80% of the graph space isn't being used?
I've attached my code below, if you could implement the changes into the code exactly as I should write it to work then that would be very much appreciated.
library("igraph")
library("networkD3")
library("tidyverse")
library("igraph")
library("ggraph")
my_data <- read.csv("email-Eu-core-temporal-Dept3-CSV2.csv", header=TRUE)
my_network_frame <- data.frame(my_data$source_node, my_data$target_node)
my_network <- graph.data.frame(my_network_frame, directed=TRUE)
V(my_network)
E(my_network)
plot(my_network, edge.arrow.size=.5, vertex.color="red", vertex.size=10,
vertex.frame.color="black", vertex.label.color="black", vertex.label.cex=0.5,
edge.colour="black", edge.width="0.5")
Thank you.
I'm plotting a directed network from an edge list, and have so far created a tree-like plot (see here).
It looks good, however all of the nodes are too close together. I would like to keep the shape of it while spreading out the nodes more. Here's the code that got me the image above:
library(igraph)
ref <- read.csv("my-ref.csv", as.is=T)
el <- graph.data.frame(ref, directed=T)
lay.kk <- layout.kamada.kawai(el, niter=1000, kkconst=50)
plot.igraph(el, lay=lay.kk, vertex.label=NA, vertex.size=2, vertex.color="black")
I've tried messing around with kkconst, but that doesn't seem to change anything. Any tips are greatly appreciated!
The Kamada-Kawai layout does not really work well for disconnected graphs because the disconnected components tend to "drift away" from each other. Since igraph scales the entire plot to fit within the canvas, the farther the components are from each other, the closer the nodes will be within the components. Try the Fruchterman-Reingold layout instead.
As Tamás suggested, you might get better results with layout.fruchterman.reingold(). You can fine tune this function with the following parameters:
require(igraph)
g <- erdoss.renyi.game(n = 100, p.or.m = 0.04)
lo <- layout.fruchterman.reingold(g, repulserad = vcount(g)^2.8,
area = vcount(g)^2.3, niter = 1000)
plot(g, layout = lo, vertex.size = 3, vertex.frame.color = NULL,
vertex.label.dist = 0.5, vertex.label.cex = 0.7, edge.width = 0.5)
These values resulted a low overlap, clear but compact layout for me. Try to change them a bit, to see their effect on the layout. Those parameters I set for plot() also help to make the visualization more clear.
There is a similar question here to which I have posted the following answer
Option 1: make the vertices smaller
node.size= c(10,10,10)
plot(net, vertex.size=node.size*0.25)
Option 2 (in case the distances between the vertices are not important to you):
# Use the tkplot option to edit your graph in GUI
tkplot (net)
tkplot GUI will allow you to interactively change the layout types.
Note: tkplot outputs the graph as eps. If you want to edit it further or export it to pdf I suggest using inkscape (I use it for all my graph editing - just save the graph as pdf in RStudio and edit it in inkscape).
For the case of eps if you are on a windows machine you will need to tweak inkscape to open this format. A very short and simple process which is detailed here:
Network visualizations become common in science in practice. But as networks are increasing in size, common visualizations become less useful. There are simply too many nodes/vertices and links/edges. Often visualization efforts end up in producing "hairballs".
Some new approaches have been proposed to overcome this issue, e.g.:
Edge bundling:
http://vis.stanford.edu/papers/divided-edge-bundling or
https://gephi.org/tag/edge-bundling/
Hierarchial edge bundling:
http://graphics.cs.illinois.edu/sites/graphics.dev.engr.illinois.edu/files/edgebundles.pdf
Group Attributes Layout:
http://wiki.cytoscape.org/Cytoscape_3/UserManual
How to make grouped layout in igraph?
I am sure that there are many more approaches. Thus, my question is:
How to overcome the hairball issue, i.e. how to visualize large networks by using R?
Here is some code that simulates an exemplary network:
# Load packages
lapply(c("devtools", "sna", "intergraph", "igraph", "network"), install.packages)
library(devtools)
devtools::install_github(repo="ggally", username="ggobi")
lapply(c("sna", "intergraph", "GGally", "igraph", "network"),
require, character.only=T)
# Set up data
set.seed(123)
g <- barabasi.game(1000)
# Plot data
g.plot <- ggnet(g, mode = "fruchtermanreingold")
g.plot
This questions is related to
Visualizing Undirected Graph That's Too Large for GraphViz?. However, here I am searching not for general software recommendations but for concrete examples (using the data provided above) which techniques help to make a good visualization of a large network by using R (comparable to the examples in this thread: R: Scatterplot with too many points).
Another way to visualize very large networks is with BioFabric (www.BioFabric.org), which uses horizontal lines instead of points to represent the nodes. Edges are then shown using vertical line segments. A quick D3 demo of this technique is shown at: http://www.biofabric.org/gallery/pages/SuperQuickBioFabric.html.
BioFabric is a Java application, but a simple R version is available at: https://github.com/wjrl/RBioFabric.
Here is a snippet of R code:
# You need 'devtools':
install.packages("devtools")
library(devtools)
# you need igraph:
install.packages("igraph")
library(igraph)
# install and load 'RBioFabric' from GitHub
install_github('RBioFabric', username='wjrl')
library(RBioFabric)
#
# This is the example provided in the question:
#
set.seed(123)
bfGraph = barabasi.game(1000)
# This example has 1000 nodes, just like the provided example, but it
# adds 6 edges in each step, making for an interesting shape; play
# around with different values.
# bfGraph = barabasi.game(1000, m=6, directed=FALSE)
# Plot it up! For best results, make the PDF in the same
# aspect ratio as the network, though a little extra height
# covers the top labels. Given the size of the network,
# a PDF width of 100 gives us good resolution.
height <- vcount(bfGraph)
width <- ecount(bfGraph)
aspect <- height / width;
plotWidth <- 100.0
plotHeight <- plotWidth * (aspect * 1.2)
pdf("myBioFabricOutput.pdf", width=plotWidth, height=plotHeight)
bioFabric(bfGraph)
dev.off()
Here is a shot of the BioFabric version of the data provided by the questioner, though networks created with values of m > 1 are more interesting. The inset detail shows a close-up of the upper left corner of the network; node BF4 is the highest-degree node in the network, and the default layout is a breadth-first search of the network (ignoring edge directions) starting from that node, with neighboring nodes traversed in order of decreasing node degree. Note that we can immediately see that, for example, about 60% of node BF4's neighbors are degree 1. We can also see from the strict 45-degree lower edge that this 1000-node network has 999 edges, and is therefore a tree.
Full disclosure: BioFabric is a tool that I wrote.
That's an interesting question, I didn't know most of the tools you listed, so thanks. You can add HivePlot to the list. It's a deterministic method consisting in projecting nodes on a fixed number of axes (usually 2 or 3). Look a the linked page, there're many visual examples.
It works better if you have a categorical nodal attribute in your dataset, so that you can use it to select which axis a node goes to. For instance, when studying the social network of a university: students on one axis, teachers on another and administrative staff on the third. But of course, it can also work with a discretized numerical attribute (eg. young, middle-aged and older people on their respective axes).
Then you need another attribute, and it has to be numerical (or at least ordinal) this time. It is used to determine the position of a node on its axis. You can also use some topological measure, such as degree or transitivity (clustering coefficient).
(source: hiveplot.net)
The fact the method is deterministic is interesting, because it allows comparing different networks representing distinct (but comparable) systems. For example, you can compare two universities (provided you use the same attributes/measures to determine axes and position). It also allows describing the same network in various ways, by choosing different combinations of attributes/measures to generate the visualization. This is the recommanded way of visualizing a network, actually, thanks to a so-called hive panel.
Several softwares able of generating those hive plots are listed in the page I mentioned at the beginning of this post, including implementations in Java and R.
I've been dealing with this problem recently. As a result, I've come up with another solution. Collapse the graph by communities/clusters. This approach is similar to the third option outlined by the OP above. As a word of warning, this approach will work best with undirected graphs. For example:
library(igraph)
set.seed(123)
g <- barabasi.game(1000) %>%
as.undirected()
#Choose your favorite algorithm to find communities. The algorithm below is great for large networks but only works with undirected graphs
c_g <- fastgreedy.community(g)
#Collapse the graph by communities. This insight is due to this post http://stackoverflow.com/questions/35000554/collapsing-graph-by-clusters-in-igraph/35000823#35000823
res_g <- simplify(contract(g, membership(c_g)))
The result of this process is the below figure, where the vertices' names represent community membership.
plot(g, margin = -.5)
The above is clearly nicer than this hideous mess
plot(r_g, margin = -.5)
To link communities to original vertices you will need something akin to the following
mem <- data.frame(vertices = 1:vcount(g), memeber = as.numeric(membership(c_g)))
IMO this is a nice approach for two reasons. First, it can in theory deal with any size graph. The process of finding communities can be continuously repeated on collapsed graphs. Second, adopting a interactive approach would yield very readable results. For example, one can imagine the user being able to click on a vertex in the collapsed graph to expand that community revealing all of its original vertices.
I have looked around and found no good solution. My approach has been to remove nodes and play with edge transparency. It is more of a design solution rather than a technical one, but I've been able to plot gephi-like networks of up to 50,000 edges without much complications on my laptop.
with your example:
plot(simplify(g), vertex.size= 0.01,edge.arrow.size=0.001,vertex.label.cex = 0.75,vertex.label.color = "black" ,vertex.frame.color = adjustcolor("white", alpha.f = 0),vertex.color = adjustcolor("white", alpha.f = 0),edge.color=adjustcolor(1, alpha.f = 0.15),display.isolates=FALSE,vertex.label=ifelse(page_rank(g)$vector > 0.1 , "important nodes", NA))
Example of twitter mentions network with 30,000 edges:
Yet another interesting package is networkD3. There are a myriad of means of representing graphs within this library. In particular, I find the forceNetwork an interesting option. It is interactive and therefore allows you to really explore your network. It is great for EDA, but it maybe too "wiggly" for final work.
I tired this pacakge. It's very fast.
if (!require("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("netbiov")
https://www.bioconductor.org/packages/release/bioc/html/netbiov.html
Examples:
https://www.bioconductor.org/packages/release/bioc/vignettes/netbiov/inst/doc/netbiov-intro.pdf
I am trying to build graphs using tree-like data, where nodes typically split into >2 edges. I have tried various layouts, and I see that the layout.reingold.tilford parameter will generate tree-like graphs with non-bifurcating data. However the outputs are not particularly attractive. I would rather use something like the layout.lgl or layout.kamada.kawai since these produce more radial structures. I cannot see how to change the parameters in R such that these trees have no overlapping edges though. Is this possible?
I imported a simple data file in Pajek format, with 355 nodes and 354 edges. I'm currently printing it using:
plot.igraph(g,vertex.size=3,vertex.label=NA,layout=layout.lgl)
This gives me an output like this, which is nice, but still has overlapping edges. I have read that you can manually fix this using tkplot, or another program like cytoscape, however I have quite a few of these to build, and the size of them makes manual correction a hassle.
Many thanks.
Just want to add a comment but my rep is too low. The method that #bdemarest posted does not work on igraph version > 0.7. The newer version does not support the area parameter, so I cannot get the same effect. And getting the old version to build took me a while, so I though I'd share some insights. You can manually install igraph 0.7 from source if you download it from igraph nightly builds. On my machine (Mac OS 10.10), I encountered some problems building it, due to gfortran, so I found this link that solved the problem. Hope that helps anyone who wants to create similar graphs in R.
You may want to try layout.fruchterman.reingold(). It seems to do a good job keeping the edges from crossing. I've tested it with a 355 node version of the barabasi graph suggested by #Tamás.
library(igraph)
g = barabasi.game(355, directed=FALSE)
png("plot1.png", height=6, width=12, units="in", res=200)
par(mfrow=c(1, 2))
plot.igraph(g,vertex.size=3,vertex.label=NA,
layout=layout.fruchterman.reingold(g, niter=10000))
mtext("layout.fruchterman.reingold, area = vcount^2", side=1)
plot.igraph(g,vertex.size=3,vertex.label=NA,
layout=layout.fruchterman.reingold(g, niter=10000, area=30*vcount(g)^2))
mtext("layout.fruchterman.reingold, area = 30 * vcount^2", side=1)
dev.off()
layout.reingold.tilford has a parameter called circular. Setting this to TRUE will convert the final layout into a radial one by treating the X coordinate as the angle (after appropriate rescaling) and the Y coordinate as the radius. Ironically enough, this does not guarantee that there will be no edge crossings in the end, but it works nicely if your subtrees are not exceedingly wide compared to the rest of the graph:
> g <- barabasi.game(100, directed=F)
> layout <- layout.reingold.tilford(g, circular=T)
> plot(g, layout=layout)