Calculating the share of isolates in R with igraph - r

I am currently working on an analysis of networks in R. I have run into something that does not seem right to me. First some context:
I have created a network with the igraph package with 7231 observations of 4 variables, using the following code:
MyNetwork <- data.frame(Katalog_G_2000_2018_VOLLEDIG$Zuwendungsempfänger, Katalog_G_2000_2018_VOLLEDIG$Ausführende.Stelle, Katalog_G_2000_2018_VOLLEDIG$typ, Katalog_G_2000_2018_VOLLEDIG$verbund)
Network <- graph.data.frame(MyNetwork, directed=F)
After this, I visualised the network with the following code:
plot(Network,vertex.size=6, edge.arrow.size=0.4, main= "ICT-Networks in Germany 2000-2018", vertex.label.cex = 0.8,vertex.label=NA,vertex.color = "green")
Now, I would like to find out how many vertices without edges there are in my network (i.e. the share of isolates). For this I tried using this code:
V(Network)[igraph::degree(Network, mode = "out")>0 & igraph::degree(Network, mode = "in") > 0]
length(V(Network)[igraph::degree(Network, mode = 'out')>0 & igraph::degree(Network, mode = 'in') > 0])
This is where the problem arises. Running this code, tells me that 4305/4305 vertices have edges, while the visualisation of my network clearly shows that there are vertices without edges.
Could anyone please tell me how to fix this code to find out what the share of isolates in my network is? Any other solutions to this problem (using different codes for example) would we greatly appreciated as well. If you need any additional information in order to answer my question, please let me know (unfortunately I can't share my data set at this point).
Thanks in advance.

If your main graph is all connected, you can count the vertices without edges this way.
library(igraph)
length(decompose.graph(Network,
mode = c("weak", "strong"),
max.comps = NA,
min.vertices = 0)
) - 1

Related

Is the relative neighborhood graph of the cccd library of R working properly or am I doing something wrong?

I have loaded a 2D dataset and I wanted to try some graph-based clustering algorithms, however the relative neighborhood graph I get seems to be missing a lot of edges. Here is the result I get, and you can tell visually that vertices that should be connected with an edge are not. Is the cccd library working correctly or am I doing something wrong?
Here's the code I used:
g <- rng(x = dataset1)
plot.igraph(g, vertex.size = 2, vertex.label = NA)
I'm also having this issue. Setting the "open" option to FALSE seems to help but I still have 1 less edge than expected on the iris UCI dataset (195 vs 196).

Create Weighted edges in igraph instead of multiple single connections

I'm trying to visualize the connections between the institutions in a medical faculty and just can't get the edges to be weighted and displayed thicker or thinner depending on the number of connections.
I've tried to combine the answers I found here playing around with edge.width = E(g)$weight and trying graph.strength(g). But honestly I have no idea what I'm doing. This is the first time I have to use R and I have no experience in programming whatsoever.
library(igraph)
D3 <- read.csv(file.choose(),header=TRUE,row.names = 1)
g <- graph.data.frame(D3, directed=FALSE)
plot(g,
vertex.size=20,
vertex.label.dist=1,
vertex.label.degree=-pi/2,
layout=layout_with_kk)
Igraph plots a network where every single connection is shown. Some institutions have multiple connections between each other which make the graph quite unattractive to look at. Only a Part of the table was used for this picture
My data looks like this and has about 1500 rows:
"1","NEUROLOGIE","MEDINF"
my data
Any help is much appreciated!
Using edge.width = E(g)$weight is the right idea, but you need to get the right weight. graph.strength(g) is a property of the vertices, but you need a weight for the edges. I don't know of a function that directly calculates how many edges there are between two vertices, but it is not hard to write one.
First, get a version of the graph with just one edge between each pair of connected vertices.
g2 = simplify(g)
Now we need to get the right weight for the edges of g2. If an edge connects two vertices, all shortest paths connecting those two vertices will be single edges, so for each edge of the simplified g2, we need to find the number of shortest paths (edges) between those vertices in the original g. Then we can plot.
E(g2)$weight = sapply(E(g2), function(e) {
length(all_shortest_paths(g, from=ends(g2, e)[1], to=ends(g2, e)[2])$res) } )
plot(g2,
vertex.size=15,
vertex.label.dist=0.5,
vertex.label.cex=0.8,
vertex.label.degree=-pi/2,
edge.width=E(g2)$weight,
layout=layout_with_kk,
margin=-0.2)
(I have slightly modified your plot statement to improve readability.)
Thank you so much for your help!! I was nowhere close to that.. To make it more readble I reduced the thickness of the edges and replaced the names with number, this is the code:
library(igraph)
D3 <- read.csv(file.choose(),header=TRUE,row.names = 1)
g <- graph.data.frame(D3, directed=FALSE)
g2 = simplify(g)
E(g2)$weight = sapply(E(g2), function(e) {
length(all_shortest_paths(g, from=ends(g2, e)[1], to=ends(g2, e)[2])$res) } )
tkplot(g2,
vertex.color= "gold",
vertex.label.color="red",
vertex.size=10,
vertex.label.cex=1,
edge.width=E(g2)$weight*0.15,
edge.color="grey",
layout=layout.reingold.tilford,
asp = .5,
margin=-0.95)
Creating:
Reingold.tilford
I find this visualization quite fine because the graph is interactive. Are there other ways to make it even more readable?
Thanks again for the help!
All the best,
Jay

R/ImageJ: Measuring shortest distance between points and curves

I have some experience with R as a statistics platform, but am inexperienced in image based maths. I have a series of photographs (tiff format, px/µm is known) with holes and irregular curves. I'd like to measure the shortest distance between a hole and the closest curve for that particular hole. I'd like to do this for each hole in a photograph. The holes are not regular either, so maybe I'd need to tell the program what are holes and what are curves (ImageJ has a point and segmented line functions).
Any ideas how to do this? Which package should I use in R? Would you recommend another program for this kind of task?
EDIT: Doing this is now possible using sclero package. The package is currently available on GitHub and the procedure is described in detail in the tutorial. Just to illustrate, I use an example from the tutorial:
library(devtools)
install_github("MikkoVihtakari/sclero", dependencies = TRUE)
library(sclero)
path <- file.path(system.file("extdata", package = "sclero"), "shellspots.zip")
dat <- read.ijdata(path, scale = 0.7812, unit = "um")
shell <- convert.ijdata(dat)
aligned <- spot.dist(shell)
plot(aligned)
It is also possible to add sample spot sizes using the functions provided by the sclero package. Please see Section 2.5 in the tutorial.
There's a tool for edge detection written for Image J that might help you first find the holes and the lines, and clarify them. You find it at
http://imagejdocu.tudor.lu/doku.php?id=plugin:filter:edge_detection:start
Playing around with the settings for the tresholding and the hysteresis can help in order to get the lines and holes found. It's difficult to tell whether this has much chance of working without seeing your actual photographs, but a colleague of mine had good results using this tool on FRAP images. I programmed a ImageJ tool that can calculate recoveries in FRAP analysis based on those images. You might get some ideas for yourself when looking at the code (see: http://imagejdocu.tudor.lu/doku.php?id=plugin:analysis:frap_normalization:start )
The only way I know you can work with images, is by using EBImage that's contained in the bioconductor system. The package Rimage is orphaned, so is no longer maintained.
To find the shortest distance: once you have the coordinates of the lines and holes, you can go for the shotgun approach : calculate the distances between all points and the line, and then take the minimum. An illustration about that in R :
x <- -100:100
x2 <- seq(-70,-50,length.out=length(x)/4)
a.line <- list(x = x,
y = 4*x + 5)
a.hole <- list(
x = c(x2,rev(x2)),
y = c(200 + sqrt(100-(x2+60)^2),
rev(200 - sqrt(100-(x2+60)^2)))
)
plot(a.line,type='l')
lines(a.hole,col='red')
calc.distance <- function(line,hole){
mline <- matrix(unlist(line),ncol=2)
mhole <- matrix(unlist(hole),ncol=2)
id1 <- rep(1:nrow(mline),nrow(mhole))
id2 <- rep(1:nrow(mhole), each=nrow(mline))
min(
sqrt(
(mline[id1,1]-mhole[id2,1])^2 +
(mline[id1,2]-mhole[id2,2])^2
)
)
}
Then :
> calc.distance(a.line,a.hole)
[1] 95.51649
Which you can check mathematically by deriving the equations from the circle and the line. This goes fast enough if you don't have millions of points describing thousands of lines and holes.

Draw Network in R (control edge thickness plus non-overlapping edges)

I need to draw a network with 5 nodes and 20 directed edges (an edge connecting each 2 nodes) using R, but I need two features to exist:
To be able to control the thickness of each edge.
The edges not to be overlapping (i.e.,the edge form A to B is not drawn over the edge from B to A)
I've spent hours looking for a solution, and tried many packages, but there's always a problem.
Can anybody suggest a solution please and provide a complete example as possible?
Many Thanks in advance.
If it is ok for the lines to be curved then I know two ways. First I create an edgelist:
Edges <- data.frame(
from = rep(1:5,each=5),
to = rep(1:5,times=5),
thickness = abs(rnorm(25)))
Edges <- subset(Edges,from!=to)
This contains the node of origin at the first column, node of destination at the second and weight at the third. You can use my pacake qgraph to plot a weighted graph using this. By default the edges are curved if there are multiple edges between two nodes:
library("qgraph")
qgraph(Edges,esize=5,gray=TRUE)
However this package is not really intended for this purpose and you can't change the edge colors (yet, working on it:) ). You can only make all edges black with a small trick:
qgraph(Edges,esize=5,gray=TRUE,minimum=0,cut=.Machine$double.xmin)
For more control you can use the igraph package. First we make the graph:
library("igraph")
g <- graph.edgelist(as.matrix(Edges[,-3]))
Note the conversion to matrix and subtracting one because the first node is 0. Next we define the layout:
l <- layout.fruchterman.reingold(g)
Now we can change some of the edge parameters with the E()function:
# Define edge widths:
E(g)$width <- Edges$thickness * 5
# Define arrow widths:
E(g)$arrow.width <- Edges$thickness * 5
# Make edges curved:
E(g)$curved <- 0.2
And finally plot the graph:
plot(g,layout=l)
While not an R answer specifically, I would recommend using Cytoscape to generate the network.
You can automate it using a RCytoscape.
http://bioconductor.org/packages/release/bioc/html/RCytoscape.html
The package informatively named 'network' can draw directed networks fairly well, and handle your issues.
ex.net <- rbind(c(0, 1, 1, 1), c(1, 0, 0, 1), c(0, 0, 0, 1), c(1, 0, 1, 0))
plot(network(ex.net), usecurve = T, edge.curve = 0.00001,
edge.lwd = c(4, rep(1, 7)))
The edge.curve argument, if set very low and combined with usecurve=T, separates the edges, although there might be a more direct way of doing this, and edge.lwd can take a vector as its argument for different sizes.
It's not always the prettiest result, I admit. But it's fairly easy to get decent looking network plots that can be customized in a number of different ways (see ?network.plot).
The 'non overlapping' constraint on edges is the big problem here. First, your network has to be 'planar' otherwise it's impossible in 2-dimensions (you cant connect three houses to gas, electric, phone company buildings without crossovers).
I think an algorithm for planar graph layout essentially solves the 4-colour problem. Have fun with that. Heuristics exist, search for planar graph layout, and force-directed, and read Planar Graph Layouts

R: Selectively display nodes when plotting an igraph

How can I plot a selection of igraph nodes?
I have an existing graph, but it is too complex. I want to be able to 'zoom in' on a subset of the nodes.
I am able to delete a subset of edges, but I can't figure out how to 'turn off' the isolated nodes.
When using the network package, the displayisolates=FALSE parameter does this; it does not display these isolated nodes.
The layout algorithm should also ignore the 'turned off' edges.
For example:
g1 <- graph( c( 0,1, 1,2, 2,2, 2,3 ) )
g2 <- delete.edges(g1, E(g1, c(0,1)))
plot(g2)
When plotting g2, I want to not display node 0.
Thanks
I understand that users should not submit new answers to comment on other answers, but my edit was rejected and I don't have a high enough reputation to leave comments.
I just wanted to point out that in Wine's answer above, the "- 1" index correction in the deletes.isolates function is not necessary from igraph 0.6 onwards. See also Tamas' comment here:
Plot only Edges with a specific weight - igraph
Hey, it looks like you figured it out, but in exploring the question (I usually use the network package myself, but have tried to use igraph as well for some things) I came up with a function that should do that automatically, mirroring the displayisolates = F functinality.
delete.isolates <- function(graph, mode = 'all') {
isolates <- which(degree(graph, mode = mode) == 0) - 1
delete.vertices(graph, isolates)
}
In your case, running this with g1 would remove the first vertex if you used the argument mode = 'in' and the last vertex if you used the argument mode = 'out'.
So in your case, if you entered:
g2 <- delete.isolates(g1, mode = 'in')
plot(g2)
You should get what you want. I don't use igraph much, so it's very possible that this function would run into some issues for other graphs.
P.S. This also gives the kind of weird result that in the new g2, the first vertex is now an isolate based on indegree. This function probably isn't useful in most situations, but might be helpful for making a cleaner plot.
iso <- V(g1)[degree(g1)==0]
g2 <- delete.vertices(g1, iso)

Resources