How to select a graph vertex with igraph in R? - r

In Python you can simply use graph.select() (atleast when reading the documentation: https://igraph.org/python/doc/api/igraph.VertexSeq.html) to select a vertex based on a value. I have a huge graph connecting movies that share an actor. However I would simply like to select one vertex from that graph and then plot it with its direct neighbors. I'm running into issues when I want to select just that single vertex.
I had hoped something like worked
graph.movies.select('Snatch (2000)')
But no luck.
Another approach I took was grabbing the single Snatch vertex by filtering all others out and then adding the edges.
snatch.graph <- induced.subgraph(g.movies, vids=V(g.movies)$name == 'Snatch (2000)')
snatch.edges <- edges(g.movies, "Snatch (2000)")
add_edges(snatch.graph, snatch.edges$edges)
However this returns an empty graph with only the snatch vertex.
My goal is to grab the Snatch vertex and plot this vertex, its DIRECT neighbors and the edges amomng them. Any suggestions? Thanks alot :D Been stuck o nthis for a quite a while -.-

You can use ?ego to grab the neighbours or ?make_ego_graph to form the graph. (Depending on whether your graph is directed or not, you may need to use the mode argument).
An example:
library(igraph)
# create some data
set.seed(71085002)
n = 10
g = random.graph.game(n, 0.25)
V(g)$name = seq_len(n)
# grab neighbours of node "1"
# set mindist = 0 to include node itself
# set order for the stepsize of the neigbourhood;
nb_g = make_ego_graph(g, order=1, node="1", mindist=0)[[1]]
# plot
# you could use the `layout` argument to
# keep nodes in the same position
op = par(mfrow=c(1,2), oma=rep(0,4))
plot(g, vertex.size=0, vertex.label.cex=3, main="Full graph")
plot(nb_g, vertex.size=0, vertex.label.cex=3, main="Sub graph")
par(op) # reset
If you just need a list of the neighbouring nodes:
ego(g, order=1, node="1", mindist=0)
# [[1]]
# + 4/10 vertices, named, from 00cfa70:
# [1] 1 4 6 9

I think the method using ego (by #user20650) is comprehensive and efficient.
Here is another option if you would like to find the sub-graph built on direct neighbors, which applies distances + induced_subgraph
> induced_subgraph(g, which(distances(g, "1") <= 1))
IGRAPH 99f872b UN-- 4 3 -- Erdos renyi (gnp) graph
+ attr: name (g/c), type (g/c), loops (g/l), p (g/n), name (v/n)
+ edges from 99f872b (vertex names):
[1] 1--4 1--6 1--9

Related

"sna" or "igraph" : Why do I get different degree values for undirected graph?

I am doing some basic network analysis using networks from the R package "networkdata". To this end, I use the package "igraph" as well as "sna". However, I realised that the results of descriptive network statistics vary depending on the package I use. Most variation is not too grave but the average degree of my undirected graph halved as soon as I switched from "sna" to "igraph".
library(networkdata)
n_1 <- covert_28
library(igraph)
library(sna)
n_1_adjmat <- as_adjacency_matrix(n_1)
n_1_adjmat2 <- as.matrix(n_1_adjmat)
mean(sna::degree(n_1_adjmat2, cmode = "freeman")) # [1] 23.33333
mean(igraph::degree(n_1, mode = "all")) # [1] 11.66667
This doesn't happen in case of my directed graph. Here, I get the same results regardless of using "sna" or "igraph".
Is there any explanation for this phenomenon? And if so, is there anything I can do in order to prevent this from happening?
Thank you in advance!
This is explained in the documentation for sna::degree.
indegree of a vertex, v, corresponds to the cardinality
of the vertex set N^+(v) = {i in V(G) : (i,v) in E(G)};
outdegree corresponds to the cardinality of the vertex
set N^-(v) = {i in V(G) : (v,i) in E(G)}; and total
(or “Freeman”) degree corresponds to |N^+(v)| + |N^-(v)|.
(Note that, for simple graphs,
indegree=outdegree=total degree/2.)
A simpler example than yours makes it clear.
library(igraph)
library(sna)
g = make_ring(3)
plot(g)
AM = as.matrix(as_adjacency_matrix(g))
sna::degree(AM)
[1] 4 4 4
igraph::degree(g)
[1] 2 2 2
Vertex 1 has links to both vertices 2 and 3. These count in the
in-degree and also count in the out-degree, so
Freeman = in + out = 2 + 2 = 4
The "Note" in the documentation states this.

Maximum number of nodes which can be reached from each node in a graph using igraph

I want to find the maximum number of nodes which can be reached from each node in a graph using igraph in R.
For example, I have the following graph:
IGRAPH fb9255f DN-- 4 3 --
+ attr: name (v/c), X (e/l)
+ edges from fb9255f (vertex names):
1 1->2 2->3 2->4
Resulting graph
For node 1, for example, I would like to obtain the list of all the possible
nodes reachable (not only using one hop) from it.
In this case, for node 1 it will be: [2,3,4]
I have read the igraph documentation, but I do not see any function that can help.
Any help will be appreciated.
Thanks
Carlos
You can compute this using the subcomponent function.
Since you do not provide any data, I will illustrate with an arbitrary example.
## Example graph
library(igraph)
set.seed(123)
g = erdos.renyi.game(15, 0.15, directed = TRUE)
plot(g)
subcomponent gives you all of nodes that are reachable. Here, I am assuming that you are using directed graphs and you mean reachable by going forward along the directed edges. You can change this by altering the mode argument to subcomponent.
sort(subcomponent(g, 2, mode="out"))
+ 7/15 vertices:
[1] 2 5 10 12 13 14 15
If you just want the number of nodes that can be reached, just take the length
length(subcomponent(g, 2, mode="out"))
[1] 7

How can I reduce the nodes in a ggraph arc graph?

I'm trying to create an arc graph showing relationships between nonprofits focusing on a subgraph centered on one of the nonprofits. There are so many nonprofits in this subgraph, I need to reduce the number of nodes in the arc graph to only focus on the strongest connections.
I've successfully filtered out edges below a weight of 50. But when I create the graph, the nodes are still remaining even though the edges have disappeared. How do I filter the unwanted nodes from the arc graph?
Here's my code, starting from the creation of the igraph object.
# Create an igraph object
NGO_igraph <- graph_from_data_frame(d = edges, vertices = nodes, directed = TRUE)
# Create a subgraph centered on a node
# Start by entering the node ID
nodes_of_interest <- c(48)
# Build the graph
selegoV <- ego(NGO_igraph, order=1, nodes = nodes_of_interest, mode = "all", mindist = 0)
selegoG <- induced_subgraph(NGO_igraph,unlist(selegoV))
# Reducing the graph based on edge weight
smaller <- delete.edges(selegoG, which(E(selegoG)$weight < 50))
# Plotting an arc graph
ggraph(smaller, layout = "linear") +
geom_edge_arc(aes(width = weight), alpha = 0.8) +
scale_edge_width(range = c(0.2, 2)) +
geom_node_text(aes(label = label)) +
labs(edge_width = "Interactions") +
theme_graph()
And here's the result I'm getting:
If you are only interested in omitting zero degree vertices or isolates (meaning vertices which have no incoming or outgoing edge) you could simply use the following line:
g <- induced.subgraph(g, degree(g) > 0)
However, this will delete all isolates. So if you are for some reason set on specificly deleting those vertices connected by edges smaller than 50 (and exempt other 'special' isolates), then you will need to clearly identify which those are:
special_vertex <- 1
v <- ends(g, which(E(g) < 50))
g <- delete.vertices(g, v[v != special_vertex])
You could also skip the delete.edges part by considering the strength of a vertex:
g <- induced.subgraph(g, strength(g) > 50)
Without any sample data I created this basic sample:
#define graph
g <- make_ring(10) %>%
set_vertex_attr("name", value = LETTERS[1:10])
g
V(g)
#delete edges going to and from vertice C
g<-delete.edges(g, E(g)[2:3])
#find the head and tails of each edge in graph
heads<-head_of(g, E(g))
tails<-tail_of(g, E(g))
#list of all used vetrices
combine<-unique(c(heads, tails))
#collect an vertices
v<-V(g)
#find vertices not in found set
toremove<-setdiff(v, combine)
#remove unwanted vertices
delete_vertices(g, toremove)
The basic process is to identify the start and end of all of the edges of interest, then compare this unique list with all of the edges and remove the ones not in the unique list.
From your code above the graph "smaller" would be used to find the vertices.
Hope this helps.

Why igraph::cluster_walktrap gives a different result for non directed isomorphic graphs?

I'm trying to use igraph::cluster_walktrap in R to look for communities inside of a graph, however I noticed a weird behaviour (or at least, a behaviour I am not able to explain).
Suppose you are given an undirected graph by defining a list of its edges. Say
a,b
c,d
e,f
...
Then, if I define another graph by swapping randomly selected vertices in the edge list definition:
a,b
d,c
e,f
...
I expect the two graphs to be isomorphic and the difference between the two graph to be empty. This is exactly what happens in R in my toy example. Following this line of reasoning, calling cluster_walktrap on the two graphs (using set.seed appropriately) should yield the same result since the two graphs are the same. This is not happening and the only explanation I can give is that the starting point of each random walk is not the same for the two graphs. Why is this?
You can follow my reasoning in the toy example below. I don't understand why the last two objects are not identical.
require(igraph)
# Number of vertices
verteces <- 50
# Swap randomly some elements in the edges definition
set.seed(20)
row_swapped <- sample(1:verteces,25,replace=F)
m_values <- sample(letters, verteces*2, replace=T) #1:100
# Build edge lists
m1 <- matrix(m_values, verteces, 2)
m1
a <- m1
colS <- seq(round(ncol(m1)*0.3))
m1[row_swapped, 2:1] <- m1[row_swapped, 1:2]
m1
b <- m1
# Define the two graphs
ag <- igraph::graph_from_edgelist(a, directed = F)
bg <- igraph::graph_from_edgelist(b, directed = F)
# Another way of building an isomorphic graph for testing
#bg <- permute(ag, sample(vcount(ag)))
# Should be empty: ok
difference(ag, bg)
# Should be TRUE: ok
isomorphic(ag,bg)
# I expect it to be TRUE but it isn't...
identical(ag, bg)
# Vertices
V(ag)
ag
V(bg)
bg
# Calculate community
set.seed(100)
ac1 <- cluster_walktrap(ag)
set.seed(100)
bc1 <- cluster_walktrap(bg)
# I expect all to be TRUE, however
# merges is different
# membership is different
# names are different
identical(ac1$merges, bc1$merges)
identical(ac1$modularity, bc1$modularity)
identical(ac1$membership, bc1$membership)
identical(ac1$names, bc1$names)
identical(ac1$vcount, bc1$vcount)
identical(ac1$algorithm, bc1$algorithm)
The results are not different. You have two things going on which is making your graphs not identical but isoporphic. I emphasize identical because it has a very strict definition.
1) identical(ag, bg) is not identical because the vertices and edges are not in the same order between the two graphs. Exactly, the same nodes and edges exist but they are not in the exact same place or orientation. For, example if I shuffle the rows of a and make a new graph...
a1 <- a[sample(1:nrow(a)), ]
a1g <- igraph::graph_from_edgelist(a1, directed = F)
identical(ag, a1g)
#[1] FALSE
2) This goes for edges as well. An edge is stored as node1, node2 and a flag if the edge is directed or not. so when you swap rows the representation at the "byte level" (I use this term loosely) is different even though the relationship is the same. Edge 44 represents the same relationship but is stored based on how it was constructed.
E(ag)[44]
# + 1/50 edge from 6318240 (vertex names):
# [1] q--d
E(bg)[44]
# + 1/50 edge from 38042e0 (vertex names):
# [1] d--q
So onto your cluster_walktrap, first, the function returns the index of the vertices, not the name which can be misleading. Which means the reason the objects aren't identical is because ag and bg have different ordering of nodes in the object.
If I reorder the membership by node name the two become identical.
identical(membership(bc1)[order(names(membership(bc1)))], membership(ac1)[order(names(membership(ac1)))])
#[1] TRUE

Issue with plotting neighbors on igraph

I identify the neighbors of a selected node but haven’t been able to plot the result. Take the following example, copied from another question:
edgelist <- read.table(text = "
A B
B C
C D
D E
C F
F G")
library(igraph)
graph <- graph.data.frame(edgelist)
str(graph)
#IGRAPH DN-- 7 6 --
# + attr: name (v/c)
# + edges (vertex names):
# [1] A->B B->C C->D D->E C->F F->G
I identify the neighbors of "D" with:
neighborsD <- neighbors(graph, "D")
But when I instruct R to plot "neighborsD"...
plot(neighborsD)
... I get a chart instead of a sociogram, and when I try to tkplot it I get the error "not a graph object". So two questions:
1) How do I plot the network around, say, "D"?
2) How do I plot “D”, its neighbors, and the neighbors of the neighbors (two steps from "D"?
Use the ego() function to find nodes that are a certain distance away from a a node. And then use induced_subgraph to subset your main graph. For example, the does that are 1 step away are
plot(induced_subgraph(graph, ego(graph, 1, "D")[[1]]))
and those that are two steps away are
plot(induced_subgraph(graph, ego(graph, 2, "D")[[1]]))

Resources