I have an edgelist that has the following columns, from and to. This represents the edges between nodes.
from = c("10009", "10009", "10009", "10009", "10011", "10011", ...)
to = c("23908", "230908", "230908", "230908", "230514", "230514", ...)
edgelist = data.frame(from, to)
nodes = c("10009", "10011", "230908", "230514" ...)
I then created a network object, converted to graph object, to calculate its centrality measures:
library(network)
library(qgraph)
library(igraph)
network_el = network(edgelist, vertex.attr = nodes, directed=T) #network object
g = asIgraph(network_el) #convert to graph object
centrality = centrality_auto(g) #calculate Centrality
df = data.frame(centrality$edge.betweenness.centrality) #extract edge betweenness centrality into a dataframe
This gives me a dataframe with the columns c("from", "to", "centrality"). However, the "from" and "to" are no longer the original node names listed in edgelist. They have been converted into a different ID, starting from 1...to the last row.
#my current results
from = c("1","2","3","4"...)
to = c("6", "100", "204", ...)
edge.betweenness.centrality = c(4653193, 20188105, ...)
How do I merge back the original node names? I need to identify the actual "from" and "to" (i.e., the node data), such as:
#my desired results
from = c("10009", "10009", "10009", "10009", "10011"...) #rather than 1,2,3..
to = c("23908", "230908", "230908", "230908", "235014",...)
edge.betweenness.centrality = c(4653193, 20188105, ...)
I think this works - the node IDs are now back in place!
#Assign as dictionary
dict <- data.frame(name = V(g)$vertex.names)
dict <- data.frame (row.names(dict), dict)
#Replace with dictionary values
df$from <- with(dict, name[match(df$from, row.names.dict.)])
df$to <- with(dict, name[match(df$to, row.names.dict.)])
Here's an example using igraph:
library(igraph)
We create an example graph with explicit vertex names:
g <- sample_pa(length(letters), m=2, directed=F)
V(g)$name <- letters
Compute edge betweenness and save it to an edge attribute:
E(g)$eb <- edge_betweenness(g)
Now the dataframe you asked for, with the original vertex names:
> as_data_frame(g)
from to eb
1 a b 12.831818
2 a c 16.524242
3 b c 25.700000
4 c d 30.812879
5 b d 12.464394
...
...
Related
I'm out of my depth when it comes to network graphs, but I have a table of ~6300 From/To links similar to the data frame df given below. Each vertex has a binary property called status.
What I would like to do is determine all of the vertices that are upstream of a vertex where status = 1, how would i do this in igraph? I've looked at data.tree but my data are not necessarily a single-root tree and "loops" are possible.
In the example below, this would mean that vertices Z, R, S, M, and K should have status = 1 (i.e. be orange in the plot), as they are upstream of Q, L, I, respectively.
library(igraph)
df <- data.frame(from = c("D","B","A","Q","Z","L","M","R","S","T","U","H","I","K"),
to = c("O","D","B","B","Q","O","L","Q","R","O","T","J","J","I"),
stringsAsFactors = FALSE
)
vertices <- data.frame(vertex = unique(c(df[,1], c(df[,2]))),
status = c(0,0,0,1,0,1,0,0,0,0,0,1,
1,0,0,0))
g <- graph_from_data_frame(df, vertices = vertices, directed = T)
plot(g, vertex.color = vertex_attr(g, "status"))
You can use subcomponent with mode='in'.
I am using igraph from R. I know we can make a subgraph with selected vertices but if those nodes aren’t directly connected, there won’t be an edge in the new subgraph. Is there a way to make a subgraph which creates an edge between two nodes if there are other nodes (that are not a part of the vertex list) indirectly connecting those two nodes?
For example, if I have a graph which has the following edges:
E-F
F-G
And my vertex list contains E and G, how can I create a new subgraph that creates that edge E-G?
Thank you!!!
One way to find neighbors that are two steps away is to multiply the adjacency matrix with itself (see comments here for example).
First create the graph described in the question:
library(igraph)
g <- graph_from_literal(E--F, F--G)
Then take the adjacency matrix (m) and multiply it with itself.
m <- get.adjacency(g, sparse = F)
m2 <- m %*% m
Built new graph from resulting adjacency matrix and remove all vertices that have a degree of 0 (no second-degree neighbor):
g2 <- graph_from_adjacency_matrix(m2, diag = F, mode = "undirected")
induced_subgraph(g2, degree(g2) > 0)
#> IGRAPH 089bf67 UN-- 2 1 --
#> + attr: name (v/c)
#> + edge from 089bf67 (vertex names):
#> [1] E--G
Created on 2022-08-26 with reprex v2.0.2
Building upon the suggestions in the comments, I arrive at:
require(igraph)
set.seed(1)
g <- erdos.renyi.game(2^6, 1/32)
V(g)$name <- seq(vcount(g))
filter <- c(7,22, 1, 4, 6)
amg <- g[] # adjacency matrix g
clg <- clusters(g)$membership # strongly connected components
amtc <- clg[row(amg)] == clg[col(amg)] # adjacency matrix of transitive closure
dim(amtc) <- dim(amg)
gtc <- simplify(graph.adjacency(amtc, mode="undirected")) # transitive closure of g
V(gtc)$name <- V(g)$name
isg <- induced_subgraph(gtc, filter)
plot(isg)
However this solution is not feasible if g is large and the subgraph significantly smaller.
If subgraph << original graph then:
require(igraph)
set.seed(1)
g <- erdos.renyi.game(2^6, 1/32)
V(g)$name <- seq(vcount(g))
filter <- c(1, 4, 6, 7, 22, 25)
stopifnot(!is.directed(g)) # assume undirected graph
mscc <- components(g)$membership[filter] # membership strongly connected components
amfi <- outer(X=mscc, mscc, FUN = "==")*1 # cross product = 1, when equal
fitc <- simplify(graph.adjacency(amfi, mode="undirected")) # transitive closure of filter in g
plot(fitc)
Building on Szabolcs, note that connect(g, vcount(g)) computes the transitive closure of g. However not suitable for larger graphs (vcount > 8192).
require(igraph)
g <- make_graph(~ E-G, G-F)
fi <- c("E", "F")
system.time(tcg <- connect(g, vcount(g)) )
sg <- subgraph(tcg, V(tcg)[fi])
sg
I have a weighted graph in igraph R environment.
And need to obtain sub-graphs recursively, starting from any random node. The sum of weights in each sub-graph has to be less them a number.
The Deep First Search algorithm seems to deal with this problem. Also the random walk function.
Does anybody know which igraph function could tackle this?
This iterative function finds the sub-graph grown from vertex vertex of any undirected graph which contains the biggest possible weight-sum below a value spevified in limit.
A challange in finding such a graph is the computational load of evaluating the weight sum of any possible sub-graphs. Consider this example, where one iteration has found a sub-graph A-B with a weight sum of 1.
The shortest path to any new vertex is A-C (with a weight of 3), a sub-graph of A-B-D has a weight-sum of 6, while A-B-C would have a weight-sum of 12 because of the inclusion of the edge B-C in the sub-graph.
The function below looks ahead and evaluates iterative steps by choosing to gradually enlarge the sub-graph by including the next vertex that would result in the lowest sub-graph weight-sum rather than that vertex which has the shortest direct paths.
In terms of optimisation, this leaves something to be desired, but I think id does what you requested in your first question.
find_maxweight_subgraph_from <- function(graph, vertex, limit=0, sub_graph=c(vertex), current_ws=0){
# Keep a shortlist of possible edges to go next
shortlist = data.frame(k=integer(0),ws=numeric(0))
limit <- min(limit, sum(E(graph)$weight))
while(current_ws < limit){
# To find the next possible vertexes to include, a listing of
# potential candidates is computed to be able to choose the most
# efficient one.
# Each iteration chooses amongst vertecies that are connected to the sub-graph:
adjacents <- as.vector(adjacent_vertices(graph, vertex, mode="all")[[1]])
# A shortlist of possible enlargements of the sub-graph is kept to be able
# to compare each potential enlargement of the sub-graph and always choose
# the one which results in the smallest increase of sub-graph weight-sum.
#
# The shortlist is enlarged by vertecies that are:
# 1) adjacent to the latest added vertex
# 2) not alread IN the sub-graph
new_k <- adjacents[!adjacents %in% sub_graph]
shortlist <- rbind(shortlist[!is.na(shortlist$k),],
data.frame(k = new_k,
ws = rep(Inf, length(new_k)) )
)
# The addition to the weight-sum is NOT calculated by the weight on individual
# edges leading to vertecies on the shortlist BUT on the ACTUAL weight-sum of
# a sub-graph that would be the result of adding a vertex `k` to the sub-graph.
shortlist$ws <- sapply(shortlist$k, function(x) sum( E(induced_subgraph(graph, c(sub_graph,x)))$weight ) )
# We choose the vertex with the lowest impact on weight-sum:
shortlist <- shortlist[order(shortlist$ws),]
vertex <- shortlist$k[1]
current_ws <- shortlist$ws[1]
shortlist <- shortlist[2:nrow(shortlist),]
# Each iteration adds a new vertex to the sub-graph
if(current_ws <= limit){
sub_graph <- c(sub_graph, vertex)
}
}
(induced_subgraph(graph, sub_graph))
}
# Test function using a random graph
g <- erdos.renyi.game(16, 30, type="gnm", directed=F)
E(g)$weight <- sample(1:1000/100, length(E(g)))
sum(E(g)$weight)
plot(g, edge.width = E(g)$weight, vertex.size=2)
sg <- find_maxweight_subgraph_from(g, vertex=12, limit=60)
sum(E(sg)$weight)
plot(sg, edge.width = E(sg)$weight, vertex.size=2)
# Test function using your example code:
g <- make_tree(10, children = 2, mode = c("undirected"))
s <- seq(1:10)
g <- set_edge_attr(g, "weight", value= s)
plot(g, edge.width = E(g)$weight)
sg <- find_maxweight_subgraph_from(g, 2, 47)
sum(E(sg)$weight)
plot(sg, edge.width = E(g)$weight)
It is done here below, however, it does not seem to be effective.
#######Example code
g <- make_tree(10, children = 2, mode = c("undirected"))
s <- seq(1:19)
g <- set_edge_attr(g, "weight", value= s)
plot(g)
is_weighted(g)
E(g)$weight
threshold <- 5
eval <- function(r){
#r <- 10
Vertice_dfs <- dfs(g, root = r)
Sequencia <- as.numeric(Vertice_dfs$order)
for (i in 1:length(Sequencia)) {
#i <- 2
# function callback by vertice to dfs
f.in <- function(graph, data, extra) {
data[1] == Sequencia[i]-1
}
# DFS algorithm to the function
dfs <- dfs(g, root = r,in.callback=f.in)
# Vertices resulted from DFS
dfs_eges <- na.omit(as.numeric(dfs$order))
# Rsulted subgraph
g2 <- induced_subgraph(g, dfs_eges)
# Total weight subgraph g2
T_W <- sum(E(g2)$weight)
if (T_W > threshold) {
print(T_W)
return(T_W)
break
}
}
}
#search by vertice
result <- lapply(1:length(V(g)),eval)
I am trying to make a network visualization for calling activity from a manager to store locations. The only problem is I keep getting the error "Duplicate Vertex IDs". I need to have multiple of the same vertex IDs as one manager has called more than one store. How do I get around this?
My edges data is organized as follows:
from to weight
12341 1 5
12341 2 4
23435 1 3
My node data includes only the from column:
from
12341
12341
23435
This was the code I tried to run:
MANAGER_LOC <- graph_from_data_frame(d = edges, vertices = nodes,
directed = TRUE)
You are getting the duplicate vertex ID error because you need to reference unique node data in vertices = . You could use unique(nodes), but this will give you another error, because nodes 1 and 2 you are referencing in your adjacency list data are not included in your nodes data.
Your node data cannot only include unique values from column edges$from, it must include all unique values from edges$from and edges$to, because you are passing adjacency list data to the graph_from_data_frame() function.
So in edges$to you also need to reference vertices by their names as in edges$from, e.g. 12341 or 23435.
Here is some R-Code, maybe including what you are trying to achieve.
#graph from your data frame
MANAGER_LOC <- graph_from_data_frame(
d = edges
,vertices = unique(c(edges$from, edges$to))
,directed = TRUE);
#plot also includes vertices 1 and 2
plot(
x = MANAGER_LOC
,main = "Plot from your edges data");
#plot from your data assuming you are referencing an id in edges$to
MANAGER_LOC <- graph_from_data_frame(
d = merge(
x = edges
,y = data.frame(
to_vertice_id = 1:length(unique(edges$from))
,to_vertice = unique(edges$from))
,by.x = "to"
,by.y = "to_vertice_id"
,all.x = T)[,c("from","to_vertice","weight")]
,vertices = unique(edges$from)
,directed = TRUE);
#plot does not include vertices 1 and 2
plot(
x = MANAGER_LOC
,main = "Plot assuming vertice ID
reference in edges$to");
#plot from your data assuming you are referencing the xth value of edges$from in edges$to
MANAGER_LOC <- graph_from_data_frame(
d = merge(
x = edges
,y = data.frame(
to_vertice_ref = 1:nrow(edges)
,to_vertice = edges$from)
,by.x = "to"
,by.y = "to_vertice_ref"
,all.x = T)[,c("from","to_vertice","weight")]
,vertices = unique(edges$from)
,directed = TRUE);
#plot does not include vertices 1 and 2
plot(
x = MANAGER_LOC
,main = "Plot assuming edges$from
reference in edges$to");
Problem
I have two separate networks with no overlapping nodes or edges, they both have the same attributes. I want to combine these two networks into a single network which would then be made up of two distinct components.
However when I try to merge them using the union command the attributes are renamed from "attribute" to "attribute_1" and "attribute_2". That this will happen is stated in the command help file, but I cannot find an obvious way to merge these two networks.
The situation is shown in the below code block
library(igraph)
#create a 4 node network of two components
adjmat <- rep(0, 16)
adjmat[c(2,5,12,15)] <- 1
g <- graph.adjacency(matrix(adjmat, nrow = 4) , mode = "undirected")
#give attributes naming the nodes and the edges
g <- set_vertex_attr(g, "name", value = paste0("Node_", 1:4))
g <- set_edge_attr(g, "name", value = paste0("Edge_",1:2))
#I am interested in the type attribute
g <- set_edge_attr(g, "type", value = c("foo", "bar"))
plot(g)
#Decompose into seperate networks
gList <- decompose(g)
g2 <-union(gList[[1]], gList[[2]])
#vertices are fine but edges have been renamed as stated in the helpfile for union.
get.edge.attribute(g2)
get.vertex.attribute(g2)
Work around
Currently the two separate networks originate from the same original network so I have been able to make a hack however this isn't always the case and I would like a more igraph way of merging the two.
The hack is below
#To solve this problem I do the following
#Create two dataframes from the edge characteristics of the network and combine into a single dataframe
P <- rbind(as_data_frame(gList[[1]]),
as_data_frame(gList[[2]]))
g3 <- set.edge.attribute(g, "type", value = P$type[match(P$name, get.edge.attribute(g, "name"))])
#Edges are now correct
get.edge.attribute(g3)matrix(adjmat, nrow = 4)
get.vertex.attribute(g3)
Is there a function in igraph that would merge the two seperate networks into a single network whilst maintaining the attributes as is?
I have made the below version of union, which accepts two graphs with an arbitrary number of overlapping attributes and merges them into a single graph where the attributes do not have the "_x" suffix. The graphs can be entirely independent or have overlapping nodes.
In the case of overlapping nodes the attributes of graph 1 take precedence
library(dplyr)
library(igraph)
union2<-function(g1, g2){
#Internal function that cleans the names of a given attribute
CleanNames <- function(g, target){
#get target names
gNames <- parse(text = (paste0(target,"_attr_names(g)"))) %>% eval
#find names that have a "_1" or "_2" at the end
AttrNeedsCleaning <- grepl("(_\\d)$", gNames )
#remove the _x ending
StemName <- gsub("(_\\d)$", "", gNames)
NewnNames <- unique(StemName[AttrNeedsCleaning])
#replace attribute name for all attributes
for( i in NewnNames){
attr1 <- parse(text = (paste0(target,"_attr(g,'", paste0(i, "_1"),"')"))) %>% eval
attr2 <- parse(text = (paste0(target,"_attr(g,'", paste0(i, "_2"),"')"))) %>% eval
g <- parse(text = (paste0("set_",target,"_attr(g, i, value = ifelse(is.na(attr1), attr2, attr1))"))) %>%
eval
g <- parse(text = (paste0("delete_",target,"_attr(g,'", paste0(i, "_1"),"')"))) %>% eval
g <- parse(text = (paste0("delete_",target,"_attr(g,'", paste0(i, "_2"),"')"))) %>% eval
}
return(g)
}
g <- igraph::union(g1, g2)
#loop through each attribute type in the graph and clean
for(i in c("graph", "edge", "vertex")){
g <- CleanNames(g, i)
}
return(g)
}
Using the previous example
g4 <-union2(gList[[1]], gList[[2]])
#As we would like
get.edge.attribute(g4)
get.vertex.attribute(g4)