Reduce number of nodes/edges of a Graph in nedworkx - graph

I have a Graph with many nodes of degree 2 (derived from a LineString). In order to simplify the Graph I would like to reduce it to just the nodes with a degree not equal to 2 but still containig the same overall connections. You can find an example of what I mean in the picture below.
So if there are multiple nodes mit degree=2 between two nodes with the degree of 3, all the nodes and edges in the middle should be deleted and a single connection between the two deg=3 nodes should be established with the same weight as the sum of the omitted edges.
Example Picture of reduced Graph

You can identify chains by 1) inducing a subgraph only containing nodes with degree 2, and 2) then identifying the individual components in the induced subgraph. Then it is a simple matter of summing the weights in each chain and creating a new edge with that weight between the nodes connecting to the end points of the chain.
import numpy as np
import matplotlib.pyplot as plt
import networkx as nx
def contract(g):
"""
Contract chains of neighbouring vertices with degree 2 into a single edge.
Arguments:
----------
g -- networkx.Graph or networkx.DiGraph instance
Returns:
--------
h -- networkx.Graph or networkx.DiGraph instance
the contracted graph
"""
# create subgraph of all nodes with degree 2
is_chain = [node for node, degree in g.degree() if degree == 2]
chains = g.subgraph(is_chain)
# contract connected components (which should be chains of variable length) into single node
components = list(nx.components.connected_component_subgraphs(chains))
hyper_edges = []
for component in components:
end_points = [node for node, degree in component.degree() if degree < 2]
candidates = set([neighbor for node in end_points for neighbor in g.neighbors(node)])
connectors = candidates - set(list(component.nodes()))
hyper_edge = list(connectors)
weights = [component.get_edge_data(*edge)['weight'] for edge in component.edges()]
hyper_edges.append((hyper_edge, np.sum(weights)))
# initialise new graph with all other nodes
not_chain = [node for node in g.nodes() if not node in is_chain]
h = g.subgraph(not_chain).copy()
for hyper_edge, weight in hyper_edges:
h.add_edge(*hyper_edge, weight=weight)
return h
# create weighted graph
edges = np.random.randint(0, 20, size=(int(400*0.2), 2))
weights = np.random.rand(len(edges))
g = nx.Graph()
for edge, weight in zip(edges, weights):
g.add_edge(*edge, weight=weight)
h = nx.algorithms.minimum_spanning_tree(g)
# contract
i = contract(h)
# plot
pos = nx.spring_layout(h)
fig, (ax1, ax2) = plt.subplots(1, 2, sharex=True, sharey=True)
nx.draw(h, pos=pos, ax=ax1)
nx.draw(i, pos=pos, ax=ax2)
plt.show()

nx.component.connected_component_subgraphs has been deprecated. components can now be defined as:
components = [chains.subgraph(c) for c in nx.components.connected_components(chains)]
https://networkx.org/documentation/networkx-2.1/reference/algorithms/generated/networkx.algorithms.components.connected_component_subgraphs.html

Related

How do I see the two nodes of most weighted egdes in R igraph?

I have an igraph network object constructed in R and generated weight information for each edge. I want to see the nodes of the most weighted edges (descending). What codes should I use to do that? Thank you!
# create an igraph project of user interaction network and check descriptives.
library(igraph)
#edge list
EL = read.csv("(file path omitted)user_interaction_structure.csv")
head(EL)
#node list: I do not have a node list
#construct an igraph oject
g <- graph_from_data_frame(EL, directed = TRUE, vertices = NULL)
#check the edge and node number of the network
gsize(g)
vcount(g)
#check nodes based on degree (descending)
deg <- igraph::degree(g)
dSorted <-sort.int(deg,decreasing=TRUE,index.return=FALSE)
dSorted
#check edges based on weight
E(g)
#the network will contain loop edges and multiple edges
#simplify multiple edges
g_simple <- graph.adjacency(get.adjacency(g),weighted=TRUE)
#check edge weight
E(g_simple)$weight
#igraph can generate a matrix
g_simple[]
Then I wanted to see who were interacting heavily with whom (the nodes of the edges with the largest weight),so I tried
e_top_weights <- order(order(E(g_simple))$weight, decreasing=TRUE)
but it did not work.
I think what you want is the igraph function strength(), which gives the sum of the weights of the edges incident to each node. Here's an example:
library(igraph)
# A small graph we can visualize
g <- make_ring(5)
# Assign each edge an increasing weight, to make things
# easy
edgeweights<- 1:ecount(g)
E(g)$weight <- edgeweights
# The strength() function sums the weights of edges incident
# to each node
strengths <- strength(g)
# We can collect the top two strengths by sorting the
# strengths vector, then asking for which elements of the
# strengths vector are equal to or greater than the second
# largest element.
toptwo <- which(strengths >= sort(strengths, decreasing = TRUE)[2])
## [1] 4 5
# Assign nodes a color blue that is more saturated when nodes
# have greater strength.
cr <- colorRamp(c(rgb(0,0,1,.1), rgb(0,0,1,1)), alpha = TRUE)
colors <- cr(strengths/max(strengths))
V(g)$color <- apply(colors, 1, function(row) rgb(row[1], row[2], row[3], row[4], maxColorValue = 255))
# Plot to confirm
plot(g, edge.width = edgeweights)
Edit
Here are two different ways to find the two nodes (the "from" node and the "to" node) which are the ends of the edge with the maximum weight:
## 1
edge_df <- as_data_frame(g, "edges")
edge_df[which(edge_df$weight == max(edge_df$weight)), c("from", "to")]
## 2
max_weight_edge <- E(g)[which(E(g)$weight == max(E(g)$weight))]
ends(g, es = max_weight_edge)

How to get the length of lines representing edges in the plot of graph after layout out using networkx

For a graph in networkx, I have made a layout to draw a network graph using code below:
data = pd.read_csv('data\\email-dept3.csv')
edges = [edge for edge in zip(data['source'],data['target'])]
print(len(edges))
G = nx.Graph()
G.add_edges_from(edges)
node_pos = nx.kamada_kawai_layout(G)
#I want to get the edge length as one attributes, but I don't know how to code this function
edge_length = calculate_edge_length()
nx.draw_networkx_nodes(G,node_pos,**options)#draw nodes
[nx.draw_networkx_edges(G,node_pos,edgelist=[key],alpha=np.amin([1,value*100]),width=2) for key,value in cent.items()]
plt.show()
And the result is:
What I want to do is get the every edge's length in this graph. Because after layout, every node has a position in screen, and the edge has its length according to its two nodes' position. But in networkx's API, I can't find the method to get the edge's length. And I also don't know how to calculate this value.
If you need more information, please contact me.
I am trying all kinds of methods to adjust the transparency of edges. The length of line is one of my consideration.
Interesting idea! Seems like a worthwhile experiment; I'll let you decide if it works well or not. :-)
But in networkx's API, I can't find the method to get the edge's length
I think you have to compute them yourself. Fortunately, that's not too hard. Here's an example.
import numpy as np
import pandas as pd
import networkx as nx
import matplotlib.pyplot as plt
plt.rcParams["figure.figsize"] = (10,10)
def example_graph():
"""
Return the classic Karate Club network, but give text labels to the nodes.
"""
labels = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJZKLMNOPQRSTUVWXYZ'
kg = nx.karate_club_graph()
edges = [(labels[i], labels[j]) for i,j in kg.edges()]
G = nx.Graph()
G.add_edges_from(edges)
return G
# Test network
G = example_graph()
# Determine layout node positions
node_pos = nx.kamada_kawai_layout(G)
# Determine edge distances (from the node positions)
node_pos_df = pd.DataFrame(node_pos.values(), columns=['x', 'y'], index=node_pos.keys())
node_pos_df = node_pos_df.rename_axis('label').sort_index()
edges = np.array(G.edges())
u_pos = node_pos_df.loc[edges[:, 0]].values
v_pos = node_pos_df.loc[edges[:, 1]].values
distances = np.linalg.norm(u_pos - v_pos, axis=1)
## Optional: Add the distances as edge attributes
#edge_distances = {(u,v): d for (u,v), d in zip(G.edges(), distances)}
#nx.set_edge_attributes(G, edge_distances, "layout_distance")
# Compute alpha: Set 0.15 as minimum alpha, 1.0 as maximum alpha
d_min, d_max = distances.min(), distances.max()
alphas = 1.0 - 0.85 * (distances - d_min) / (d_max - d_min)
# Draw graph
nx.draw_networkx_nodes(G, node_pos)
nx.draw_networkx_edges(G, node_pos, edgelist=G.edges(), alpha=alphas, width=2)
plt.show()

Trying to pad adjacency matrix of networkx graph with 0's

I have a list of networkX graphs gSet that are of varying sizes. I want to add isolated nodes to all of them such that they all have the same number of nodes thereby padding their adjacency matrices on the right and bottom with 0's. This is what I have tried thus far, maxNodes is the number of nodes in the largest graph from the list:
for i in range(0, len( gSet )):
numOfNodes = nx.to_numpy_array( gSet[i] ).shape[0]
for j in range(maxNodes - numOfNodes, maxNodes ):
gSet[i].add_node(j)
This doesn't seem to change all the graphs to be the same size however.
# collection of dummy graphs:
gSet = []
for _ in range(10):
size = np.random.randint(1,8)
G = nx.from_numpy_array(np.random.rand(size,size)>0.8)
gSet.append(G)
# check number of nodes in each graph:
print('before:')
for g in gSet:
print(len(g))
# find number of nodes in graph with most nodes:
max_nodes = max([len(g) for g in gSet])
# networkx numbers nodes from 0 to the number of nodes -1 (=length of the graph -1)
# so len(g) gives the smallest positive integer that can be used as a node name.
for g in gSet:
while len(g) < max_nodes:
g.add_node(len(g))
# check number of nodes in each graph:
print('after:')
for g in gSet:
print(len(g))
gSet.add_node(j)
This looks to be incorrect. You want to add the extra node to ONE of the graphs in gSet.

Visualizing agents using Mesa & Networkx

I'm current doing a multi agent path finding using Mesa and Networkx. The nodes represent location where only 1 agent can reside at one point of time. The edges represent distance between the nodes. How do I visualize the moving of agents along the edges at each time step? For example at time step = 4, Agent A is in the middle of edge connecting node 1 and 2.
I guess you want to plot some agent traversing between the nodes using networkx and matplotlib.
import matplotlib.pyplot as plt
import networkx as nx
import matplotlib.animation as animation
import matplotlib
import numpy as np
matplotlib.use('TkAgg')
plt.ion()
H = nx.octahedral_graph() # generate a random graph
pos = nx.spring_layout(H, iterations=200) # find good positions for nodes
In order to do that, first we need to know the position of the agent in each step or frame while traversing. If we assume that there are 50 steps between each node (or on each edge), we can write a generator to update the agent's position in each frame:
def traverse(graph, start, end, steps_between_nodes=50):
"""Generate the new position of the agent.
:param graph: the graph you want to put your agent to traverse on.
:param start: the node to start from.
:param end: the node to end at.
:param steps_between_nodes: number of steps on each edge.
"""
steps = np.linspace(0, 1, steps_between_nodes)
# find the best path from start to end
path = nx.shortest_path(graph, source=start, target=end)
stops = np.empty((0, 2))
for i, j in zip(path[1:], path):
# get the position of the agent at each step
new_stops = steps[..., None] * pos[i] + (1 - steps[..., None]) * pos[j]
stops = np.vstack((stops, new_stops))
for s in stops:
yield s
Then we can animate it as follows:
agent_pos = traverse(H, 1, 4) # make an agent traversing from 1 to 4
def update_position(n):
plt.cla()
nx.draw(H, pos, node_size=700, with_labels=True, node_color='green')
c = plt.Circle(next(agent_pos), 0.05, color='purple', zorder=2, alpha=0.7)
plt.gca().add_patch(c)
ani = animation.FuncAnimation(plt.gcf(), update_position, interval=30, repeat=False)
plt.ioff()
plt.show()
Finally we will have something like this:

How can I reduce the nodes in a ggraph arc graph?

I'm trying to create an arc graph showing relationships between nonprofits focusing on a subgraph centered on one of the nonprofits. There are so many nonprofits in this subgraph, I need to reduce the number of nodes in the arc graph to only focus on the strongest connections.
I've successfully filtered out edges below a weight of 50. But when I create the graph, the nodes are still remaining even though the edges have disappeared. How do I filter the unwanted nodes from the arc graph?
Here's my code, starting from the creation of the igraph object.
# Create an igraph object
NGO_igraph <- graph_from_data_frame(d = edges, vertices = nodes, directed = TRUE)
# Create a subgraph centered on a node
# Start by entering the node ID
nodes_of_interest <- c(48)
# Build the graph
selegoV <- ego(NGO_igraph, order=1, nodes = nodes_of_interest, mode = "all", mindist = 0)
selegoG <- induced_subgraph(NGO_igraph,unlist(selegoV))
# Reducing the graph based on edge weight
smaller <- delete.edges(selegoG, which(E(selegoG)$weight < 50))
# Plotting an arc graph
ggraph(smaller, layout = "linear") +
geom_edge_arc(aes(width = weight), alpha = 0.8) +
scale_edge_width(range = c(0.2, 2)) +
geom_node_text(aes(label = label)) +
labs(edge_width = "Interactions") +
theme_graph()
And here's the result I'm getting:
If you are only interested in omitting zero degree vertices or isolates (meaning vertices which have no incoming or outgoing edge) you could simply use the following line:
g <- induced.subgraph(g, degree(g) > 0)
However, this will delete all isolates. So if you are for some reason set on specificly deleting those vertices connected by edges smaller than 50 (and exempt other 'special' isolates), then you will need to clearly identify which those are:
special_vertex <- 1
v <- ends(g, which(E(g) < 50))
g <- delete.vertices(g, v[v != special_vertex])
You could also skip the delete.edges part by considering the strength of a vertex:
g <- induced.subgraph(g, strength(g) > 50)
Without any sample data I created this basic sample:
#define graph
g <- make_ring(10) %>%
set_vertex_attr("name", value = LETTERS[1:10])
g
V(g)
#delete edges going to and from vertice C
g<-delete.edges(g, E(g)[2:3])
#find the head and tails of each edge in graph
heads<-head_of(g, E(g))
tails<-tail_of(g, E(g))
#list of all used vetrices
combine<-unique(c(heads, tails))
#collect an vertices
v<-V(g)
#find vertices not in found set
toremove<-setdiff(v, combine)
#remove unwanted vertices
delete_vertices(g, toremove)
The basic process is to identify the start and end of all of the edges of interest, then compare this unique list with all of the edges and remove the ones not in the unique list.
From your code above the graph "smaller" would be used to find the vertices.
Hope this helps.

Resources