add a second edge's attribute into a existing graph

add a second edge's attribute into a existing graph - graph

I'm trying to add a second edge's attribute in a existing graph.
I created a graph G and save it as a pkl file.
edges1 = pd.DataFrame({'source':[0,1,2,3,4],
'target':[10,11,12,13,14],
'weight':[50,50,50,50,50]})
G = nx.from_pandas_edgelist(edges1, 'source', 'target', 'weight')
I loaded G and then tried to add the second edge's attribute(cost) and a node attribute.
But it keeps overwriting the first edges' attribute(weight).
edges2 = pd.DataFrame({'source':[0,1,2,6,7,8],
'target':[10,11,12,16,17,18],
'cost':[100,100,100,100,100,100]})
nodes = pd.DataFrame({'node':[0,1,2,3,10,18],
'name':['A','B','C','D','E','F']})
nx.from_pandas_edgelist(edges2, 'source', 'target', 'cost')
nx.set_node_attributes(G, pd.Series(nodes.name, index=nodes.node).to_dict(), 'name')
I must load the graph G, so combining edges1 and edges2 DataFrames and creating a graph isn't what I need.
How can I get this?
[(0, 10, {'weight':50, 'cost': 100}), (1, 11, {'weight':50, 'cost':
100}) ...]
instead of this
[(0, 10, {'cost': 100}), (1, 11, {'cost': 100}) ...]

I'm not clear if you want to add new edges from edges2 or not. If you are okay with adding new edges, you can use nx.compose:
H = nx.from_pandas_edgelist(edges2, 'source', 'target', 'cost')
G_updated = nx.compose(G, H)
If you don't want to add new edges, then you can check if the edge exists and then set the edge attribute directly:
H = nx.from_pandas_edgelist(edges2, 'source', 'target', 'cost')
for edge in H.edges():
if edge in G.edges():
G.edges[edge]['cost'] = H.edges[edge]['cost']
If performance is an issue, you could also consider setting the edge attributes of G directly by using your edges2 data without building a second graph or even a second dataframe.

Related

How can I animate simultaneously all the graph points inside Manim?

I made this class : 50 points of a spiral change to a cirle.
But the animation is sequential and I would like to start it at the same time.
class SpiralToCircle(Scene):
def construct(self):
vertices1 = range(50)
vertices2 = range(50)
edges = [(48, 49),(3, 4)]
g1 = Graph(vertices1, edges, layout="spiral")
g2 = Graph(vertices2, edges, layout="circular")
# self.add(graph)
self.play(Create(g1))
self.wait(5)
for i in vertices1:
self.play(g1[i].animate.move_to(g2[i]))
self.wait()
I thought about this trick, but I returns an error :
self.play((g1[i].animate.move_to(g2[i])) for i in vertices1)
TypeError: Unexpected argument <generator object GraphCircular.construct.. at 0x00000229667509E0> passed to Scene.play().

This should work: self.play([g1[i].animate.move_to(g2[i]) for i in vertices1]) The play function can take a list of animations.

Try unpacking the animation list and pass them as parameters to the play method:
self.play(*[g1[i].animate.move_to(g2[i]) for i in vertices1])
So, the code would be:
class SpiralToCircle(Scene):
def construct(self):
vertices1 = range(50)
vertices2 = range(50)
edges = [(48, 49), (3, 4)]
g1 = Graph(vertices1, edges, layout="spiral")
g2 = Graph(vertices2, edges, layout="circular")
# self.add(graph)
self.play(Create(g1))
self.wait(5)
self.play(*[g1[i].animate.move_to(g2[i]) for i in vertices1])
self.wait()
Generating this output:
Output Animation

Getting the Vertices numbers from an Edge

I am using the shortest path algorithm from LightGraphs.jl. In the end I want to collect some information about the nodes along the path. In order to do that I need to be able to extract the vertices from the edges that the function gives back.
Using LightGraphs
g = cycle_graph(4)
path = a_star(g, 1, 3)
edge1 = path[1]
Using this I get: Edge 1 => 2
How would I automatically get the vertices 1, 2 without having to look at the Edge manually? I thinking about some thing like edge1[1] or edge1.From which both does not work.
Thanks in advance!

The accessors for AbstractEdge classes are src and dst, used like this:
using LightGraphs
g = cycle_graph(4)
path = a_star(g, 1, 3)
edge1 = path[1]
s = src(edge1)
d = dst(edge1)
println("source: $s") # prints "source: 1"
println("destination: $d") # prints "destination: 2"

All path lengths from source to target in Directed Acyclic Graph

I have a graph with an adjacency matrix shape (adj_mat.shape = (4000, 4000)). My current problem involves finding the list of path lengths (the sequence of nodes is not so important) that traverses from the source (row = 0 ) to the target (col = trans_mat.shape[0] -1).
I am not interested in finding the path sequences; I am only interested in propagating the path length. As a result, this is different from finding all simple paths - which would be too slow (ie. find all paths from source to target; then score each path). Is there a performant way to do this quickly?
DFS is suggested as one possible strategy (noted here). My current implementation (below) is simply not optimal:
# create graph
G = nx.from_numpy_matrix(adj_mat, create_using=nx.DiGraph())
# initialize nodes
for node in G.nodes:
G.nodes[node]['cprob'] = []
# set starting node value
G.nodes[0]['cprob'] = [0]
def propagate_prob(G, node):
# find incoming edges to node
predecessors = list(G.predecessors(node))
curr_node_arr = []
for prev_node in predecessors:
# get incoming edge weight
edge_weight = G.get_edge_data(prev_node, node)['weight']
# get predecessor node value
if len(G.nodes[prev_node]['cprob']) == 0:
G.nodes[prev_node]['cprob'] = propagate_prob(G, prev_node)
prev_node_arr = G.nodes[prev_node]['cprob']
# add incoming edge weight to prev_node arr
curr_node_arr = np.concatenate([curr_node_arr, np.array(edge_weight) + np.array(prev_node_arr)])
# update current node array
G.nodes[node]['cprob'] = curr_node_arr
return G.nodes[node]['cprob']
# calculate all path lengths from source to sink
part_func = propagate_prob(G, 4000)

I don't have a large example by hand (e.g. >300 nodes), but I found a non recursive solution:
import networkx as nx
g = nx.DiGraph()
nx.add_path(g, range(7))
g.add_edge(0, 3)
g.add_edge(0, 5)
g.add_edge(1, 4)
g.add_edge(3, 6)
# first step retrieve topological sorting
sorted_nodes = nx.algorithms.topological_sort(g)
start = 0
target = 6
path_lengths = {start: [0]}
for node in sorted_nodes:
if node == target:
print(path_lengths[node])
break
if node not in path_lengths or g.out_degree(node) == 0:
continue
new_path_length = path_lengths[node]
new_path_length = [i + 1 for i in new_path_length]
for successor in g.successors(node):
if successor in path_lengths:
path_lengths[successor].extend(new_path_length)
else:
path_lengths[successor] = new_path_length.copy()
if node != target:
del path_lengths[node]
Output: [2, 4, 2, 4, 4, 6]
If you are only interested in the number of paths with different length, e.g. {2:2, 4:3, 6:1} for above example, you could even reduce the lists to dicts.
Background
Some explanation what I'm doing (and I hope works for larger examples as well). First step is to retrieve the topological sorting. Why? Then I know in which "direction" the edges flow and I can simply process the nodes in that order without "missing any edge" or any "backtracking" like in a recursive variant. Afterwards, I initialise the start node with a list containing the current path length ([0]). This list is copied to all successors, while updating the path length (all elements +1). The goal is that in each iteration the path length from the starting node to all processed nodes is calculated and stored in the dict path_lengths. The loop stops after reaching the target-node.

With igraph I can calculate up to 300 nodes in ~ 1 second. I also found that accessing the adjacency matrix itself (rather than calling functions of igraph to retrieve edges/vertices) also saves time. The two key bottlenecks are 1) appending a long list in an efficient manner (while also keeping memory) 2) finding a way to parallelize. This time grows exponentially past ~300 nodes, I would love to see if someone has a faster solution (while also fitting into memory).
import igraph
# create graph from adjacency matrix
G = igraph.Graph.Adjacency((trans_mat_pad > 0).tolist())
# add edge weights
G.es['weight'] = trans_mat_pad[trans_mat_pad.nonzero()]
# initialize nodes
for node in range(trans_mat_pad.shape[0]):
G.vs[node]['cprob'] = []
# set starting node value
G.vs[0]['cprob'] = [0]
def propagate_prob(G, node, trans_mat_pad):
# find incoming edges to node
predecessors = trans_mat_pad[:, node].nonzero()[0] # G.get_adjlist(mode='IN')[node]
curr_node_arr = []
for prev_node in predecessors:
# get incoming edge weight
edge_weight = trans_mat_pad[prev_node, node] # G.es[prev_node]['weight']
# get predecessor node value
if len(G.vs[prev_node]['cprob']) == 0:
curr_node_arr = np.concatenate([curr_node_arr, np.array(edge_weight) + propagate_prob(G, prev_node, trans_mat_pad)])
else:
curr_node_arr = np.concatenate([curr_node_arr, np.array(edge_weight) + np.array(G.vs[prev_node]['cprob'])])
## NB: If memory constraint, uncomment below
# set max size
# if len(curr_node_arr) > 100:
# curr_node_arr = np.sort(curr_node_arr)[:100]
# update current node array
G.vs[node]['cprob'] = curr_node_arr
return G.vs[node]['cprob']
# calculate path lengths
path_len = propagate_prob(G, trans_mat_pad.shape[0]-1, trans_mat_pad)

Best way to count downstream with edge data

I have a NetworkX problem. I create a digraph with a pandas DataFrame and there is data that I set along the edge. I now need to count the # of unique sources for nodes descendants and access the edge attribute.
This is my code and it works for one node but I need to pass a lot of nodes to this and get unique counts.
graph = nx.from_pandas_edgelist(df, source="source", target="target",
edge_attr=["domain", "category"], create_using=nx.DiGraph)
downstream_nodes = list(nx.descendants(graph, node))
downstream_nodes.append(node)
subgraph = graph.subgraph(downstream_nodes).copy()
domain_sources = {}
for s, t, v in subgraph.edges(data=True):
if v["domain"] in domain_sources:
domain_sources[v["domain"]].append(s)
else:
domain_sources[v["domain"]] = [s]
down_count = {}
for k, v in domain_sources.items():
down_count[k] = len(list(set(v)))
It works but, again, for one node the time is not a big deal but I'm feeding this routine at least 40 to 50 nodes. Is this the best way? Is there something else I can do that can group by an edge attribute and uniquely count the nodes?

Two possible enhancements:
Remove copy from line creating the sub graph. You are not changing anything and the copy is redundant.
Create a defaultdict with keys of set. Read more here.
from collections import defaultdict
import networkx as nx
# missing part of df creation
graph = nx.from_pandas_edgelist(df, source="source", target="target",
edge_attr=["domain", "category"], create_using=nx.DiGraph)
downstream_nodes = list(nx.descendants(graph, node))
downstream_nodes.append(node)
subgraph = graph.subgraph(downstream_nodes)
domain_sources = defaultdict(set)
for s, t, v in subgraph.edges(data=True):
domain_sources[v["domain"]].add(s)
down_count = {}
for k, v in domain_sources.items():
down_count[k] = len(set(v))

How can I add a second new edge between two nodes update the old one

What I have: a multigraph H in networkX. Two nodes '0' and '1'. An existing edge e1=(0,1).
What I want: add a second new edge e2 between nodes 0 and 1.
PROBLEM: When I add the new edge e2 between 0 and 1, e1 is updated whit the new value (attributes) of e2, and e2 is not added. There is always a single edge between 0 and 1
My example code:
H=nx.MultiGraph()
H=nx.read_gml('my_graph.gml')
If I print all the edges of H I correctly have this:
for i in H.edges(data=True):
print i
>>>>>(0, 1, {}) #this is ok
Now I add a new edge to e2=(0,1) using the key attribute:
H.add_edge(0,1,key=1,value='blue')
But if i print all the edges of H:
for i in H.edges(data=True):
print i
>>>>>(0, 1, {'key': 1, 'value': 'blue'}) #this is error e1 was updated instead add of e2
As you can see, the second edge has update the first one, but e2 was added with a specified key, different form e1 (default is 0).
How can I avoid this problem??
I Want this result after adding edge e2:
for i in H.edges(data=True):
print i
>>>>>(0: 0, 1, {}, 1: 0,1,{'value': 'blue'} ) #this is correct

You don't have a multigraph so you are replacing edges instead of adding new ones.
Use
H=nx.MultiGraph(nx.read_gml('my_graph.gml'))

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

add a second edge's attribute into a existing graph - graph

Related

How can I animate simultaneously all the graph points inside Manim?

Getting the Vertices numbers from an Edge

All path lengths from source to target in Directed Acyclic Graph

Best way to count downstream with edge data

How can I add a second new edge between two nodes update the old one

Categories

Resources

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

add a second edge's attribute into a existing graph - graph

Related

How can I animate simultaneously all the graph points inside Manim?

Getting the Vertices numbers from an Edge

All path *lengths* from source to target in Directed Acyclic Graph

Best way to count downstream with edge data

How can I add a second new edge between two nodes update the old one

Categories

Resources

All path lengths from source to target in Directed Acyclic Graph