Getting the Vertices numbers from an Edge - julia

I am using the shortest path algorithm from LightGraphs.jl. In the end I want to collect some information about the nodes along the path. In order to do that I need to be able to extract the vertices from the edges that the function gives back.
Using LightGraphs
g = cycle_graph(4)
path = a_star(g, 1, 3)
edge1 = path[1]
Using this I get: Edge 1 => 2
How would I automatically get the vertices 1, 2 without having to look at the Edge manually? I thinking about some thing like edge1[1] or edge1.From which both does not work.
Thanks in advance!

The accessors for AbstractEdge classes are src and dst, used like this:
using LightGraphs
g = cycle_graph(4)
path = a_star(g, 1, 3)
edge1 = path[1]
s = src(edge1)
d = dst(edge1)
println("source: $s") # prints "source: 1"
println("destination: $d") # prints "destination: 2"

Related

All path *lengths* from source to target in Directed Acyclic Graph

I have a graph with an adjacency matrix shape (adj_mat.shape = (4000, 4000)). My current problem involves finding the list of path lengths (the sequence of nodes is not so important) that traverses from the source (row = 0 ) to the target (col = trans_mat.shape[0] -1).
I am not interested in finding the path sequences; I am only interested in propagating the path length. As a result, this is different from finding all simple paths - which would be too slow (ie. find all paths from source to target; then score each path). Is there a performant way to do this quickly?
DFS is suggested as one possible strategy (noted here). My current implementation (below) is simply not optimal:
# create graph
G = nx.from_numpy_matrix(adj_mat, create_using=nx.DiGraph())
# initialize nodes
for node in G.nodes:
G.nodes[node]['cprob'] = []
# set starting node value
G.nodes[0]['cprob'] = [0]
def propagate_prob(G, node):
# find incoming edges to node
predecessors = list(G.predecessors(node))
curr_node_arr = []
for prev_node in predecessors:
# get incoming edge weight
edge_weight = G.get_edge_data(prev_node, node)['weight']
# get predecessor node value
if len(G.nodes[prev_node]['cprob']) == 0:
G.nodes[prev_node]['cprob'] = propagate_prob(G, prev_node)
prev_node_arr = G.nodes[prev_node]['cprob']
# add incoming edge weight to prev_node arr
curr_node_arr = np.concatenate([curr_node_arr, np.array(edge_weight) + np.array(prev_node_arr)])
# update current node array
G.nodes[node]['cprob'] = curr_node_arr
return G.nodes[node]['cprob']
# calculate all path lengths from source to sink
part_func = propagate_prob(G, 4000)
I don't have a large example by hand (e.g. >300 nodes), but I found a non recursive solution:
import networkx as nx
g = nx.DiGraph()
nx.add_path(g, range(7))
g.add_edge(0, 3)
g.add_edge(0, 5)
g.add_edge(1, 4)
g.add_edge(3, 6)
# first step retrieve topological sorting
sorted_nodes = nx.algorithms.topological_sort(g)
start = 0
target = 6
path_lengths = {start: [0]}
for node in sorted_nodes:
if node == target:
print(path_lengths[node])
break
if node not in path_lengths or g.out_degree(node) == 0:
continue
new_path_length = path_lengths[node]
new_path_length = [i + 1 for i in new_path_length]
for successor in g.successors(node):
if successor in path_lengths:
path_lengths[successor].extend(new_path_length)
else:
path_lengths[successor] = new_path_length.copy()
if node != target:
del path_lengths[node]
Output: [2, 4, 2, 4, 4, 6]
If you are only interested in the number of paths with different length, e.g. {2:2, 4:3, 6:1} for above example, you could even reduce the lists to dicts.
Background
Some explanation what I'm doing (and I hope works for larger examples as well). First step is to retrieve the topological sorting. Why? Then I know in which "direction" the edges flow and I can simply process the nodes in that order without "missing any edge" or any "backtracking" like in a recursive variant. Afterwards, I initialise the start node with a list containing the current path length ([0]). This list is copied to all successors, while updating the path length (all elements +1). The goal is that in each iteration the path length from the starting node to all processed nodes is calculated and stored in the dict path_lengths. The loop stops after reaching the target-node.
With igraph I can calculate up to 300 nodes in ~ 1 second. I also found that accessing the adjacency matrix itself (rather than calling functions of igraph to retrieve edges/vertices) also saves time. The two key bottlenecks are 1) appending a long list in an efficient manner (while also keeping memory) 2) finding a way to parallelize. This time grows exponentially past ~300 nodes, I would love to see if someone has a faster solution (while also fitting into memory).
import igraph
# create graph from adjacency matrix
G = igraph.Graph.Adjacency((trans_mat_pad > 0).tolist())
# add edge weights
G.es['weight'] = trans_mat_pad[trans_mat_pad.nonzero()]
# initialize nodes
for node in range(trans_mat_pad.shape[0]):
G.vs[node]['cprob'] = []
# set starting node value
G.vs[0]['cprob'] = [0]
def propagate_prob(G, node, trans_mat_pad):
# find incoming edges to node
predecessors = trans_mat_pad[:, node].nonzero()[0] # G.get_adjlist(mode='IN')[node]
curr_node_arr = []
for prev_node in predecessors:
# get incoming edge weight
edge_weight = trans_mat_pad[prev_node, node] # G.es[prev_node]['weight']
# get predecessor node value
if len(G.vs[prev_node]['cprob']) == 0:
curr_node_arr = np.concatenate([curr_node_arr, np.array(edge_weight) + propagate_prob(G, prev_node, trans_mat_pad)])
else:
curr_node_arr = np.concatenate([curr_node_arr, np.array(edge_weight) + np.array(G.vs[prev_node]['cprob'])])
## NB: If memory constraint, uncomment below
# set max size
# if len(curr_node_arr) > 100:
# curr_node_arr = np.sort(curr_node_arr)[:100]
# update current node array
G.vs[node]['cprob'] = curr_node_arr
return G.vs[node]['cprob']
# calculate path lengths
path_len = propagate_prob(G, trans_mat_pad.shape[0]-1, trans_mat_pad)

Robot in a Grid - how to get all possible paths

I'm trying to solve this problem:
There is a grid with with r rows and c columns. A robot sitting in top left cell can only move in 2 directions, right and down. But certain cells have to be avoided and the robot cannot step on them. Find a path for the robot from the top left to the bottom right.
The problem specifically asks for a single path, and that seems straight forward:
Having the grid as boolean[][], the pseudocode I have is
List<String> path = new ArrayList<String>()
boolean found = false
void getPath(r, c){
if (!found) {
if ( (r or c is outofbounds) || (!grid[r][c]) )
return
if (r==0 AND c==0) // we reached
found = true
getPath(r-1, c)
getPath(r, c-1)
String cell = "(" + r + ", " + c + ")"
path.add(cell)
}
}
Though I was wondering how can I get all the possible paths (NOT just the count, but the path values as well). Note that it has r rows and c columns, so its not a nxn grid. I'm trying to think of a DP/recursive solution but unable to come up with any and stuck. It's hard to think when the recursion goes in two ways.
Any pointers? And also any general help on how to "think" about such problems would be appreciated :).
Any pointers? And also any general help on how to "think" about such problems would be appreciated :).
Approach to the problem:
Mentally construct graph G of the problem. In this case the vertices are cells in the grid and directed edges are created where a valid robot move exist.
Search for properties of G. In this case G is a DAG (Directed Acyclic Graph).
Use such properties to come up with a solution. In this case (G is a DAG) its common to use topological sort and dynamic programming to find the amount of valid paths.
Actually you don't need to construct the graph since the set of edges is pretty clear or to do topological sort as usual iteration of the matrix (incremental row index and incremental column index) is a topological sort of this implicit graph.
The dynamic programming part can be solved by storing in each cell [x][y] the amount of valid paths from [0][0] to [x][y] and checking where to move next.
Recurrence:
After computations the answer is stored in dp[n - 1][m - 1] where n is amount of rows and m is amount of columns. Overall runtime is O(nm).
How about find all possible valid paths:
Usual backtracking works and we can speed it up by applying early pruning. In fact, if we calculate dp matrix and then we do backtracking from cell [n - 1][m - 1] we can avoid invalid paths as soon the robot enters at a cell whose dp value is zero.
Python code with dp matrix calculated beforehand:
n, m = 3, 4
bad = [[False, False, False, False],
[ True, True, False, False],
[False, False, False, False]]
dp = [[1, 1, 1, 1],
[0, 0, 1, 2],
[0, 0, 1, 3]]
paths = []
curpath = []
def getPath(r, c):
if dp[r][c] == 0 or r < 0 or c < 0:
return
curpath.append((r, c))
if r == 0 and c == 0:
paths.append(list(reversed(curpath)))
getPath(r - 1, c)
getPath(r, c - 1)
curpath.pop()
getPath(n - 1, m - 1)
print(paths)
# valid paths are [[(0, 0), (0, 1), (0, 2), (0, 3), (1, 3), (2, 3)],
# [(0, 0), (0, 1), (0, 2), (1, 2), (1, 3), (2, 3)],
# [(0, 0), (0, 1), (0, 2), (1, 2), (2, 2), (2, 3)]]
Notice that is very similar to your code, there is a need to store all valid paths together and take care that appended lists are a copy of curpath to avoid ending up with an list of empty lists.
Runtime: O((n + m) * (amount of valid paths)) since simulated robot moves belong to valid paths or first step into an invalid path detected using foresight (dp). Warning: This method is exponential as amount of valid paths can be .

DFS to get all possible solutions?

I have these Circles:
I want to get the list of all possible solution of maximum non-intersecting circles. This is the illustration of the solution I wanted from node A.
Therefore the possible solutions from node A:
1 = [A,B,C], 2 = [A,B,E], 3 = [A,C,B], 4 = [A,E,B] ..etc
I want to store all of the possibilities into a list, which the will be used for weighting and selecting the best result. However, I'm still trying to create the list of all possibilities.
I've tried to code the structure here, however I still confused about backtracking and recursive. Anyone could help here?
# List of circle
# List of circle
list_of_circle = ['A','B','C','D','E']
# List of all possible solutions
result = []
# List of possible nodes
ways = []
for k in list_of_circle:
if len(list_of_circle)==0:
result.append(ways)
else:
ways.append[k]
list_of_circle.remove(k)
for j in list_of_circle:
if k.intersects(j):
list_of_circle.remove(j)
return result
Here is a possible solution (pseudocode).
def get_max_non_intersect(selected_circles, current_circle_idx, all_circles):
if current_circle_idx == len(all_circles): # final case
return selected_circles
# we recursively get the biggest selection of circles if the current circle is not selected
list_without_current_circle = get_max_non_intersect(selected_circles, current_circle_idx + 1, all_circles)
# now we check if we can add the current circle to the ones selected
current_intersects_selected = false
current_circle = all_circles[current_circle_idx]
for selected_circle in selected_circles:
if intersects(current_circle, selected_circle):
current_intersects_selected = true
break
if current_intersects_selected is true: # we cannot add the current circle
return list_without_current_circle
else: # we can add the current circle
list_with_current_circle = get_max_non_intersect(selected_circles + [current_circle], current_circle_idx + 1, all_circles)
return list_with_current_circle + list_without_current_circle

Best way to count downstream with edge data

I have a NetworkX problem. I create a digraph with a pandas DataFrame and there is data that I set along the edge. I now need to count the # of unique sources for nodes descendants and access the edge attribute.
This is my code and it works for one node but I need to pass a lot of nodes to this and get unique counts.
graph = nx.from_pandas_edgelist(df, source="source", target="target",
edge_attr=["domain", "category"], create_using=nx.DiGraph)
downstream_nodes = list(nx.descendants(graph, node))
downstream_nodes.append(node)
subgraph = graph.subgraph(downstream_nodes).copy()
domain_sources = {}
for s, t, v in subgraph.edges(data=True):
if v["domain"] in domain_sources:
domain_sources[v["domain"]].append(s)
else:
domain_sources[v["domain"]] = [s]
down_count = {}
for k, v in domain_sources.items():
down_count[k] = len(list(set(v)))
It works but, again, for one node the time is not a big deal but I'm feeding this routine at least 40 to 50 nodes. Is this the best way? Is there something else I can do that can group by an edge attribute and uniquely count the nodes?
Two possible enhancements:
Remove copy from line creating the sub graph. You are not changing anything and the copy is redundant.
Create a defaultdict with keys of set. Read more here.
from collections import defaultdict
import networkx as nx
# missing part of df creation
graph = nx.from_pandas_edgelist(df, source="source", target="target",
edge_attr=["domain", "category"], create_using=nx.DiGraph)
downstream_nodes = list(nx.descendants(graph, node))
downstream_nodes.append(node)
subgraph = graph.subgraph(downstream_nodes)
domain_sources = defaultdict(set)
for s, t, v in subgraph.edges(data=True):
domain_sources[v["domain"]].add(s)
down_count = {}
for k, v in domain_sources.items():
down_count[k] = len(set(v))

Using method source(edge) of Package Graphs.jl in Julia on Juliabox.org

Take a look at the following simple code example:
Pkg.add("Graphs")
using Graphs
gd = simple_graph(20, is_directed=true) # directed graph with 20 nodes
nodeTo = 2
for nodeFrom in vertices(gd) # add some edges...
if(nodeTo != 20)
add_edge!(gd, nodeFrom, nodeTo)
nodeTo +=1
end
end
for i in edges(gd) # Print source and target for every edge in gd
println("Target: ",target(i))
println("Source: ", source(i))
end
So it works sometimes, and it prints the source and targets of the edges, but most times running the cell(after programming in this or other cells or doing nothing) i get the following error:
type: anonymous: in apply, expected Function, got Int64
while loading In[11], in expression starting on line 14
in anonymous at no file:16
I have not change any code concerning the method nether the cell, but it doesnt work anymore. The method target(edge) works fine, but the method source(edge) makes problems the most times.
http://graphsjl-docs.readthedocs.org/en/latest/graphs.html#graph
What should i do? I would be pleased to get some help.
After some thoughts, i found out, that the mistake have to be in the code between the hashtags:
Pkg.add("JuMP")
Pkg.add("Cbc")
# Pkg.update()
using Graphs
using JuMP
using Cbc
function createModel(graph, costs, realConnections)
m = Model(solver = CbcSolver())
#defVar(m, 0 <= x[i=1:realConnections] <= 1, Int)
#setObjective(m, Min, dot(x,costs[i=1:realConnections]))
println(m)
for vertex in vertices(graph)
edgesIn = Int64[] # Array of edges going in the vertex
edgesOut = Int64[] # Array of Edges going out of the vertex
for edge in edges(graph)
if (target(edge) == vertex) # works fine
push!(edgesIn, edge_index(edge))
end
if (source(edge) == vertex) # does not work
push!(edgesOut, edge_index(edge))
print(source(edge), " ")
end
end
# #addConstraint(m, sum{x[edgesIn[i]], i=1:length(edgesIn)} - sum{x[edgesOut[j]], j=1:length(edgesOut)} == 0)
end
return m
end
file = open("csp50.txt")
g = createGraph(file) # g = g[1] = simpleGraph Object with 50 nodes and 173 arccs, g[2] = number of Nodes g[3]/g[4] = start/end time g[5] = costs of each arc
# After running this piece of code, the source(edge) method does not work anymore
########################################################################################
# adding a source and sink node and adding edges between every node of the orginal graph and the source and sink node
realConnections = length(g[5]) # speichern der Kanten
source = (num_vertices(g[1])+1)
sink = (num_vertices(g[1])+2)
add_vertex!(g[1], source)
add_vertex!(g[1], sink)
push!(g[3], 0)
push!(g[3], 0)
push!(g[4], 0)
push!(g[4], 0)
for i in vertices(g[1])
if (i != source)
add_edge!(g[1], source, i) # edge from source to i
push!(g[5], 0)
end
if (i != sink)
add_edge!(g[1], i, sink) # Kante von i zu Senke
push!(g[5], 0) # Keine Kosten/Zeit fuer diese Kante
end
end
######################################################################################
numEdges = num_edges(g[1]);
createModel(g[1], g[5], realConnections)
From Julia's Manual:
Julia will even let you redefine built-in constants and functions if
needed:
julia> pi
π = 3.1415926535897...
julia> pi = 3
Warning: imported binding for pi overwritten in module Main
3
julia> pi
3
julia> sqrt(100)
10.0
julia> sqrt = 4
Warning: imported binding for sqrt overwritten in module Main
4
However, this is obviously not recommended to avoid potential
confusion.
So reusing source as a variable "unbound" it from it's function definition. Using a different variable name should preserve the Graphs.jl definition for it.

Resources