Can't traverse to vertices with unique properties in Gremlin - graph

I'm looking for paths to get from one vertex to the other, avoiding vertices with properties already matched on that path.
Consider this example:
Graph graph = TinkerGraph.open();
GraphTraversalSource t = graph.traversal();
Vertex A = t.addV().property("age", 19).next();
Vertex B = t.addV().property("age", 21).next();
Vertex C = t.addV().property("age", 20).next();
Vertex D = t.addV().property("age", 21).next();
Vertex E = t.addV().property("age", 22).next();
t.V(A).addE("knows").to(B).iterate();
t.V(B).addE("knows").to(C).iterate();
t.V(C).addE("knows").to(D).iterate();
t.V(D).addE("knows").to(E).iterate();
t.V(C).addE("knows").to(E).iterate();
List<Path> paths = t.V(A)
.repeat(
out()
).times(5).emit()
.has("age", 22).path().toList();
Assert.assertEquals(1, paths.size());
I'm looking for ways to go from A to E. There are two paths:
A->B->C->D->E
A->B->C->E
What I'm looking for is just the second one, because in the first path, B and D has the same age.
I tried to filter using as and by, but I failed to scale that to whole path. For instance, I can check the vertices not to match the first vertex' property, by doing the following:
List<Path> paths = t.V(A).as("first")
.repeat(
out()
.where(P.neq("first")).by("age")
).times(5).emit()
.has("age", 22).path().toList();
Assert.assertEquals(1, paths.size());
But as you can imagine, it doesn't filter the collisions in the middle of the path. I feel like there should be an easier way to do this that I'm missing. Is there a statement like as(), but instead of replacing the previous assignment, it keeps them in an array or something? How can I achieve this?
Thanks

You need to compare the current age against all previously seen ages. If there's any match, let the traverser die:
t.V(A).as("a").
repeat(filter(loops().is(lt(5))).out().
not(values("age").as("current"). /* current age */
select(all, "a").unfold(). /* previous ages */
values("age").
where(eq("current"))).as("a")). /* not() negates the match */
until(has("age", 22)).
path()

Related

How find shortest path in transport connection by using Cypher query (neo4j) over GTFS data?

I am new in neo4j, I created a graph following this steps, based on a data model from GTFS. I would like to find all the shortest indirect routes in the graph (with transfers).
Data model of graph database contains 4 entities: Route, Trip, Stop, Stoptime. Here is a screenshot of db.scheme().
Based on query which wrote Bruggen, I modified it for my use:
MATCH
(from:Stop {code:'VBR'})--(st_from:Stoptime),
(to:Stop {code:'VIR'})--(st_to:Stoptime),
p1=((st_from)-[:PRECEDES*]->(st_midway_arr:Stoptime)),
(st_midway_arr)--(midway:Stop),
(midway)--(st_midway_dep:Stoptime),
p2=((st_midway_dep)-[:PRECEDES*]->(st_to))
WHERE
st_from.departure_time > '00:00'
AND st_from.departure_time < '23:00'
AND st_midway_arr.arrival_time > st_from.departure_time
AND st_midway_dep.departure_time > st_midway_arr.arrival_time
AND st_to.arrival_time > st_midway_dep.departure_time
RETURN
from,st_from,to,st_to,p1,p2,midway
order by (st_to.arrival_time_int-st_from.departure_time_int) ASC
limit 1;
This query is not using the shortest path, and it takes in average 30s to find a path, but the output of the query is good.
So I tried to write another query, with method allshortestpaths, it really fast (0,3s). But it returns me also trips which run in a different direction (VIR -> VBR)... another problem is the timing od that connection.
Could you help me, how to access to the transfer node (Station) when I am using allshortestpath method? I want to write a condition for timing and stop_sequence to be sure that's the right direction.
match (from:Stop {code:'VBR'}),(to:Stop {code:'VIR'})
with from,to
match p = allshortestpaths((from)-[*]-(to))
where NONE (x in relationships(p) where type(x)="OPERATES")
return p
limit 10;
match (from:Stop {code:'VBR'}),(to:Stop {code:'VIR'})
with from,to
match p = allshortestpaths((from)-[*]->(to)) // here you needed you give the direction to make sure paths are from 'VBR' to 'VIR'
where NONE (x in relationships(p) where type(x)="OPERATES")
return p
limit 10;
Next , if you want to see the nodes in the path , then you can use the nodes(p)
match (from:Stop {code:'VBR'}),(to:Stop {code:'VIR'})
with from,to
match p = allshortestpaths((from)-[*]->(to))
where NONE (x in relationships(p) where type(x)="OPERATES")
AND ALL(node in nodes WHERE node = from OR node = to OR YOUR CONDTION ON TRANSFER NODE)
limit 10

Sort paths based on Edge properties

Sorting traversal paths based on Edge property and Dedup
Hello,
I'm having a in memory graph and I want to sort paths based on Edge property and also dedup where paths leading to same destination.
E.g.
String NAME = "name";
String id = "id";
g.addV().property(id, 1).property(NAME, "u1").as("u1")
.addV().property(id, 2).property(NAME, "u2").as("u2")
.addV().property(id, 3).property(NAME, "u3").as("u3")
.addV().property(id, 4).property(NAME, "u4").as("u4")
.addE(rel).from("u2").to("u1").property("order", 2)
.addE(rel).from("u3").to("u1").property("order", 1)
.addE(rel).from("u4").to("u2").property("order", 3)
.addE(rel).from("u4").to("u3").property("order", 4)
.iterate();
What I'm trying to achieve is a traversal which gives me only one path i.e.
vertices = [path[u1, u3, u4]].
I tried using below gremlin.
List<Path> maps = g.V()
.has("id", 1)
.repeat(in()
.simplePath())
.until(inE().count().is(0))
.order().by(outE("rel").values("order"),Order.asc)
.path().by("name")
.toList();
However sorting doesn't happen. It gives me two paths :
vertices = [path[u1, u2, u4], path[u1, u3, u4]]
But I'm looking for output as vertices = [path[u1, u3, u4]]
I'm new to gremlin and ran out of options to try.
can someone help ?
g.V()
.has("id", 1)
.repeat(in("rel") .order() .by(outE().values("order"), Order.asc) .simplePath() )
.until(inE().count().is(0))
.dedup()
.path()
.by("name")
.toList() ;
Using toList will give you all the passible traversals. In your case you did order the answers but didn't take only the first one.
You should add limit step:
...
.limit(1).toList()
Or you can use next() instead of toList()

Filter vertices on several properties - Julia

I am working on julia with the Metagraphs.jl library.
In order to conduct an optimization problem, I would like to get the set/list of edges in the graph that point to a special set of vertices having 2 particular properties in common.
My first guess was to first get the set/list of vertices. But I am facing a first issue which is that the filter_vertices function doesn't seem to accept to apply a filter on more than one property.
Here is below an example of what I would like to do:
g = DiGraph(5)
mg = MetaDiGraph(g, 1.0)
add_vertex!(mg)
add_edge!(mg,1,2)
add_edge!(mg,1,3)
add_edge!(mg,1,4)
add_edge!(mg,2,5)
add_edge!(mg,3,5)
add_edge!(mg,5,6)
add_edge!(mg,4,6)
set_props!(mg,3,Dict(:prop1=>1,:prop2=>2))
set_props!(mg,1,Dict(:prop1=>1,:prop2=>0))
set_props!(mg,2,Dict(:prop1=>1,:prop2=>0))
set_props!(mg,4,Dict(:prop1=>0,:prop2=>2))
set_props!(mg,5,Dict(:prop1=>0,:prop2=>2))
set_props!(mg,6,Dict(:prop1=>0,:prop2=>0))
col=collect(filter_vertices(mg,:prop1,1,:prop2,2))
And I want col to find vertex 3 and no others.
But the filter_vertices would only admit one property at a time and then it makes it more costly to do a loop with 2 filters and then try to compare in order to sort a list with the vertices that have both properties.
Considering the size of my graph I would like to avoid defining this set with multiple and costly loops. Would any one of you have an idea of how to solve this issue in an easy and soft way?
I ended up making this to answer my own question:
fil3=Array{Int64,1}()
fil1=filter_vertices(mg,:prop1,1)
for f in fil1
if get_prop(mg,f,:prop2)==2
push!(fil3,f)
end
end
println(fil3)
But tell me if you get anything more interesting
Thanks for your help!
Please provide a minimal working example in a way we can simply copy and paste, and start right away. Please also indicate where the problem occurs in the code. Below is an example for your scenario:
Pkg.add("MetaGraphs")
using LightGraphs, MetaGraphs
g = DiGraph(5)
mg = MetaDiGraph(g, 1.0)
add_vertex!(mg)
add_edge!(mg,1,2)
add_edge!(mg,1,3)
add_edge!(mg,1,4)
add_edge!(mg,2,5)
add_edge!(mg,3,5)
add_edge!(mg,5,6)
add_edge!(mg,4,6)
set_props!(mg,3,Dict(:prop1=>1,:prop2=>2))
set_props!(mg,1,Dict(:prop1=>1,:prop2=>0))
set_props!(mg,2,Dict(:prop1=>1,:prop2=>0))
set_props!(mg,4,Dict(:prop1=>0,:prop2=>2))
set_props!(mg,5,Dict(:prop1=>0,:prop2=>2))
set_props!(mg,6,Dict(:prop1=>0,:prop2=>0))
function my_vertex_filter(g::AbstractMetaGraph, v::Integer, prop1, prop2)
return has_prop(g, v, :prop1) && get_prop(g, v, :prop1) == prop1 &&
has_prop(g, v, :prop2) && get_prop(g, v, :prop2) == prop2
end
prop1 = 1
prop2 = 2
col = collect(filter_vertices(mg, (g,v)->my_vertex_filter(g,v,prop1,prop2)))
# returns Int[3]
Please check ?filter_vertices --- it gives you a hint on what/how to write to define your custom filter.
EDIT. For filtering the edges, you can have a look at ?filter_edges to see what you need to achieve the edge filtering. Append the below code excerpt to the solution above to get your results:
function my_edge_filter(g, e, prop1, prop2)
v = dst(e) # get the edge's destination vertex
return my_vertex_filter(g, v, prop1, prop2)
end
myedges = collect(filter_edges(mg, (g,e)->my_edge_filter(g,e,prop1,prop2)))
# returns [Edge 1 => 3]
I found this solution:
function filter_function1(g,prop1,prop2)
fil1=filter_vertices(g,:prop1,prop1)
fil2=filter_vertices(g,:prop2,prop2)
filter=intersect(fil1,fil2)
return filter
end
This seems to work and is quite easy to implement.
Just I don't know if the filter_vertices function is taking a lot of computational power.
Otherwise a simple loop like this seems to also work:
function filter_function2(g,prop1,prop2)
filter=Set{Int64}()
fil1=filter_vertices(g,:prop1,prop1)
for f in fil1
if get_prop(g,f,:prop2)==prop2
push!(filter,f)
end
end
return filter
end
I am open to any other answers if you have some more elegant ones.

Users who mentioned each other in Gremlin

We have a smaller example twitter database:
user -[TWEETED]-> tweet -[MENTIONED]-> user2
and I would like to find out how to write a query in Gremlin, that shows who were the users who mentioned each other. I have already read the docs but I don't know how to do it.
Given this sample data that assume marko and stephen mention each other and marko and daniel mention each other:
g = new TinkerGraph()
vMarko = g.addVertex("marko", [type:"user"])
vStephen = g.addVertex("stephen", [type:"user"])
vDaniel = g.addVertex("daniel", [type:"user"])
vTweetm1s = g.addVertex("m1s", [type:"tweet"])
vTweetm2d = g.addVertex("m2d", [type:"tweet"])
vTweets1m = g.addVertex("s1m", [type:"tweet"])
vTweetd1m = g.addVertex("d1m", [type:"tweet"])
vMarko.addEdge("tweeted",vTweetm1s)
vMarko.addEdge("tweeted",vTweetm2d)
vStephen.addEdge("tweeted",vTweets1m)
vDaniel.addEdge("tweeted",vTweetd1m)
vTweetm1s.addEdge("mentioned", vStephen)
vTweetm2d.addEdge("mentioned", vDaniel)
vTweets1m.addEdge("mentioned", vMarko)
vTweetd1m.addEdge("mentioned", vMarko)
you could handle it with the following:
gremlin> g.V.has("type","user").as('s')
.out("tweeted").out("mentioned").as('m').out("tweeted")
.out("mentioned").as('e').select.filter{it[0]==it[2]}
==>[s:v[daniel], m:v[marko], e:v[daniel]]
==>[s:v[stephen], m:v[marko], e:v[stephen]]
==>[s:v[marko], m:v[stephen], e:v[marko]]
==>[s:v[marko], m:v[daniel], e:v[marko]]
This approach uses select to extract the data from the labelled steps then a final filter to find those where "s" (vertex in the first position) is equal to the "e" (vertex in the final position). This of course means that there is cycle pattern detected where the one user mentioned another and the other mentioned that person back at some point.
If you follow that much then we can clean up the result a little bit so as to get the unique set of pairs:
gremlin> g.V.has("type","user").as('s')
.out("tweeted").out("mentioned").as('m')
.out("tweeted").out("mentioned").as('e')
.select.filter{it[0]==it[2]}
.transform{[it[0].id,it[1].id] as Set}.toList() as Set
==>[daniel, marko]
==>[stephen, marko]
By adding a transform to the previous code, we can convert the result to "id" (the user's name in this case) and flip everything to Set so as to get unique pairs of results.

Graph: node dependency

I have a graph like this:
for example I want to know the dependency of node 8 which are : 1,2,3,5 . can some body give me some code or may be pseudo code to solve this problem?
Thanks
Dependency structure is a partially ordered set. I had similar case which I covered with 2 methods (in python):
nodes_to_update( to_proc ) parameter is a set of nodes to start with (e.g. set([8])). Returns two set of nodes: set of all nodes on which given nodes depend and set of leaf nodes. Idea is to recursively visit all nodes on which depend visited nodes.
sequence_to_update( to_proc ) parameter is like above. Return (ordered) list of nodes so that nodes in list depend only on nodes before in list. It is done by adding leaf node to the ordered list, and updating set of nodes to processed (all and leaf nodes).
Dependent nodes are get with method down_nodes(o_id) and nodes that depends on are get with up_nodes(o_id).
def nodes_to_update(self, to_proc):
all_to_update, first_to_update = set(), set()
while to_proc:
n = to_proc.pop()
all_to_update.add(n)
ds = self.down_nodes(n) # nodes on which n depends
if not ds:
first_to_update.add(n)
else:
to_proc.update(ds)
return all_to_update, first_to_update
def sequence_to_update(self, to_proc):
all_to_update, first_to_update = self.nodes_to_update(to_proc)
update_in_order = []
while first_to_update:
n_id = first_to_update.pop()
all_to_update.discard(n_id)
update_in_order.append(n_id)
# nodes which depend on n_id (intersection of upper nodes and all nodes to update)
for up_id in (self.up_nodes(n_id) & all_to_update):
if all(d not in all_to_update for d in self.down_nodes(up_id)):
first_to_update.add(up_id)
return update_in_order

Resources