ArangoDB copy Vertex and Edges to neighbors - graph

I'm trying to copy a vertex node and retain it's relationships in ArangoDB. I'm getting a "access after data-modification" error (1579). It doesn't like it when I iterate over the source node's edges and insert an edge copy within the loop. This makes sense but I'm struggling to figure out how to do what I'm wanting within a single transaction.
var query = arangojs.aqlQuery`
let tmpNode = (FOR v IN vertices FILTER v._id == ${nodeId} RETURN v)[0]
let nodeCopy = UNSET(tmpNode, '_id', '_key', '_rev')
let nodeCopyId = (INSERT nodeCopy IN 'vertices' RETURN NEW._id)[0]
FOR e IN GRAPH_EDGES('g', ${nodeId}, {'includeData': true, 'maxDepth': 1})
let tmpEdge = UNSET(e, '_id', '_key', '_rev')
let edgeCopy = MERGE(tmpEdge, {'_from': nodeCopyId})
INSERT edgeCopy IN 'edges'
`;

This quesion is somewhat similar to 'In AQL how to re-parent a vertex' - so let me explain this in a similar way.
One should use the ArangoDB 2.8 pattern matching traversals to solve this.
We will copy Alice to become Sally with similar relations:
let alice=DOCUMENT("persons/alice")
let newSally=UNSET(MERGE(alice, {_key: "sally", name: "Sally"}), '_id')
let r=(for v,e in 1..1 ANY alice GRAPH "knows_graph"
LET me = UNSET(e, "_id", "_key", "_rev")
LET newEdge = (me._to == "persons/alice") ?
MERGE(me, {_to: "persons/sally"}) :
MERGE(me, {_from: "persons/sally"})
INSERT newEdge IN knows RETURN newEdge)
INSERT newSally IN persons RETURN newSally
We therefore first load Alice. We UNSET the properties ArangoDB should set on its own. We change the properties that have to be uniq to be uniq for Alice so we have a Sally afterwards.
Then we open a subquery to traverse ANY first level relations of Alice. In this subequery we want to copy the edges - e. We need to UNSET once more the document attributes that have to be autogenerated by ArangoDB. We need to find out which side of _from and _to pointed to Alice and relocate it to Sally.
The final insert of Sally has to be outside of the subquery, else this statement will attempt to insert one Sally per edge we traverse. We can't insert Saly in front of the query as you already found out - no subsequent fetches are allowed after the insert.

Related

How to export edgelist from Grakn without using client APIs

I'm trying to export edges from grakn. I can do that with Python client like so:
edge_query = "match $c2c($c1, $c2) isa c2c; $c1 has id $id1; $c2 has id $id2;get $id1,$id2;"
with open(f"grakn.edgelist","w") as outfile:
with GraknClient(uri="localhost:48555") as client:
with client.session(keyspace=KEYSPACE) as session:
with session.transaction().read() as read_transaction:
answer_iterator = read_transaction.query(edge_query)
for answer in tqdm(answer_iterator):
id1 = answer.get("id1")
id2 = answer.get("id2")
outfile.write(f"{id1.value()} {id2.value()} \n")
Edit: For each Relation, I want to export entities pairwise. The output can be a pair of Grakn IDs.
I can ignore the attributes of relation or entities.
Exporting to edges seems like a common task. Is there a better way(more elegant, faster, more efficient) to do it in Grakn?
This works as long as the relation type c2c always has two roleplayers. However, this will produce two edges for every $c1, $c2, which is probably not what you want.
Let's take a pair of Things, with ids V123 and V456. If they satisfy $c2c($c1, $c2) isa c2c; with $c1 = V123 and $c2 = V456 then they will also satisfy the same pattern as $c1 = V456 and $c2 = V123. Grakn will return all combinations of $c1, $c2 that satisfy your query, so you'll get two answers back for this one c2c relation.
Assuming this isn't what you want, if $c1 and $c2 play different roles in the relation c2c (likely implying there is direction to the edge) then try changing the query, adding the roles, to:
edge_query = "match $c2c(role1: $c1, role2: $c2) isa c2c; $c1 has id $id1; $c2 has id $id2; get $id1,$id2;"
If they both play the same role (implying undirected edges), then we need to do something different in our logic. Either store edges as a set of sets of ids to remove duplicates without much effort, or perhaps consider using the Python ConceptAPI, something like this:
relation_query = "match $rc2c isa c2c;get;"
with open(f"grakn.edgelist","w") as outfile:
with GraknClient(uri="localhost:48555") as client:
with client.session(keyspace=KEYSPACE) as session:
with session.transaction().read() as read_transaction:
answer_iterator = read_transaction.query(relation_query)
for answer in answer_iterator:
relation_concept = answer.get("rc2c")
role_players_map = relation_concept.role_players_map()
role_player_ids = set()
for role, thing in role_players_map.items():
# Here you can do any logic regarding what things play which roles
for t in thing:
role_player_ids.add(t.id) # Note that you can retrieve a concept id from the concept object, you don't need to ask for it in the query
outfile.write(", ".join(role_player_ids) + "\n")
Of course, I have no idea what you're doing with the resulting edgelist, but for completeness, the more Grakn-esque way would be to treat the Relation as a first-class citizen since it represents a hyperedge in the Grakn knowledge model, in this case we would treat the Roles of the relation as edges. This means we aren't stuck when we have ternary or N-ary relations. We can do this by changing the query:
match $c2c($c) isa c2c; get;
Then in the result we get the id of the $c2c and of the $c.

Sort paths based on Edge properties

Sorting traversal paths based on Edge property and Dedup
Hello,
I'm having a in memory graph and I want to sort paths based on Edge property and also dedup where paths leading to same destination.
E.g.
String NAME = "name";
String id = "id";
g.addV().property(id, 1).property(NAME, "u1").as("u1")
.addV().property(id, 2).property(NAME, "u2").as("u2")
.addV().property(id, 3).property(NAME, "u3").as("u3")
.addV().property(id, 4).property(NAME, "u4").as("u4")
.addE(rel).from("u2").to("u1").property("order", 2)
.addE(rel).from("u3").to("u1").property("order", 1)
.addE(rel).from("u4").to("u2").property("order", 3)
.addE(rel).from("u4").to("u3").property("order", 4)
.iterate();
What I'm trying to achieve is a traversal which gives me only one path i.e.
vertices = [path[u1, u3, u4]].
I tried using below gremlin.
List<Path> maps = g.V()
.has("id", 1)
.repeat(in()
.simplePath())
.until(inE().count().is(0))
.order().by(outE("rel").values("order"),Order.asc)
.path().by("name")
.toList();
However sorting doesn't happen. It gives me two paths :
vertices = [path[u1, u2, u4], path[u1, u3, u4]]
But I'm looking for output as vertices = [path[u1, u3, u4]]
I'm new to gremlin and ran out of options to try.
can someone help ?
g.V()
.has("id", 1)
.repeat(in("rel") .order() .by(outE().values("order"), Order.asc) .simplePath() )
.until(inE().count().is(0))
.dedup()
.path()
.by("name")
.toList() ;
Using toList will give you all the passible traversals. In your case you did order the answers but didn't take only the first one.
You should add limit step:
...
.limit(1).toList()
Or you can use next() instead of toList()

Longest cycle of key-value pairs in a dictionary

I've got a question regarding Python 3 which I can't wrap my head around.
Let's say I got the following dictionary.
{'Noah': 'Liam', 'Ethan': 'Peter', 'Liam': 'Olivia', 'Emma': 'Ethan', 'Peter': 'Emma', 'Olivia': 'Noah'}
I need to find the longest cycle of key-value pairs in this dictionary.
In pseudo-code this would be
for key in dictionary:
find value in dictionary, make this key
continue process untill start key has been detected
In this example the longest cycle would be:
Noah --> Liam --> Olivia --> Noah (length of 3)
I have no clue how to go about this, even though I know exactly what I want to do. Would appreciate some help.
The keyword to your problem is recursion... Or google "linked lists". Draw what you want to do on a sheet of paper and think of each step individually.. with that small amount of data it is possible.
It's for sure not the prettiest solution but it works. And you get the point.
import copy
people = {'Noah': 'Liam', 'Ethan': 'Peter', 'Liam': 'Olivia', 'Emma': 'Ethan', 'Peter': 'Emma', 'Olivia': 'Noah'}
def walk_the_links(person, subtree, counter):
print(person)
if person in subtree:
counter += 1
next_person = subtree[person]
subtree.pop(person)
counter = walk_the_links(next_person, subtree, counter)
return counter
for person in people:
subtree = copy.deepcopy(people)
counter = 0
length_tree = walk_the_links(person, subtree, counter)
print(length_tree)

Select paths from traversal and filter on the destination vertex (OrientDB)

I am new to graph databases and OrientDB, so I appreciate your patience.
I have the following SQL query to produce an expanded set of results for the shortest path between two vertices (I am using the GratefulDeadConcerts database):
select expand(sp) from (select shortestPath(#9:2,#9:15,'BOTH') as sp)
For whatever reason, using expand without aliasing produces no results, but that isn't really an issue.
What I want is not the shortest path, but a collection of potential paths and branches.
I have tried playing with travesedVertex:
SELECT traversedVertex(-1) FROM ( TRAVERSE out() FROM #9:2 WHILE $depth <= 10 )
But I don't know how to set the destination, or (honestly) how to interpret the results I get.
EDIT
If there are multiple ways to get from A to B, I want each of those paths returned as a set, something like:
{
paths: [
[#9:2, #4:16, #8:7, #9:15],
[#9:2, #4:2, #16:5, #11:3, #9:15],
[#9:2, #4:4, #11:6, #9:15]
]
}
Thank you for your help.
First, $path is the string representation of the current path.
Second, you can filter on the destination columns on the where clause of the outer query. Try this :
SELECT
$path
FROM
( TRAVERSE
out()
FROM
#9:2
WHILE
$depth <= 10 )
WHERE
#rid = #9:15
I get the following output :
Is this what you are looking for ?
If I don't add the where clause, I get this output :

Users who mentioned each other in Gremlin

We have a smaller example twitter database:
user -[TWEETED]-> tweet -[MENTIONED]-> user2
and I would like to find out how to write a query in Gremlin, that shows who were the users who mentioned each other. I have already read the docs but I don't know how to do it.
Given this sample data that assume marko and stephen mention each other and marko and daniel mention each other:
g = new TinkerGraph()
vMarko = g.addVertex("marko", [type:"user"])
vStephen = g.addVertex("stephen", [type:"user"])
vDaniel = g.addVertex("daniel", [type:"user"])
vTweetm1s = g.addVertex("m1s", [type:"tweet"])
vTweetm2d = g.addVertex("m2d", [type:"tweet"])
vTweets1m = g.addVertex("s1m", [type:"tweet"])
vTweetd1m = g.addVertex("d1m", [type:"tweet"])
vMarko.addEdge("tweeted",vTweetm1s)
vMarko.addEdge("tweeted",vTweetm2d)
vStephen.addEdge("tweeted",vTweets1m)
vDaniel.addEdge("tweeted",vTweetd1m)
vTweetm1s.addEdge("mentioned", vStephen)
vTweetm2d.addEdge("mentioned", vDaniel)
vTweets1m.addEdge("mentioned", vMarko)
vTweetd1m.addEdge("mentioned", vMarko)
you could handle it with the following:
gremlin> g.V.has("type","user").as('s')
.out("tweeted").out("mentioned").as('m').out("tweeted")
.out("mentioned").as('e').select.filter{it[0]==it[2]}
==>[s:v[daniel], m:v[marko], e:v[daniel]]
==>[s:v[stephen], m:v[marko], e:v[stephen]]
==>[s:v[marko], m:v[stephen], e:v[marko]]
==>[s:v[marko], m:v[daniel], e:v[marko]]
This approach uses select to extract the data from the labelled steps then a final filter to find those where "s" (vertex in the first position) is equal to the "e" (vertex in the final position). This of course means that there is cycle pattern detected where the one user mentioned another and the other mentioned that person back at some point.
If you follow that much then we can clean up the result a little bit so as to get the unique set of pairs:
gremlin> g.V.has("type","user").as('s')
.out("tweeted").out("mentioned").as('m')
.out("tweeted").out("mentioned").as('e')
.select.filter{it[0]==it[2]}
.transform{[it[0].id,it[1].id] as Set}.toList() as Set
==>[daniel, marko]
==>[stephen, marko]
By adding a transform to the previous code, we can convert the result to "id" (the user's name in this case) and flip everything to Set so as to get unique pairs of results.

Resources