Cosmos DB Gremlin "project" not producing expected results - azure-cosmosdb

I have the following traversal that shows that the selected Vertex has 14 Edges labeled "follows".
g.V().has('user','email','me#email.com').project('name','email','follow-edges').by('name').by('email').by(outE().hasLabel('follows').project('id','inV').by('id').by('inV'))
Which produces the following results:
[{
"name": "David",
"email": "me#email.com",
"follow-edges": 14}]
But when I want to project the "follows" Edge's id and inV ids, I'm only getting one result item back.
g.V().has('user','email','david#me.com').project('name','email','follow-edges').by('name').by('email').by(outE().hasLabel('follows').project('edge-id', 'inV-id').by('id').by('inV'))
Results:
[{
"name": "David",
"email": "me#email.com",
"follow-edges": {
"edge-id": "ccc06183-f4ca-410d-9c3c-9d2dfd93f5f0",
"inV-id": "f4703a07-f42d-46f9-86be-7f5440f07f12"
}}]
I was expecting to get a list of all the "follows" edge's for the selected vertex. Similiar to the answer given by Stephen Mallette at this link.
Does anyone know why this is not working?

You need to reduce the stream of objects in your anonymous traversal in by() - note my addition of fold():
g.V().has('user','email','david#me.com').
project('name','email','followedges').
by('name').
by('email').
by(outE().hasLabel('follows').
project('edge-id', 'inV-id').
by('id').
by('inV').fold())
I assume that "inV" is an actual property and you're not trying to get the "in vertex" of the edge. If you are trying to get the "in vertex" then you need by(inV().id()).

Related

Getting values of a path with two different edges

I'm looking for a pattern.
I'm working on this query:
g.V().has('objid','7615388501660').as('location')
.in('enhabits').as('population')
.out('isInFaction').as('faction')
.in('isInFaction').out('isOfSpecies').as('species')
.path().by('name')
and I get this back:
"labels": [
["location"],
["population"],
["faction"],
[],
["species"]
],
"objects": [
"Plara",
"Se Bemon",
"Se",
"Se Bemon",
"Wan"
]
but there is an extra step [] that I feel is the wrong approach. It also traverses through all of the populations in that faction, not just the one I want. What I want is each record of the location, population, faction, species in a list. Or, in another way, for each population in that location, I want that population, it's faction, it's species.
You can often flatten these backtracking type of use cases by introducing a union step into the query. Something along the lines of :
g.V().has('objid','7615388501660').as('location').
in('enhabits').as('population').
local(
union(
out('isInFaction').as('faction'),
out('isOfSpecies').as('species')).
fold()).
path().
by(unfold().values('name').fold())

Gremlin traversal.Output all Edge details and also in/out Vertex id's

I'm having trouble constructing the gremlin query to give me all of the Edge details(label, properties) and also the ID's of the Inv and OutV adjoining Vertex's (I don't need any more info from the linked Vertex's, just the ID's).
All I have is the Edge ID as a starting point.
So my Edge is as follows:
Label: "CONTAINS"
id: c6b4f3cb-f96e-cc97-dedb-e405771cb4f2
keys:
key="ekey1", value="e1"
key="ekey2", value="e2"
inV has id 50b4f3cb-f907-c31c-6284-1a3463fd72b9
outV has id 7cb4f3cb-d9a2-1398-61d7-9339be34833b
What I want is a single query that will return me something like -
"CONTAINS", "c6b4f3cb-f96e-cc97-dedb-e405771cb4f2", {ekey1=e1, ekey2=e2, ...}, "50b4f3cb-f907-c31c-6284-1a3463fd72b9", "7cb4f3cb-d9a2-1398-61d7-9339be34833b"
I can get the info in separate queries i.e.
g.E("c6b4f3cb-f96e-cc97-dedb-e405771cb4f2").bothV()
==>v[50b4f3cb-f907-c31c-6284-1a3463fd72b9]
==>v[7cb4f3cb-d9a2-1398-61d7-9339be34833b]
g.E("c6b4f3cb-f96e-cc97-dedb-e405771cb4f2").valueMap()
==>{ekey1=e1, ekey2=e2}
g.E("c6b4f3cb-f96e-cc97-dedb-e405771cb4f2").label()
==>CONTAINS
But I can't for the life of me work out how to combine these into one.
You could use project() to get what you're looking for:
g.E("c6b4f3cb-f96e-cc97-dedb-e405771cb4f2").
project('ekey1', 'inV', 'outV', 'label').
by('ekey1').
by(inV().id()).
by(outV().id()).
by(label).

Arangodb traversal to include head vertex

I'm using GRAPH_TRAVERSAL to get the path from a list of nodes to the head of the tree. This works perfectly except when the example happens to be the head of the tree. In this case, the edgeCollection doesn't have an inbound entry for this object so it doesn't appear in the results.
FOR v IN GRAPH_TRAVERSAL('gdp2',
[{_id:'pmsite/14419285155'}],
'inbound',{edgeCollection:'child'})
RETURN v
The result is an empty list: []
Is there a way I can guarantee that the starting node is on the list? It would be a pain to go through the list of examples to segregate which ones are at the head of a tree.
The problem is within the query itself. It contains a subtle error which is hard to spot:
[{_id:pmsite/14419285155}],
This is missing the quotes around pmsite/14419285155.
What this query realy does is to devide (probably the count of) the collection pmsite by the id 14419285155 and put in this as {_id: divcount}.
If you add the missing quotes, the query should do exactly what you want there. (edit: quotes were present in the original query, fixed the post.)
hint: db._explain() gives information about that.
Trying to reproduce, using the knows sample graph:
arangosh> var examples = require("org/arangodb/graph-examples/example-graph.js");
arangosh> var g = examples.loadGraph("knows_graph");
arangosh> db._query("FOR e IN GRAPH_TRAVERSAL('knows_graph', [{_id: 'persons/eve'}], 'inbound', {edgeCollection: 'knows'}) return e").toArray()
[
[
{
"vertex" : {
"_id" : "persons/eve",
"_rev" : "1405497100114",
"_key" : "eve",
"name" : "Eve"
}
}
]
]
However what creates a somewhat similar behaviour is to use a collection not part of the graph definition:
arangosh> db._create("othercol")
arangosh> db.othercol.save({_key: "1" })
arangosh> db._query("FOR e IN GRAPH_TRAVERSAL('knows_graph', [{_id: 'othercol/1'}], 'inbound', {edgeCollection: 'knows'}) return e").toArray()
[ ]
As pointed out in the Comments, edge relations have a direction. If you want to have edges pointing in both directions, you need to create a second relation in the other direction. Edges not fullfilling the edge definitions may be ignored.

converting Freebase MQL to TinkerPop3 gremlin?

Even though Freebase was deprecated in Jun. 2015, I was impressed by Freebase's MQL. It is intuitive, concise, declarative, and easy to understand/write.
These days I'm learning about TinkerPop3 and gremlin. I think gremlin has many good features. I wonder whether I could convert Freebase MQL to TinkerPop3 gremlin.
Let's say, I have TinkerPop3 sample data "The Crew" and following MQL:
[{
"type": "person",
"name": null,
"develops": {
"type": "software",
"release_date": null, // release_date is not in the crew data.
// Just added for test
"name": null,
"sort": "-name", // descending sort
"limit": 2 // print only two software
},
"uses": {
"type": "software",
"release_date": null,
"name": null,
"sort": "name", // ascending sort
"limit": 2
}
}]
Above MQL means "find person", and for each person "print his two developed software and used software". Please keep in mind that above MQL is just an example for testing.
I've tried to convert MQL to a single gremlin. But I couldn't convert it. So I'm asking you. Is it possible? then how to convert, if impossible why? (If it is possible it would be better if a generated gremlin is efficient in terms of optimization and execution.)
If single gremlin is impossible, then is it possible assembling two or more gremlin can generate as same as MQL's output without performance loss?
Thanks in advanced.
The fastest query to solve this problem should be the following:
gremlin> g.V().hasLabel("person").as("person").
gremlin> map(out("develops").order().by("name", decr).limit(2).fold()).as("develops").select("person").
gremlin> map(out("uses").order().by("name", incr).limit(2).fold()).as("uses").
gremlin> select("person","develops","uses")
==>[person:v[1], develops:[v[11], v[10]], uses:[v[10], v[11]]]
==>[person:v[7], develops:[v[11], v[10]], uses:[v[10], v[11]]]
==>[person:v[8], develops:[v[10]], uses:[v[10], v[11]]]
==>[person:v[9], develops:[], uses:[v[10], v[11]]]
However, using the match() step, you can probably enhance the readability (although it contains the same steps):
g.V().hasLabel("person").match(
__.as("person").out("develops").order().by("name", decr).limit(2).fold().as("develops"),
__.as("person").out("uses").order().by("name", incr).limit(2).fold().as("uses")).
select("person","develops","uses")
UPDATE
Since you don't want to see me (v[9]) in the result set, you can add a simple filter condition:
g.V().hasLabel("person").as("person").
filter(outE("develops").and().outE("uses")).
map(out("develops").order().by("name", decr).limit(2).fold()).as("develops").select("person").
map(out("uses").order().by("name", incr).limit(2).fold()).as("uses").
select("person","develops","uses")

Freebase query result buried under nonsense

A beginner's question about freebase:
I am looking for the imdb id of a movie called "O". If I use the searchbox on the freebase.com website and constrain the search by type to all:/film/film, then I get a high quality result with the best match on top:
http://www.freebase.com/search?query=o&lang=en&all=%2Ffilm%2Ffilm&scoring=entity&prefixed=true
But this does not include the imdb id. When I try to recreate and refine this search using the query editor, I can't figure out how to do a "general query". The best I could come up with was doing a fuzzy name search like this:
[{
"type": "/film/film",
"name": null,
"name~=": "o",
"imdb_id": [],
"rottentomatoes_id": []
}]
The result contains exactly the information I want, but the movie "O" is only the 12th result in the list, buried under lots of nonsense:
http://www.freebase.com/query?lang=%2Flang%2Fen&q=[{%22type%22%3A%22%2Ffilm%2Ffilm%22%2C%22name%22%3Anull%2C%22name~%3D%22%3A%22o%22%2C%22imdb_id%22%3A[]%2C%22rottentomatoes_id%22%3A[]}]
How can I improve the quality of my result? What special magic does the "?query=o" use that "name~=":"o" does not have?
When you use query=o, freebase does some smart sorting of the results, display exact matches first, followed by less exact matches.
With your query name ~= o you are not searching for movies with name "O", but for movies that contain "O" in their names (the ~= operator). If you want to search for a specific movie title, then specify the exact name:
[{
"type": "/film/film",
"name": "o",
"imdb_id": [],
"rottentomatoes_id": []
}]
This will result in output:
{
"result": [{
"imdb_id": [
"tt0184791"
],
"name": "O",
"type": "/film/film",
"rottentomatoes_id": [
"o"
]
}]
}
If Search gives you the topic that you want, why not just use the output parameter to add the IMDB ID (or whatever else you want) to the output that you request it return?

Resources