Simple outer-join like gremlin query not returning any results - gremlin

I wrote the simple query below to traversal between Person to Country but it’s not returning any results.
g.V().hasLabel("Person").as("p").out("from").hasLabel("Country").as("c").select("p", "c")
In the actual data, only Person vertices exists and no Country vertices or from edges exist. I expected to at least return p - basically I want to do a left outer join. However, if I have Country and from data as well, the query returns results.
I tried another query using match as well but still no results unless there are actual data:
g.V().hasLabel("Person").has("name","bob").match(__.as("p").out("from").hasLabel("Country").as("c")).select("p", "c")
I'm running these queries against Datastax Enterpise Graph.
Any idea why it’s returning no results?

The result you are getting is expected. If there are no "from" edges then the traverser essentially dies and does not proceed any further. Perhaps you could consider using project():
g.V().hasLabel("Person").
project('name','country').
by('name')
by(out('from').hasLabel('Country').values('name').fold())
With the "modern" toy graph in TinkerPop, the output looks like this:
gremlin> g.V().hasLabel('person').project('name','knows').by().by(out('knows').values('name').fold())
==>[name:v[1],knows:[vadas,josh]]
==>[name:v[2],knows:[]]
==>[name:v[4],knows:[]]
==>[name:v[6],knows:[]]
In the future, when you submit questions about Gremlin, please include a Gremlin script that can be pasted into a Gremlin Console which makes it easier to try to more directly answer your specific question.

Related

Gremlin `elementMap() step` returns less elements than actually present

I have an application with more than 3000 vertices having the same label , let's say ABC. It is required for my application to get the list of all the vertices and their properties for the user to choose the entity and interact with it. For that I am writing a GetAllVertices query for label ABC.
The id's of the vertices are numbers
Ex: 1,2,3,..
The following query returns the correct amount of vertices ~ 3000
g.V().hasLabel('ABC').dedup().count()
The following query however only returns around 1600 entries
g.V().hasLabel('ABC').elementMap()
I am trying to understand what is happening and how can I get the elementMap for all the vertices that I am interested in. I think it might be because of the hash function elementMap() might be using that is causing the collision of the keys and thus resulting in overwriting some of the keys with different entries.
Using TinkerGraph I am not able to reproduce this behavior.
gremlin> g.inject(0).repeat(addV('ABC').property(id,loops())).times(3000)
==>v[2999]
gremlin> g.V().hasLabel('ABC').count()
==>3000
gremlin> g.V().hasLabel('ABC').elementMap().count()
==>3000
If you can say more about the data in your graph I can do some additional tests and try to reproduce what you are seeing.
UPDATED 2022-08-03
I ran the same test on Amazon Neptune version 1.1.1.0.R4 from a Neptune notebook, and it worked there as well.
%%gremlin
g.inject(0).repeat(addV('ABC').property('p1',loops())).times(3000)
v[a6c131cc-42e8-3713-c82d-faa193b118a0]
%%gremlin
g.V().hasLabel('ABC').count()
3000
%%gremlin
g.V().hasLabel('ABC').elementMap().count()
3000

How to get properties hasid/key with vertexes info in one gremlin query or Gremlin.Net

I try to get properties which has key or id in following query by Gremlin.Net, but vertex info(id and label) in VertexProperty is null in result.
g.V().Properties<VertexProperty>().HasKey(somekey).Promise(p => p.ToList())
So i try another way, but it's return class is Path, and i had to write an ugly code for type conversion.
g.V().Properties<VertexProperty>().HasKey(somekey).Path().By(__.ValueMap<object, object>(true))
Is there a better way to achieve this requirement
I think basically the only thing missing to get what you want is the Project() step.
In order to find all vertices that have a certain property key and then get their id, label, and then all information about that property, you can use this traversal:
g.V().
Has(someKey).
Project<object>("vertexId", "vertexLabel", "property").
By(T.Id).
By(T.Label).
By(__.Properties<object>(someKey).ElementMap<object>()).
Promise(t => t.ToList());
This returns a Dictionary where the keys are the arguments given to the Project step.
If you instead want to filter by a certain property id instead of a property key, then you can do it in a very similar way:
g.V().
Where(__.Properties<object>().HasId(propertyId)).
Project<object>("vertexId", "vertexLabel", "property").
By(T.Id).
By(T.Label).
By(__.Properties<object>(someKey).ElementMap<object>()).
Promise(t => t.ToList());
This filters in both cases first the vertices to only have vertices that have the properties we are looking for. That way, we can use the Project() step afterwards to get the desired data back.
ElementMap should give all information back about the properties that you want.
Note however that these traversals will most likely require a full graph scan in JanusGraph, meaning that it has to iterate over all vertices in your graph. The reason is that these traversals cannot use an index which would make them much more efficient. So, for larger graphs, the traversals will probably not be feasible.
If you had the vertex ids available instead of the property ids in the second traversal, then you could make the traversal a lot more efficient by replacing g.V().Where([...]) simply with g.V(id).

how to execute a query after getting response of another query?

I need the result of the first query to pass it as an input parameter to my second query. and also want to know to write multi queries.
In my use case, the second query can be traversed only using the result of the first query and that too using loop(which is similar to for loop)
const query1 =g.V().hasLabel('Province').has('code',market').inE('partOf').outV().has('type',state).values('code').as('state')
After executing query1,the result is
res=[{id1},{id2},........]
query2 = select('state').repeat(has('code',res[0]).inE('partOf').outV().has('type',city).value('name')).times(${res.length-1}).as('city')
I made the assumptions that your first query tries to finds "states by market" where the market is a variable you intend to pass to your query. If that is correct then your first query simplifies to:
g.V().hasLabel('Province').has('code',market).
in('partOf').
has('type','state').values('code')
so, prefer in() to inE().outV() when no filtering on edge properties is needed.
Your second query doesn't look like valid Gremlin, but maybe you were just trying to provide an example of what you wanted to do. You wrote:
select('state').
repeat(has('code',res[0]).
inE('partOf').outV().
has('type',city).value('name')).
times(${res.length-1}).as('city')
and I assume that means you want to use the states found in the first query to find their cities. If that's what you're after you can simplify this to a single query of:
g.V().hasLabel('Province').has('code',market).
in('partOf').has('type','state').
in('partOf').has('type','city').
values('name')
If you need data about the state and the city as part of the result then consider project():
g.V().hasLabel('Province').has('code',market).
in('partOf').has('type','state').
project('state','cities').
by('code').
by(__.in('partOf').has('type','city').
values('name').
fold())

Cannot access specific vertex by ID, using TinkerGraph in Gremlin Console

I cannot select a specific vertex, by executing g.V(3640).valueMap(true).unfold(). Any command which contains an ID between the parentheses in the g.V() command does not seem to work.
This is what I did:
I'm new to Graph databases and experimenting with the Gremlin console. I started by creating an instance:
graph = TinkerGraph.open()
g=graph.traversal()
and loading sample data by importing a .graphml database file:
g.io(graphml()).readGraph('/full/path/to/air-routes-latest.graphml')
which seemed to work fine because a count gives a nice result back
gremlin> g.V().count()
==>3642
Unfortunately the following does not work:
gremlin> g.V(3640).valueMap(true).unfold()
Which I think is odd, because by executing the following
gremlin> g.V()
==>v[3640]
==>v[2306]
...
the ID does seem to exist. Any ideas why I cannot access a specific ID? I tried different commands but g.V() seems to work fine, and g.V(3640) does not. Is it because I use TinkerGraph instead of a Gremlin database, of what might be the problem?
EDIT:
It seems that my id's were saved as strings, because g.V("2").valueMap(true).unfold() does give me results.
I think you likely have an issue with the "type" of the identifier. I suspect that if you do:
g.V(3640L)
that you will get the vertex you want. By default, TinkerGraph handles id equality with equals() so if you try to find an integer when the id is a long it will act like it's not there. You can modify that default if you like with an IdManager configuration discussed here. Note that this is also discussed in more detail in Practical Gremlin.

When is it appropriate to use g.query()...vertex() vs g.V.has()

I'm a little confused about this one. Several similar examples can be found throughout the documentation. Such as :
g.V.has('name','hercules').next()
g.query().has("name",EQUAL,"hercules").vertices()
Could someone clarify what the difference in the process is between the two above?
Thanks
The first is gremlin-groovy syntax:
g.V.has('name','hercules').next()
and either iterates all vertices looking for vertices that have a "name" property with a value of "hercules". In the event that "name" is indexed then titan will utilize the index to avoid the linear scan to find such vertices.
The second is basically Java and the Titan API. The above gremlin-groovy code basically compiles down to your second statement:
g.query().has("name",EQUAL,"hercules").vertices()
however, in the case of the second statement it returns an iterator of all vertices that match the filter and doesn't just pop off the first one as shown in the gremlin statement (given the use of next()).

Resources