Get vertices with a simpler format - gremlin

Is there a way to get a list of vertices with a simpler format?
Currently, the following query:
g.V().has(label, 'Quantity').has('text', '627 km');
returns an object like this:
{
"id": 42545168,
"label": "Quantity",
"type": "vertex",
"properties": {
"sentence": [
{
"id": "pkbgi-pbw28-745",
"value": "null"
}
],
"updated_text": [
{
"id": "pk9vm-pbw28-5j9",
"value": "627 km"
}
],[...]
And when I get a list of edges it is formatted in a simpler format:
g.E().has(label, 'locatedAt').has('out_entity_id','41573-41579');
returns:
{
"id": "ozfnt-ip8o-2mtx-g8vs",
"label": "locatedAt",
"type": "edge",
"inVLabel": "Location",
"outVLabel": "Location",
"inV": 758008,
"outV": 872520,
"properties": {
"sentence": "Bolloré is a corporation (société anonyme) with a Board of Directors whose registered offi ce is located at Odet, 29500 Ergué-Gabéric in France.",
"in_entity_id": "41544-41548",
"score": "0.795793",
"out_entity_id": "41573-41579"
}
}
How so?
Is there a way to get vertices formatted this way?

My advice is to rather than have your query return the whole vertex, return the specific properties that you are interested in. For example the vertex ID or some selected properties that you are interested in or a valueMap. Otherwise what you will get back is essentially everything. This is really the same as in SQL trying to not do a "select *" but selecting only what you really care about.
Edited to add an example that returns the IDs of matching vertices.
g.V().has(label, 'Quantity').has('text', '627 km').id().fold()
Will yield a result that looks like this
{"requestId":"73f40519-87c8-4037-a9fc-41be82b3b227","status":{"message":"","code":200,"attributes":{}},"result":{"data":[[20608,28920,32912,106744,123080,135200,139296,143464,143488,143560,151584,155688,155752,159784,188520,254016,282688,286968,311360,323832,348408,4344,835648,8336,1343616,12352]],"meta":{}}}

Related

Can't get the desired properties via JsonPath evaluate method

I have a json schema that marks special properties in need of processing and I want to query those via JsonPath.Evaluate.
Here's a part of the schema to illustrate the issue
{
"type": "object",
"properties": {
"period": {
"description": "The period in which the rule applies",
"type": "object",
"properties": {
"start": {
"type": "string",
"format": "date-time"
},
"end": {
"type": "string",
"format": "date-time"
}
},
"required": [
"start"
],
"x-updateIndicatorProperties": [
"start"
]
},
"productType": {
"type": "string"
},
"x-updateIndicatorProperties": [
"productType"
]
}
}
I want to get the the JsonPath of the "x-updateIndicatorProperties" properties, so that I can then query the actual properties to process.
For this example, the expected result would be
[
"$['properties']['x-updateIndicatorProperties']",
"$['properties']['period']['x-updateIndicatorProperties']"
]
I've been trying for a while to get a JsonPath expression that would query these properties.
Currently I'm just iterating all properties and filter them manually :
"$..*"
I've also tried using :
$..['x-updateIndicatorProperties']
This works. But it returns a lot of duplicates. For the example above, I get 5 results instead of the expected 2. Can be demonstrated here : https://json-everything.net/json-path
Assuming I can't influence the schema itself, only the code that traverses it,
can anybody help with an expression to get the expected results or any other way to achieve the same outcome?
The stack is JsonPath 0.2.0, .net 6 and system.text.json.
This was a bug in the library when parsing paths that use a recursive descent (..) into a quoted-property-name selector (['foo']). So it would happen for any path in the form $..['foo'].
I've fixed the issue and released version 0.2.1.

Create a "join" query with data from edge and connected vertix

I have a Gremlin API Cosmos DB. In the DB I have one type of Vertice with Label User that are connected to Vertices labeled Companies. I then want to show all connected companies. I do the query g.V('id-of-User').outE() and gets all connected Companies. The result might look something like this:
[
{
"id": "08f97a1d-9e81-4ccc-a498-90eb502b1879",
"label": "AuthorizedSignatory",
"type": "edge",
"inVLabel": "Company",
"outVLabel": "User",
"inV": "abd51134-1524-44fe-8a49-60d2d449a1f3",
"outV": "103bf1b9-464f-4f68-a4ca-7dfdbe94ae84"
},
{
"id": "c36b640b-9574-403b-8ab6-fcce695caa90",
"label": "AuthorizedSignatory",
"type": "edge",
"inVLabel": "Company",
"outVLabel": "User",
"inV": "2c14d279-00a4-41ad-a8c0-f3b882864568",
"outV": "103bf1b9-464f-4f68-a4ca-7dfdbe94ae84"
}
]
This is absolutely as expected. Now I want to take this a bit further and instead of just showing the GUID in the inV parameter I also want to include the Company Name in the result object, but I do not understand how to do the equivalent to a SQL join here.
Can someone please help me!!
What I want is something similar to the example below:
[
{
"id": "08f97a1d-9e81-4ccc-a498-90eb502b1879",
"label": "AuthorizedSignatory",
"type": "edge",
"inVLabel": "Company",
"outVLabel": "User",
"inV": "abd51134-1524-44fe-8a49-60d2d449a1f3",
"outV": "103bf1b9-464f-4f68-a4ca-7dfdbe94ae84",
"CompanyName": "ACME CORP"
},
{
"id": "c36b640b-9574-403b-8ab6-fcce695caa90",
"label": "AuthorizedSignatory",
"type": "edge",
"inVLabel": "Company",
"outVLabel": "User",
"inV": "2c14d279-00a4-41ad-a8c0-f3b882864568",
"outV": "103bf1b9-464f-4f68-a4ca-7dfdbe94ae84",
"CompanyName": "Giganticorp"
}
]
Where the CompanyName is one of the properties in the Company Vertice with the guid in inV prop.
There is no "join". The data is already connected by way of the edge, so you simply need to traverse further along your graph to get the "CompanyName".
g.V('id-of-User').out().values("CompanyName")
That shows you all of the names of the companies related to that user. If you're saying that you still want to show the data from the edge in addition to company name as you had in your examples, then no problem, project() the edge being specific about what you want:
g.V('id-of-User').outE().
project('eid','label','companyName').
by(T.id).
by(T.label).
by(inV().values("CompanyName"))
Again, note that there is no "join" for the "CompanyName". As the data is implicitly joined by way of the edge you just need to traverse over inV() to reach the data there.

Gremlin querying Edge inVLabel, outVLabel

I have the following example Edge labeled "posts". "posts" can can have multiple types of parent Vertice (outVLabel) such as "channel", "publisher", "user", etc. How do you query for all Edges that have an outVLabel of "channel" without interrogating the label on the out() vertice? I want an array of "posts" Edges returned.
Query:
g.E().hasLabel('posts').has(???, 'channel')
Edge object:
[{
"id": "83c972b0-315d-49fe-a735-882c4dcbdaa2",
"label": "posts",
"type": "edge",
"inVLabel": "article",
"outVLabel": "channel",
"inV": "7410b6c8-ed70-4a00-800c-489d596907da",
"outV": "c8c5f45d-0195-49c5-b7ae-9eda1d441bc9",
"properties": {
"service": "rss"
}]
You would have to do:
g.E().hasLabel('posts').where(outV().hasLabel('channel'))
or if necessary, denormalize and place the outgoing vertex label on the edge as a property, in which case you could then do:
g.E().has('posts', 'outVLabel', 'channel')

correct GeoJSON format. map visualization

First things first: is this data in proper GeoJSON format?
According to the definition of GeoJSON data, as you can see by the MultiPoint & coordinates, I think it is.
It looks like this:
{
"lang": {
"code": "en",
"conf": 1.0
},
"group": "JobServe",
"description": "Work with the data science team to build new products and integrate analytics\ninto existing workflows. Leverage big data solutions, advanced statistical\nmethods, and web apps. Coordinate with domain experts, IT operations, and\ndevelopers. Present to clients.\n\n * Coordinate the workflow of the data science team\n * Join a team of experts in big data, advanced analytics, and visualizat...",
"title": "Data Science Team Lead",
"url": "http://www.jobserve.com/us/en/search-jobs-in-Columbia,-Maryland,-USA/DATA-SCIENCE-TEAM-LEAD-99739A4618F8894B/",
"geo": {
"type": "MultiPoint",
"coordinates": [
[
-76.8582049,
39.2156213
]
]
},
"tags": [
"Job Board"
],
"spider": "jobserveNa",
"employmentType": [
"Unspecified"
],
"lastSeen": "2015-05-13T01:21:07.240000",
"jobLocation": [
"Columbia, Maryland, United States of America"
],
"identifier": "99739A4618F8894B",
"hiringOrganization": [
"Customer Relation Market Research Company"
],
"firstSeen": "2015-05-13T01:21:07+00:00"
},
I want to visualize this as a "zoomable",viz. interactive, map, as in the examples on the d3js website.
I'm trying to use a tool called mapshaper.org to see an initial visualization of the data in map form, but when I load it up, nothing happens.
To me this doesn't make sense because, according to their website, one can simply
Drag and drop or select a file to import.
Shapefile, GeoJSON and TopoJSON files and Zip archives are supported.
However, in my case it is not working.
Does anyone have any intuition as to what might be going wrong, or a suggestion as to a tool comparable to create a zoomable map out of GeoJSON data?
According to the definition of GeoJSON data, I have what I think constitutes data in that format
Well, you don't have a proper GeoJSON object. Just compare what you've got against the example you've linked. It doesn't even come close. That's why mapshaper doesn't know what to do with the JSON you load into it.
A GeoJSON object with the type "FeatureCollection" is a feature collection object. An object of type "FeatureCollection" must have a member with the name "features". The value corresponding to "features" is an array. Each element in the array is a feature object as defined above.
A feature collection looks like this:
{
"type": "FeatureCollection",
"features": [
// Array of features
]
}
http://geojson.org/geojson-spec.html#feature-collection-objects
A GeoJSON object with the type "Feature" is a feature object. A feature object must have a member with the name "geometry". The value of the geometry member is a geometry object as defined above or a JSON null value. A feature object must have a member with the name "properties". The value of the properties member is an object (any JSON object or a JSON null value). If a feature has a commonly used identifier, that identifier should be included as a member of the feature object with the name "id".
A feature looks like this:
{
"id": "Foo",
"type": "Feature",
"geometry": {
"type": "Point",
"coordinates": [0, 0]
},
"properties": {
"label": "My Foo"
}
}
http://geojson.org/geojson-spec.html#feature-objects
Here are examples of the different geometry objects a feature can support: http://geojson.org/geojson-spec.html#appendix-a-geometry-examples
Put those two together, it would look like this:
{
"type": "FeatureCollection",
"features": [{
"id": "Foo",
"type": "Feature",
"geometry": {
"type": "Point",
"coordinates": [0, 0]
},
"properties": {
"label": "My Foo"
}
},{
"id": "Bar",
"type": "Feature",
"geometry": {
"type": "LineString",
"coordinates": [
[100.0, 0.0],
[101.0, 1.0]
]
},
"properties": {
"label": "My Bar"
}
}]
}
That really doesn't look like the JSON you've posted. You'll need to convert that to proper GeoJSON somehow via a custom script or manually. It's a format i've never seen before, sorry to say.

freebase city regions above neighborhood

Using freebase how can I find say, all the burrows/subcities of NY? (queens, brooklyn, etc.)
And will it be similar to other cities? Say if I want to know the subdivisions of Prague (Zizkov, Old Town, etc.) or Berlin, etc?
I've tried various combos but haven't hit one yet.
{
"id": "/en/new_york",
"guid": null,
"name": null,
"/location/location/containedby": [
],
"/location/location/contains" : [],
"/location/place_with_neighborhoods/neighborhoods": [
]
}​
The property /location/location/contains is the one that you want, but you're going to have two problems:
It's only sparsely populated
It has multiple levels of containment as a hack to work around API limitations
There's not much you can do about #1 unless you want to work on improving the data yourself. For #2, you can subtract the set of locations which are contained in another location in the "contains" set.
Someone might be able to give a better answer but this will get major districts like in NY but probably not for smaller cities which are more like regions.
{
"id": "/en/new_york",
"guid": null,
"name": null,
"/location/location/containedby": [
],
"/location/location/contains" : [
"name": null,
"type": "/location/citytown"
]
}​
or to select multiple items that might be it
{
"id": "/en/new_york",
"guid": null,
"name": null,
"/location/location/containedby": [
],
"/location/location/contains" : [
"name": null,
"type|=" : [
"/location/citytown",
"/location/neighborhood",
"/location/administrative_division",
"/location/de_borough",
"/location/place_with_neighborhoods/neighborhoods"
]
]
}​

Resources