problems on elasticsearch with parent child documents

problems on elasticsearch with parent child documents - parent-child

We work with two types of documents on elastic search (ES): items and slots, where items are parents of slot documents.
We define the index with the following command:
curl -XPOST 'localhost:9200/items' -d #itemsdef.json
where itemsdef.json has the following definition
{
"mappings" : {
"item" : {
"properties" : {
"id" : {"type" : "long" },
"name" : {
"type" : "string",
"_analyzer" : "textIndexAnalyzer"
},
"location" : {"type" : "geo_point" },
}
}
},
"settings" : {
"analysis" : {
"analyzer" : {
"activityIndexAnalyzer" : {
"alias" : ["activityQueryAnalyzer"],
"type" : "custom",
"tokenizer" : "whitespace",
"filter" : ["trim", "lowercase", "asciifolding", "spanish_stop", "spanish_synonym"]
},
"textIndexAnalyzer" : {
"type" : "custom",
"tokenizer" : "whitespace",
"filter" : ["word_delimiter_impl", "trim", "lowercase", "asciifolding", "spanish_stop", "spanish_synonym"]
},
"textQueryAnalyzer" : {
"type" : "custom",
"tokenizer" : "whitespace",
"filter" : ["trim", "lowercase", "asciifolding", "spanish_stop"]
}
},
"filter" : {
"spanish_stop" : {
"type" : "stop",
"ignore_case" : true,
"enable_position_increments" : true,
"stopwords_path" : "analysis/spanish-stopwords.txt"
},
"spanish_synonym" : {
"type" : "synonym",
"synonyms_path" : "analysis/spanish-synonyms.txt"
},
"word_delimiter_impl" : {
"type" : "word_delimiter",
"generate_word_parts" : true,
"generate_number_parts" : true,
"catenate_words" : true,
"catenate_numbers" : true,
"split_on_case_change" : false
}
}
}
}
}
Then we add the child document definition using the following command:
curl -XPOST 'localhost:9200/items/slot/_mapping' -d #slotsdef.json
Where slotsdef.json has the following definition:
{
"slot" : {
"_parent" : {"type" : "item"},
"_routing" : {
"required" : true,
"path" : "parent_id"
},
"properties": {
"id" : { "type" : "long" },
"parent_id" : { "type" : "long" },
"activity" : {
"type" : "string",
"_analyzer" : "activityIndexAnalyzer"
},
"day" : { "type" : "integer" },
"start" : { "type" : "integer" },
"end" : { "type" : "integer" }
}
}
}
Finally we perform a bulk index with the following command:
curl -XPOST 'localhost:9200/items/_bulk' --data-binary #testbulk.json
Where testbulk.json holds the following data:
{"index":{"_type": "item", "_id":35}}
{"location":[40.4,-3.6],"id":35,"name":"A Name"}
{"index":{"_type":"slot","_id":126,"_parent":35}}
{"id":126,"start":1330,"day":1,"end":1730,"activity":"An Activity","parent_id":35}
We see through ES Head plugin that definitions seem to be ok. We test the analyzers to check that they have been loaded and they work. Both documents appear listed in ES Head browser view. But if we try to retrieve the child item using the API, ES responds that it does not exist:
$ curl -XGET 'localhost:9200/items/slot/126'
{"_index":"items","_type":"slot","_id":"126","exists":false}
When we import 50 documents, all parent documents can be retrieved through API, but only SOME of the requests for child elements get a successful response.
My guess is that it may have something to do with how docs are stored across shards and the routing...which certainly is not clear to me how it works.
Any clue on how to be able to retrieve individual child documents? ES Head shows they have been stored but HTTP GETs to localhost:9200/items/slot/XXX respond randomly with "exists":false.

The child documents are using parent's id for routing. So, in order to retrieve child documents you need to specify parent id in the routing parameter on your query:
curl "localhost:9200/items/slot/126?routing=35"
If parent id is not available, you will have to search for the child documents:
curl "localhost:9200/items/slot/_search?q=id:126"
or switch to an index with a single shard.

Related

Wiremock matchesJSONPath if null or empty

I'm trying to add a Wiremock stub that matches if the JSON in a request body is either non-existent OR an empty string.
The stub I have at the moment is:
{
"id" : "e331007e-3e6d-4660-b575-b04e774e88c6",
"request" : {
"urlPathPattern" : "/premises/([0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12})/bookings/([0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12})/non-arrivals",
"method" : "POST",
"bodyPatterns" : [ {
"matchesJsonPath" : "$.[?(#.reason === '' || #.reason == null)]"
} ]
},
"response" : {
"status" : 400,
"jsonBody" : {
"type" : "https://example.net/validation-error",
"title" : "Invalid request parameters",
"code" : 400,
"invalid-params" : [ {
"propertyName" : "reason",
"errorType" : "blank"
} ]
},
"headers" : {
"Content-Type" : "application/problem+json;charset=UTF-8"
}
},
"uuid" : "e331007e-3e6d-4660-b575-b04e774e88c6"
}
It matches is the reason is '', but not if reason is not present. Any ideas?

Can/Should I mix application/x-ndjson with application/hal+json?

Currently trying to get the best of both worlds in the same http response, but an http response can have only one content type.
Should I use one content type for both?
Is this not encouraged/recommended?
Thanks for your help!
EDIT 2021-08-23 14:27Z
This is the type of response I want to return, for which I don't know which Content-Type to use:
{ "firstname" : "Alan", "id" : 21, "_links" : { "self" : { "href" : "http://hos/people/21", "all" : { "href" : "http://hos/people" }, "dogs" : { "href" : "http://hos/people/21/dogs" } }
{ "firstname" : "Dave", "id" : 42, "_links" : { "self" : { "href" : "http://hos/people/42", "all" : { "href" : "http://hos/people" }, "dogs" : { "href" : "http://hos/people/42/dogs" } }
{ "firstname" : "John", "id" : 99, "_links" : { "self" : { "href" : "http://hos/people/99", "all" : { "href" : "http://hos/people" }, "dogs" : { "href" : "http://hos/people/99/dogs" } }
"best of both worlds":
I don't know a lot of both content types, but as I understand application/x-ndjson means each item is delimited with \n, and http clients can use this response an item at a time, or in bulk, but do not need to wait for the whole response to be built/sent to use it.
And application/hal+json as I understand means the returned json has links referencing available actions for the returned entity.

Example Dgraph recurse sum query

New Dgraph user wondering if anyone can provide me with an example recursive count and sum query to help get me going.
The data looks like this (there are more predicates, but left out for simplicity):
{
"uid" : <0x1>,
"url" : "example.com",
"link" : [
{
"uid" : <0x2>,
"url" : "example2.com",
"link" : [
{
"uid" : <0x4>,
"url" : "example4.com",
"link" : [
{
"uid" : <0x6>,
"url" : "example6.com",
"link" : [
{
etc...
}
]
}
]
},
{
"uid" : <0x5>,
"url" : "example5.com",
}
]
},
{
"uid" : <0x2>,
"url" : "example2.com",
"link" : [
{
etc ....
}
},
]
}
Just a home page with n-links which each have n-links and the depth, obviously, can vary. Just hoping for a good example of how to count all the links for each url and sum them up. I will add different filters to the query at some point, but just wanting to see a basic query to help get me going. Thanks.

Loading mapbox with Firebase database

I'm trying to learn Firebase and Mapbox and wanted to integrate the two. Firebase stores some of my data in the following format:
{
"messages" : {
"-KUE2EwfvbI48Azw01Hv" : {
"geometry" : {
"coordinates" : [ 28.6618976, 77.22739580000007 ],
"type" : "Point"
},
"properties" : {
"description" : "xyz",
"hashtag" : "#xyz",
"imageUrl" : "xyz.jpg",
"name" : "Xyz Xyz",
"photoUrl" : "xyz.jpg",
"title" : "XYZ"
},
"type" : "Issue"
},
"-KUD2EwfvbI48Azw01Hv" : {
"geometry" : {
"coordinates" : [ 12.9715987, 77.59456269999998 ],
"type" : "Point"
},
"properties" : {
"description" : "xyz",
"hashtag" : "#xyz",
"imageUrl" : "xyz.jpg",
"name" : "Xyz Xyz",
"photoUrl" : "xyz.jpg",
"title" : "XYZ"
},
"type" : "Issue"
}
}
}
Is there a way to load the data and plot it into Mapbox? The examples require a GeoJSON file hosted somewhere that can be used to plot them. How can we use the Firebase database to plot on the Mapbox in realtime?
Sorry if my question is ambiguous. I'm willing to provide more information if needed :D
Thanks!

You can load the data, but you first have to convert it to a valid GeoJSON object.
Here is a JSFiddle using the data you provided:
https://jsfiddle.net/mkrv9uuy/
var firebaseGeojsonFeatures = [];
for (var key in firebaseData.messages) {
var f = firebaseData.messages[key];
f.type = "Feature";
firebaseGeojsonFeatures.push(f);
}

Using Usergrid how do I get related entities nested in a single json and not only the link to them

When I query /mycollections?ql=Select * where name='dfsdfsdfsdfsdfsdf' I get
{
"action" : "get",
"application" : "859e6180-de8a-11e4-9360-f1aabbc15f58",
"params" : {
"ql" : [ "Select * where name='dfsdfsdfsdfsdfsdf'" ]
},
"path" : "/mycollections",
"uri" : "http://localhost:8080/myorg/myapp/mycollections",
"entities" : [ {
"uuid" : "2ff8961a-dea8-11e4-996b-63ce373ace35",
"type" : "mycollection",
"name" : "dfsdfsdfsdfsdfsdf",
"created" : 1428577466865,
"modified" : 1428577466865,
"metadata" : {
"path" : "/mycollections/2ff8961a-dea8-11e4-996b-63ce373ace35",
"connections" : {
"relations" : "/mycollections/2ff8961a-dea8-11e4-996b-63ce373ace35/relations"
}
}
} ],
"timestamp" : 1428589309204,
"duration" : 53,
"organization" : "myorg",
"applicationName" : "myapp",
"count" : 1
}
Now if I query /mycollections/2ff8961a-dea8-11e4-996b-63ce373ace35/relations I get the second entity
{
"action" : "get",
"application" : "859e6180-de8a-11e4-9360-f1aabbc15f58",
"params" : { },
"path" : "/mycollections/2ff8961a-dea8-11e4-996b-63ce373ace35/relations",
"uri" : "http://localhost:8080/myorg/myapp/mycollections/2ff8961a-dea8-11e4-996b-63ce373ace35/relations",
"entities" : [ {
"uuid" : "56a1185a-dec1-11e4-9ac0-e9343f86b604",
"type" : "secondcollection",
"name" : "coucou",
"created" : 1428588269141,
"modified" : 1428588269141,
"metadata" : {
"connecting" : {
"relations" : "/mycollections/2ff8961a-dea8-11e4-996b-63ce373ace35/relations/56a1185a-dec1-11e4-9ac0-e9343f86b604/connecting/relations"
},
"path" : "/mycollections/2ff8961a-dea8-11e4-996b-63ce373ace35/relations/56a1185a-dec1-11e4-9ac0-e9343f86b604"
}
} ],
"timestamp" : 1428589668542,
"duration" : 51,
"organization" : "myorg",
"applicationName" : "myapp"
}
What I want is that instead of providing me the path of the related entity Usergrid directly nest it in the first JSON answer so that I only need to make a single http request instead of two.

You cannot. Usergrid is not designed in that way. You need to write an extra wrapper rest endpoint to simulate one response.

Not sure what DB you are using. If you are using document db like mongo then you can write a node.js scripts to do this manipulation. Apigee has volvo.js check is it possible to do scripting.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

problems on elasticsearch with parent child documents - parent-child

Related

Wiremock matchesJSONPath if null or empty

Can/Should I mix application/x-ndjson with application/hal+json?

Example Dgraph recurse sum query

Loading mapbox with Firebase database

Using Usergrid how do I get related entities nested in a single json and not only the link to them

Categories

Resources