JSONPath exclude key in paths - jsonpath

Given a json like this:
{
"post": {
"operationId": "post-collection",
"responses": {
"201": {
"description": "Post something",
"content": {
"application/ld+json": {
"schema": {
"$ref": "./models/Mapping.jsonld.json"
}
},
"application/hal+json": {
"schema": {
"$ref": "./models/Mapping.jsonHal.json"
}
},
"application/vnd.api+json": {
"schema": {
"$ref": "./models/Mapping.vndApi.json" },
...
}
}
}
I want the content for all, BUT application/ld+json
I found a thread on how to do this based on the value: https://stackoverflow.com/a/29710922/1092632 but how would I exclude a key?
Basically what I am asking is the negative of this: $..content[application/ld+json]
Is this possible without any functions / programming? I used this for testing: https://jsonpath.com/

https://jsonpath.com/ uses JSONPath Plus
It provides #property shorthand selector within filters
$..content[?(#property != 'application/ld+json')]

Related

Cloud Firestore REST API - Add server timestamp

I'm using an Arduino with an ESP8266-01 module to upload a value to a Cloud Firestore database using the createDocument API with the following payload:
{
"fields": {
"distance": {
"integerValue": "555"
}
}
}
I do a POST-request to a route like this:
https://firestore.googleapis.com/v1beta1/projects/<MY_PROJECT>/databases/(default)/documents/<SOME_COLLECTION>?key=MY_VERY_SECRET_KEY
That all works, but I would like to add the server timestamp as well. I've found a few answers here on stackoverflow, but I have not been able to make any of them work.
How can I add the server timestamp to the created document? What I want is for the following to be created:
{
"fields": {
"distance": {
"integerValue": "555"
},
"timestamp" : {
"DATETIME": SERVER_TIMESTAMP
}
}
}
Any help appreciated.
What I ended up doing in the end was the following:
A POST request to a route like this:
https://firestore.googleapis.com/v1beta1/projects/<MY_PROJECT>/databases/(default)/documents:commit?&key=<MY_VERY_SECRET_KEY>
With the following payload:
{
"writes": [
{
"update": {
"name": "projects/<MY_PROJECT>/databases/(default)/documents/<COLLECTION_ID>/<DOCUMENT_ID>",
"fields": {
"distance": {
"integerValue": "555"
}
}
}
},
{
"transform": {
"document": "projects/<MY_PROJECT>/databases/(default)/documents/<COLLECTION_ID>/<DOCUMENT_ID>",
"fieldTransforms": [
{
"fieldPath": "servertime",
"setToServerValue": "REQUEST_TIME"
}
]
}
}
]
}
Where I generate a new DOCUMENT_ID (e.g. a GUID) instead of having cloud firestore generate one for me.

Request probleme with Google Cloud Datastore and Filter

I'm currently doing some tests on google datastore, but I'm having a problem with my queries.
If I believe in the documentation https://cloud.google.com/datastore/docs/concepts/queries we can realize a filter on several columns with the instruction EQUALS.
But when testing, I get an error from the API.
While searching on Datastore's github, I found this reference: https://github.com/GoogleCloudPlatform/google-cloud-dotnet/issues/304 which corresponds to my problem, except that for my case the query to the look good.
Here is the request sent:
{
{
"kind": [{
"name": "talk.message"
}],
"filter": {
"compositeFilter": {
"op": "AND",
"filters": [{
"propertyFilter": {
"property": {
"name": "Conversation"
},
"op": "EQUAL",
"value": {
"stringValue": "2f16c14f6939464ea687d316438ad4cb"
}
}
},
{
"propertyFilter": {
"property": {
"name": "CreatedOn"
},
"op": "LESS_THAN_OR_EQUAL",
"value": {
"timestampValue": "2019-03-15T10:43:31.474166300Z"
}
}
},
{
"propertyFilter": {
"property": {
"name": "CreatedOn"
},
"op": "GREATER_THAN_OR_EQUAL",
"value": {
"timestampValue": "2019-03-14T10:43:31.474175100Z"
}
}
}
]
}
}
}
}
And here is the answer from the API:
{Grpc.Core.RpcException: Status(
StatusCode=FailedPrecondition,
Detail="no matching index found. recommended index is:
- kind: talk.message
properties:
- name: Conversation
- name: CreatedOn"
)
According to the documentation, this should be good... but it's not !
What am I missing ?
Your query includes both an EQUALS (on Conversation) and a non-EQUALS filter (on CreatedOn), therefore you need a composite index to fulfil the query. So your query is valid, but it needs a composite index to be able to run the query.

API gateway body mapper nested objects

I have an endpoint setup on API gateway that is talking directly to DynamoDB.
As a post request comes in I use the body mapper script to map my url request parameters to dynamoDB params.
My URL params
{
"name": "sdaf",
"location": "asdf",
"gender": "male"
}
Body Mapper Script
{
"TableName": "sample-table",
"Item": {
"firstName": {
"S": "$input.path('$.name')"
},
"location": {
"S": "$input.path('$.location')"
}
}
}
All of this works fine until I have to write a whole object to dynamo.
New URL Params
{
"name": "sdaf",
"location": "asdf",
"gender": "male",
"hobbies": {
"hobby1": {
"startedAt": "<some time>"
},
"hobby2": {
"startedAt": "<some time>"
},
}
}
I am not sure how the body mapper is supposed to look like for this situation?
I have tried this:
Body Mapper
{
"TableName": "sample-table",
"Item": {
"firstName": {
"S": "$input.path('$.name')"
},
"location": {
"S": "$input.path('$.location')"
},
"hobbies": {
"M": "$input.path('$.hobbies')"
}
}
}
But doesn't work. I wonder if there is a way to dump an object into a column in dynamo from the api gateway directly. I know this is possible with adding a lambda in between but I want to avoid that.
I don't think this can be made to work while passing hobbies as a URL parameter.
If you instead pass hobbies in the body, you can do something like this:
"M": {
#foreach( $elem in $input.path('$.hobbies'))
$elem
#if($foreach.hasNext),#end
#end
}

What is the best way to query the document closest to a date-time on elasticsearch?

I need to retrieve the document that has the closest geo location and date-time to the request, so I'm not looking for a match of the date-time, but the closest one. I solved it using a custom script, however I'm guessing there might be a better way to do it, similar to the way I'm filtering the geo location based on a location and a distance.
Here's my code (in python):
query = {
"query": {
"function_score": {
"boost_mode": "replace",
"query": {
"filtered": {
"query" : {
"match_all" : {}
},
"filter" : {
"geo_distance" : {
"distance" : "10km",
"location" : json.loads(self.request.body)["location"]
}
}
}
},
"script_score": {
"lang": "groovy",
"script_file": "calculate-score",
"params": {
"stamp": json.loads(self.request.body)["stamp"]
}
}
}
},
"sort": [
{"_score": "asc"}
],
"size": 1
}
response = requests.get('http://localhost:9200/meteo/meteo/_search', data=json.dumps(query))
The custom calculate-score.groovy script contains the following:
abs(new java.text.SimpleDateFormat("yyyy-MM-dd\'T\'HH:mm").parse(stamp).getTime() - doc["stamp"].date.getMillis()) / 60000
The script returns the score as the absolute difference in minutes between the document date-time and the requested date-time.
Is there any other way to achieve this?
You should be able to use function_score to do this.
You could use the decay functions mentioned in the doucmentation to give a larger score to documents closer to the origin timestamp. Below is the example
where the scale=28800 mins i.e 20d.
Example:
put test
put test/test/_mapping
{
"properties": {
"stamp": {
"type": "date",
"format": "dateOptionalTime"
}
}
}
put test/test/1
{
"stamp":"2015-10-15T00:00"
}
put test/test/2
{
"stamp":"2015-10-15T12:00"
}
post test/_search
{
"query": {
"function_score": {
"functions": [
{
"linear": {
"stamp" : {
"origin": "now",
"scale": "28800m"
}
}
}
],
"score_mode" : "multiply",
"boost_mode": "multiply",
"query": {
"match_all": {}
}
}
}
}

Elasticsearch PHP longest prefix match

I am currently using the FOSElasticaBundle in Symfony2 and I am having a hard time trying to build a search to match the longest prefix.
I am aware of the 100 examples that are on the Internet to perform autocomplete-like searches using this. However, my problem is a little different.
In an autocomplete type of search the database holds the longest alphanumeric string (in length of characters) and the user just provides the shortest portion, let's say the user types "jho" and Elasticsearch can easily provide "Jhon, Jhonny, Jhonas".
My problem is backwards, I would like to provide the longest alphanumeric string and I want Elasticsearch to provide me the biggest match in the database.
For example: I could provide "123456789" and my database can have [12,123,14,156,16,7,1234,1,67,8,9,123456,0], in this case the longest prefix match in the database for the number that the user provided is "123456".
I am just starting with Elasticsearch so I don't really have a close to working settings or anything.
If there is any information not clear or missing let me know and I will provide more details.
Update 1 (Using Val's 2nd Update)
Index: Download 1800+ indexes
Settings:
curl -XPUT localhost:9200/tests -d '{
"settings": {
"analysis": {
"analyzer": {
"edge_ngram_analyzer": {
"tokenizer": "edge_ngram_tokenizer",
"filter": [ "lowercase" ]
}
},
"tokenizer": {
"edge_ngram_tokenizer": {
"type": "edgeNGram",
"min_gram": "2",
"max_gram": "25"
}
}
}
},
"mappings": {
"test": {
"properties": {
"my_string": {
"type": "string",
"fields": {
"prefix": {
"type": "string",
"analyzer": "edge_ngram_analyzer"
}
}
}
}
}
}
}'
Query:
curl -XPOST localhost:9200/tests/test/_search?pretty=true -d '{
"size": 1,
"sort": {
"_script": {
"script": "doc.my_string.value.length()",
"type": "number",
"order": "desc"
},
"_score": "desc"
},
"query": {
"filtered": {
"query": {
"match": {
"my_string.prefix": "8092232423"
}
},
"filter": {
"script": {
"script": "doc.my_string.value.length() <= maxlength",
"params": {
"maxlength": 10
}
}
}
}
}
}'
With this configuration the query returns the following results:
{
"took" : 61,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 1754,
"max_score" : null,
"hits" : [ {
"_index" : "tests",
"_type" : "test",
"_id" : "AU8LqQo4FbTZPxBtq3-Q",
"_score" : 0.13441172,
"_source":{"my_string":"80928870"},
"sort" : [ 8.0, 0.13441172 ]
} ]
}
}
Bonus question
I would like to provide an array of numbers for that search and get the matching prefix for each one in an efficient way without having to perform the query each time
Here is my take at it.
Basically, what we need to do is to slice and dice the field (called my_string below) at indexing time with an edgeNGram tokenizer (called edge_ngram_tokenizer below). That way a string like 123456789 will be tokenized to 12, 123, 1234, 12345, 123456, 1234567, 12345678, 123456789 and all tokens will be indexed and searchable.
So let's create a tests index, a custom analyzer called edge_ngram_analyzer analyzer and a test mapping containing a single string field called my_string. You'll note that the my_string field is a multi-field declaring a prefixes sub-field which will contain all the tokenized prefixes.
curl -XPUT localhost:9200/tests -d '{
"settings": {
"analysis": {
"analyzer": {
"edge_ngram_analyzer": {
"tokenizer": "edge_ngram_tokenizer",
"filter": [ "lowercase" ]
}
},
"tokenizer": {
"edge_ngram_tokenizer": {
"type": "edgeNGram",
"min_gram": "2",
"max_gram": "25"
}
}
}
},
"mappings": {
"test": {
"properties": {
"my_string": {
"type": "string",
"fields": {
"prefixes": {
"type": "string",
"index_analyzer": "edge_ngram_analyzer"
}
}
}
}
}
}
}
Then let's index a few test documents using the _bulk API:
curl -XPOST localhost:9200/tests/test/_bulk -d '
{"index":{}}
{"my_string":"12"}
{"index":{}}
{"my_string":"1234"}
{"index":{}}
{"my_string":"1234567890"}
{"index":{}}
{"my_string":"abcd"}
{"index":{}}
{"my_string":"abcdefgh"}
{"index":{}}
{"my_string":"123456789abcd"}
{"index":{}}
{"my_string":"abcd123456789"}
'
The thing that I found particularly tricky was that the matching result could be either longer or shorter than the input string. To achieve that we have to combine two queries, one looking for shorter matches and another for longer matches. So the match query will find documents with shorter "prefixes" matching the input and the query_string query (with the edge_ngram_analyzer applied on the input string!) will search for "prefixes" longer than the input string. Both enclosed in a bool/should and sorted by a decreasing string length (i.e. longest first) will do the trick.
Let's do some queries and see what unfolds:
This query will return the one document with the longest match for "123456789", i.e. "123456789abcd". In this case, the result is longer than the input.
curl -XPOST localhost:9200/tests/test/_search -d '{
"size": 1,
"sort": {
"_script": {
"script": "doc.my_string.value.length()",
"type": "number",
"order": "desc"
}
},
"query": {
"bool": {
"should": [
{
"match": {
"my_string.prefixes": "123456789"
}
},
{
"query_string": {
"query": "123456789",
"default_field": "my_string.prefixes",
"analyzer": "edge_ngram_analyzer"
}
}
]
}
}
}'
The second query will return the one document with the longest match for "123456789abcdef", i.e. "123456789abcd". In this case, the result is shorter than the input.
curl -XPOST localhost:9200/tests/test/_search -d '{
"size": 1,
"sort": {
"_script": {
"script": "doc.my_string.value.length()",
"type": "number",
"order": "desc"
}
},
"query": {
"bool": {
"should": [
{
"match": {
"my_string.prefixes": "123456789abcdef"
}
},
{
"query_string": {
"query": "123456789abcdef",
"default_field": "my_string.prefixes",
"analyzer": "edge_ngram_analyzer"
}
}
]
}
}
}'
I hope that covers it. Let me know if not.
As for your bonus question, I'd simply suggest using the _msearch API and sending all queries at once.
UPDATE: Finally, make sure that scripting is enabled in your elasticsearch.yml file using the following:
# if you have ES <1.6
script.disable_dynamic: false
# if you have ES >=1.6
script.inline: on
UPDATE 2 I'm leaving the above as the use case might fit someone else's needs. Now, since you only need "shorter" prefixes (makes sense !!), we need to change the mapping a little bit and the query.
The mapping would be like this:
{
"settings": {
"analysis": {
"analyzer": {
"edge_ngram_analyzer": {
"tokenizer": "edge_ngram_tokenizer",
"filter": [
"lowercase"
]
}
},
"tokenizer": {
"edge_ngram_tokenizer": {
"type": "edgeNGram",
"min_gram": "2",
"max_gram": "25"
}
}
}
},
"mappings": {
"test": {
"properties": {
"my_string": {
"type": "string",
"fields": {
"prefixes": {
"type": "string",
"analyzer": "edge_ngram_analyzer" <--- only change
}
}
}
}
}
}
}
And the query would now be a bit different but will always return only the longest prefix but shorter or of equal length to the input string. Please try it out. I advise to re-index your data to make sure everything is setup properly.
{
"size": 1,
"sort": {
"_script": {
"script": "doc.my_string.value.length()",
"type": "number",
"order": "desc"
},
"_score": "desc" <----- also add this line
},
"query": {
"filtered": {
"query": {
"match": {
"my_string.prefixes": "123" <--- input string
}
},
"filter": {
"script": {
"script": "doc.my_string.value.length() <= maxlength",
"params": {
"maxlength": 3 <---- this needs to be set to the length of the input string
}
}
}
}
}
}

Resources