What is the best way to query the document closest to a date-time on elasticsearch? - datetime

I need to retrieve the document that has the closest geo location and date-time to the request, so I'm not looking for a match of the date-time, but the closest one. I solved it using a custom script, however I'm guessing there might be a better way to do it, similar to the way I'm filtering the geo location based on a location and a distance.
Here's my code (in python):
query = {
"query": {
"function_score": {
"boost_mode": "replace",
"query": {
"filtered": {
"query" : {
"match_all" : {}
},
"filter" : {
"geo_distance" : {
"distance" : "10km",
"location" : json.loads(self.request.body)["location"]
}
}
}
},
"script_score": {
"lang": "groovy",
"script_file": "calculate-score",
"params": {
"stamp": json.loads(self.request.body)["stamp"]
}
}
}
},
"sort": [
{"_score": "asc"}
],
"size": 1
}
response = requests.get('http://localhost:9200/meteo/meteo/_search', data=json.dumps(query))
The custom calculate-score.groovy script contains the following:
abs(new java.text.SimpleDateFormat("yyyy-MM-dd\'T\'HH:mm").parse(stamp).getTime() - doc["stamp"].date.getMillis()) / 60000
The script returns the score as the absolute difference in minutes between the document date-time and the requested date-time.
Is there any other way to achieve this?

You should be able to use function_score to do this.
You could use the decay functions mentioned in the doucmentation to give a larger score to documents closer to the origin timestamp. Below is the example
where the scale=28800 mins i.e 20d.
Example:
put test
put test/test/_mapping
{
"properties": {
"stamp": {
"type": "date",
"format": "dateOptionalTime"
}
}
}
put test/test/1
{
"stamp":"2015-10-15T00:00"
}
put test/test/2
{
"stamp":"2015-10-15T12:00"
}
post test/_search
{
"query": {
"function_score": {
"functions": [
{
"linear": {
"stamp" : {
"origin": "now",
"scale": "28800m"
}
}
}
],
"score_mode" : "multiply",
"boost_mode": "multiply",
"query": {
"match_all": {}
}
}
}
}

Related

Kibana two different DSL Queries behaving as OR

I'm trying to create two Queries that appear as blue button on Visualizer that I want to apply both as if they are a OR . So in this case I can filter my logs by INFO or ERROR or BOTH AT THE SAME TIME. If I enable one or the other alone they work as expected but if I enable both it's like the final query is a INFO AND ERROR when what I want is INFO OR ERROR. Both of the queries are similar, one with ERROR the other with INFO
{
"query": {
"bool": {
"filter": [
{
"match": {
"message": "INFO"
}
}
]
}
}
}
I used both filter and should.
I do saw the Inspect but I can't understand it.....
Any idea if this is possible at all?
Thanks.
EDITED for clarification after 1st reply:
What I need is 2 different, separated queries (one with "status": "info" and the other "status": "error", because I want to attach them to those blue buttons that appear when you click "Add a filter". So I end up with 2 blue buttons, ERROR and INFO, in a way that when they are both enabled it will show both. At the moment they work individually but when I enable both I think it behaves like ERROR AND INFO and no line have both, so what I want is some kind of ERROR OR INFO so it will display both. Any idea?
EDIT 2:
From my last comment below, looking at the Inspect with two scripts each one in its own button, it shows
Inspect
"query": {
"bool": {
"must": [ <--- my scripts below get wrapped in this MUST
{
"bool": {
"should": [
{
"match_phrase": {
"message": "INFO"
}
}
],
"minimum_should_match": 1
}
},
{
"bool": {
"should": [
{
"match_phrase": {
"message": "ERROR"
}
}
],
"minimum_should_match": 1
}
},
...
and the scripts I have in the two buttons
INFO
{
"query": {
"bool": {
"should": [
{
"match_phrase": {
"message": "INFO"
}
}
],
"minimum_should_match": 1
}
}
}
ERROR
{
"query": {
"bool": {
"should": [
{
"match_phrase": {
"message": "ERROR"
}
}
],
"minimum_should_match": 1
}
}
}
So if there is no way to change the way Kimunda wraps the scripts I guess I'm screwed...
EDIT
You can use is one of it will be OR query.
You can use the should query.
POST _bulk
{ "index" : { "_index" : "test_should", "_id" : "1" } }
{ "status" : "info" }
{ "index" : { "_index" : "test_should", "_id" : "2" } }
{ "status" : "error" }
GET test_should/_search
{
"query": {
"bool": {
"should": [
{
"match": {
"status": "info"
}
},
{
"match": {
"status": "error"
}
}
]
}
}
}

JSONPath exclude key in paths

Given a json like this:
{
"post": {
"operationId": "post-collection",
"responses": {
"201": {
"description": "Post something",
"content": {
"application/ld+json": {
"schema": {
"$ref": "./models/Mapping.jsonld.json"
}
},
"application/hal+json": {
"schema": {
"$ref": "./models/Mapping.jsonHal.json"
}
},
"application/vnd.api+json": {
"schema": {
"$ref": "./models/Mapping.vndApi.json" },
...
}
}
}
I want the content for all, BUT application/ld+json
I found a thread on how to do this based on the value: https://stackoverflow.com/a/29710922/1092632 but how would I exclude a key?
Basically what I am asking is the negative of this: $..content[application/ld+json]
Is this possible without any functions / programming? I used this for testing: https://jsonpath.com/
https://jsonpath.com/ uses JSONPath Plus
It provides #property shorthand selector within filters
$..content[?(#property != 'application/ld+json')]

2 filter dsl query looks the same and how to combine

In Kibana I created 2 filters:
raw.browserJs.isWebDriver is true and raw.browserJs.isWebDriver isnot true. why the edit query DSL is the same for both:
{
"query": {
"match": {
"raw.browserJs.isWebDriver": {
"query": true,
"type": "phrase"
}
}
}
}
Also, how can i add condition in order to have one large DSL query with:
{
"query": {
"match": {
"appName": {
"query": "temp",
"type": "phrase"
}
}
}
}
The query DSL showing in Kibana is not the actual query which is send to elasticsearch. A range filter for the selected period is added and filters are inverted. You can see the actual query in the underlying request that is send in your browser.
You filter where raw.browserJs.isWebDriver is not true will end up in something like:
{
"query": {
"bool": {
"must_not": [
{
"match_phrase": {
"raw.browserJs.isWebDriver": true
}
}
]
}
}
}
You can combine multiple conditions in one DSL query with the bool query.(https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-bool-query.html)
The following query will work in your example:
{
"query": {
"bool": {
"must": [
{
"match_phrase": {
"raw.browserJs.isWebDriver": true
}
},
{
"match_phrase": {
"appName": "temp"
}
}
]
}
}
}

How to change the include section of an AQL query in a file spec

I want to change the output of a AQL string formatted as a file spec for Artifactory.
The query looks like this:
{
"files": [
{
"aql": {
"items.find":{
"repo":"gradle-dev-local",
"$or":[
{
"$and": [
{ "stat.downloads": { "$eq":null } },
{ "updated": { "$before": "7d" } }
]
},
{
"$and": [
{ "stat.downloads": { "$gt": 0 } },
{ "stat.downloaded": { "$before": "30d" } }
]
}
]
}
}
}
]
}
In a pure AQL REST API call, I would include the following:
"include":["repo", "name", "path", "updated", "sha256", "stat.downloads", "stat.downloaded"]
But when used, it does not get passed in to the right part of the query, resulting in the following error message:
Failed to parse query: items.find({
"repo":"mfm-gradle-dev-local",
"$or":[
{
"$and": [
{ "stat.downloads": { "$eq":null } },
{ "updated": { "$before": "7d" } }
]
},
{
"$and": [
{ "stat.downloads": { "$gt": 0 } },
{ "stat.downloaded": { "$before": "30d" } }
]
}
]
},
"include":["repo", "name", "path", "updated", "sha256", "stat.downloads", "stat.downloaded"]
).include("name","repo","path","actual_md5","actual_sha1","size","type","property"), it looks like there is syntax error near the following sub-query: "include":["repo", "name", "path", "updated", "sha256", "stat.downloads", "stat.downloaded"]
How do I format the AQL so that the include statement gets passed as well?
If you're using the JFrog CLI, there is an open issue (github.com/jfrog/jfrog-cli-go/issues/320) for being able to add includes in the search queries (both using the -s parameter and file specs). Please feel free to add additional information to that issue, if we've missed anything so far.

Strange output query Elastic Search

I just started with using Elastic Search. I've got everything set-up correctly. I'm using Firebase + Flashlight + Elastic Search.
In my front-end I'm building queries based on different search params. I insert them into a node in Firebase /search/requests/. Flashlight will pick this up and putting the response into /search/response, this works like a charm!
However, I'm not sure how to write my queries properly. I'm getting strange results when I'm trying to combine two must match queries. I'm using Query DSL.
My documents in Elastic Search under deliverables/doc are having the following scheme.
...
{
"createdBy" : "admin#xx.org",
"createdOn" : 1501200000000,
"deadLine" : 1508716800000,
"description" : {
"value" : "dummy description"
},
"key" : "<FBKEY>",
"programmes" : [ {
"code" : "95000",
"name" : "Test programme",
"programYear" : 2017
} ],
"projects" : [ {
"projectCode" : "113200",
"projectName" : "Test project",
"projectYear" : 2017
} ],
"reportingYear" : 2017,
"status" : "Open",
"type" : "writing",
"updatedBy" : "admin#xx.org",
"updatedOn" : 1501200000000,
},
...
My query has the following structure.
{
"query": {
"bool": {
"must": [
{
"match": {
"createdBy": "xx#company.org"
},
"match": {
"programmes.code": "95000"
}
}
]
}
}
}
In my output I'm also getting documents that don't have exactly those two fields? They have a very low score as well. Is this normal?
My mapping, automatically created using Flashlight
Update 1
I just tried this query, however it still gives me strange results by not filtering on both fields:
{
"query": {
"bool": {
"filter": {
"bool": {
"must": [
{
"match": {
"programmes.code": "890000"
}
},
{
"match": {
"createdBy": "admin#xx.org"
}
}
]
}
}
}
}
}
The must clause used in bool query is executed in query context(all the documents are returned in decreasing order of score) and contributes to score. see link
If you want it to be executed as a filter, then use the following query:
{
"query": {
"bool": {
"filter": {
"bool": {
"must": [
{
"match": {
"createdBy": "xx#company.org"
}
},
{
"match": {
"programmes.code": "95000"
}
}
]
}
}
}
}
}
NOTE:
By default the string field is analyzed, update the mapping of the string fields as not_analyzed, to use filter query. Refer: mapping-intro

Resources