Kibana two different DSL Queries behaving as OR - kibana

I'm trying to create two Queries that appear as blue button on Visualizer that I want to apply both as if they are a OR . So in this case I can filter my logs by INFO or ERROR or BOTH AT THE SAME TIME. If I enable one or the other alone they work as expected but if I enable both it's like the final query is a INFO AND ERROR when what I want is INFO OR ERROR. Both of the queries are similar, one with ERROR the other with INFO
{
"query": {
"bool": {
"filter": [
{
"match": {
"message": "INFO"
}
}
]
}
}
}
I used both filter and should.
I do saw the Inspect but I can't understand it.....
Any idea if this is possible at all?
Thanks.
EDITED for clarification after 1st reply:
What I need is 2 different, separated queries (one with "status": "info" and the other "status": "error", because I want to attach them to those blue buttons that appear when you click "Add a filter". So I end up with 2 blue buttons, ERROR and INFO, in a way that when they are both enabled it will show both. At the moment they work individually but when I enable both I think it behaves like ERROR AND INFO and no line have both, so what I want is some kind of ERROR OR INFO so it will display both. Any idea?
EDIT 2:
From my last comment below, looking at the Inspect with two scripts each one in its own button, it shows
Inspect
"query": {
"bool": {
"must": [ <--- my scripts below get wrapped in this MUST
{
"bool": {
"should": [
{
"match_phrase": {
"message": "INFO"
}
}
],
"minimum_should_match": 1
}
},
{
"bool": {
"should": [
{
"match_phrase": {
"message": "ERROR"
}
}
],
"minimum_should_match": 1
}
},
...
and the scripts I have in the two buttons
INFO
{
"query": {
"bool": {
"should": [
{
"match_phrase": {
"message": "INFO"
}
}
],
"minimum_should_match": 1
}
}
}
ERROR
{
"query": {
"bool": {
"should": [
{
"match_phrase": {
"message": "ERROR"
}
}
],
"minimum_should_match": 1
}
}
}
So if there is no way to change the way Kimunda wraps the scripts I guess I'm screwed...

EDIT
You can use is one of it will be OR query.
You can use the should query.
POST _bulk
{ "index" : { "_index" : "test_should", "_id" : "1" } }
{ "status" : "info" }
{ "index" : { "_index" : "test_should", "_id" : "2" } }
{ "status" : "error" }
GET test_should/_search
{
"query": {
"bool": {
"should": [
{
"match": {
"status": "info"
}
},
{
"match": {
"status": "error"
}
}
]
}
}
}

Related

2 filter dsl query looks the same and how to combine

In Kibana I created 2 filters:
raw.browserJs.isWebDriver is true and raw.browserJs.isWebDriver isnot true. why the edit query DSL is the same for both:
{
"query": {
"match": {
"raw.browserJs.isWebDriver": {
"query": true,
"type": "phrase"
}
}
}
}
Also, how can i add condition in order to have one large DSL query with:
{
"query": {
"match": {
"appName": {
"query": "temp",
"type": "phrase"
}
}
}
}
The query DSL showing in Kibana is not the actual query which is send to elasticsearch. A range filter for the selected period is added and filters are inverted. You can see the actual query in the underlying request that is send in your browser.
You filter where raw.browserJs.isWebDriver is not true will end up in something like:
{
"query": {
"bool": {
"must_not": [
{
"match_phrase": {
"raw.browserJs.isWebDriver": true
}
}
]
}
}
}
You can combine multiple conditions in one DSL query with the bool query.(https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-bool-query.html)
The following query will work in your example:
{
"query": {
"bool": {
"must": [
{
"match_phrase": {
"raw.browserJs.isWebDriver": true
}
},
{
"match_phrase": {
"appName": "temp"
}
}
]
}
}
}

How to change the include section of an AQL query in a file spec

I want to change the output of a AQL string formatted as a file spec for Artifactory.
The query looks like this:
{
"files": [
{
"aql": {
"items.find":{
"repo":"gradle-dev-local",
"$or":[
{
"$and": [
{ "stat.downloads": { "$eq":null } },
{ "updated": { "$before": "7d" } }
]
},
{
"$and": [
{ "stat.downloads": { "$gt": 0 } },
{ "stat.downloaded": { "$before": "30d" } }
]
}
]
}
}
}
]
}
In a pure AQL REST API call, I would include the following:
"include":["repo", "name", "path", "updated", "sha256", "stat.downloads", "stat.downloaded"]
But when used, it does not get passed in to the right part of the query, resulting in the following error message:
Failed to parse query: items.find({
"repo":"mfm-gradle-dev-local",
"$or":[
{
"$and": [
{ "stat.downloads": { "$eq":null } },
{ "updated": { "$before": "7d" } }
]
},
{
"$and": [
{ "stat.downloads": { "$gt": 0 } },
{ "stat.downloaded": { "$before": "30d" } }
]
}
]
},
"include":["repo", "name", "path", "updated", "sha256", "stat.downloads", "stat.downloaded"]
).include("name","repo","path","actual_md5","actual_sha1","size","type","property"), it looks like there is syntax error near the following sub-query: "include":["repo", "name", "path", "updated", "sha256", "stat.downloads", "stat.downloaded"]
How do I format the AQL so that the include statement gets passed as well?
If you're using the JFrog CLI, there is an open issue (github.com/jfrog/jfrog-cli-go/issues/320) for being able to add includes in the search queries (both using the -s parameter and file specs). Please feel free to add additional information to that issue, if we've missed anything so far.

Strange output query Elastic Search

I just started with using Elastic Search. I've got everything set-up correctly. I'm using Firebase + Flashlight + Elastic Search.
In my front-end I'm building queries based on different search params. I insert them into a node in Firebase /search/requests/. Flashlight will pick this up and putting the response into /search/response, this works like a charm!
However, I'm not sure how to write my queries properly. I'm getting strange results when I'm trying to combine two must match queries. I'm using Query DSL.
My documents in Elastic Search under deliverables/doc are having the following scheme.
...
{
"createdBy" : "admin#xx.org",
"createdOn" : 1501200000000,
"deadLine" : 1508716800000,
"description" : {
"value" : "dummy description"
},
"key" : "<FBKEY>",
"programmes" : [ {
"code" : "95000",
"name" : "Test programme",
"programYear" : 2017
} ],
"projects" : [ {
"projectCode" : "113200",
"projectName" : "Test project",
"projectYear" : 2017
} ],
"reportingYear" : 2017,
"status" : "Open",
"type" : "writing",
"updatedBy" : "admin#xx.org",
"updatedOn" : 1501200000000,
},
...
My query has the following structure.
{
"query": {
"bool": {
"must": [
{
"match": {
"createdBy": "xx#company.org"
},
"match": {
"programmes.code": "95000"
}
}
]
}
}
}
In my output I'm also getting documents that don't have exactly those two fields? They have a very low score as well. Is this normal?
My mapping, automatically created using Flashlight
Update 1
I just tried this query, however it still gives me strange results by not filtering on both fields:
{
"query": {
"bool": {
"filter": {
"bool": {
"must": [
{
"match": {
"programmes.code": "890000"
}
},
{
"match": {
"createdBy": "admin#xx.org"
}
}
]
}
}
}
}
}
The must clause used in bool query is executed in query context(all the documents are returned in decreasing order of score) and contributes to score. see link
If you want it to be executed as a filter, then use the following query:
{
"query": {
"bool": {
"filter": {
"bool": {
"must": [
{
"match": {
"createdBy": "xx#company.org"
}
},
{
"match": {
"programmes.code": "95000"
}
}
]
}
}
}
}
}
NOTE:
By default the string field is analyzed, update the mapping of the string fields as not_analyzed, to use filter query. Refer: mapping-intro

What is the best way to query the document closest to a date-time on elasticsearch?

I need to retrieve the document that has the closest geo location and date-time to the request, so I'm not looking for a match of the date-time, but the closest one. I solved it using a custom script, however I'm guessing there might be a better way to do it, similar to the way I'm filtering the geo location based on a location and a distance.
Here's my code (in python):
query = {
"query": {
"function_score": {
"boost_mode": "replace",
"query": {
"filtered": {
"query" : {
"match_all" : {}
},
"filter" : {
"geo_distance" : {
"distance" : "10km",
"location" : json.loads(self.request.body)["location"]
}
}
}
},
"script_score": {
"lang": "groovy",
"script_file": "calculate-score",
"params": {
"stamp": json.loads(self.request.body)["stamp"]
}
}
}
},
"sort": [
{"_score": "asc"}
],
"size": 1
}
response = requests.get('http://localhost:9200/meteo/meteo/_search', data=json.dumps(query))
The custom calculate-score.groovy script contains the following:
abs(new java.text.SimpleDateFormat("yyyy-MM-dd\'T\'HH:mm").parse(stamp).getTime() - doc["stamp"].date.getMillis()) / 60000
The script returns the score as the absolute difference in minutes between the document date-time and the requested date-time.
Is there any other way to achieve this?
You should be able to use function_score to do this.
You could use the decay functions mentioned in the doucmentation to give a larger score to documents closer to the origin timestamp. Below is the example
where the scale=28800 mins i.e 20d.
Example:
put test
put test/test/_mapping
{
"properties": {
"stamp": {
"type": "date",
"format": "dateOptionalTime"
}
}
}
put test/test/1
{
"stamp":"2015-10-15T00:00"
}
put test/test/2
{
"stamp":"2015-10-15T12:00"
}
post test/_search
{
"query": {
"function_score": {
"functions": [
{
"linear": {
"stamp" : {
"origin": "now",
"scale": "28800m"
}
}
}
],
"score_mode" : "multiply",
"boost_mode": "multiply",
"query": {
"match_all": {}
}
}
}
}

Elasticsearch PHP longest prefix match

I am currently using the FOSElasticaBundle in Symfony2 and I am having a hard time trying to build a search to match the longest prefix.
I am aware of the 100 examples that are on the Internet to perform autocomplete-like searches using this. However, my problem is a little different.
In an autocomplete type of search the database holds the longest alphanumeric string (in length of characters) and the user just provides the shortest portion, let's say the user types "jho" and Elasticsearch can easily provide "Jhon, Jhonny, Jhonas".
My problem is backwards, I would like to provide the longest alphanumeric string and I want Elasticsearch to provide me the biggest match in the database.
For example: I could provide "123456789" and my database can have [12,123,14,156,16,7,1234,1,67,8,9,123456,0], in this case the longest prefix match in the database for the number that the user provided is "123456".
I am just starting with Elasticsearch so I don't really have a close to working settings or anything.
If there is any information not clear or missing let me know and I will provide more details.
Update 1 (Using Val's 2nd Update)
Index: Download 1800+ indexes
Settings:
curl -XPUT localhost:9200/tests -d '{
"settings": {
"analysis": {
"analyzer": {
"edge_ngram_analyzer": {
"tokenizer": "edge_ngram_tokenizer",
"filter": [ "lowercase" ]
}
},
"tokenizer": {
"edge_ngram_tokenizer": {
"type": "edgeNGram",
"min_gram": "2",
"max_gram": "25"
}
}
}
},
"mappings": {
"test": {
"properties": {
"my_string": {
"type": "string",
"fields": {
"prefix": {
"type": "string",
"analyzer": "edge_ngram_analyzer"
}
}
}
}
}
}
}'
Query:
curl -XPOST localhost:9200/tests/test/_search?pretty=true -d '{
"size": 1,
"sort": {
"_script": {
"script": "doc.my_string.value.length()",
"type": "number",
"order": "desc"
},
"_score": "desc"
},
"query": {
"filtered": {
"query": {
"match": {
"my_string.prefix": "8092232423"
}
},
"filter": {
"script": {
"script": "doc.my_string.value.length() <= maxlength",
"params": {
"maxlength": 10
}
}
}
}
}
}'
With this configuration the query returns the following results:
{
"took" : 61,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 1754,
"max_score" : null,
"hits" : [ {
"_index" : "tests",
"_type" : "test",
"_id" : "AU8LqQo4FbTZPxBtq3-Q",
"_score" : 0.13441172,
"_source":{"my_string":"80928870"},
"sort" : [ 8.0, 0.13441172 ]
} ]
}
}
Bonus question
I would like to provide an array of numbers for that search and get the matching prefix for each one in an efficient way without having to perform the query each time
Here is my take at it.
Basically, what we need to do is to slice and dice the field (called my_string below) at indexing time with an edgeNGram tokenizer (called edge_ngram_tokenizer below). That way a string like 123456789 will be tokenized to 12, 123, 1234, 12345, 123456, 1234567, 12345678, 123456789 and all tokens will be indexed and searchable.
So let's create a tests index, a custom analyzer called edge_ngram_analyzer analyzer and a test mapping containing a single string field called my_string. You'll note that the my_string field is a multi-field declaring a prefixes sub-field which will contain all the tokenized prefixes.
curl -XPUT localhost:9200/tests -d '{
"settings": {
"analysis": {
"analyzer": {
"edge_ngram_analyzer": {
"tokenizer": "edge_ngram_tokenizer",
"filter": [ "lowercase" ]
}
},
"tokenizer": {
"edge_ngram_tokenizer": {
"type": "edgeNGram",
"min_gram": "2",
"max_gram": "25"
}
}
}
},
"mappings": {
"test": {
"properties": {
"my_string": {
"type": "string",
"fields": {
"prefixes": {
"type": "string",
"index_analyzer": "edge_ngram_analyzer"
}
}
}
}
}
}
}
Then let's index a few test documents using the _bulk API:
curl -XPOST localhost:9200/tests/test/_bulk -d '
{"index":{}}
{"my_string":"12"}
{"index":{}}
{"my_string":"1234"}
{"index":{}}
{"my_string":"1234567890"}
{"index":{}}
{"my_string":"abcd"}
{"index":{}}
{"my_string":"abcdefgh"}
{"index":{}}
{"my_string":"123456789abcd"}
{"index":{}}
{"my_string":"abcd123456789"}
'
The thing that I found particularly tricky was that the matching result could be either longer or shorter than the input string. To achieve that we have to combine two queries, one looking for shorter matches and another for longer matches. So the match query will find documents with shorter "prefixes" matching the input and the query_string query (with the edge_ngram_analyzer applied on the input string!) will search for "prefixes" longer than the input string. Both enclosed in a bool/should and sorted by a decreasing string length (i.e. longest first) will do the trick.
Let's do some queries and see what unfolds:
This query will return the one document with the longest match for "123456789", i.e. "123456789abcd". In this case, the result is longer than the input.
curl -XPOST localhost:9200/tests/test/_search -d '{
"size": 1,
"sort": {
"_script": {
"script": "doc.my_string.value.length()",
"type": "number",
"order": "desc"
}
},
"query": {
"bool": {
"should": [
{
"match": {
"my_string.prefixes": "123456789"
}
},
{
"query_string": {
"query": "123456789",
"default_field": "my_string.prefixes",
"analyzer": "edge_ngram_analyzer"
}
}
]
}
}
}'
The second query will return the one document with the longest match for "123456789abcdef", i.e. "123456789abcd". In this case, the result is shorter than the input.
curl -XPOST localhost:9200/tests/test/_search -d '{
"size": 1,
"sort": {
"_script": {
"script": "doc.my_string.value.length()",
"type": "number",
"order": "desc"
}
},
"query": {
"bool": {
"should": [
{
"match": {
"my_string.prefixes": "123456789abcdef"
}
},
{
"query_string": {
"query": "123456789abcdef",
"default_field": "my_string.prefixes",
"analyzer": "edge_ngram_analyzer"
}
}
]
}
}
}'
I hope that covers it. Let me know if not.
As for your bonus question, I'd simply suggest using the _msearch API and sending all queries at once.
UPDATE: Finally, make sure that scripting is enabled in your elasticsearch.yml file using the following:
# if you have ES <1.6
script.disable_dynamic: false
# if you have ES >=1.6
script.inline: on
UPDATE 2 I'm leaving the above as the use case might fit someone else's needs. Now, since you only need "shorter" prefixes (makes sense !!), we need to change the mapping a little bit and the query.
The mapping would be like this:
{
"settings": {
"analysis": {
"analyzer": {
"edge_ngram_analyzer": {
"tokenizer": "edge_ngram_tokenizer",
"filter": [
"lowercase"
]
}
},
"tokenizer": {
"edge_ngram_tokenizer": {
"type": "edgeNGram",
"min_gram": "2",
"max_gram": "25"
}
}
}
},
"mappings": {
"test": {
"properties": {
"my_string": {
"type": "string",
"fields": {
"prefixes": {
"type": "string",
"analyzer": "edge_ngram_analyzer" <--- only change
}
}
}
}
}
}
}
And the query would now be a bit different but will always return only the longest prefix but shorter or of equal length to the input string. Please try it out. I advise to re-index your data to make sure everything is setup properly.
{
"size": 1,
"sort": {
"_script": {
"script": "doc.my_string.value.length()",
"type": "number",
"order": "desc"
},
"_score": "desc" <----- also add this line
},
"query": {
"filtered": {
"query": {
"match": {
"my_string.prefixes": "123" <--- input string
}
},
"filter": {
"script": {
"script": "doc.my_string.value.length() <= maxlength",
"params": {
"maxlength": 3 <---- this needs to be set to the length of the input string
}
}
}
}
}
}

Resources