Strange output query Elastic Search - firebase

I just started with using Elastic Search. I've got everything set-up correctly. I'm using Firebase + Flashlight + Elastic Search.
In my front-end I'm building queries based on different search params. I insert them into a node in Firebase /search/requests/. Flashlight will pick this up and putting the response into /search/response, this works like a charm!
However, I'm not sure how to write my queries properly. I'm getting strange results when I'm trying to combine two must match queries. I'm using Query DSL.
My documents in Elastic Search under deliverables/doc are having the following scheme.
...
{
"createdBy" : "admin#xx.org",
"createdOn" : 1501200000000,
"deadLine" : 1508716800000,
"description" : {
"value" : "dummy description"
},
"key" : "<FBKEY>",
"programmes" : [ {
"code" : "95000",
"name" : "Test programme",
"programYear" : 2017
} ],
"projects" : [ {
"projectCode" : "113200",
"projectName" : "Test project",
"projectYear" : 2017
} ],
"reportingYear" : 2017,
"status" : "Open",
"type" : "writing",
"updatedBy" : "admin#xx.org",
"updatedOn" : 1501200000000,
},
...
My query has the following structure.
{
"query": {
"bool": {
"must": [
{
"match": {
"createdBy": "xx#company.org"
},
"match": {
"programmes.code": "95000"
}
}
]
}
}
}
In my output I'm also getting documents that don't have exactly those two fields? They have a very low score as well. Is this normal?
My mapping, automatically created using Flashlight
Update 1
I just tried this query, however it still gives me strange results by not filtering on both fields:
{
"query": {
"bool": {
"filter": {
"bool": {
"must": [
{
"match": {
"programmes.code": "890000"
}
},
{
"match": {
"createdBy": "admin#xx.org"
}
}
]
}
}
}
}
}

The must clause used in bool query is executed in query context(all the documents are returned in decreasing order of score) and contributes to score. see link
If you want it to be executed as a filter, then use the following query:
{
"query": {
"bool": {
"filter": {
"bool": {
"must": [
{
"match": {
"createdBy": "xx#company.org"
}
},
{
"match": {
"programmes.code": "95000"
}
}
]
}
}
}
}
}
NOTE:
By default the string field is analyzed, update the mapping of the string fields as not_analyzed, to use filter query. Refer: mapping-intro

Related

Kibana two different DSL Queries behaving as OR

I'm trying to create two Queries that appear as blue button on Visualizer that I want to apply both as if they are a OR . So in this case I can filter my logs by INFO or ERROR or BOTH AT THE SAME TIME. If I enable one or the other alone they work as expected but if I enable both it's like the final query is a INFO AND ERROR when what I want is INFO OR ERROR. Both of the queries are similar, one with ERROR the other with INFO
{
"query": {
"bool": {
"filter": [
{
"match": {
"message": "INFO"
}
}
]
}
}
}
I used both filter and should.
I do saw the Inspect but I can't understand it.....
Any idea if this is possible at all?
Thanks.
EDITED for clarification after 1st reply:
What I need is 2 different, separated queries (one with "status": "info" and the other "status": "error", because I want to attach them to those blue buttons that appear when you click "Add a filter". So I end up with 2 blue buttons, ERROR and INFO, in a way that when they are both enabled it will show both. At the moment they work individually but when I enable both I think it behaves like ERROR AND INFO and no line have both, so what I want is some kind of ERROR OR INFO so it will display both. Any idea?
EDIT 2:
From my last comment below, looking at the Inspect with two scripts each one in its own button, it shows
Inspect
"query": {
"bool": {
"must": [ <--- my scripts below get wrapped in this MUST
{
"bool": {
"should": [
{
"match_phrase": {
"message": "INFO"
}
}
],
"minimum_should_match": 1
}
},
{
"bool": {
"should": [
{
"match_phrase": {
"message": "ERROR"
}
}
],
"minimum_should_match": 1
}
},
...
and the scripts I have in the two buttons
INFO
{
"query": {
"bool": {
"should": [
{
"match_phrase": {
"message": "INFO"
}
}
],
"minimum_should_match": 1
}
}
}
ERROR
{
"query": {
"bool": {
"should": [
{
"match_phrase": {
"message": "ERROR"
}
}
],
"minimum_should_match": 1
}
}
}
So if there is no way to change the way Kimunda wraps the scripts I guess I'm screwed...
EDIT
You can use is one of it will be OR query.
You can use the should query.
POST _bulk
{ "index" : { "_index" : "test_should", "_id" : "1" } }
{ "status" : "info" }
{ "index" : { "_index" : "test_should", "_id" : "2" } }
{ "status" : "error" }
GET test_should/_search
{
"query": {
"bool": {
"should": [
{
"match": {
"status": "info"
}
},
{
"match": {
"status": "error"
}
}
]
}
}
}

2 filter dsl query looks the same and how to combine

In Kibana I created 2 filters:
raw.browserJs.isWebDriver is true and raw.browserJs.isWebDriver isnot true. why the edit query DSL is the same for both:
{
"query": {
"match": {
"raw.browserJs.isWebDriver": {
"query": true,
"type": "phrase"
}
}
}
}
Also, how can i add condition in order to have one large DSL query with:
{
"query": {
"match": {
"appName": {
"query": "temp",
"type": "phrase"
}
}
}
}
The query DSL showing in Kibana is not the actual query which is send to elasticsearch. A range filter for the selected period is added and filters are inverted. You can see the actual query in the underlying request that is send in your browser.
You filter where raw.browserJs.isWebDriver is not true will end up in something like:
{
"query": {
"bool": {
"must_not": [
{
"match_phrase": {
"raw.browserJs.isWebDriver": true
}
}
]
}
}
}
You can combine multiple conditions in one DSL query with the bool query.(https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-bool-query.html)
The following query will work in your example:
{
"query": {
"bool": {
"must": [
{
"match_phrase": {
"raw.browserJs.isWebDriver": true
}
},
{
"match_phrase": {
"appName": "temp"
}
}
]
}
}
}

How can I create a mapping for two types in ElasticSearch?

I would like to index user IDs and tag IDs.
I send a PUT request to https://ip//elasticsearch/myIndex
{
"mappings" : {
"users" : {
"properties" : {
"id" : {"type": "keyword" }
}
},
"tags" : {
"properties" : {
"id" : {"type": "keyword" }
}
}
}
}
However, I receive this response:
{
"error": {
"root_cause": [
{
"type": "illegal_argument_exception",
"reason": "Rejecting mapping update to [myIndex] as the final mapping would have more than 1 type: [users, tags]"
}
],
"type": "illegal_argument_exception",
"reason": "Rejecting mapping update to [myIndex] as the final mapping would have more than 1 type: [users, tags]"
},
"status": 400
}
How can I solve this error?
From Elastic 6.x you cannot have more than 1 mapping type. Use a single mapping type. Instead use custom type field. See this link and also this link.
As stated by #ben5556, you can only have one type per index in elasticsearch 6+. However, you can mimic multiple types per index by including your own "type" field.
{
"mappings" : {
"ids" : {
"properties" : {
"id" : {"type": "keyword" }
"type": {"type": "keyword" }
}
}
}
}
Then when you index a document, you include the "type":
{
"type": "user",
"id": "12345"
}
This will allow you to filter by type when querying the index (using a termsQuery). Which is all elasticsearch was really doing behind the scenes for you anyways back when it supported multiple types.

Matching users with similar interest tags using Firebase, Elasticsearch and Flashlight

What is the best way to match documents by tags using elasticsearch using the following setup (or modifying the setup)?
I've got my users in a firebase database, they have associated tags that define their interests:
"users" : {
"bruce" : {
"martial art" : "Jeet Kune Do",
"name" : "Bruce Lee",
"nick" : "Little Phoenix",
"tags" : {
"android" : true,
"ios" : true
}
},
"chan" : {
"account_type" : "contractor",
"martial art" : "Kung Fu",
"name" : "Jackie Chan",
"nick" : "Cannonball",
"tags" : {
"ios" : true
}
},
"chuck" : {
"martial art" : "Chun Kuk Do",
"name" : "Carlos Ray Norris",
"nick" : "Chuck"
}}
Using Flashlight + the Firebase admin SDK I'm keeping an index up to date on Bonsai/heroku, that supposedly will help me to match users with similar interests or related products.
"firebase": {
"aliases": {},
"mappings": {
"user": {
"properties": {
"name": {
"type": "string"
},
"tags": {
"properties": {
"android": {
"type": "boolean"
},
"ios": {
"type": "boolean"
}
}
}
}
}
}...
For now I can query users with certain combination of tags:
{
"query": {
"bool": {
"must" : {
"type" : {
"value" : "user"
}
},
"should": [
{
"term": {
"tags.ios": true
}
},
{
"term": {
"tags.android": true
}
}
],
"minimum_should_match" : 1
}}}
This is great but what I'm looking for is a way to:
Given a user id find other users with similar tags ordered by _score.
There will be other _type's apart from "user" using also tags, for example products so it would also be great to match products to users when they share some tags.
I get the feeling that because I'm absolutely new on elastic search I'm targeting this in the wrong way. Maybe the way the data is modeled?
Problem is that firebase kind of restricts this a lot, for instance I cannot have arrays, so that makes the tag modeling a bit weird ending in even more weird indexed data...maybe an approach could be to manipulate the data before inserting it to the index?

What is the best way to query the document closest to a date-time on elasticsearch?

I need to retrieve the document that has the closest geo location and date-time to the request, so I'm not looking for a match of the date-time, but the closest one. I solved it using a custom script, however I'm guessing there might be a better way to do it, similar to the way I'm filtering the geo location based on a location and a distance.
Here's my code (in python):
query = {
"query": {
"function_score": {
"boost_mode": "replace",
"query": {
"filtered": {
"query" : {
"match_all" : {}
},
"filter" : {
"geo_distance" : {
"distance" : "10km",
"location" : json.loads(self.request.body)["location"]
}
}
}
},
"script_score": {
"lang": "groovy",
"script_file": "calculate-score",
"params": {
"stamp": json.loads(self.request.body)["stamp"]
}
}
}
},
"sort": [
{"_score": "asc"}
],
"size": 1
}
response = requests.get('http://localhost:9200/meteo/meteo/_search', data=json.dumps(query))
The custom calculate-score.groovy script contains the following:
abs(new java.text.SimpleDateFormat("yyyy-MM-dd\'T\'HH:mm").parse(stamp).getTime() - doc["stamp"].date.getMillis()) / 60000
The script returns the score as the absolute difference in minutes between the document date-time and the requested date-time.
Is there any other way to achieve this?
You should be able to use function_score to do this.
You could use the decay functions mentioned in the doucmentation to give a larger score to documents closer to the origin timestamp. Below is the example
where the scale=28800 mins i.e 20d.
Example:
put test
put test/test/_mapping
{
"properties": {
"stamp": {
"type": "date",
"format": "dateOptionalTime"
}
}
}
put test/test/1
{
"stamp":"2015-10-15T00:00"
}
put test/test/2
{
"stamp":"2015-10-15T12:00"
}
post test/_search
{
"query": {
"function_score": {
"functions": [
{
"linear": {
"stamp" : {
"origin": "now",
"scale": "28800m"
}
}
}
],
"score_mode" : "multiply",
"boost_mode": "multiply",
"query": {
"match_all": {}
}
}
}
}

Resources