Elasticsearch not find anything on the field with more words - symfony

I'm trying elasticsearch and it looks great!
I noticed, however, a problem very uncomfortable, in a field that contains hello world if I search hello wo returns no result!
Why does this happen?
Place my configuration (FOSElasticaBundle):
fos_elastica:
clients:
default: { host: localhost, port: 9200 }
serializer:
callback_class: FOS\ElasticaBundle\Serializer\Callback
serializer: serializer
indexes:
website:
client: default
settings:
index:
analysis:
analyzer:
custom_search_analyzer:
type: custom
tokenizer: standard
filter : [standard, worddelimiter, stopwords, snowball, lowercase, asciifolding]
custom_index_analyzer:
type: custom
tokenizer: nGram
filter : [standard, worddelimiter, stopwords, snowball, lowercase, asciifolding]
filter:
stopwords:
type: stop
stopwords: [_italian_]
ignore_case : true
worddelimiter :
type: word_delimiter
tokenizer:
nGram:
type: nGram
min_gram: 1
max_gram: 20
types:
structure:
mappings:
name: { boost: 9, search_analyzer: custom_search_analyzer, index_analyzer: custom_index_analyzer, type: string }
Any idea on how to solve?
EDIT
Here my query:
{
query: {
bool: {
must: [ ]
must_not: [ ]
should: [
{
term: {
structure.name: hello wo
}
}
]
}
}
from: 0
size: 10
sort: [ ]
facets: { }
}
EDIT 2
Ok, I don't understand this behavior ...
Now I run this query:
{
query: {
bool: {
must: [
{
term: {
structure.name: hello
}
}
{
term: {
structure.name: wo
}
}
]
must_not: [ ]
should: [ ]
}
}
from: 0
size: 10
sort: [ ]
facets: { }
}
This query is the result I wanted, but I do not understand what is the difference in putting a must with two words and two must have a word with everyone!
I could explain this behavior?

Well i need to explain you probably how its working
When you index text elastic search will try to split it to terms if text is analyzed(as its in your mapping) so in your case "hello world" will be spited to two terms "hello" and "world" when you will do term search you write term hello world which does not fit any of your two terms.
To avoid spiting to terms you can set in mapping that field name is not analyzed, then it will not be spitted to two words and will be handled as one token.
Other solution is you can multiterm query
{
"query": {
"terms": {
"structure.name": [
"world",
"hello"
]
}
}
}
Also when you use query_string it return result since it has different algorithm.
So depends on you needs you should use different queries, but to search by name you should use query_string, term should be used if you want to filter lets say categoryId, tags and stuff like that.

Related

Kibana Vega-Lite use aggregates as data

I am new with Vega-Lite in Kibana. I am trying to produce a bar chart in Kibana using Vega.
I use Vega because I have to use nested fields, and it seems there are not other options.
I don't want to plot a time series, I want to directly plot aggregates.
This is my script:
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"title": "Event counts from all indexes",
data: {
name: "aggregations"
url: {
%context%: true
%timefield%: timestamp
index: search-sonarqube-telemetry-2021-merged
body: {
"aggs": {
"languages": {
"terms": { "field": "plugins.name.keyword"}
}
}
size: 0
}
}
format: {property: "aggregations.languages.buckets" }
}
mark: bar
encoding: {
y: {
field: "buckets.key"
type: nominal
axis: { title: null }
}
x: {
field: "buckets.doc_count"
type: quantitative
axis: { title: "Document count" }
}
}
transform: [
{"filter":
{"field": "doc_count", "range": [0,100000]}
}
]
}
Everything is empty.
If I try to debug, I see the source_0, with the correct data I would like to plot, but not the data_0.
I also get the warnings:
Infinite extent for field "buckets.doc_count_start": [Infinity, -Infinity]
Infinite extent for field "buckets.doc_count_end": [Infinity, -Infinity]
What is wrong in my script?
Thanks
I found the problem.
I defined the property as aggregations.languages.buckets, but the I was defined the fields as buckets.key and buckets.doc_count in the encoding.
It was enough to replace key and doc_count to buckets.key and buckets.doc_count and it worked.
Actually it was not finding the fields because inside aggregations.languages.buckets we don't have a repetition of buckets.

ElasticSearch - difference between two date fields

I have an index in ElasticSearch with two fields of date type (metricsTime & arrivalTime). A sample document is quoted below. In Kibana, I created a scripted field delay for the difference between those two fields. My painless script is:
doc['arrivalTime'].value - doc['metricsTime'].value
However, I got the following error message when navigating to Kibana's Discover tab: class_cast_exception: Cannot apply [-] operation to types [org.joda.time.MutableDateTime] and [org.joda.time.MutableDateTime].
This looks same as the error mentioned in https://discuss.elastic.co/t/problem-in-difference-between-two-dates/121655. But the answer in that page suggests that my script is correct. Could you please help?
Thanks!
{
"_index": "events",
"_type": "_doc",
"_id": "HLV274_1537682400000",
"_version": 1,
"_score": null,
"_source": {
"metricsTime": 1537682400000,
"box": "HLV274",
"arrivalTime": 1539930920347
},
"fields": {
"metricsTime": [
"2018-09-23T06:00:00.000Z"
],
"arrivalTime": [
"2018-10-19T06:35:20.347Z"
]
},
"sort": [
1539930920347
]
}
Check the list of Lucene Expressions to check what expressions are available for date field and how you could use them
Just for sake of simplicity, check the below query. I have created two fields metricsTime and arrivalTime in a sample index I've created.
Sample Document
POST mydateindex/mydocs/1
{
"metricsTime": "2018-09-23T06:00:00.000Z",
"arrivalTime": "2018-10-19T06:35:20.347Z"
}
Query using painless script
POST mydateindex/_search
{ "query": {
"bool": {
"must": {
"match_all": {
}
},
"filter": {
"bool" : {
"must" : {
"script" : {
"script" : {
"inline" : "doc['arrivalTime'].date.dayOfYear - doc['metricsTime'].date.dayOfYear > params.difference",
"lang" : "painless",
"params": {
"difference": 2
}
}
}
}
}
}
}
}
}
Note the below line in the query
"inline" : "doc['arrivalTime'].date.dayOfYear - doc['metricsTime'].date.dayOfYear > params.difference"
Now if you change the value of difference from 2 to 26 (which is one more than the difference in the dates) then you see that the above query would not return the document.
But nevertheless, I have mentioned the query as an example as how using scripting you can compare two different and please do refer to the link I've shared.

How to Remove from List Of Maps in DynamoDB (Must be Atomic)

I have this Schema:
{
product: S // Primary Key, // my Hash
media: L // List of Maps
}
Each media item will be like this:
[
{
id: S, // for example: id: uuid()
type: S, // for example: "image"
url: S // for example: id: "http://domain.com/image.jpg"
}
]
Sample Data:
{
product: "IPhone 6+",
media: [
{
id: "1",
type: "image",
url: "http://www.apple.com/iphone-6-plus/a.jpg"
},
{
id: "2",
type: "image",
url: "http://www.apple.com/iphone-6-plus/b.jpg"
},
{
id: "3",
type: "video",
url: "http://www.apple.com/iphone-6-plus/overview.mp4"
}
]
}
I want to be able to remove from media list by id.
Something like: "From product: 'IPhone 6+', remove the media with id: '2'"
After the query, the Data should be like this:
{
product: "IPhone 6+",
media: [
{
id: "1",
type: "image",
url: "http://www.apple.com/iphone-6-plus/a.jpg"
},
{
id: "3",
type: "video",
url: "http://www.apple.com/iphone-6-plus/overview.mp4"
}
]
}
How should i express query like this? I saw a post on UpdateItem but i can't find a good example for this query type.
Thanks!
Unfortunately, the API doesn't have this feature. The closest you can do is to delete an entry from "List" data type if you know the index.
I understand that most of the time the index mayn't be available. However, you can take a look at this alternate option if you don't have any other solution.
You also need to understand that even though DynamoDB started supporting the Document data types such as List, Map and Set, you can't perform some actions. Some features are yet to be added in the API. I believe this scenario is one of them.
I have used REMOVE to delete the item from list.
var params = {
TableName : "Product",
Key : {
"product" : "IPhone 6+"
},
UpdateExpression : "REMOVE media[0]",
ReturnValues : "UPDATED_NEW"
};
console.log("Updating the item...");
docClient.update(params, function(err, data) {
if (err) {
console.error("Unable to update item. Error JSON:", JSON.stringify(err, null, 2));
} else {
console.log("UpdateItem succeeded:", JSON.stringify(data));
}
});
This is just for your reference:-
The Delete operator can be used only on SET.
DELETE - Deletes an element from a set.
If a set of values is specified, then those values are subtracted from
the old set. For example, if the attribute value was the set [a,b,c]
and the DELETE action specifies [a,c], then the final attribute value
is [b]. Specifying an empty set is an error.
The DELETE action only supports set data types. In addition, DELETE
can only be used on top-level attributes, not nested attributes.

How to use nested queries with FOQElasticaBundle for Symfony2

I have a problem with query building with FOQElasticaBundle
I have 3 entities
User
Hotel
Ambiance
Users can have 1 or more Hotels, and each Hotel has only 1 Ambiance.
In my config file, I have:
foq_elastica:
clients:
default: { host: %elasticsearch.host%, port: %elasticsearch.port% }
indexes:
MyBundle:
client: default
finder:
types:
user:
mappings:
id:
boost: 10
analyzer: fr_case_analyzer
name:
boost: 5
analyzer: fr_case_analyzer
hotels:
type: "nested"
properties:
name:
boost: 10
analyzer: fr_case_analyzer
ambiance:
boost: 1
I want to be able to search for User by typing his name or the name of his hotels, and possibly add a filter on the Ambiance type.
So the query should look like something like this :
$mainQuery = new \Elastica_Query_Bool();
$nameQuery = new \Elastica_Query_Bool();
$filtersQuery = new \Elastica_Query_Bool();
//searching in Users' names
$nameQuery = new \Elastica_Query_Text();
$nameQuery->setFieldQuery('name', $searchName);
$nameQuery->setFieldParam('name', 'boost', 5);
$nameQuery->setFieldParam('name', 'type', 'phrase_prefix');
//searching in Hotels' names
$hotelNameQuery = new \Elastica_Query_Text();
$hotelNameQuery->setFieldQuery('name', $searchName);
$hotelNameQuery->setFieldParam('name', 'boost', 3);
$hotelNameQuery->setFieldParam('name', 'type', 'phrase_prefix');
$nestedHotelNameQuery = new \Elastica_Query_Nested();
$nestedHotelNameQuery->setPath('hotels');
$nestedHotelNameQuery->setQuery($hotelNameQuery);
$nameQuery->addShould($nameQuery);
$nameQuery->addShould($nestedHotelNameQuery);
//if filter on ambiance
$ambianceQuery = new \Elastica_Query_Term();
$ambianceQuery->setTerm('ambiance', $arrFilters['ambiance']);
$nestedAmbianceQuery = new \Elastica_Query_Nested();
$nestedAmbianceQuery->setPath('hotels');
$nestedAmbianceQuery->setQuery($ambianceQuery);
$filtersQuery->addMust($nestedAmbianceQuery);
//adding the parameters to the main query
$mainQuery->addMust($nameQuery);
$mainQuery->addMust($filtersQuery);
Unfortunately this doesn't work and returns no result if the Ambiance filter is activated, but works perfectly if I only search with the name.
What do I do wrong ?
I found why it wouldn't work.
The bundle actually uses __toString() on the object.
So, instead of querying on the "id" of the ambiance, I modified my html inputs so the value is the ambiance's name.
Here's my own version of the solution :
According to the elasticsearch documentation we should implement a structure that's smilar to the json bellow :
{
"query": {
"bool": {
"must": [
{ "match": { "title": "eggs" }},
{
"nested": {
"path": "comments",
"query": {
"bool": {
"must": [
{ "match": { "comments.name": "john" }},
{ "match": { "comments.age": 28 }}
]
}}}}
]
}}}
So to do this with symfony 2 and according to the FOSElasticaBundle bundle we will make the following code lines :
//if filter on ambiance
$ambianceQuery = new \Elastica_Query_Term();
$ambianceQuery->setTerm('ambiance', $arrFilters['ambiance']);
// We will add the the term to the query bool
$filtersQuery->addMust($ambianceQuery)
$nestedAmbianceQuery = new \Elastica_Query_Nested();
$nestedAmbianceQuery->setPath('hotels');
// we will set the result as argument in the setQuery method.
$nestedAmbianceQuery->setQuery($filtersQuery);
// And finally we add the nested query to the main query bool through addMust.
$mainQuery->addMust($nestedAmbianceQuery);
Hope that will help others.
A+
Just had the same Problem. This works for me:
..
$ambianceQuery->setTerm('hotels.ambiance', $arrFilters['ambiance']);
..
Coudn't find any examples on this in FOQElasticaBundle (now FOSElasticaBundle) for Symfony2 but it should end in an elasticsearch query like this one here:
http://www.elasticsearch.org/guide/reference/query-dsl/nested-query/
One can do/test raw elasticsearch queries like this:
$queryString = '{..JSON..}';
$boolQuery = new \Elastica_Query_Builder($queryString);

sencha touch complex models

I'm trying to read entities from a Drupal Services endpoint into my Sencha Touch 2 application. The JSON output looks like this (simplified):
{
nid: 1
title: 'Test'
body: {
'en': [
'This is a test.'
]
}
}
Thats the model's coffeescript code:
Ext.define 'Node',
extend: 'Ext.data.Model'
config:
idProperty: 'nid'
fields: [
{ name: 'nid', type: 'integer' }
{ name: 'title', type: 'string' }
{ name: 'language', type: 'string' }
{ name: 'body', type: 'auto', convert: convertField }
]
proxy:
type: 'jsonp'
url: 'http://www.mydomain.com/rest/node'
convertField = (value, record) ->
console.log value # always "undefined"
return 'test'
Loading the model with a jsonp proxy works, but only the atomic fields (like "nid" and "title") are populated. I tried to add a "convert" function to the models body field, but the value parameter is always undefined.
Is there a way to load complex json data into a models field? Or do I have to use the Model-relations system (which would be a lot of mess ...). I also thought about overriding a Ext.data.Reader, but i don't really know where to start.

Resources