How to use nested queries with FOQElasticaBundle for Symfony2 - symfony

I have a problem with query building with FOQElasticaBundle
I have 3 entities
User
Hotel
Ambiance
Users can have 1 or more Hotels, and each Hotel has only 1 Ambiance.
In my config file, I have:
foq_elastica:
clients:
default: { host: %elasticsearch.host%, port: %elasticsearch.port% }
indexes:
MyBundle:
client: default
finder:
types:
user:
mappings:
id:
boost: 10
analyzer: fr_case_analyzer
name:
boost: 5
analyzer: fr_case_analyzer
hotels:
type: "nested"
properties:
name:
boost: 10
analyzer: fr_case_analyzer
ambiance:
boost: 1
I want to be able to search for User by typing his name or the name of his hotels, and possibly add a filter on the Ambiance type.
So the query should look like something like this :
$mainQuery = new \Elastica_Query_Bool();
$nameQuery = new \Elastica_Query_Bool();
$filtersQuery = new \Elastica_Query_Bool();
//searching in Users' names
$nameQuery = new \Elastica_Query_Text();
$nameQuery->setFieldQuery('name', $searchName);
$nameQuery->setFieldParam('name', 'boost', 5);
$nameQuery->setFieldParam('name', 'type', 'phrase_prefix');
//searching in Hotels' names
$hotelNameQuery = new \Elastica_Query_Text();
$hotelNameQuery->setFieldQuery('name', $searchName);
$hotelNameQuery->setFieldParam('name', 'boost', 3);
$hotelNameQuery->setFieldParam('name', 'type', 'phrase_prefix');
$nestedHotelNameQuery = new \Elastica_Query_Nested();
$nestedHotelNameQuery->setPath('hotels');
$nestedHotelNameQuery->setQuery($hotelNameQuery);
$nameQuery->addShould($nameQuery);
$nameQuery->addShould($nestedHotelNameQuery);
//if filter on ambiance
$ambianceQuery = new \Elastica_Query_Term();
$ambianceQuery->setTerm('ambiance', $arrFilters['ambiance']);
$nestedAmbianceQuery = new \Elastica_Query_Nested();
$nestedAmbianceQuery->setPath('hotels');
$nestedAmbianceQuery->setQuery($ambianceQuery);
$filtersQuery->addMust($nestedAmbianceQuery);
//adding the parameters to the main query
$mainQuery->addMust($nameQuery);
$mainQuery->addMust($filtersQuery);
Unfortunately this doesn't work and returns no result if the Ambiance filter is activated, but works perfectly if I only search with the name.
What do I do wrong ?

I found why it wouldn't work.
The bundle actually uses __toString() on the object.
So, instead of querying on the "id" of the ambiance, I modified my html inputs so the value is the ambiance's name.

Here's my own version of the solution :
According to the elasticsearch documentation we should implement a structure that's smilar to the json bellow :
{
"query": {
"bool": {
"must": [
{ "match": { "title": "eggs" }},
{
"nested": {
"path": "comments",
"query": {
"bool": {
"must": [
{ "match": { "comments.name": "john" }},
{ "match": { "comments.age": 28 }}
]
}}}}
]
}}}
So to do this with symfony 2 and according to the FOSElasticaBundle bundle we will make the following code lines :
//if filter on ambiance
$ambianceQuery = new \Elastica_Query_Term();
$ambianceQuery->setTerm('ambiance', $arrFilters['ambiance']);
// We will add the the term to the query bool
$filtersQuery->addMust($ambianceQuery)
$nestedAmbianceQuery = new \Elastica_Query_Nested();
$nestedAmbianceQuery->setPath('hotels');
// we will set the result as argument in the setQuery method.
$nestedAmbianceQuery->setQuery($filtersQuery);
// And finally we add the nested query to the main query bool through addMust.
$mainQuery->addMust($nestedAmbianceQuery);
Hope that will help others.
A+

Just had the same Problem. This works for me:
..
$ambianceQuery->setTerm('hotels.ambiance', $arrFilters['ambiance']);
..
Coudn't find any examples on this in FOQElasticaBundle (now FOSElasticaBundle) for Symfony2 but it should end in an elasticsearch query like this one here:
http://www.elasticsearch.org/guide/reference/query-dsl/nested-query/
One can do/test raw elasticsearch queries like this:
$queryString = '{..JSON..}';
$boolQuery = new \Elastica_Query_Builder($queryString);

Related

api_platform produces Error "no handler found for uri [/index/_doc/_search] and method [POST]"

When trying to implement elasticsearch (v7.9.3) via the fos_elastica-bundle (v6.0.0) into my Symfony (v5.3.10) - App with api_platform (v2.6.6), I keep on getting this error:
"{"error":"no handler found for uri [//posts/_doc/_search] and method [POST]"}",
My api_platform.yaml reads:
api_platform:
[...]
elasticsearch:
hosts: [ '%env(ELASTICSEARCH_URL)%' ]
mapping:
App\Document\Post:
index: posts
and my fos_elastica.yaml:
fos_elastica:
clients:
default: { url: '%env(ELASTICSEARCH_URL)%' }
indexes:
posts:
properties:
id:
"type": "keyword"
source: ~
title: ~
description: ~
body: ~
children: ~
tags: ~
originalContent: ~
persistence:
driver: mongodb
model: App\Document\Post
By debugging the fos-elastica Bundle, I found out that the Elastica-Connector correctly triggers a [POST]-Request to "/posts/_doc/_search" with this request body:
{"sort":[{"id":{"order":"asc"}}],"query":{"match_all":{}},"size":30,"from":0}
If I use the Kibana Dev Tools Console and trigger an identical request
POST /posts/_doc/_search
{"sort":[{"id":{"order":"asc"}}],"query":{"match_all":{}},"size":30,"from":60}
I do get results from elasticsearch as expected:
#! Deprecation: [types removal] Specifying types in search requests is deprecated.
{
"took" : 12,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3082,
"relation" : "eq"
},
"max_score" : null,
"hits" : [
{
"_index" : "posts",
"_type" : "_doc",
[...]
Apart from the deprecation notice, everything seems fine.
Does anyone have an idea why the api_platform integration of the fos_elastica-bundle does not work as expected and keeps on returning the "no handler found"-error message?
I have now helped myself by creating a custom ApiResource - filter
#[ApiFilter(FulltextFilter::class, arguments: ['index' => 'post'], properties: ['body','description','tag'])]
My custom filter implements ApiPlatform\Core\Bridge\Doctrine\MongoDbOdm\Filter\FilterInterface, directly communicates with the ElasticSearch server, sends a query to search the specified index (posts) and adds another match()-directive to the aggregationBuilder with a set of IDs matching the original search:
<?php
declare(strict_types=1);
namespace App\Filter;
use ApiPlatform\Core\Bridge\Doctrine\MongoDbOdm\Filter\FilterInterface;
use Doctrine\ODM\MongoDB\Aggregation\Builder;
use Elastica\Result;
use Elastica\Client;
use Elastica\Query;
use Symfony\Component\PropertyInfo\Type;
/**
* Filter the collection by given properties.
*
*/
final class FulltextFilter implements FilterInterface
{
protected $index = '';
protected $properties = [];
protected $client;
protected $searchParameterName;
protected $maxResultsParameterName;
const DEFAULT_MAX_RESULTS = 200;
public function __construct(Client $client, string $index = '', string $maxResultsParameterName = 'amount', string $searchParameterName = 'query', array $properties = []) {
$this->index = $index;
$this->properties = $properties;
$this->client = $client;
$this->searchParameterName = $searchParameterName;
$this->maxResultsParameterName = $maxResultsParameterName;
}
public function getFilteredIds($searchterm, $index = null, $properties = null, $maxResults = null) {
$matches = [];
if (is_null($properties)) {
$properties = array_keys($this->properties);
}
foreach ($properties as $propertyName) {
array_push($matches, ['match'=>[$propertyName => $searchterm]]);
}
$queryObject = ['query' => ['bool' => ['should' => $matches]]];
$queryObject['size'] = (int) $maxResults >0 ? (int) $maxResults : self::DEFAULT_MAX_RESULTS;
$query = new Query();
$response = $this->client->getIndex($index ?? $this->index)
->search($query->setRawQuery($queryObject))
->getResults();
return array_map(function(Result $result) {return $result->getHit()['_source']['id'];}, $response);
}
public function apply(Builder $aggregationBuilder, string $resourceClass, string $operationName = null, array &$context = [])
{
$maxResults = $context['filters'][$this->maxResultsParameterName] ?? null;
$searchterm = $context['filters'][$this->searchParameterName] ?? false;
if ($searchterm !== false) {
$aggregationBuilder->match()->field('id')->in($this->getFilteredIds($searchterm, null, null, $maxResults));
}
}
public function getDescription(string $resourceClass): array
{
return [];
}
}
This solution might not be as elegant as using the ElasticSearch-Connector natively provided by api_platform, but it is fairly performant and it works.
However, if someone comes up with a solution to fix the depicted ES-Connector issue with api_platform, please feel free to share it.
The problem is that, FOS Elastica requires an ES URL with an ending slash. But Api Platform requires a URL without ending slash.
We usually define the URL in .env file and then recall it in config files.
To solve the problem, we could define the URL in .env without endling slash and add the slash to the FOS Elastica config.
# .env
###> friendsofsymfony/elastica-bundle ###
ELASTICSEARCH_URL=http://localhost:9200
###< friendsofsymfony/elastica-bundle ###
# config/packages/api_platform.yaml
api_platform:
elasticsearch:
enabled: true
hosts: [ '%env(ELASTICSEARCH_URL)%' ]
# config/packages/fos_elastica.yaml
fos_elastica:
clients:
default: { url: '%env(ELASTICSEARCH_URL)%/' }

Creating a custom tokenizer in ElasticSearch NEST

I have a custom class in ES 2.5 of the following:
Title
DataSources
Content
Running a search is fine, except with the middle field - it's built/indexed using a delimiter of '|'.
ex: "|4|7|8|9|10|12|14|19|20|21|22|23|29|30"
I need to build a query that matches some in all fields AND matches at least one number in the DataSource field.
So to summarize what I currently have:
QueryBase query = new SimpleQueryStringQuery
{
//DefaultOperator = !operatorOR ? Operator.And : Operator.Or,
Fields = LearnAboutFields.FULLTEXT,
Analyzer = "standard",
Query = searchWords.ToLower()
};
_boolQuery.Must = new QueryContainer[] {query};
That's the search words query.
foreach (var datasource in dataSources)
{
// Add DataSources with an OR
queryContainer |= new WildcardQuery { Field = LearnAboutFields.DATASOURCE, Value = string.Format("*{0}*", datasource) };
}
// Add this Boolean Clause to our outer clause with an AND
_boolQuery.Filter = new QueryContainer[] {queryContainer};
}
That's for the datasources query. There can be multiple datasources.
It doesn't work, and returns on results with the filter query added on. I think I need some work on the tokenizer/analyzer, but I don't know enough about ES to figure that out.
EDIT: Per Val's comments below I have attempted to recode the indexer like this:
_elasticClientWrapper.CreateIndex(_DataSource, i => i
.Mappings(ms => ms
.Map<LearnAboutContent>(m => m
.Properties(p => p
.String(s => s.Name(lac => lac.DataSources)
.Analyzer("classic_tokenizer")
.SearchAnalyzer("standard")))))
.Settings(s => s
.Analysis(an => an.Analyzers(a => a.Custom("classic_tokenizer", ca => ca.Tokenizer("classic"))))));
var indexResponse = _elasticClientWrapper.IndexMany(contentList);
It builds successfully, with data. However the query still isn't working right.
New query for DataSources:
foreach (var datasource in dataSources)
{
// Add DataSources with an OR
queryContainer |= new TermQuery {Field = LearnAboutFields.DATASOURCE, Value = datasource};
}
// Add this Boolean Clause to our outer clause with an AND
_boolQuery.Must = new QueryContainer[] {queryContainer};
And the JSON:
{"learnabout_index":{"aliases":{},"mappings":{"learnaboutcontent":{"properties":{"articleID":{"type":"string"},"content":{"type":"string"},"dataSources":{"type":"string","analyzer":"classic_tokenizer","search_analyzer":"standard"},"description":{"type":"string"},"fileName":{"type":"string"},"keywords":{"type":"string"},"linkURL":{"type":"string"},"title":{"type":"string"}}}},"settings":{"index":{"creation_date":"1483992041623","analysis":{"analyzer":{"classic_tokenizer":{"type":"custom","tokenizer":"classic"}}},"number_of_shards":"5","number_of_replicas":"1","uuid":"iZakEjBlRiGfNvaFn-yG-w","version":{"created":"2040099"}}},"warmers":{}}}
The Query JSON request:
{
"size": 10000,
"query": {
"bool": {
"must": [
{
"simple_query_string": {
"fields": [
"_all"
],
"query": "\"housing\"",
"analyzer": "standard"
}
}
],
"filter": [
{
"terms": {
"DataSources": [
"1"
]
}
}
]
}
}
}
One way to achieve this is to create a custom analyzer with a classic tokenizer which will break your DataSources field into the numbers composing it, i.e. it will tokenize the field on each | character.
So when you create your index, you need to add this custom analyzer and then use it in your DataSources field:
PUT my_index
{
"settings": {
"analysis": {
"analyzer": {
"number_analyzer": {
"type": "custom",
"tokenizer": "number_tokenizer"
}
},
"tokenizer": {
"number_tokenizer": {
"type": "classic"
}
}
}
},
"mappings": {
"my_type": {
"properties": {
"DataSources": {
"type": "string",
"analyzer": "number_analyzer",
"search_analyzer": "standard"
}
}
}
}
}
As a result, if you index the string "|4|7|8|9|10|12|14|19|20|21|22|23|29|30", you DataSources field will effectively contain the following array of token: [4, 7, 8, 9, 10, 12, 14, 191, 20, 21, 22, 23, 29, 30]
Then you can get rid of your WildcardQuery and simply use a TermsQuery instead:
terms = new TermsQuery {Field = LearnAboutFields.DATASOURCE, Terms = dataSources }
// Add this Boolean Clause to our outer clause with an AND
_boolQuery.Filter = new QueryContainer[] { terms };
At an initial glance at your code I think one problem you might have is that any queries placed within a filter clause will not be analysed. So basically the value will not be broken down into tokens and will be compared in its entirety.
It's easy to forget this so any values that require analysis need to be placed in the must or should clauses.

How to Remove from List Of Maps in DynamoDB (Must be Atomic)

I have this Schema:
{
product: S // Primary Key, // my Hash
media: L // List of Maps
}
Each media item will be like this:
[
{
id: S, // for example: id: uuid()
type: S, // for example: "image"
url: S // for example: id: "http://domain.com/image.jpg"
}
]
Sample Data:
{
product: "IPhone 6+",
media: [
{
id: "1",
type: "image",
url: "http://www.apple.com/iphone-6-plus/a.jpg"
},
{
id: "2",
type: "image",
url: "http://www.apple.com/iphone-6-plus/b.jpg"
},
{
id: "3",
type: "video",
url: "http://www.apple.com/iphone-6-plus/overview.mp4"
}
]
}
I want to be able to remove from media list by id.
Something like: "From product: 'IPhone 6+', remove the media with id: '2'"
After the query, the Data should be like this:
{
product: "IPhone 6+",
media: [
{
id: "1",
type: "image",
url: "http://www.apple.com/iphone-6-plus/a.jpg"
},
{
id: "3",
type: "video",
url: "http://www.apple.com/iphone-6-plus/overview.mp4"
}
]
}
How should i express query like this? I saw a post on UpdateItem but i can't find a good example for this query type.
Thanks!
Unfortunately, the API doesn't have this feature. The closest you can do is to delete an entry from "List" data type if you know the index.
I understand that most of the time the index mayn't be available. However, you can take a look at this alternate option if you don't have any other solution.
You also need to understand that even though DynamoDB started supporting the Document data types such as List, Map and Set, you can't perform some actions. Some features are yet to be added in the API. I believe this scenario is one of them.
I have used REMOVE to delete the item from list.
var params = {
TableName : "Product",
Key : {
"product" : "IPhone 6+"
},
UpdateExpression : "REMOVE media[0]",
ReturnValues : "UPDATED_NEW"
};
console.log("Updating the item...");
docClient.update(params, function(err, data) {
if (err) {
console.error("Unable to update item. Error JSON:", JSON.stringify(err, null, 2));
} else {
console.log("UpdateItem succeeded:", JSON.stringify(data));
}
});
This is just for your reference:-
The Delete operator can be used only on SET.
DELETE - Deletes an element from a set.
If a set of values is specified, then those values are subtracted from
the old set. For example, if the attribute value was the set [a,b,c]
and the DELETE action specifies [a,c], then the final attribute value
is [b]. Specifying an empty set is an error.
The DELETE action only supports set data types. In addition, DELETE
can only be used on top-level attributes, not nested attributes.

Elasticsearch not find anything on the field with more words

I'm trying elasticsearch and it looks great!
I noticed, however, a problem very uncomfortable, in a field that contains hello world if I search hello wo returns no result!
Why does this happen?
Place my configuration (FOSElasticaBundle):
fos_elastica:
clients:
default: { host: localhost, port: 9200 }
serializer:
callback_class: FOS\ElasticaBundle\Serializer\Callback
serializer: serializer
indexes:
website:
client: default
settings:
index:
analysis:
analyzer:
custom_search_analyzer:
type: custom
tokenizer: standard
filter : [standard, worddelimiter, stopwords, snowball, lowercase, asciifolding]
custom_index_analyzer:
type: custom
tokenizer: nGram
filter : [standard, worddelimiter, stopwords, snowball, lowercase, asciifolding]
filter:
stopwords:
type: stop
stopwords: [_italian_]
ignore_case : true
worddelimiter :
type: word_delimiter
tokenizer:
nGram:
type: nGram
min_gram: 1
max_gram: 20
types:
structure:
mappings:
name: { boost: 9, search_analyzer: custom_search_analyzer, index_analyzer: custom_index_analyzer, type: string }
Any idea on how to solve?
EDIT
Here my query:
{
query: {
bool: {
must: [ ]
must_not: [ ]
should: [
{
term: {
structure.name: hello wo
}
}
]
}
}
from: 0
size: 10
sort: [ ]
facets: { }
}
EDIT 2
Ok, I don't understand this behavior ...
Now I run this query:
{
query: {
bool: {
must: [
{
term: {
structure.name: hello
}
}
{
term: {
structure.name: wo
}
}
]
must_not: [ ]
should: [ ]
}
}
from: 0
size: 10
sort: [ ]
facets: { }
}
This query is the result I wanted, but I do not understand what is the difference in putting a must with two words and two must have a word with everyone!
I could explain this behavior?
Well i need to explain you probably how its working
When you index text elastic search will try to split it to terms if text is analyzed(as its in your mapping) so in your case "hello world" will be spited to two terms "hello" and "world" when you will do term search you write term hello world which does not fit any of your two terms.
To avoid spiting to terms you can set in mapping that field name is not analyzed, then it will not be spitted to two words and will be handled as one token.
Other solution is you can multiterm query
{
"query": {
"terms": {
"structure.name": [
"world",
"hello"
]
}
}
}
Also when you use query_string it return result since it has different algorithm.
So depends on you needs you should use different queries, but to search by name you should use query_string, term should be used if you want to filter lets say categoryId, tags and stuff like that.

Grid content JSON conversion

I have a grid where user and add new rows as many as they want. After adding all the rows, they click the "Save" button. On Save button click, I want to send all the data entered by the user in JSON format to the server side code (i.e. a servlet in my case)
Below is the model and store definition:
Ext.define('Plant', {
extend: 'Ext.data.Model',
fields: [
// the 'name' below matches the tag name to read, except 'availDate'
// which is mapped to the tag 'availability'
{name: 'common', type: 'string'},
{name: 'botanical', type: 'string'},
{name: 'light'},
{name: 'price', type: 'float'},
// dates can be automatically converted by specifying dateFormat
{name: 'availDate', mapping: 'availability', type: 'date', dateFormat: 'm/d/Y'},
{name: 'indoor', type: 'bool'}
]
});
// create the Data Store
var store = Ext.create('Ext.data.Store', {
// destroy the store if the grid is destroyed
autoDestroy: true,
model: 'Plant'
});
On click of the save button, I am able to get the store like this:
{
text: 'Save',
handler : function(){
//Getting the store
var records = grid.getStore();
console.log(records.getCount());
Ext.Ajax.request({
url: '/CellEditing/CellEditingGridServlet',
method: 'POST',
jsonData: {
//How to assign the store here such that
//it is send in a JSON format to the server?
},
callback: function (options, success, response) {
}
});
}
But I don't know like how to convert the store content into JSON and send it in the jsonData of the ajax request.
I want the JSON data something like this in the server side:
{"plantDetails":
[
{
"common": Plant1,
"light": 'shady',
"price": 25.00,
"availDate": '05/05/2013',
"indoor": 'Yes'
},
{
"common": Plant2,
"light": 'most shady',
"price": 15.00,
"availDate": '12/09/2012',
"indoor": 'No'
},
]
}
Please let me know how to achieve this.
Regards,
Agreed with Neil, the right way to do this is through an editable store outfited with a proxy and a writer. See example here: http://docs.sencha.com/ext-js/4-1/#!/example/grid/cell-editing.html
Store
writer :
{
type : 'json',
allowSingle : true
}
Experiment with allowSingle as per your use case
In your controller
//if you want to set extra params
yourStore.getProxy().setExtraParam("anyParam",anyValue);
// sync the store
yourStore.sync({success : function() {
yourGrid.setLoading(false);
.. },
scope : this // within the scope of the controller
});
You should be creating the model with a new id ( you can ignore it at the server side and use your own key generation , but it lets extjs4 for its internal purposes know that a new record has been created).
creating a model instance
var r = Ext.create('yourModel', { id: idGen++, propA : valA , ... });
insert to grid
store.insert(0,r);
var editPlugin = grid.getPlugin(editPluginId);
editPlugin.startEdit(0,1);
Once you receive a response back the id's can be update to their true value.
in the Store
reader :
{
type : 'json',
root : 'yourdata',
successProperty : 'success',
idProperty : 'id'
}
If you were to use the same grid for handling and editing then you could use the write event or the appropriate event
for more advanced handling in the Store
listeners :
{
write : function(store,operation, eOpts)
{
var insertedData = Ext.decode(operation.response.responseText);
.. do something
}
}
I would recommend using the mvc architecture of Extjs4
This is what I tried and it seems to work:
var store = Ext.create('Ext.data.Store', {
// destroy the store if the grid is destroyed
autoDestroy: true,
model: 'Plant',
proxy: {
type: 'ajax',
url: '/CellEditing/CellEditingGridServlet',
writer: {
type: 'json',
root: 'plantDetails'
}
}
handler : function(){
grid.getStore().sync();
But I am getting an additional parameter in the JSON at the server side:
"id": null
I don't have this id set in my model then where is this coming from? Is there some way to set some values to it rather than having a default null value?

Resources