Elastic search language configuration - symfony

I work on a project with symfony and sonata admin,
my project will be in two language: fr and de so i used elastic search so i installed "friendsofsymfony/elastica-bundle": "3.2.1".
in the elasticconfiguration file how can add the language configuration?
this is my fos elastic bundle configuration file:
fos_elastica:
clients:
default: { host: %elastic_host%, port: %elastic_port% }
indexes:
myproject:
settings:
index:
analysis:
analyzer:
custom_analyzer :
type : custom
tokenizer: nGram
filter : [stopwords, asciifolding ,lowercase, elision, worddelimiter]
custom_search_analyzer :
type : custom
tokenizer: standard
filter : [stopwords, asciifolding ,lowercase, elision, worddelimiter]
tokenizer:
nGram:
type: nGram
min_gram: 3
max_gram: 20
filter:
elision:
type: elision
articles: [l, m, t, qu, n, s, j, d]
stopwords:
type: stop
stopwords: [_french_]
ignore_case : true
worddelimiter:
type: word_delimiter
types:
produit:
mappings:
titre: ~
active:
type: boolean
descriptifTexteOriginal:
index_analyzer: custom_analyzer
search_analyzer : custom_search_analyzer
type: string
norms: { enabled: false }
explain: true
index_options: freqs
persistence:
driver: orm
model: myproject\BOBundle\Entity\Produit
finder: ~
provider: ~
listener: ~

In base, we will about two situtaion. First one is you have a field and it can store
two or more different language text. For example, you have a title field which can
store english or turkish. Not both of them is at the same time. The second situation
is, your data have multiple language value in one document.
In my opinion, there is two way of this problem, too. First is separate your fields
and second is separating index. Let's explain both of them to you:
Think about you have the following data for your products on e-commerce website:
id => identify of your data
name => name of the product
price => price of the product
Our first opinion has multiple option. Now, let's look our first option of the first
opinion and create a mapping:
PUT 44369578
{
"mappings": {
"product": {
"properties": {
"id": { "type": "integer" },
"name": {
"type": "text",
"fields": {
"en": {
"type": "text",
"analyzer": "englishAnalyzer"
},
"tr": {
"type": "text",
"analyzer": "turkishAnalyzer"
}
}
},
"price": { "type": "double"}
}
}
}
}
There is a name field which can take english or turkish value
And in this field only can handle one of them. Not both of them
When you try to improve your language support, you should change your mapping, too.
Then, let's look our second options of the first opinion.
PUT 44369578
{
"mappings": {
"product": {
"properties": {
"id": { "type": "integer" },
"name_en": {"type": "text", "analyzer": "englishAnalyzer" },
"name_tr": {"type": "text", "analyzer": "turkishAnalyzer" },
"price": { "type": "double"}
}
}
}
}
In this example there is two field for our two language. And we can
save english and turkish values for our documents at the same time.
When you try to improve your language support, you should change your mapping, too.
These are two options of the first opinion.
Now, lets look our second opinion. In this example, we have same data structure for
product documents. But we create multiple indexes for each languages.
PUT 44369578
{
"mappings": {
"product": {
"properties": {
"id": { "type": "integer" },
"name": {"type": "text", "analyzer": "englishAnalyzer" },
"price": { "type": "double"}
}
}
}
}
PUT 44369578
{
"mappings": {
"product": {
"properties": {
"id": { "type": "integer" },
"name": {"type": "text", "analyzer": "turkishAnalyzer" },
"price": { "type": "double"}
}
}
}
}
In this opinion, we have two documents for each document and we try to merge the
documents on our application while we are searching documents. This is odd part of
this opinion. But we can create a new index while we create a new language support.
Conclusion, As far as, I know these three methods. You can select one of them and
don't forget, each election will get some features but take away some features.

Related

Omitting Weaviate Objects Containing Specific Reference

Suppose I have a database of movies with some genres tagged to it. My Weaviate schema looks like this:
"classes": [{
"class": "Movie",
"properties": [{
"name": "name",
"dataType": ["string"],
}, {
"name": "inGenres",
"dataType": ["Genre"],
}],
}, {
"class": "Genre",
"properties": [{
"name": "name",
"dataType": ["string"],
}],
}]
I would like to exclude movies tagged with a specific genre from the search results. Specifically, for a database containing the following Movie objects:
{"name":"foo", "inGenres":[{"name":"drama"}]}
{"name":"bar", "inGenres":[{"name":"horror"},{"name":"thriller"}]}
{"name":"baz", "inGenres":[{"name":"horror"},{"name":"sci-fi"}]}
If I exclude the horror genre, the search results should only return the movie foo. Is there any way to perform such a query with GraphQL or the Python client?
You can use the where filter to achieve this.
In your specific case:
{
Get {
Article(
where: {
path: ["inGenres", "Genre", "name"],
operator: NotEqual,
valueString: "horror"
}
) {
name
inGenres {
... on Genre {
name
}
}
}
}
}
In Python
import weaviate
client = weaviate.Client("http://localhost:8080")
where_filter = {
"path": ["inGenres", "Genre", "name"],
"operator": "NotEqual",
"valueString": "horror"
}
query_result = client.query.get("Movie", ["name"]).with_where(where_filter).do()
print(query_result)

How to describe request body properly using OpenAPI and API Platform?

I am struggling getting the right definition for the request body used from within Symfony Api Platform:
From the image above, what my endpoint is expecting is a JSON containing required values. I am defining required values to be in the path but that is NOT true and they don't belong even to: query, header or cookies.
I have tried two definitions (and I've removed some lines that aren't relevant for the resolution):
// the definition result is on the first image on this post
resources:
App\Entity\Cases:
collectionOperations:
case_assign:
swagger_context:
parameters:
- name: assign_type
in: path
description: 'The assignee type: worker or reviewer'
required: true
type: string
// ....
- in: body
name: body
description: 'Object needed to perform a case assignment'
required: true
example: {
"assign_type": "worker",
"assigned_by": 1,
"assigned_to": 1,
"cases": {
"0": 109855,
"1": 109849,
"2": 109845
}
}
And the second definition looks like:
resources:
App\Entity\Cases:
collectionOperations:
case_assign:
swagger_context:
summary: 'Assign a worker or a reviewer to a case'
parameters:
- name: body
in: body
required: true
schema:
type: array
items:
assign_type:
name: assign_type
description: 'The assignee type: worker or reviewer'
required: true
type: string
And here is the result:
None of them seem to be doing what I need or expect, what I am doing wrong? Can I get some help?
I have also read several pages/post without found a good example or the right way to do it (see the following list):
https://github.com/api-platform/api-platform/issues/1019
https://api-platform.com/docs/core/swagger/
https://idratherbewriting.com/learnapidoc/pubapis_swagger_intro.html
https://swagger.io/docs/specification/describing-parameters/
https://swagger.io/docs/specification/describing-request-body/
How to describe this POST JSON request body in OpenAPI (Swagger)?
You can use a SwaggerDecorator as described in the docs to override the generated swagger.json and change pretty much anything you like. You will need to edit the definitions for v2
"paths": {
"/todos": {
"post": {
"parameters": [
{
"name": "todo",
"in": "body",
"description": "The new Todo resource",
"schema": {
"$ref": "#/definitions/Todo"
}
}
]
}}}
"definitions": {
"Todo": {
"type": "object",
"description": "",
"properties": {
"text": {
"required": true,
"description": "This text will be added to the schema as a description",
"type": "string"
},...
or the components.schemas in Openapi v3:
"components": {
"schemas": {
"Todo": {
"type": "object",
"description": "",
"properties": {
"text": {
"required": true,
"minLength": 4,
"example": "some example text",
"description": "This text will be added to the schema as a description",
"type": "string"
},...
Your other option is to use the "swagger_context" ("openapi_context" for openapi v3) of the #ApiResource or #ApiProperty Annotations. Valid options should be in the swagger docs.
/**
* This text will be added to the schema as a description
* #ApiProperty(
* swaggerContext={"required"=true},
* openapiContext={"required"=true, "minLength"=4, "example"="some example text"})
* #ORM\Column(type="string", length=255)
*/
private $text;
Would result in:
example doc
Edit:
I just noticed that there is an error in your yaml configuration. You are trying to setup the documentation for an array. I assume you want to actually send an object
parameters:
- name: body
in: body
required: true
schema:
type: object
required: #only for swagger v2
- assign_type
properties:
assign_type:
description: 'The assignee type: worker or reviewer'
required: true #only for openapi v3
type: string
assigned_by:
...
You can check the correct syntax for Data Types here.

How to handle large inputs into the Text Translator Api?

I am working on a cognitive skillset in Azure Search, and I need to add Text Translator functionality to the project. Currently all of the text is translated correctly, unless the character count is above the maximum value. In which case, the api returns null.
I am currently using document/content as my input for the text translator, but I've also tried using the output of the predefined Split Skill. When I pass the files through the Split Skill (by page) before the translator skill, the indexer breaks and all of the files fail to index (even the ones that don't need to be translated)
This code taken from my skillset.json that translates all of the files below the character limit cutoff.
{
"#odata.type": "#Microsoft.Skills.Custom.WebApiSkill",
"description": "Our new translator custom skill",
"uri": "[my-uri]",
"batchSize":1,
"context": "/document",
"inputs": [
{
"name": "text",
"source": "/document/content"
},
{
"name": "language",
"source": "/document/languageCode"
}
],
"outputs": [
{
"name": "text",
"targetName": "translatedText"
}
]
}
This is my attempt to split the text by page before passing the text through the translator api. This results in a 501 error.
{
"#odata.type": "#Microsoft.Skills.Text.SplitSkill",
"textSplitMode" : "pages",
"maximumPageLength": 4000,
"inputs": [
{
"name": "text",
"source": "/document/content"
},
{
"name": "languageCode",
"source": "/document/languageCode"
}
],
"outputs": [
{
"name": "textItems",
"targetName": "pages"
}
]
},
{
"#odata.type": "#Microsoft.Skills.Custom.WebApiSkill",
"description": "Our new translator custom skill",
"uri": "[my-uri]",
"batchSize":1,
"context": "/document/pages/*",
"inputs": [
{
"name": "text",
"source": "/document/pages/*"
},
{
"name": "language",
"source": "/document/languageCode"
}
],
"outputs": [
{
"name": "text",
"targetName": "translatedText"
}
]
}
I use the exact same implementation for the Named Entity Recognition Skill (using document/pages/* as input), and it works fine. I'm not sure what the difference would be with the Text Translator skill.

How to associate nested relationships with attributes for a POST in JSON API

According to the spec, resource identifier objects do not hold attributes.
I want to do a POST to create a new resource which includes other nested resource.
These are the basic resources: club (with name) and many positions (type). Think a football club with positions like goalkeeper, goalkeeper, striker, striker, etc.
When I do this association, I want to set some attributes like is the position required for this particular team. For example I only need 1 goalkeeper but I want to have a team with many reserve goalkeepers. When I model these entities in the DB I'll set the required attribute in a linkage table.
This is not compliant with JSON API:
{
"data": {
"type": "club",
"attributes": {
"name": "Backyard Football Club"
},
"relationships": {
"positions": {
"data": [{
"id": "1",
"type": "position",
"attributes": {
"required": "true"
}
}, {
"id": "1",
"type": "position",
"attributes": {
"required": "false"
}
}
]
}
}
}
}
This is also not valid:
{
"data": {
"type": "club",
"attributes": {
"name": "Backyard Football Club",
"positions": [{
"position_id": "1",
"required": "true"
},
{
"position_id": "1",
"required": "false"
}]
}
}
}
So how is the best way to approach this association?
The best approach here will be to create a separate resource for club_position
Creating a club will return a url to a create club_positions, you will then post club_positions to that url with a relationship identifier to the position and club resource.
Added benefit to this is that club_positions creation can be parallelized.

JSON API format in API-Platform

First of all I want to implement JSON API.
I follow tutorial on api platform and just like in example create entities and response is like
{
"links": {
"self": "/api/books"
},
"meta": {
"totalItems": 1,
"itemsPerPage": 30,
"currentPage": 1
},
"data": [
{
"id": "/api/books/1",
"type": "Book",
"attributes": {
"isbn": "9781782164104",
"title": "Persistence in PHP with the Doctrine ORM",
"description": "This book is designed for PHP developers and architects who want to modernize their skills through better understanding of Persistence and ORM.",
"author": "Kévin Dunglas",
"publicationDate": "2013-12-01T00:00:00+01:00",
"_id": 1
},
"relationships": {
"reviews": {
"data": [
{
"type": "Review",
"id": "/api/reviews/1"
}
]
}
}
}
]
}
My api_platform.yaml config
api_platform:
mapping:
paths: ['%kernel.project_dir%/src/Entity']
formats:
jsonapi:
mime_types: ['application/vnd.api+json']
So i have problem with id filed in data. I get id fields in format api/entityName/id but I just want to get number(string), just like in JSON API. So is there some configuration that i miss or is any way to achieve that
It was discussed here.
You need to use Normalizer or create custom getter.
All you need is to send
Accept: application/json
on request header.

Resources