Hasura - query for tags - null array supposed to return all results, but only returns items with tags - hasura

I have a query:
query SearchProductData($tagId: [Int!]) {
product(where: { productTags: {_or: {tagId: {_in: $tagId}}}}) {
id
name
productTags {
tag {
id
name
}
}
}
}
and I pass in a variable of
{"tagId": null}
I would like to get back all product, regardless of if they have tags applied or not. What happens though is it retrieves only items with tags applied, not including the items with no tags.
the DB schema is
|- product
|
|-- productTags (one to many linking table of productIDs and tagIDs)
|
|--- tags
Any ideas how to write a query for this use case?

This is expected because of how the where clause is translating to a join (you can see the Generated SQL and Execution Plan query analysis by clicking Analyze in GraphiQL console.)
I don't think it's possible without injecting a bool expression. See below:
E.g.
query Test($expression: product_bool_exp) {
product(where: $expression) {
id
name
product_tags {
tag {
id
name
}
}
}
}
with argument
{
"expression": {"product_tags": {"tag_id": {"_in": [2]}}}
}
returns back:
{
"data": {
"product": [
{
"id": 1,
"name": "product_A",
"product_tags": [
{
"tag": {
"id": 1,
"name": "tag_1"
}
},
{
"tag": {
"id": 2,
"name": "tag_2"
}
}
]
}
]
}
}
And for the other case using the same query (where there are no tags passed in, and therefore all products we can use this variable value:
{
"expression": null
}
And we get back...
{
"data": {
"product": [
{
"id": 1,
"name": "product_A",
"product_tags": [
{
"tag": {
"id": 1,
"name": "tag_1"
}
},
{
"tag": {
"id": 2,
"name": "tag_2"
}
}
]
},
{
"id": 4,
"name": "product_D",
"product_tags": []
}
]
}
}
So you can dynamically construct that where expressions and pass that as an argument to the query.

Related

How to retrieve firebase documents missing a field using runQuery and the IN operator?

This is my http POST requst body...
{
"structuredQuery": {
"select": {
"fields": [
{
"fieldPath": "name"
},
{
"fieldPath": "taxId"
},
{
"fieldPath": "mailingAddress"
}
]
},
"from": [
{
"collectionId": "orgs"
}
],
"where": {
"fieldFilter": {
"field": {
"fieldPath": "orgId"
},
"op": "IN",
"value": {
"arrayValue": {
"values": [
{
"stringValue": ""
},
{
"nullValue": null
}
]
}
}
}
}
}
It fails to return orgs where the orgId field is completely missing from the document. It correctly includes orgs where the orgId field is present and equal to empty string. This is accessing a Cloud Firestore db.
Due to the way Firestore indexes data, it is not possible to query for documents for which a certain field "is completely missing from the document": the field needs to exist in order for the Firestore index to take it into account. More details on the indexing mechanism in the following official video.
You may store an empty value in this field, as you mention in your question.

Updating item in DynamoDB fails for the UpdateExpression syntax

My table data looks like below one
{
"id": {
"S": "alpha-rocket"
},
"images": {
"SS": [
"apple/value:50",
"Mango/aa:284_454_51.0.0",
"Mango/bb:291",
"Mango/cc:4"
]
},
"product": {
"S": "fruit"
}
}
Below is my code to update table. The variables I am passing to function has values product_id has alpha-rocket, image_val has 284_454_53.0.0 and image has Mango/aa:284_454_53.0.0.
I am trying to update value of Mango/aa from 284_454_51.0.0 to 284_454_53.0.0 but getting an error "The document path provided in the update expression is invalid for update"
def update_player_score(product_id, image_val, image):
dynamo = boto3.resource('dynamodb')
tbl = dynamo.Table('<TableName>')
result = tbl.update_item(
expression_attribute_names: {
"#image_name" => "image_name"
},
expression_attribute_values: {
":image_val" => image_val,
},
key: {
"product" => "fruit",
"id" => product_id,
},
return_values: "ALL_NEW",
table_name: "orcus",
update_expression: "SET images.#image_val = :image_val",
}
Is there a way to update the value of Mango/aa or replace full string "Mango/aa:284_454_51.0.0" to "Mango/aa:284_454_53.0.0"
You cannot update a string in a list by matching the string. If you know the index of it you can replace the value of the string by index:
SET images[1] = : image_val
It seems like maybe what you want is not a list of strings, but another map. So instead of your data looking like it does you'd make it look like this, which would allow you to do the update you're looking for:
{
"id": {
"S": "alpha-rocket"
},
"images": {
"M": {
"apple" : {
"M": {
"value": {
"S": "50"
}
},
"Mango" : {
"M": {
"aa": {
"S": "284_454_51.0.0"
},
"bb": {
"S": "291"
},
"cc": {
"S": "4"
}
}
}
},
"product": {
"S": "fruit"
}
}
I would also consider putting the different values in different "rows" in the table and using queries to build the objects.

Structure Firebase data to return array of objects

What is the best way to structure firebase database so that querying just one endpoint i.e admin.database().ref('categories').once() returns me a json of the form
{
"categories": [
{
"-ky1": {
"name": "Profile",
"id": "-ky1",
"sections": [{
"-ky1a": {
"name": "section1",
"id": "-ky1a",
"-ky1a1": {
"name": "field1",
"id": "-ky1a1"
}
},
"-ky1b": {
"name": "section2",
"id": "-ky1b",
"-ky1b1": {
"name": "fieldA",
"id": "-ky1b1"
}
}
}]
}
},
{
"-ky2": {
...
}
}
]
}
where ideally, I get back an array of nested objects.

What is the best way to query the document closest to a date-time on elasticsearch?

I need to retrieve the document that has the closest geo location and date-time to the request, so I'm not looking for a match of the date-time, but the closest one. I solved it using a custom script, however I'm guessing there might be a better way to do it, similar to the way I'm filtering the geo location based on a location and a distance.
Here's my code (in python):
query = {
"query": {
"function_score": {
"boost_mode": "replace",
"query": {
"filtered": {
"query" : {
"match_all" : {}
},
"filter" : {
"geo_distance" : {
"distance" : "10km",
"location" : json.loads(self.request.body)["location"]
}
}
}
},
"script_score": {
"lang": "groovy",
"script_file": "calculate-score",
"params": {
"stamp": json.loads(self.request.body)["stamp"]
}
}
}
},
"sort": [
{"_score": "asc"}
],
"size": 1
}
response = requests.get('http://localhost:9200/meteo/meteo/_search', data=json.dumps(query))
The custom calculate-score.groovy script contains the following:
abs(new java.text.SimpleDateFormat("yyyy-MM-dd\'T\'HH:mm").parse(stamp).getTime() - doc["stamp"].date.getMillis()) / 60000
The script returns the score as the absolute difference in minutes between the document date-time and the requested date-time.
Is there any other way to achieve this?
You should be able to use function_score to do this.
You could use the decay functions mentioned in the doucmentation to give a larger score to documents closer to the origin timestamp. Below is the example
where the scale=28800 mins i.e 20d.
Example:
put test
put test/test/_mapping
{
"properties": {
"stamp": {
"type": "date",
"format": "dateOptionalTime"
}
}
}
put test/test/1
{
"stamp":"2015-10-15T00:00"
}
put test/test/2
{
"stamp":"2015-10-15T12:00"
}
post test/_search
{
"query": {
"function_score": {
"functions": [
{
"linear": {
"stamp" : {
"origin": "now",
"scale": "28800m"
}
}
}
],
"score_mode" : "multiply",
"boost_mode": "multiply",
"query": {
"match_all": {}
}
}
}
}

Elasticsearch PHP longest prefix match

I am currently using the FOSElasticaBundle in Symfony2 and I am having a hard time trying to build a search to match the longest prefix.
I am aware of the 100 examples that are on the Internet to perform autocomplete-like searches using this. However, my problem is a little different.
In an autocomplete type of search the database holds the longest alphanumeric string (in length of characters) and the user just provides the shortest portion, let's say the user types "jho" and Elasticsearch can easily provide "Jhon, Jhonny, Jhonas".
My problem is backwards, I would like to provide the longest alphanumeric string and I want Elasticsearch to provide me the biggest match in the database.
For example: I could provide "123456789" and my database can have [12,123,14,156,16,7,1234,1,67,8,9,123456,0], in this case the longest prefix match in the database for the number that the user provided is "123456".
I am just starting with Elasticsearch so I don't really have a close to working settings or anything.
If there is any information not clear or missing let me know and I will provide more details.
Update 1 (Using Val's 2nd Update)
Index: Download 1800+ indexes
Settings:
curl -XPUT localhost:9200/tests -d '{
"settings": {
"analysis": {
"analyzer": {
"edge_ngram_analyzer": {
"tokenizer": "edge_ngram_tokenizer",
"filter": [ "lowercase" ]
}
},
"tokenizer": {
"edge_ngram_tokenizer": {
"type": "edgeNGram",
"min_gram": "2",
"max_gram": "25"
}
}
}
},
"mappings": {
"test": {
"properties": {
"my_string": {
"type": "string",
"fields": {
"prefix": {
"type": "string",
"analyzer": "edge_ngram_analyzer"
}
}
}
}
}
}
}'
Query:
curl -XPOST localhost:9200/tests/test/_search?pretty=true -d '{
"size": 1,
"sort": {
"_script": {
"script": "doc.my_string.value.length()",
"type": "number",
"order": "desc"
},
"_score": "desc"
},
"query": {
"filtered": {
"query": {
"match": {
"my_string.prefix": "8092232423"
}
},
"filter": {
"script": {
"script": "doc.my_string.value.length() <= maxlength",
"params": {
"maxlength": 10
}
}
}
}
}
}'
With this configuration the query returns the following results:
{
"took" : 61,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 1754,
"max_score" : null,
"hits" : [ {
"_index" : "tests",
"_type" : "test",
"_id" : "AU8LqQo4FbTZPxBtq3-Q",
"_score" : 0.13441172,
"_source":{"my_string":"80928870"},
"sort" : [ 8.0, 0.13441172 ]
} ]
}
}
Bonus question
I would like to provide an array of numbers for that search and get the matching prefix for each one in an efficient way without having to perform the query each time
Here is my take at it.
Basically, what we need to do is to slice and dice the field (called my_string below) at indexing time with an edgeNGram tokenizer (called edge_ngram_tokenizer below). That way a string like 123456789 will be tokenized to 12, 123, 1234, 12345, 123456, 1234567, 12345678, 123456789 and all tokens will be indexed and searchable.
So let's create a tests index, a custom analyzer called edge_ngram_analyzer analyzer and a test mapping containing a single string field called my_string. You'll note that the my_string field is a multi-field declaring a prefixes sub-field which will contain all the tokenized prefixes.
curl -XPUT localhost:9200/tests -d '{
"settings": {
"analysis": {
"analyzer": {
"edge_ngram_analyzer": {
"tokenizer": "edge_ngram_tokenizer",
"filter": [ "lowercase" ]
}
},
"tokenizer": {
"edge_ngram_tokenizer": {
"type": "edgeNGram",
"min_gram": "2",
"max_gram": "25"
}
}
}
},
"mappings": {
"test": {
"properties": {
"my_string": {
"type": "string",
"fields": {
"prefixes": {
"type": "string",
"index_analyzer": "edge_ngram_analyzer"
}
}
}
}
}
}
}
Then let's index a few test documents using the _bulk API:
curl -XPOST localhost:9200/tests/test/_bulk -d '
{"index":{}}
{"my_string":"12"}
{"index":{}}
{"my_string":"1234"}
{"index":{}}
{"my_string":"1234567890"}
{"index":{}}
{"my_string":"abcd"}
{"index":{}}
{"my_string":"abcdefgh"}
{"index":{}}
{"my_string":"123456789abcd"}
{"index":{}}
{"my_string":"abcd123456789"}
'
The thing that I found particularly tricky was that the matching result could be either longer or shorter than the input string. To achieve that we have to combine two queries, one looking for shorter matches and another for longer matches. So the match query will find documents with shorter "prefixes" matching the input and the query_string query (with the edge_ngram_analyzer applied on the input string!) will search for "prefixes" longer than the input string. Both enclosed in a bool/should and sorted by a decreasing string length (i.e. longest first) will do the trick.
Let's do some queries and see what unfolds:
This query will return the one document with the longest match for "123456789", i.e. "123456789abcd". In this case, the result is longer than the input.
curl -XPOST localhost:9200/tests/test/_search -d '{
"size": 1,
"sort": {
"_script": {
"script": "doc.my_string.value.length()",
"type": "number",
"order": "desc"
}
},
"query": {
"bool": {
"should": [
{
"match": {
"my_string.prefixes": "123456789"
}
},
{
"query_string": {
"query": "123456789",
"default_field": "my_string.prefixes",
"analyzer": "edge_ngram_analyzer"
}
}
]
}
}
}'
The second query will return the one document with the longest match for "123456789abcdef", i.e. "123456789abcd". In this case, the result is shorter than the input.
curl -XPOST localhost:9200/tests/test/_search -d '{
"size": 1,
"sort": {
"_script": {
"script": "doc.my_string.value.length()",
"type": "number",
"order": "desc"
}
},
"query": {
"bool": {
"should": [
{
"match": {
"my_string.prefixes": "123456789abcdef"
}
},
{
"query_string": {
"query": "123456789abcdef",
"default_field": "my_string.prefixes",
"analyzer": "edge_ngram_analyzer"
}
}
]
}
}
}'
I hope that covers it. Let me know if not.
As for your bonus question, I'd simply suggest using the _msearch API and sending all queries at once.
UPDATE: Finally, make sure that scripting is enabled in your elasticsearch.yml file using the following:
# if you have ES <1.6
script.disable_dynamic: false
# if you have ES >=1.6
script.inline: on
UPDATE 2 I'm leaving the above as the use case might fit someone else's needs. Now, since you only need "shorter" prefixes (makes sense !!), we need to change the mapping a little bit and the query.
The mapping would be like this:
{
"settings": {
"analysis": {
"analyzer": {
"edge_ngram_analyzer": {
"tokenizer": "edge_ngram_tokenizer",
"filter": [
"lowercase"
]
}
},
"tokenizer": {
"edge_ngram_tokenizer": {
"type": "edgeNGram",
"min_gram": "2",
"max_gram": "25"
}
}
}
},
"mappings": {
"test": {
"properties": {
"my_string": {
"type": "string",
"fields": {
"prefixes": {
"type": "string",
"analyzer": "edge_ngram_analyzer" <--- only change
}
}
}
}
}
}
}
And the query would now be a bit different but will always return only the longest prefix but shorter or of equal length to the input string. Please try it out. I advise to re-index your data to make sure everything is setup properly.
{
"size": 1,
"sort": {
"_script": {
"script": "doc.my_string.value.length()",
"type": "number",
"order": "desc"
},
"_score": "desc" <----- also add this line
},
"query": {
"filtered": {
"query": {
"match": {
"my_string.prefixes": "123" <--- input string
}
},
"filter": {
"script": {
"script": "doc.my_string.value.length() <= maxlength",
"params": {
"maxlength": 3 <---- this needs to be set to the length of the input string
}
}
}
}
}
}

Resources