Cosmos DB - can't use a property (array) with IN Keyword - azure-cosmosdb

I'm attempting to query for several documents at once, using some properties that were found on the first document, similar to a TSQL left join on property value.
My attempt in CosmosDB:
select c from assets
join ver on c.versions
where c.id = '123' OR c.id IN ver.otherIds
--NOTE: ver.otherIds is an array
The query above results in a syntax error, stating it doesn't understand ver.otherIds. The docs state the syntax to be where c.id in ("123","456"...)
Things I've tried to work around this:
Attempted Custom UDF that takes in the array generates the syntax wanted Ex) ["123,"456"] --> "("123", "456")
Attempted using array_contains(ver.otherIds, c.id)
Attempted sub query approach, which produced a "The cardinality of a scalar subquery result set cannot be greater than one" error:
select value c from c
where array_contains((select ... that produces array), c.id)
None of the above worked.
I can, of course, pull the first asset, then generate a second query to pull the rest, but I'd rather not do that. I can also just de-normalize all the data, but without giving specifics to my scenario, it would end up being a very bad idea.
Any ideas?
Thanks in advance!

You could use your second scenario:ARRAY_CONTAINS.
My sample document:
[
{
"id": "1",
"versions": [
{
"otherIds": [
"1",
"2",
"3"
]
}
]
},
{
"id": "2",
"versions": [
{
"otherIds": [
"1",
"2",
"3"
]
},
{
"otherIds": [
"123",
"2",
"3"
]
}
]
},
{
"id": "123",
"versions": [
{
"otherIds": [
"1",
"2",
"3"
]
},
{
"otherIds": [
"123",
"2",
"3"
]
}
]
}
]
SQL:
SELECT distinct c.id,c.versions FROM c
join ver in c.versions
where c.id="123" or array_contains(ver.otherIds,c.id,false)
ARRAY_CONTAINS function can specify if the match is full or partial.

Related

Terraform: Add item to a DynamoDB Table

What is the correct way to add tuple and key-pair values items to a DynamoDB database via Terraform?
I am trying like this:
resource "aws_dynamodb_table_item" "item" {
table_name = aws_dynamodb_table.dynamodb-table.name
hash_key = aws_dynamodb_table.dynamodb-table.hash_key
for_each = {
"0" = {
location = "Madrid"
coordinates = [["lat", "40.49"], ["lng", "-3.56"]]
visible = false
destinations = [0, 4]
}
}
item = <<ITEM
{
"id": { "N": "${each.key}"},
"location": {"S" : "${each.value.location}"},
"visible": {"B" : "${each.value.visible}"},
"destinations": {"L" : [{"N": "${each.value.destinations}"}]
}
ITEM
}
And I am getting the message:
each.value.destinations is tuple with 2 elements
│
│ Cannot include the given value in a string template: string required.
I also have no clue on how to add the coordinates variable.
Thanks!
List should be something like that :
"destinations": {"L": [{ "N" : 1 }, { "N" : 2 }]}
You are trying to pass
"destinations": {"L": [{ "N" : [0,4] }]}
Also you are missing the last } in destinations key
TLDR: I think the problem here is that you are trying to put L(N) - i.e. a list of numeric values, while your current Terraform code tries to put all the destinations into one N/number.
Instead of:
[{"N": "${each.value.destinations}"}]
you need some iteration over destinations and building a {"N": ...} of them.
"destinations": {"NS": ${jsonencode(each.value.destinations)}}
Did the trick!

Cosmos DB query strings in array retain grouping and trim the value?

Say we have two sets of data in my collection:
{
"id": "111",
"linkedId": [
"ABC:123",
"ABC:456"
]
}
{
"id": "222",
"linkedId": [
"DEF:321",
"DEF:654"
]
}
What query can I run to get a result that will look like this?
{
[
"123",
"456"
]
},
{
[
"321",
"654"
]
}
I have tried
SELECT c.linkedId FROM c
But this has the "linkedId" as the property name in the result set. And I tried LEFT but it doesn't trim first 4 characters of the string.
Then I tried
SELECT value cc FROM cc In c.linkedId
But this loses the grouping.
Any idea?
Since the elements are just strings, not json object, i suggest you using UDF in cosmos db query sql.
UDF:
function userDefinedFunction(arr){
var returnArr = [];
for(var i=0;i<arr.length;i++){
returnArr.push(arr[i].substring(4,7));
}
return returnArr;
}
SQL:
SELECT value udf.test(c.linkedId) FROM c
OUTPUT:

DocumentDB - WHERE clause within collection

Given a sample document like this one from the Microsoft examples:
{
"id": "AndersenFamily",
"lastName": "Andersen",
"parents": [
{ "firstName": "Thomas" },
{ "firstName": "Mary Kay"}
],
"children": [
{
"firstName": "Henriette Thaulow",
"gender": "female",
"grade": 5,
"pets": [{ "givenName": "Fluffy" }]
}
],
"address": { "state": "WA", "county": "King", "city": "seattle" },
"creationDate": 1431620472,
"isRegistered": true
}
We can see that there is a sub-collection children containing an array of documents.
Let's say I wanted to write a query using the SELECT ... FROM ... WHERE ... type syntax, how would I go about writing a query to find families with any daughters (any children with gender "female")
So something like
SELECT c.id
FROM c
WHERE c.children.contains( // I'm stuck!
I'm wondering if I'm missing a JOIN or something but honestly I'm not sure where I go from here, and I'm struggling to find anything helpful in Google partially because I'm not sure how to phrase my search!
You need the JOIN keyword to unwind the children, then apply the filter on gender like the query below:
SELECT f.id
FROM family f
JOIN child IN f.children
WHERE child.gender = "female"

simple jq filter has null results

I'm using the filter
[.bar_1.baz_a, .bar_1.baz_b, .bar_2.qux_1,.bar_2.qux_2]
on the following JSON and it's returning four nulls instead of two lines each having four elements of nonsense data. This is my first attempt at a filter, what concept am I not comprehending?
{
"version": "0.1",
"foos": [
{
"bar_1": {
"baz_a": 673396201,
"baz_b": "dfgsfg"
},
"bar_2": {
"qux_1": "ghjhj",
"qux_2": "Q"
}
},
{
"bar_1": {
"baz_a": 674567484,
"baz_b": "tyutyj"
},
"bar_2": {
"qux_1": "bnmn",
"qux_2": "Z"
}
}
]
}
The root object doesn't have keys bar1 and bar2; those occur in the objects in the array assigned to the name foos. Compare your filter to
jq '.foos[] | [.bar_1.baz_a, .bar_1.baz_b, .bar_2.qux_1,.bar_2.qux_2]' tmp.json

Query to get exact matches of Elastic Field with multile values in Array

I want to write a query in Elastic that applies a filter based on values i have in an array (in my R program). Essentially the query:
Matches a time range (time field in Elastic)
Matches "trackId" field in Elastic to any value in array oth_usr
Return 2 fields - "trackId", "propertyId"
I have the following primitive version of the query but do not know how to use the oth_usr array in a query (part 2 above).
query <- sprintf('{"query":{"range":{"time":{"gte":"%s","lte":"%s"}}}}',start_date,end_date)
view_list <- elastic::Search(index = "organised_recent",type = "PROPERTY_VIEW",size = 10000000,
body=query, fields = c("trackId", "propertyId"))$hits$hits
You need to add a terms query and embed it as well as the range one into a bool/must query. Try updating your query like this:
terms <- paste(sprintf("\"%s\"", oth_usr), collapse=", ")
query <- sprintf('{"query":{"bool":{"must":[{"terms": {"trackId": [%s]}},{"range": {"time": {"gte": "%s","lte": "%s"}}}]}}}',terms,start_date,end_date)
I'm not fluent in R syntax, but this is raw JSON query that works.
It checks whether your time field matches given range (start_time and end_time) and whether one of your terms exact matches trackId.
It returns only trackId, propertyId fields, as per your request:
POST /indice/_search
{
"_source": {
"include": [
"trackId",
"propertyId"
]
},
"query": {
"bool": {
"must": [
{
"range": {
"time": {
"gte": "start_time",
"lte": "end_time"
}
}
},
{
"terms": {
"trackId": [
"terms"
]
}
}
]
}
}
}

Resources