DocumentDB - WHERE clause within collection - azure-cosmosdb

Given a sample document like this one from the Microsoft examples:
{
"id": "AndersenFamily",
"lastName": "Andersen",
"parents": [
{ "firstName": "Thomas" },
{ "firstName": "Mary Kay"}
],
"children": [
{
"firstName": "Henriette Thaulow",
"gender": "female",
"grade": 5,
"pets": [{ "givenName": "Fluffy" }]
}
],
"address": { "state": "WA", "county": "King", "city": "seattle" },
"creationDate": 1431620472,
"isRegistered": true
}
We can see that there is a sub-collection children containing an array of documents.
Let's say I wanted to write a query using the SELECT ... FROM ... WHERE ... type syntax, how would I go about writing a query to find families with any daughters (any children with gender "female")
So something like
SELECT c.id
FROM c
WHERE c.children.contains( // I'm stuck!
I'm wondering if I'm missing a JOIN or something but honestly I'm not sure where I go from here, and I'm struggling to find anything helpful in Google partially because I'm not sure how to phrase my search!

You need the JOIN keyword to unwind the children, then apply the filter on gender like the query below:
SELECT f.id
FROM family f
JOIN child IN f.children
WHERE child.gender = "female"

Related

Create index on nested array value with dynamodb

I have the following data stored in a DynamoDB table called elo-history.
{
"gameId": "chess",
"guildId": "abc123",
"id": "c3c640e2d8b76b034605d8835a03bef8",
"recordedAt": 1621095861673,
"results": [
{
"oldEloRating": null,
"newEloRating": 2010,
"place": 1,
"playerIds": [
"abc1"
]
},
{
"oldEloRating": null,
"newEloRating": 1990,
"place": 2,
"playerIds": [
"abc2"
]
}
],
"versus": "1v1"
}
I have 2 indexes, guildId-recordedAt-index and gameId-recordedAt-index. Theses allow me to query on those fields.
I am trying to add another index for results[].playerIds[]. I want to be able to do a query for records with playerId=abc1 and have those sorted just like guildId and gameId. Does DynamoDB support something like? Do I need to restructure the data or save it in two different formats to support this type of query?
Something like this.
New table called player-elo-history in addition to the elo-history table. This would store the list of games by playerId
{
"id": "abc1",
"gameId": "chess",
"guildId": "abc123",
"recordedAt": 1621095861673,
"results": [
[
{
"oldEloRating": null,
"newEloRating": 2010,
"place": 1,
"playerIds": [
"abc1"
]
},
{
"oldEloRating": null,
"newEloRating": 1990,
"place": 2,
"playerIds": [
"abc2"
]
}
]
]
}
{
"id": "abc2",
"gameId": "chess",
"guildId": "abc123",
"recordedAt": 1621095861673,
"results": [
[
{
"oldEloRating": null,
"newEloRating": 2010,
"place": 1,
"playerIds": [
"abc1"
]
},
{
"oldEloRating": null,
"newEloRating": 1990,
"place": 2,
"playerIds": [
"abc2"
]
}
]
]
}
It looks like you're modeling the one-to-many relationship between Games and Results using a complex attribute (e.g. a list or objects) on the Game item. This is a completely valid approach to modeling one-to-many relationships and is best used when 1) the results data doesn't change (or change often) and 2) you don't have any access patterns around Results.
Since it sounds like you do have access patterns around Results, you'd be better off storing your Results in their own items.
For example, you might consider modeling results in the user partition with a PK=USER#user_id SK=RESULT#game_id. This would allow you to fetch results by User ID (QUERY where PK=USER#user_id SK begins_with RESULT). Alternatively, you could model results with a PK=RESULT#game_id SK=USER#user_id and create a GSI that swaps the PK/SK's which will allow you to group results by User.
I don't know the specifics around your access patterns, but can say that you'll need to move results into their own items if you want to support access patterns around game results.

How can I query nested arrays in Data Explorer?

I am trying to write a query in Data Explorer over a Cosmos DB to give me a list of results where the order has a discount applied. That requires that I examine every element of the Totals array for a Discounts element that is not empty.
I've tried to use ARRAY_LENGTH within ARRAY_CONTAINS as shown below and that didn't return a result set. I know the ARRAY_CONTAINS is use to look for a field value within an array, but I was hoping that it would accept ARRAY_LENGTH command.
SELECT * FROM c where ARRAY_CONTAINS(c.OrderHeader.Totals,{ARRAY_LENGTH(Discounts):1},true))
I've also tried to check for a value in CampaignId field of the Discounts array using the following query. It didn't return a result set.
SELECT * FROM c where ARRAY_CONTAINS(c.OrderHeader.Totals.Discounts,{CampaignId:null},false)
I would assume there's a way to do this, so any input would be greatly appreciated!
{
"OrderHeader": {
"Totals": [
{
"Currency": "CAD",
"Price": 10.00,
"Discounts": []
},
{
"Currency": "CAD",
"Price": 20.00,
"Discounts": []
},
{
"Currency": "CAD",
"Price": 30.00,
"Discounts": [
{
"CampaignId": "Campaign2",
"CouponDefinition": null,
}
]
}
}
Please try this sql:
SELECT t.Currency,t.Price,t.Discounts FROM c JOIN t
IN c.OrderHeader.Totals WHERE ARRAY_LENGTH(t.Discounts) > 0
Here is the result:
[
{
"Currency": "CAD",
"Price": 30,
"Discounts": [
{
"CampaignId": "Campaign2",
"CouponDefinition": null
}
]
}
]
Hope it can help you.

Cosmos DB query strings in array retain grouping and trim the value?

Say we have two sets of data in my collection:
{
"id": "111",
"linkedId": [
"ABC:123",
"ABC:456"
]
}
{
"id": "222",
"linkedId": [
"DEF:321",
"DEF:654"
]
}
What query can I run to get a result that will look like this?
{
[
"123",
"456"
]
},
{
[
"321",
"654"
]
}
I have tried
SELECT c.linkedId FROM c
But this has the "linkedId" as the property name in the result set. And I tried LEFT but it doesn't trim first 4 characters of the string.
Then I tried
SELECT value cc FROM cc In c.linkedId
But this loses the grouping.
Any idea?
Since the elements are just strings, not json object, i suggest you using UDF in cosmos db query sql.
UDF:
function userDefinedFunction(arr){
var returnArr = [];
for(var i=0;i<arr.length;i++){
returnArr.push(arr[i].substring(4,7));
}
return returnArr;
}
SQL:
SELECT value udf.test(c.linkedId) FROM c
OUTPUT:

Cosmos DB - can't use a property (array) with IN Keyword

I'm attempting to query for several documents at once, using some properties that were found on the first document, similar to a TSQL left join on property value.
My attempt in CosmosDB:
select c from assets
join ver on c.versions
where c.id = '123' OR c.id IN ver.otherIds
--NOTE: ver.otherIds is an array
The query above results in a syntax error, stating it doesn't understand ver.otherIds. The docs state the syntax to be where c.id in ("123","456"...)
Things I've tried to work around this:
Attempted Custom UDF that takes in the array generates the syntax wanted Ex) ["123,"456"] --> "("123", "456")
Attempted using array_contains(ver.otherIds, c.id)
Attempted sub query approach, which produced a "The cardinality of a scalar subquery result set cannot be greater than one" error:
select value c from c
where array_contains((select ... that produces array), c.id)
None of the above worked.
I can, of course, pull the first asset, then generate a second query to pull the rest, but I'd rather not do that. I can also just de-normalize all the data, but without giving specifics to my scenario, it would end up being a very bad idea.
Any ideas?
Thanks in advance!
You could use your second scenario:ARRAY_CONTAINS.
My sample document:
[
{
"id": "1",
"versions": [
{
"otherIds": [
"1",
"2",
"3"
]
}
]
},
{
"id": "2",
"versions": [
{
"otherIds": [
"1",
"2",
"3"
]
},
{
"otherIds": [
"123",
"2",
"3"
]
}
]
},
{
"id": "123",
"versions": [
{
"otherIds": [
"1",
"2",
"3"
]
},
{
"otherIds": [
"123",
"2",
"3"
]
}
]
}
]
SQL:
SELECT distinct c.id,c.versions FROM c
join ver in c.versions
where c.id="123" or array_contains(ver.otherIds,c.id,false)
ARRAY_CONTAINS function can specify if the match is full or partial.

Combine orderByPriority with equalTo

I have a dataset like this:
[
{
"projectId": "fdsFDSFaSdA",
"teamId": "ASDasdASDsada"
...
},
{
"projectId": "DSF432afdsf",
"teamId": "fdsASfsdasdd"
...
},
...
]
I nead to select objects from this list sometimes by projectId, sometimes by teamId.
I know that this is possible like so:
ref.orderByChild('key').equalTo(value)
But the problem is that I need to order by priority.
I see that equalTo() takes two parameters:
equalTo(value, [key])
I tried like so but it doesn't work:
ref.orderByPriority().equalTo(value, 'key')
How can I make it work?

Resources