Need to flatten json array in azure-cosmos db - azure-cosmosdb

I have simple JSON
{"ProductUID":1100,"Color":"White","tags":["toy","children","games"]}
I want output as
[
{
"ProductUID": 1100,
"Color":"White",
"tags":"toy"
},
{
"ProductUID": 1100
"Color" : "White",
"tags" : "Children"
},
{
"ProductUID": 1100,
"Color": "White",
"tags": "games"
}
]
I have tried this query in cosmos db but couldn't separate tags array
SELECT P.ProductUID, P.tags
FROM Products P
join C in P.tags

Welcome to StackOverflow!
This query should project what you need:
SELECT P.ProductUID, P.Color, C AS tags
FROM Products P
join C in P.tags

Related

Create index on nested array value with dynamodb

I have the following data stored in a DynamoDB table called elo-history.
{
"gameId": "chess",
"guildId": "abc123",
"id": "c3c640e2d8b76b034605d8835a03bef8",
"recordedAt": 1621095861673,
"results": [
{
"oldEloRating": null,
"newEloRating": 2010,
"place": 1,
"playerIds": [
"abc1"
]
},
{
"oldEloRating": null,
"newEloRating": 1990,
"place": 2,
"playerIds": [
"abc2"
]
}
],
"versus": "1v1"
}
I have 2 indexes, guildId-recordedAt-index and gameId-recordedAt-index. Theses allow me to query on those fields.
I am trying to add another index for results[].playerIds[]. I want to be able to do a query for records with playerId=abc1 and have those sorted just like guildId and gameId. Does DynamoDB support something like? Do I need to restructure the data or save it in two different formats to support this type of query?
Something like this.
New table called player-elo-history in addition to the elo-history table. This would store the list of games by playerId
{
"id": "abc1",
"gameId": "chess",
"guildId": "abc123",
"recordedAt": 1621095861673,
"results": [
[
{
"oldEloRating": null,
"newEloRating": 2010,
"place": 1,
"playerIds": [
"abc1"
]
},
{
"oldEloRating": null,
"newEloRating": 1990,
"place": 2,
"playerIds": [
"abc2"
]
}
]
]
}
{
"id": "abc2",
"gameId": "chess",
"guildId": "abc123",
"recordedAt": 1621095861673,
"results": [
[
{
"oldEloRating": null,
"newEloRating": 2010,
"place": 1,
"playerIds": [
"abc1"
]
},
{
"oldEloRating": null,
"newEloRating": 1990,
"place": 2,
"playerIds": [
"abc2"
]
}
]
]
}
It looks like you're modeling the one-to-many relationship between Games and Results using a complex attribute (e.g. a list or objects) on the Game item. This is a completely valid approach to modeling one-to-many relationships and is best used when 1) the results data doesn't change (or change often) and 2) you don't have any access patterns around Results.
Since it sounds like you do have access patterns around Results, you'd be better off storing your Results in their own items.
For example, you might consider modeling results in the user partition with a PK=USER#user_id SK=RESULT#game_id. This would allow you to fetch results by User ID (QUERY where PK=USER#user_id SK begins_with RESULT). Alternatively, you could model results with a PK=RESULT#game_id SK=USER#user_id and create a GSI that swaps the PK/SK's which will allow you to group results by User.
I don't know the specifics around your access patterns, but can say that you'll need to move results into their own items if you want to support access patterns around game results.

How can I query nested arrays in Data Explorer?

I am trying to write a query in Data Explorer over a Cosmos DB to give me a list of results where the order has a discount applied. That requires that I examine every element of the Totals array for a Discounts element that is not empty.
I've tried to use ARRAY_LENGTH within ARRAY_CONTAINS as shown below and that didn't return a result set. I know the ARRAY_CONTAINS is use to look for a field value within an array, but I was hoping that it would accept ARRAY_LENGTH command.
SELECT * FROM c where ARRAY_CONTAINS(c.OrderHeader.Totals,{ARRAY_LENGTH(Discounts):1},true))
I've also tried to check for a value in CampaignId field of the Discounts array using the following query. It didn't return a result set.
SELECT * FROM c where ARRAY_CONTAINS(c.OrderHeader.Totals.Discounts,{CampaignId:null},false)
I would assume there's a way to do this, so any input would be greatly appreciated!
{
"OrderHeader": {
"Totals": [
{
"Currency": "CAD",
"Price": 10.00,
"Discounts": []
},
{
"Currency": "CAD",
"Price": 20.00,
"Discounts": []
},
{
"Currency": "CAD",
"Price": 30.00,
"Discounts": [
{
"CampaignId": "Campaign2",
"CouponDefinition": null,
}
]
}
}
Please try this sql:
SELECT t.Currency,t.Price,t.Discounts FROM c JOIN t
IN c.OrderHeader.Totals WHERE ARRAY_LENGTH(t.Discounts) > 0
Here is the result:
[
{
"Currency": "CAD",
"Price": 30,
"Discounts": [
{
"CampaignId": "Campaign2",
"CouponDefinition": null
}
]
}
]
Hope it can help you.

Cosmos DB query strings in array retain grouping and trim the value?

Say we have two sets of data in my collection:
{
"id": "111",
"linkedId": [
"ABC:123",
"ABC:456"
]
}
{
"id": "222",
"linkedId": [
"DEF:321",
"DEF:654"
]
}
What query can I run to get a result that will look like this?
{
[
"123",
"456"
]
},
{
[
"321",
"654"
]
}
I have tried
SELECT c.linkedId FROM c
But this has the "linkedId" as the property name in the result set. And I tried LEFT but it doesn't trim first 4 characters of the string.
Then I tried
SELECT value cc FROM cc In c.linkedId
But this loses the grouping.
Any idea?
Since the elements are just strings, not json object, i suggest you using UDF in cosmos db query sql.
UDF:
function userDefinedFunction(arr){
var returnArr = [];
for(var i=0;i<arr.length;i++){
returnArr.push(arr[i].substring(4,7));
}
return returnArr;
}
SQL:
SELECT value udf.test(c.linkedId) FROM c
OUTPUT:

Gremlin group by vertex property and get sum other properties in the same vertex

We have vertex which will store various jobs and their types and counts as properties. I have to group by the status and their counts. I tried the following query which works for one property(receiveCount)
g.V().hasLabel("Jobs").has("Type",within("A","B","C")).group().by("Type").by(fold().match(__.as("p").unfold().values("receiveCount").sum().as("totalRec")).select("totalRec")).next()
I wanted to give 10 more properties like successCount, FailedCount etc.. Is there a better way to give that?
You could use cap() step just like:
g.V().has("name","marko").out("knows").groupCount("a").by("name").group("b").by("name").by(values("age").sum()).cap("a","b")
And the result would be:
"data": [
{
"a": {
"vadas": 1,
"josh": 1
},
"b": {
"vadas": [
27.0
],
"josh": [
32.0
]
}
}
]

DocumentDB - WHERE clause within collection

Given a sample document like this one from the Microsoft examples:
{
"id": "AndersenFamily",
"lastName": "Andersen",
"parents": [
{ "firstName": "Thomas" },
{ "firstName": "Mary Kay"}
],
"children": [
{
"firstName": "Henriette Thaulow",
"gender": "female",
"grade": 5,
"pets": [{ "givenName": "Fluffy" }]
}
],
"address": { "state": "WA", "county": "King", "city": "seattle" },
"creationDate": 1431620472,
"isRegistered": true
}
We can see that there is a sub-collection children containing an array of documents.
Let's say I wanted to write a query using the SELECT ... FROM ... WHERE ... type syntax, how would I go about writing a query to find families with any daughters (any children with gender "female")
So something like
SELECT c.id
FROM c
WHERE c.children.contains( // I'm stuck!
I'm wondering if I'm missing a JOIN or something but honestly I'm not sure where I go from here, and I'm struggling to find anything helpful in Google partially because I'm not sure how to phrase my search!
You need the JOIN keyword to unwind the children, then apply the filter on gender like the query below:
SELECT f.id
FROM family f
JOIN child IN f.children
WHERE child.gender = "female"

Resources