I am building on this question that I asked earlier
The example document is the same
[
{
"id": "1",
"locales": [
{
"categories": [
"Women",
"clothing",
"tops"
]
}
]
},
{
"id": "2",
"locales": [
{
"categories": [
"Men",
"test",
"tops"
]
}
]
}
]
And I am using the following query to get the UNIQUE categories from all documents and this is working fine but I need to further restrict the search so I only get the first 2 levels in the categories array and I cannot figure out how to adjust the query to do that.
From the example document I would only like to get Women, Clothing, Men, Test since those are all in the first 2 levels in the array.
How do I adjust my query to achieve this?
Anyway,you can't achieve your goal with single sql directly. I still suggest you using the combination of sql and stored procedure.
sql:
SELECT l.categories[0] as one, l.categories[1] as two FROM c
join l in c.locales
Output:
Then use above sql in stored procedure to get the results, loop it to filter the duplicate items.
Update Answer:
Duplicate will compare all the elements,so you can't achieve the goal with it directly.
Related
I'm trying to understand the documentation about JOIN in Cosmos DB SQL.
In the sample JSON, each family object has a children property, like this:
First family object:
"id": "AndersenFamily",
"lastName": "Andersen",
"children": [
{
"firstName": "Henriette Thaulow",
"grade": 5
}
],
Second family object:
"id": "WakefieldFamily",
"children": [
{
"givenName": "Jesse"
}
],
Then, this JOIN operation is shown:
SELECT f.id
FROM Families f
JOIN f.children
The obvious result is the identifier of each family object.
[
{
"id": "AndersenFamily"
},
{
"id": "WakefieldFamily"
}
]
If the "JOIN" is removed, the result is exactly the same. Even I wanted to project the children, I could just use SELECT f.id, f.children and there would be no reason to join f.children.
The only difference I observe is if a family object didn't have a children property. Then the joining on f.children would exclude the family object from the results.
So what is the point of a JOIN in Cosmos DB SQL without combining it with IN? Is there any real use cases for it?
I have a large collection of json documents whose structure is in the form:
{
"id": "00000000-0000-0000-0000-000000001122",
"typeId": 0,
"projectId": "p001",
"properties": [
{
"id": "a6fdd321-562c-4a40-97c7-4a34c097033d",
"name": "projectName",
"value": "contoso",
},
{
"id": "d3b5d3b6-66de-47b5-894b-cdecfc8afc40",
"name": "status",
"value": "open",
},
.....{etc}
]
}
There may be a lot of properties in the collection, all identified by the value of name. The fields in properties are pretty consistent -- there may be some variability, but they will all have the fields that I care about. There's an Id, some labels, etc
I'm wanting to combine these with some other data in PowerBI using the projectId to create some very valuable reports.
I think what I want to do it 'normalize' this data into a table, like:
ProjectId
projectName
status
openDate
closeDate
manager
p001
contoso
open
20200101
me
etc
Where I'm at...
I can go:
SELECT c["value"] AS ProjectName
FROM c in t.Properties
WHERE c["name"] = "projectName"
... this will give me each projectName
I can do that a heap of times to get the 'values' (status, openDate, manager, etc)
If I want to combine them together then I would need to combine all those sub-queries together with 'id'. But 'id' in not in the scope of the SELECT, so how do I get it?? If I were to do this, it sounds like something that would be very expensive (RU's) to execute.
I think I'm overcomplicating this, but I cant quite get my head around the Cosmos syntax.
Help??
You can achieve it with JOINS and the WHERE expressions although the scheme is not ideal for querying and you should consider changing it.
SELECT
c['projectId'], --c.projectId also works, but value is a reserved keyword
n['value'] AS projectName,
s['value'] AS status
FROM c
JOIN n IN c.properties
JOIN s IN c.properties
WHERE n['name'] = 'projectName' AND s['name'] = 'status'
--note all filtered properties must appear exactly once for it to work properly
Edit; new query that solves the potential issue that filtered properties must appear exactly once.
SELECT
c['projectId'],
ARRAY(
SELECT VALUE n['value']
FROM n IN c.properties
WHERE n['name'] = 'projectName'
)[0] AS projectName,
ARRAY(
SELECT VALUE n['value']
FROM n IN c.properties
WHERE n['name'] = 'status'
)[0] AS status
FROM c
Assume documents of the following schema
{
"id": 1,
"children":[
{"id": "a", "name":"al"},
{"id": "b", "name":"bob"}
]
}
I want to return an array of arrays of all children but filtered on the id property at the root level. Below are the most the known alternatives and limitations:
SELECT * FROM c.children
The above SQL, provides the array of arrays in the right shape but it doens't allow me to filter at the ID in the ROOT level of the document.
SELECT children FROM c WHERE c.id >= 1
The above allows the filtering but returns an array of objects all with the "children" property containing the array.
SELECT child.id, child.name FROM c JOIN child in c.children WHERE c.id >= 1
The above allows the filtering but returns an array of objects. Unlike the previous example the objects are flattened to the child level e.g. property named prefix of "children" is not present.
Again the ordering and grouping children in the returned arrays returned are important on the client side, thus the desired to return all children of a parent grouped in to an array. The first query accomplishes that be doesn't allow filtering.
Please try this SQL:
SELECT value c.children FROM c WHERE c.id >= 1
Result:
[
[
{
"id": "a",
"name": "al"
},
{
"id": "b",
"name": "bob"
}
]
]
I'm kinda stuck on this issue. I have several hundreds of a certain model stored in ComsosDb and I can't seem to get the top 5 of each category.
This is the model:
"id": "06224840-6b88-4394-9324-4d1628383702",
"name": "Reservation",
"description": null,
"client": null,
"reference": null,
"isMonitoring": false,
"monitoringSince": null,
"hasRiskProfile": false,
"riskProfile": -1,
"monitorFrequency": 0,
"mainBindable": null,
"organizationId": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
"userId": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
"createDate": "2020-08-18T11:00:02.5266403Z",
"updateDate": "2020-08-18T11:00:02.5266419Z",
"lastMonitorDate": "2020-08-18T11:00:02.5266427Z"
So what i'm trying to do is use C# to get the top 5 from each risk profile where the organizationId matches. GroupBy through LINQ throws an error, same with a row_number() query combined with a PARTITION BY, doesn't seem to work either.
Any way I can get this to work in a single query compatible with cosmos?
EDIT:
What i am trying to achieve in CosmosDb is this roughly:
WITH TopEntries AS (
SELECT *
,ROW_NUMBER() OVER (
PARTITION BY [riskProfile]
ORDER BY [updateDate] DESC
) AS [ROW NUMBER]
WHERE [organizationId] = "xyz"
FROM [reservations]
)
SELECT * FROM TopEntries
WHERE TopEntries.[ROW NUMBER] <= 5
It sounds like combining TOP and ORDER BY would do the job. For example:
SELECT TOP 5 *
FROM c
WHERE c.organizationId = "xyz"
ORDER BY c.riskProfile
You can build such queries with parameters in the .NET SDK as in this sample.
The functionality you are trying to achieve is not directly possible through single query in Cosmos DB. There are 2 steps to do this(You can change as per you document sets)
Firstly you will have to group by like below:
SELECT c.city FROM c where c.org = 'xyz' group by c.city
Then loop through the result one by one from the first query like below:
SELECT TOP 5 * FROM C WHERE C.city = 'delhi' order by C.date desc
You can refer to similar issue here:
https://learn.microsoft.com/en-us/answers/questions/38454/index.html
My document that I save in Cosmos DB looks like this:
{
"id": "abc123",
"myProperty": [
"1905844b-6ca9-4967-ba40-a736b685ca62",
"b03cc85c-ef0b-4f48-9c31-800de089190a"
]
}
As you can see, in the myProperty property, I have an array of GUID values and I want to read them as an array/list of GUID values but I'm having trouble formulating the correct SELECT statement.
The output I'm looking for is:
[
"1905844b-6ca9-4967-ba40-a736b685ca62",
"b03cc85c-ef0b-4f48-9c31-800de089190a"
]
The closest I could get is this `SELECT statement:
SELECT VALUE c.myProperty FROM c WHERE c.id = "abc123"
But this doesn't give me exactly what I want either. This gives me an array within an array i.e.
[
[
"1905844b-6ca9-4967-ba40-a736b685ca62",
"b03cc85c-ef0b-4f48-9c31-800de089190a"
]
]
What should my SELECT statement look like to get what I want?
I dont think you can ever get anything else, because cosmos db will always return an array in response to a query because potentially there can be 0-infinity results. so you will always get a top level array that will wrap all your results (even if you have only one)