Cosmos DB - Select root document based on child data - azure-cosmosdb

Sorry if this is a newbie question, but I am a newbie to Cosmos DB.
I am trying to select all the root documents from my collection where there a child element matches specified (multiple) criteria.
Lets assume you have an ORDER document, which has ORDERITEMS as sub-data document, what I need to do is to query all the orders where a particular product has been ordered, and to return the whole order document.
[
{
"order": {
"id": "1",
"orderiems": [
{
"partcode": "A",
"qty": "4"
},
{
"partcode": "B",
"qty": "4"
},
{
"partcode": "C",
"qty": "4"
}
]
}
},
{
"order": {
"id": "2",
"orderiems": [
{
"partcode": "A",
"qty": "4"
},
{
"partcode": "B",
"qty": "4"
},
{
"partcode": "A",
"qty": "4"
}
]
}
},
{
"order": {
"id": "3",
"orderiems": [
{
"partcode": "A",
"qty": "1"
}
]
}
}
]
My query is
SELECT order from order
JOIN items in order.orderitem
WHERE item.partcode = '<mypartcode>
AND item.qty > 1
Now, this sort of works and returns me the orders, but it is returning
id: 1
id: 2
id: 2 << repeated
because id: 2 has two of the same item.... id: 3 excluded because it's only 1 item
In normal SQL Server SQL I would simply have
SELECT *
from Orders o
where exists (select 1
from OrderItems oi
where oi.ordID = o.ID
and oi.partcode = 'A'
and oi.qty > 1)
How can I stop the duplication please
Please note that the above is a hand-crafted representation to simplify the problem as the document model I am actually working on a extremely large

Cosmos DB now supports the DISTINCT keyword and it will actually work on document use cases such as yours.

With the current version of the Azure Cosmos DB SQL API you can use some of these:
SELECT distinct VALUE order
FROM order
JOIN item in order.orderitems
WHERE item.partcode = '<Partcode>'
AND item.qty > 1
Or:
SELECT order
FROM order
WHERE EXISTS (
SELECT NULL
FROM item IN order.orderitems
item.partcode = '<Partcode>'
AND item.qty > 1
)

Related

How to return an array from `capture` with `global` filter in `jq`

Given the following input:
{
"text": "a1\nb2"
}
How do I get the following output:
[
{
"letter": "a",
"number": 1
},
{
"letter": "b",
"number": 2
}
]
I've tried using capture with the "g" flag, but this yields two documents instead of a single document with an array of captured inputs:
$ echo '{
"text": "a1\\nb2"
}' | jq '.text | capture("(?<letter>[a-z])(?<number>[0-9])";"g")'
{
"letter": "a",
"number": "1"
}
{
"letter": "b",
"number": "2"
}
Here is a link to the jqplay example.
Why not just wrap the capture in a new array:
.text | [ capture("(?<letter>[a-z])(?<number>[0-9])";"g") ]
JqPlay

Best way to retrieve document with nested JSON and limit

Suppose we have a structure:
{
"nested_items": [
{
"nested_sample0": "1",
"nested_sample1": "test",
"nested_sample2": "test",
"nested_sample3": {
"type": "type"
},
"nested_sample": null
},
{
"nested_sample0": "1",
"nested_sample1": "test",
"nested_sample2": "test",
"nested_sample3": {
"type": "type"
},
"nested_sample1": null
},
...
],
"sample1": 1233,
"id": "ed68ca34-6b59-4687-a557-bdefc9ec2f4b",
"sample2": "",
"sample3": "test",
"sample4": "test",
"_ts": 1656503348
}
I want to retrieve documents by id by with limit of "nested_items" field .As I know limit and offset not supported in sub queries. Any way to do this except of divide into two queries? Maybe some udf or else?
You can use the function ARRAY_SLICE assuming the array is ordered.
Example data:
{
"name": "John",
"details": [
{
"id": 1
},
{
"id": 2
},
{
"id": 3
}
]
}
Example queries
-- First 2 items from nested array
SELECT c.name, ARRAY_SLICE(c.details, 0, 2) as details
FROM c
-- Last 2 items from nested array
SELECT c.name, ARRAY_SLICE(c.details, ARRAY_LENGTH(c.details) - 2, 2) as details
FROM c

query for access selective array items from COSMOS db

I need to fetch some selective array items from following array in cosmos db collection.
"details": [
{
"name": "a",
"roll_no": 100,
"sub":"maths",
"class":"3"
},
{
"name": "b",
"roll_no":"512",
"sub":"eng",
"class":"5"
},
{
"name": "c",
"roll_no":"512",
"sub":"eng",
"class":"7"
}
and so on
Desired output is:
"details": [
{
"name": "a",
"roll_no": 100,
},
{
"name": "b",
"roll_no":"512",
},
{
"name": "c",
"roll_no":"512",
}
and so on
How can I write query for the same in cosmosb db?
Using a subquery and the ARRAY function will work:
SELECT c.id, ARRAY(SELECT d.name, d.roll_no FROM d in c.details) AS details
FROM c

Filtering on a aggregate function

I am using Azure Cosmos DB and trying to write a query to filter document by Name and version. I am new to Cosmos and it seems the way I'm doing applies the filter per record versus the results themselves. Can anyone tell me the proper way to accomplish this:
select C.*
from c
JOIN (select MAX(c.version) from c where c.name = "test") maxVersion
where maxVersion = c.version
Sample data:
[{"name":"test","verson":1}{"name":"test","verson":2}{"name":"test","verson":3}]
Results:
I get a record back for each version vs the max version. IE I only should get one record back and it's version number should be 3
When you run this SQL:
select c,maxVersion
from c
JOIN (select MAX(c.version) from c where c.name = "test") maxVersion
you will get this document:
{
"c": {
"id": "1",
"name": "test",
"version": 1
},
"maxVersion": {
"$1": 1
}
}
{
"c": {
"id": "2",
"name": "test",
"version": 2
},
"maxVersion": {
"$1": 2
}
},
{
"c": {
"id": "3",
"name": "test",
"version": 3
},
"maxVersion": {
"$1": 3
}
}
Your maxVerson equals to c.version in each document, so you will get multiple documents not one.
According to your requirement, you can try something like this SQL:
SELECT TOP 1 *
FROM c
WHERE c.name = "test"
ORDER BY c.version DESC

Cosmos DB SQL API query for children of nested objects

I would like to find a better way to search for if documents in a collection have a property with more than 0 elements in the array, i.e. anything that isn't empty.
such as: select * from c where c.property = 'x' and array_length(c.child) > 0 and array_length(c.child.grandchild) > 0
The first arraylength works. Adding the second with just this dot notation doesn't work as I read somewhere else. How can I ensure that I can accomplish this. The grandchild will be anywhere from 0 to many number where it has a greater array length than 0.
Please let me know if more clarification is needed.
Please use below sql :
SELECT distinct c.id,c.name,c.child FROM c
join child in c.child
where array_length(c.child) > 0
and array_length(child.grandchild) > 0
My sample documents:
[
{
"id": "1",
"name": "Jay",
"child": [
{
"name": "A",
"grandchild": [
{
"name": "A1"
},
{
"name": "A2"
}
]
},
{
"name": "B",
"grandchild": [
{
"name": "B1"
},
{
"name": "B2"
}
]
}
]
},
{
"id": "2",
"name": "Tom",
"child": [
{
"name": "A",
"grandchild": []
},
{
"name": "B",
"grandchild": []
}
]
}
]
Hope it helps you.

Resources