Is there a way to use c.* in CosmosDB? - azure-cosmosdb

I'm trying to run a query in CosmosDB but I can't seem to get the result the way I want it.
This is the query:
SELECT c FROM c
JOIN p in c.Data.packages
WHERE p.packageId ="ID_9DACF11F-31F1-45C0-9A99-7ED846F9226E"
Is there a way to get the following result without the "c":
[
{
"c": {
"id": "ID-6A23-432D-B862-4342D6B8C6F0",
"prop1": "value",
"prop2": "value",
"Data": {
"date": "2020-01-30T18:21:57",
"packages": [
{
"packageId": "123"
}
]
}
}
}
]
As in just:
[
{
"id": "ID-6A23-432D-B862-4342D6B8C6F0",
"prop1": "value",
"prop2": "value",
"Data": {
"date": "2020-01-30T18:21:57",
"packages": [
{
"packageId": "123"
}
]
}
}
]
I know I can use c.id, c.prop1, etc. But I have tons of more properties, and it will get really hard to maintain in the future. So is there a way to achieve this?
Basically looking for something like this:
SELECT c.* FROM c
JOIN p in c.Data.packages
WHERE p.packageId ="ID_9DACF11F-31F1-45C0-9A99-7ED846F9226E"
But this doesn't work obviously. Any help appreciated!

For queries with multiple aliases use VALUE c instead.
SELECT VALUE c FROM c
JOIN p in c.Data.packages
WHERE p.packageId ="ID_9DACF11F-31F1-45C0-9A99-7ED846F9226E"
Docs: https://learn.microsoft.com/en-us/azure/cosmos-db/sql-query-select

Related

How to query Cosmos DB to have an array from multiple items in the result set

I have the following content in a container, where device_id is the partition key.
[
{
"id": "hub-01",
"device_id": "device-01",
"created": "2020-12-08T17:47:35",
"cohort": "test"
},
{
"id": "hub-02",
"device_id": "device-01",
"created": "2020-12-08T17:47:36",
"cohort": "test"
},
{
"id": "hub-01",
"device_id": "device-02",
"created": "2020-11-17T20:25:20",
"cohort": "test"
},
{
"id": "hub-01",
"device_id": "device-03",
"created": "2020-11-17T16:05:18",
"cohort": "test"
}
]
How do I query all unique devices, with all their metadata collected into a sub-list, so I get the following result set:
[
{
"device_id": "device-01",
"hubs": [
{
"id": "hub-01",
"created": "2020-12-08T17:47:35",
"cohort": "test"
},
{
"id": "hub-02",
"created": "2020-12-08T17:47:36",
"cohort": "test"
}
]
},
{
"device_id": "device-02",
"hubs": [
{
"id": "hub-01",
"created": "2020-11-17T20:25:20",
"cohort": "test"
}
]
},
{
"device_id": "device-03",
"hubs": [
{
"id": "hub-01",
"created": "2020-11-17T16:05:18",
"cohort": "test"
}
]
}
]
I was experimenting along the lines of the following sub-query, but it does not behave as I would expect:
SELECT
DISTINCT c.device_id,
ARRAY(
SELECT
c2.id,
c2.created,
c2.cohort
FROM c AS c2
WHERE c2.device_id = c.device_id
) as hubs
FROM c
You can create UDF function to handle this.
Here is a similar question I answered from another post.
group data by same timestamp using cosmos db sql
I agree with Mo B. You need to deal with this on your client side. I don't think UDF function can handle this because UDF function can't combine multiple items to one. I think the closest SQL like this:
SELECT
c2.device_id,ARRAY_CONCAT([],c2.hubs)
FROM
(SELECT c.device_id,ARRAY(
SELECT
c.id,
c.created,
c.cohort
FROM c
) as hubs FROM c) as c2
GROUP BY c2.device_id
But ARRAY_CONCAT isn't Aggregate function and there is no Aggregate function can concat array.

Gremlin Query (JSON Output) using tree

Query:
g.withSack(0).V().hasLabel('A').has('label_A','A').union(__.emit().repeat(sack(sum).by(constant(1)).in()),emit().repeat(sack(sum).by(constant(-1)).out())).project('level','properties').by(sack()).by(tree().by(valueMap().by(unfold())).unfold())
Output:
{
"level": 1,
"properties": {
"key": {
"label_A": "A"
},
"value": {
"{label_A=A}": {}
}
}
},
{
"level": 2,
"properties": {
"key": {
"label_A": "A"
},
"value": {
"{label_A=A}": {}
}
}
}
}
Getting keys in json format but not values. Please suggest changes in query to acheive the values in json format.
The tree() step returns a tree structure in the form of a Map of Map instances essentially so the output is about what you can expect. In this case, I wonder if you need to use tree() and could instead get by with path() as it seems like it accomplishes the same result without the added structure:
g.withSack(0).
V().hasLabel('A').has('label_A','A').
union(__.emit().repeat(sack(sum).by(constant(1)).in()),
emit().repeat(sack(sum).by(constant(-1)).out())).
project('level','properties').
by(sack()).
by(path().by(elementMap()))

Cosmos DB query syntax WHERE clause with array in array

The following json represents two documents in a Cosmos DB container.
How can I write a query that gets any document that has an item with an id of item_1 and value of bar.
I've looked into ARRAY_CONTAINS, but don't get this to work with array's in array's.
Als I've tried somethings with any. Although I can't seem to find any documentation on how to use this, any seems to be a valid function, as I do get formatting highlights in the cosmos db explorer in Azure Portal.
For the any function I tried things like SELECT * FROM c WHERE c.pages.any(p, p.items.any(i, i.id = "item_1" AND i.value = "bar")).
The id fields are unique so if it's easier to find any document that contains any object with the right id and value, that would be fine too.
[
{
"type": "form",
"id": "form_a",
"pages": [
{
"name": "Page 1",
"id": "page_1",
"items": [
{
"id": "item_1",
"value": "foo"
}
]
}
]
},
{
"type": "form",
"id": "form_b",
"pages": [
{
"name": "Page 1",
"id": "page_1",
"items": [
{
"id": "item_1",
"value": "bar"
}
]
}
]
}
]
I think join could handle with WHERE clause with array in array.Please test below sql:
SELECT c.id FROM c
join pages in c.pages
where array_contains(pages.items,{"id": "item_1","value": "bar"},true)
Output:

CosmosDB, help flatten and filter by nested array

I'm trying to flatten and filter my json data that is in a CosmosDB.
The data looks like below and I would like to flatten everything in the array Variables and then filter by specific _id and Timestamp inside of the array:
{
"_id": 21032,
"FirstConnected": {
"$date": 1522835868346
},
"LastUpdated": {
"$date": 1523360279908
},
"Variables": [
{
"_id": 99999,
"Values": [
{
"Timestamp": {
"$date": 1522835868347
},
"Value": 1
}
]
},
{
"_id": 99998,
"Values": [
{
"Timestamp": {
"$date": 1523270312001
},
"Value": 8888
}
]
}
]
}
If you want to flatten data from the Variables array with properties from the root object you can query your collection like this:
SELECT root._id, root.FirstConnected, root.LastUpdated, var.Values
FROM root
JOIN var IN root.Variables
WHERE var._id = 99998
This will result into:
[
{
"_id": 21032,
"FirstConnected": {
"$date": 1522835868346
},
"LastUpdated": {
"$date": 1523360279908
},
"Values": [
{
"Timestamp": {
"$date": 1523270312001
},
"Value": 8888
}
]
}
]
If you want to even flatten the Values array you will need to write something like this:
SELECT root._id, root.FirstConnected, root.LastUpdated,
var.Values[0].Timestamp, var.Values[0]["Value"]
FROM root
JOIN var IN root.Variables
WHERE var._id = 99998
Note that CosmosDB considers "Value" as a reserved keyword and you need to use an escpape syntax. The result for this query is:
[
{
"_id": 21032,
"FirstConnected": {
"$date": 1522835868346
},
"LastUpdated": {
"$date": 1523360279908
},
"Timestamp": "1970-01-01T00:00:00Z",
"Value": 8888
}
]
Check for more details https://learn.microsoft.com/en-us/azure/cosmos-db/sql-api-sql-query#Advanced
If you're only looking for filtering by the nested '_id' property then you could use ARRAY_CONTAINS w/ the partial_match argument set to true. The query would look something like this:
SELECT VALUE c
FROM c
WHERE ARRAY_CONTAINS(c.Variables, {_id: 99998}, true)
If you also want to flatten the array, then you could use JOIN
SELECT VALUE v
FROM v IN c.Variables
WHERE v._id = 99998

How can I write a document db query which returns results of sub-documents satisifying any conditions of input list?

Let's assume that I have the following document:
[
{
"name" : "obj1",
"field": [ "Foo1", "Foo3" ]
},
{
"name": "obj2",
"field": [ "Foo2" ]
},
{
"name": "obj3",
"field": [ "Foo3" ]
},
{
"name": "obj4",
"field": [ "Foo1" ]
}
]
I want to write a query which returns obj1, obj3, and obj4 when field = "Foo1" or "Foo3" are searched for. Obviously I can write something like:
SELECT * FROM c WHERE ARRAY_CONTAINS(c.field, "Foo1") OR ARRAY_CONTAINS(c.field, "Foo3")
Though I want to avoid constructing a long query by concatenating query string with ARRAY_CONTAINS for each value in search list.
How can this query be expressed succinctly?
You could rewrite the query using JOIN as follows:
SELECT c
FROM c
JOIN tag IN c.field
WHERE ARRAY_CONTAINS(["Foo1", "Foo3"], tag)
Note that if you have an object with both tags, then it would occur multiple times in the result, and you have to perform distinct/de-duping on the client side.

Resources