How to run aggregates across partitions

How to run aggregates across partitions - azure-cosmosdb

When I run a query like this in the portal, it runs just fine. But when I run it from the python SDK, I get “Cross partition query only supports 'VALUE ' for aggregates.". I want to run it across all partitions. Any suggestion on how to get it working from the SDK?
Thanks,
Peter
SELECT
c.data.video.id,
COUNT(1) as nTraces,
MAX(c.data.time) as lastReport
FROM c
WHERE c.data.time > "2020-06-02T17:40:25.593141+00:00"
GROUP BY c.data.video.id

According to my test,python sdk don't support group by.
Below is my test code:
query = "SELECT c.id FROM c group by c.id"
items = list(container.query_items(
query=query,
enable_cross_partition_query=True
))
Here is the error:
azure.cosmos.exceptions.CosmosHttpResponseError: (BadRequest) Gateway Failed to Retrieve Query Plan: Query contains 1 or more unsupported features. Upgrade your SDK to a version that does support the requested features:
Query contained GroupBy, which the calling client does not support.
Refer to this document,.net sdk and js sdk support group by.
So you can do this by using .net sdk and js sdk.
By the way,I search for the python sdk release history and it doesn't mentioned supporting group by.
Hope this can help you.

Related

Get a bulk pipeline runs providing run ids as a list from .NET

I have an endpoint which tells me the status of a given adf pipeline from .NET. For that purpose I use the .NET sdk for ADF, specifically I run PipelineRun pipelineRun = client.PipelineRuns.Get(resourceGroup, dataFactoryName, runId); and then I retrieve the status from pipelineRun.Status. The only thing I get from an user here is runId. However, I have a scenario where I need to send a list of runIds. From what I've seen, reading the official documentation, I noticed that most of their apis work with runId of type str which means it works only per runId? Has any of you ever stumbled upon a scenario like this and how did you manage to get status of multiple runIds? Did you use an already built function from the SDK or you just for-looped the PipelineRun pipelineRun = client.PipelineRuns.Get(resourceGroup, dataFactoryName, runId); for listSize times?

CosmosDB Zone Redundancy using Azure Libraries for Net

I currently create a CosmosDB with the following properties:
cosmosDb = await azure.CosmosDBAccounts
.Define(cosmosDbResource.Name)
.WithRegion(cosmosDbResource.Region)
.WithExistingResourceGroup(cosmosDbResource.ResourceGroup.Name)
.WithKind(DatabaseAccountKind.GlobalDocumentDB)
.WithStrongConsistency()
.WithTags(cosmosDbResource.ResourceGroup.Tags)
.CreateAsync();
The only place I have seen to be able to set Zone Redundancy on is the ReadReplication database, like so:
cosmosDb = await azure.CosmosDBAccounts
.Define(cosmosDbResource.Name)
.WithRegion(cosmosDbResource.Region)
.WithExistingResourceGroup(cosmosDbResource.ResourceGroup.Name)
.WithKind(DatabaseAccountKind.GlobalDocumentDB)
.WithStrongConsistency()
.WithReadReplication(Region.USEast, true)
.WithTags(cosmosDbResource.ResourceGroup.Tags)
.CreateAsync();
The problem is that I don't care about a Read Replication database. I want to set Zone Redundancy on the initial database I create. I noticed that in the Azure Portal when I create a CosmosDB manually, it gives me the option to set Zone Redundancy. Is this not possible via the Azure Libraries for NET SDK?

To specify write region with Zone Redundancy do this below:
.WithWriteReplication(Region.USWest2, true)
PS: If at all possible I would recommend you use the Auto-rest generated version of this SDK. The fluent API is not generally as up to date as the Auto-rest generated API's. This gets built directly off our the Cosmos DB swagger spec and everything downstream is built upon this including ARM, PowerShell and CLI.
There is a repository with a fairly complete set of examples as well that you can use to help build your own management libraries. It also includes fluent samples but also out of date. Cosmos DB Samples
This is the repo for the Auto-rest generated SDK. Cosmos DB Management SDK for .NET

Updating from Document DB SDK (2.7.0) to Cosmos SDK (3.12.0)

We are in a process of updating our service code to using Cosmos SDK 3.12.0 from DocumentDB SDK 2.7.0. Since a change will likely be huge, we would like to do it incrementally that will result in our service using both SDKs to access the same databases (one executable loading assemblies of both SDKs). Please let us know if that is supported or if you see any issues in doing so. Also, I couldn’t figure out how to do things in same ways with Cosmos SDK (e.g. specifying “enable cross partition query” in querying items – the query method in 2.7.0 takes FeedOptions as a parameter whereas the new one in 3.12.0 doesn’t). I found this wiki and some sample code but if you have more info/guidelines for converting from Document SDK to Cosmos SDK, please let me know.

Yes,you can use both DocumentDB sdk and cosmos sdk access the same databases.
In Cosmos SDK 3.12.0,there is no need to set EnableCrossPartitionQuery true.just do something like this is ok:
QueryDefinition queryDefinition = new QueryDefinition("select * from c");
FeedIterator<JObject> feedIterator = container.GetItemQueryIterator<JObject>(queryDefinition);
while (feedIterator.HasMoreResults)
{
foreach (var item in await feedIterator.ReadNextAsync())
{
{
Console.WriteLine(item.ToString());
}
}
}

How to access a query "Query Execution Metrics" in Cosmos db .NET Core SDK V3

I am running a query against an Azure Cosmos db and I need to know the total number of retrieved documents regardless of the pagination. Running a Count query against the actual query without the pagination could be very heavy if the number retrieved documents are huge.
In the bellow link it is described how to access to a query "Query Execution Metrics" in Cosmos db .NET SDK V2, I appreciated if someone guide me how to do it using the SDK V3.
https://learn.microsoft.com/en-us/azure/cosmos-db/sql-api-query-metrics

Version 3.2.0 of the SDK that was released yesterday addresses this issue. Instead of asking for the metrics, they are included in every query. You can access them through ResponseMessage.Diagnostics.
The usage is probably easiest to see by looking at the SDK's tests:
((QueryOperationStatistics)responseMessage.Diagnostics)
.queryMetrics
.Values
.First()
.RetrievedDocumentCount
You can see the full list of properties in the QueryMetrics definition: https://github.com/Azure/azure-cosmos-dotnet-v3/blob/2cdcde1b747db59721ede152fc9b5aa87fc62dd4/Microsoft.Azure.Cosmos/src/Query/Core/QueryMetrics/QueryMetrics.cs

Alternative to group by for cosmos db

Given that cosmos db does not support group by, what is a good alternative to achieve similar functionality:
Select sum(*) , groupterm from tble group by groupterm
Can I efficiently achieve this in a cosmos stored procedure?

As Cosmos_DB states as follows:
Aggregation capability in SQL limited to COUNT, SUM, MIN, MAX, AVG functions. No support for GROUP BY or other aggregation functionality found in database systems. However, stored procedures can be used to implement in-the-database aggregation capability.
Can I efficiently achieve this in a cosmos stored procedure?
For .NET and Node.js
Larry Maccherone has provided a great package documentdb-lumenize which supports Aggregations (Group-by, Pivot-table, and N-dimensional Cube) and Time Series Transformations as Stored Procedures in DocumentDB.
Additionally, for Python and Scala, you could refer to azure-cosmosdb-spark.

Group by is now supported in Cosmos db SQL API. You will be needing SDK version 3.3 or higher
Azure Cosmos DB currently supports GROUP BY in .NET SDK 3.3 or later.
Support for other language SDK's and the Azure Portal is not currently
available but is planned.
https://learn.microsoft.com/en-gb/azure/cosmos-db/sql-query-group-by

Finally, Azure Cosmos DB currently supports GROUP BY in .NET SDK 3.3 or later. Support for other language SDK's and the Azure Portal is not currently available but is planned.
<group_by_clause> ::= GROUP BY <scalar_expression_list>
<scalar_expression_list> ::=
<scalar_expression>
| <scalar_expression_list>, <scalar_expression>

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

How to run aggregates across partitions - azure-cosmosdb

Related

Get a bulk pipeline runs providing run ids as a list from .NET

CosmosDB Zone Redundancy using Azure Libraries for Net

Updating from Document DB SDK (2.7.0) to Cosmos SDK (3.12.0)

How to access a query "Query Execution Metrics" in Cosmos db .NET Core SDK V3

Alternative to group by for cosmos db

Categories

Resources