How to get metrics for a request on CosmosDB graph collection? - azure-cosmosdb

I want to find out details about a Gremlim query - so I set the PopulateQueryMetrics property of the FeedOptions argument to true.
But the FeedResponse object I get back doesn't have the QueryMetrics property populated.
var queryString = $"g.addV('{d.type}').property('id', '{d.Id}')";
var query = client.CreateGremlinQuery<dynamic>(graphCollection, queryString,
new FeedOptions {
PopulateQueryMetrics = true
});
while (query.HasMoreResults)
{
FeedResponse<dynamic> response = await query.ExecuteNextAsync();
//response.QueryMetrics is null
}
Am I missing something?

According to your description, I created my Azure Cosmos DB account with Gremlin (graph) API, and I could encounter the same issue as you mentioned. I found a tutorial Monitoring and debugging with metrics in Azure Cosmos DB and read the Debugging why queries are running slow section as follows:
In the SQL API SDKs, Azure Cosmos DB provides query execution statistics.
IDocumentQuery<dynamic> query = client.CreateDocumentQuery(
UriFactory.CreateDocumentCollectionUri(DatabaseName, CollectionName),
“SELECT * FROM c WHERE c.city = ‘Seattle’”,
new FeedOptions
{
PopulateQueryMetrics = true,
MaxItemCount = -1,
MaxDegreeOfParallelism = -1,
EnableCrossPartitionQuery = true
}).AsDocumentQuery();
FeedResponse<dynamic> result = await query.ExecuteNextAsync();
// Returns metrics by partition key range Id
IReadOnlyDictionary<string, QueryMetrics> metrics = result.QueryMetrics;
Then, I queried my Cosmos DB Gremlin (graph) account via the SQL API above, I retrieved the QueryMetrics as follows:
Note: I also checked that you could specify the SQL expression like this SELECT * FROM c where c.id='thomas' and c.label='person'. For adding new Vertex, I do not know how to construct the SQL expression. Moreover, the CreateDocumentAsync method does not support the FeedOptions parameter.
Per my understanding, the PopulateQueryMetrics setting may only work when using the SQL API. You could add your feedback here.

Related

FeedOptions.PopulateQueryMetrics in Cosmos SDK 3.0

There is a property in FeedOptions class named PopulateQueryMetrics in Document DB SDK 2.#. What is the equivalent in Cosmos SDK 3.#?
IDocumentQuery<dynamic> documentQuery = documentClient.CreateDocumentQuery<dynamic>(
collectionUri,
sqlQuery,
new FeedOptions
{
EnableCrossPartitionQuery = true,
PopulateQueryMetrics = true
});
You don't need to set this flag anymore, query metrics (plus other important network information and latency metrics) are captured in the FeedResponse.Diagnostics. Reference: https://learn.microsoft.com/dotnet/api/microsoft.azure.cosmos.response-1.diagnostics?view=azure-dotnet#Microsoft_Azure_Cosmos_Response_1_Diagnostics
Just capture this property's value ToString(), for example:
FeedResponse<T> response = await iterator.ReadNextAsync();
Console.WriteLine(response.Diagnostics.ToString());

Invalid index exception when using BulkExecutor in CosmosDb

I have an error when I'm trying to use BulkExecutor to update one of the properties in CosmosDb. The error message is "Index was out of range. Must be non-negative and less than the size of the collection.
Parameter name: index"
Important point- I don't have partition key defined on my collection.
Here is my code:
SetUpdateOperation<string> player1NameUpdateOperation = new SetUpdateOperation<string>("Player1Name", name);
var updateOperations = new List<UpdateOperation>();
updateOperations.Add(player1NameUpdateOperation);
var updateItems = new List<UpdateItem>();
foreach (var match in list)
{
string id = match.id;
updateItems.Add(new UpdateItem(id, null, updateOperations));
}
var executor = new Microsoft.Azure.CosmosDB.BulkExecutor.BulkExecutor(_client, _collection);
await executor.InitializeAsync();
var executeResult = await executor.BulkUpdateAsync(updateItems);
var count = executeResult.NumberOfDocumentsUpdated;
What am I missing?
If I run the bulk executor on a collection without a partition key, I get the same error. If I run it with a collection that does have it and i specify it, the bulk executor works fine.
Pretty sure they just don't support it right now through the bulk executor api, just use the normal cosmos api for updating the doc as a workaround for now.

Multiple partitions in COSMOS DB collection

1) I have a Cosmos DB collection with about 500k documents and which is Partitioned by a property "SITEID". In the Query Request Options only one partition key value can be passed. In my case I have queries where the SITEID in (1,2,3,4) needs to be executed where SiteID is the partition key.
For example, my SP is as follows:
SELECT * FROM c WHERE c.SITEID IN
("SiteId1","SiteId2","SiteId3","SiteId4","SiteId5")
AND c.STATUS IN ("Status1","Status2","Status3","Status4")
I am Calling the above SP using the below SQL API code.
await client.ExecuteStoredProcedureAsync<string>(UriFactory.CreateStoredProcedureUri("DBName", "CollectionName", "Sample"),new RequestOptions { PartitionKey = new PartitionKey("SiteId1") })
In the above SQL API Code, PartitionKey property only supports a Single value. Where I need to pass several partition values. Is there any other options to do this?
2) "EnableCrossPartitionQuery" property is only availbale in the FeedOptions but not in the Request Options class. Client.ExecuteStoredProcedureAsync only supports the RequestOptions parameter not FeedOptions. Now I need to execute a Stored Procedure at once and across all partitions. Is there any other options to pass EnableCrossPartitionQuery in ExecuteStoredProcedureAsync method.
E.g)
client.CreateDocumentQuery<Doc>(UriFactory.CreateDocumentCollectionUri("DBName", "CollectionName"), "select * from c", new FeedOptions { EnableCrossPartitionQuery = true }).ToList()
await client.ExecuteStoredProcedureAsync<string>(UriFactory.CreateStoredProcedureUri("DBName", "CollectionName", "Sample"),new RequestOptions { PartitionKey = new PartitionKey("WGC") })
Stored procedures can only be executed against a single partition. There is nothing you can do about that.
They are not considered a query that returns a feed but a request that could return a response of any type. That's they they don't used the FeedOptions but rather the RequestOptions.
You can still execute your query as a normal document query and set the EnableCrossPartitionQuery to true. Cosmos should recognise the partition key in the query and should limit the requests to the specific partition key values.
I say should because this answer suggests that this is the case but there are some comments that say otherwise. I would suggest you check your metrics regarding the amount of collection hits.

Use Traversal to query Azure Cosmos DB graph

I am trying to use Traversal to query an Azure Cosmos DB graph as follows
val cluster = Cluster.build(File("remote.yaml")).create()
val client = cluster.connect()
val graph = EmptyGraph.instance()
val g = graph.traversal().withRemote(DriverRemoteConnection.using(cluster))
val traversal = g.V().count()
val aliased = client.alias("g")
val result = aliased.submit(traversal)
val resultList = result.all().get()
resultList.forEach { println(it) }
Problem is execution hangs after result.all().get() and never get a response. I only have this problem when submitting a traversal. When submitting a Gremlin query string directly it works properly.
I'm in a similar boat, but according to this recent query Does Cosmos DB support Gremlin.net c# GLV? traversals are not possible just yet. However, for those using (or thinking about using) Gremlin.NET to connect to Cosmos, I'll share some of what I've been able to do.
Firstly, I have no trouble connecting to cosmos from the gremlin console, just when using Gremlin.NET as follows:
var gremlinServer = new GremlinServer(hostname, port, enableSsl: true,
username: "/dbs/" + database + "/colls/" + collection,
password: authKey);
var driver = new DriverRemoteConnection(new GremlinClient(gremlinServer));
//var driver = new DriverRemoteConnection(new GremlinClient(new GremlinServer("localhost", 8182)));
var graph = new Gremlin.Net.Structure.Graph();
var g = graph.Traversal().WithRemote(driver);
g.V().Drop().Next(); // nullreferenceexception
When using Gremlin.NET to work with a locally hosted gremlin server (see commented out line), all works fine.
The only way I can work with cosmos using gremlin.net is to submit queries as string literals e.g.
var task = gremlinClient.SubmitAsync<dynamic>("g.V().Drop()");
This works, but I want to be able to use fluent traversals.
I can work with Cosmos quite easily using the Azure/Graph API (documentclient etc), but still only with string literals. Also, this isn't very portable, and apparently slower too

Document count query ignores PartitionKey

I am looking to get the count of all documents in a chosen partition. The following code however will return the count of all documents in the collection and costs 0 RU.
var collectionLink = UriFactory.CreateDocumentCollectionUri(databaseId, collectionId);
string command = "SELECT VALUE COUNT(1) FROM Collection c";
FeedOptions feedOptions = new FeedOptions()
{
PartitionKey = new PartitionKey(BuildPartitionKey(contextName, domainName)),
EnableCrossPartitionQuery = false
};
var count = client.CreateDocumentQuery<int>(collectionLink, command, feedOptions)
.ToList()
.First();
adding a WHERE c.partition = 'blah' clause to the query will work, but costs 3.71 RUs with 11 documents in the collection.
Why would the above code snippet return the Count of the whole Collection and is there a better solution to for getting the count of all documents in a chosen partition?
If the query includes a filter against the partition key, like SELECT
* FROM c WHERE c.city = "Seattle", it is routed to a single partition. If the query does not have a filter on partition key, then it is
executed in all partitions, and results are merged client side.
You could check the logical steps the SDK performs from this official doc when we issue a query to Azure Cosmos DB.
If the query is an aggregation like COUNT, the counts from individual
partitions are summed to produce the overall count.
So when you just use SELECT VALUE COUNT(1) FROM Collection c, it is executed in all partitions and results are merged client side.
If you want to get the count of all documents in a chosen partition, you just add the where c.partition = 'XX' filter.
Hope it helps you.
I believe this is actually a bug since I am having the same problem with the partition key set in both the query and the FeedOptions.
A similar issue has been reported here:
https://github.com/Azure/azure-cosmos-dotnet-v2/issues/543
And Microsoft's response makes it sound like it is an SDK issue that is x64-specific.

Resources