Partition key path /id is invalid for Gremlin API - azure-cosmosdb

I have an Azure function that runs with a CosmosDBTrigger. It points correctly to my target database and collection. The CreateLeaseCollectionIfNotExists is set to true and the LeaseCollectionName is set to leases. When the function is started I receive this error:
Error indexing method ' * * '
Microsoft.Azure.WebJobs.Host.Indexers.FunctionIndexingException :
Error indexing method '***' ---> System.InvalidOperationException :
Cannot create Collection Information for *** in database
*** with lease leases in database *** : Partition
key path /id is invalid for Gremlin API. The path cannot be '/id',
'/label' or a nested path such as '/key/path'
It seems like Azure is creating the leases graph with an '/id' as a partition. Where did I go wrong?

The Azure Functions Cosmos DB Trigger documentation says it only works on SQL API accounts: https://learn.microsoft.com/azure/azure-functions/functions-bindings-cosmosdb-v2#supported-apis
Particularly the Trigger uses the Change Feed Processor library that was designed to work with SQL API accounts, it uses a lease collection that has the requirement of being partitioned by /id which is something that Gremlin API accounts cannot do.

Related

How to utilize automatic indexing in CosmosDB/Cassanadra API?

Cosmos DB FAQ says in Cassandra API section that Azure Cosmos DB provides automatic indexing of all attributes without any schema definition. https://learn.microsoft.com/en-us/azure/cosmos-db/faq#does-this-mean-i-dont-have-to-create-more-than-one-index-to-satisfy-the-queries-1
But when I try to add WHERE column1 = 'x' filter to my CQL query, I get exception from Datastax cassandra driver saying that data filtering is not supported. I tried to bypass client driver by supplying ALLOW FILTERING but this time got error from cosmos server saying this feature is not implemented.
So, if automatic indexing is implemented for Cosmos/Cassandra API, how can it be used?

How to delete all data in a partition?

I have a CosmosDB collection with a number of different partitions. I want to delete all of the data in one of the partitions so I tried to run the command:
db.myCollection.deleteAll({PartitionKey: 'pop-9q'})
Where PartitionKey is the field that I partition/shard based on. But when I execute this it returns the not very helpful message:
ERROR: An Error has occurred
Why would I be getting this message and how can I either get more details on the cause or find a resolution?
Currently, at this time, you are unable to perform a bulk delete. Please Up Vote and Comment on this functionality: Add the ability to delete ALL data in a partition
Additionally, which API are you consuming? For Gremlin API you could execute something like the following: g.V().drop()
The Microsoft.Azure.Cosmos SDK has added this ability - currently only available as a preview feature (which requires you to opt-in via the portal)
See here for more details:
https://learn.microsoft.com/en-us/azure/cosmos-db/nosql/how-to-delete-by-partition-key?tabs=dotnet-example
Sample code included there:
// Get reference to the container
var container = cosmosClient.GetContainer("DatabaseName", "ContainerName");
// Delete by logical partition key
ResponseMessage deleteResponse = await container.DeleteAllItemsByPartitionKeyStreamAsync(new PartitionKey("Contoso"));
if (deleteResponse.IsSuccessStatusCode) {
Console.WriteLine($"Delete all documents with partition key operation has successfully started");
}
As #Mike said, a "delete all data" feature is not supported yet in Cosmos db SQL API and Mongo API. I notice that you have already added comments in above link. I just provide you with a workaround here that using bulk delete stored procedure for Cosmos db SQL API.
(sample code: https://gist.github.com/deepumi/2a23c5380202bddf0b85e83baf5833be)
For Mongo API, unfortunately, even stored procedure is not supported. You could create an Azure HTTP Trigger Function to execute bulk delete code in the function whenever you want or merge it into your program code.

Determine if Cosmos DB NotFound due to missing collection vs. document

Is there a way to programmatically determine from a DocumentClientException where StatusCode == HttpStatusCode.NotFound whether it was the document, the collection, or the database that was not found?
I'm trying to figure out whether I can implement on-demand collection provisioning and only call DocumentClient.CreateDocumentCollectionIfNotExistsAsync when I need to. I'm trying to avoid calling it before making every request (presumably this adds an extra network roundtrip to every request). Likewise, I'm trying to avoid calling it on error recovery when I know it won't help.
From experimentation with the local emulator, the only field I see varying in these three cases is DocumentClientException.Error.Message, and only when the database cannot be found. I generally try to avoid exception dispatching based on human-readable messages.
Wrong database name:
StatusCode: HttpStatusCode.NotFound
Error.Message: {\"Errors\":[\"Owner resource does not exist\"]}...
Correct database name, wrong collection name:
StatusCode: HttpStatusCode.NotFound
Error.Message: {\"Errors\":[\"Resource Not Found\"]}...
Correct database name, correct collection name, incorrect document ID:
StatusCode: HttpStatusCode.NotFound
Error.Message: {\"Errors\":[\"Resource Not Found\"]}...
I'm planning to use a database with its own offer. Since collections inside a database with its own offer are cheap, I'm trying to see whether I can segregate each tenant in my multi-tenant application into its own collection. Each tenant ends up having a different indexing and default TTL policy. The set of collections is not fixed and changes dynamically during runtime as new tenants sign up. I cannot predict when I will need to add a new collection. There's no new tenant notification: I just get a request that I need to handle by creating a document in a possibly non-existent collection. There's a process to garbage collect unused collections.
I'm using the NuGet package Microsoft.Azure.DocumentDB.Core Version 1.9.1 in a .NET Core 2.1 app targeting a SQL API Cosmos DB instance.
If you look at the Message property in detail, you should see following strings that informs whether 404 Not Found response was generated due to Document vs Collection.
ResourceType: Document
ResourceType: Collection
It's not ideal but you can try to regex this information out of error message.

How do you connect to a Cosmos Db (primarily updated via SQL API) using Gremlin.Net ? (can you?)

Im working on a Cosmos DB app that stores both standard documents and graph documents. We are saving both types via the documentdb api and I am able to run graph queries that return Graphson using the DocumentClient.CreateGremlinQuery method. This graphson is to be read by a web app and the graph displayed for user viewing and so on.
My issue is that I cannot define the version of the Graphson format returned when using the Microsoft.Azure.Graphs method. So i looked into Gremlin.net and that has a lot more options in this regard from the documentation.
However I am finding connecting to the Cosmos Document Db using gremlin.net difficult. The server variable which you define like this :
var server = new GremlinServer("https://localhost/",8081 , enableSsl: true, username: $"/dbs/TheDatabase/colls/TheCOllection", password: "C2y6yDjf5/R+ob0N8A7Cgv30VRDJIWEHLM+4QDU5DE2nQ9nDuVTqobD4b8mGGyPMbIZnqyMsEcaGQy67XIw/Jw==");
then results in a uri that has "/gremlin" and it cannot locate the database end point.
Has anyone used Gremlin.net to connect to a Cosmos document database (not a Cosmos db configured as a graph db) that has been setup as a document db not a graph db ? The documents in it are graph/gremlin compatible in their format with _isEdge / label / _sink etc.
Cheers,
Mark (Document db/Gremlin/graph newbie)

Cosmos Mongodb query fails but azure storage explorer works fine?

I am trying to query a Cosmos MongoDB collection, I can connect to it fine with Robo3T and 3T Studio, and dotnet core mongo client (in a test harness). I can do a count of entities (db.[collection_name].count({})) in all of the platforms, but every query (db.[collection_name].find({}) fails with the following error :
Error: error: {
"_t" : "OKMongoResponse",
"ok" : 0,
"code" : 1,
"errmsg" : "Unknown server error occurred when processing this request.",
"$err" : "Unknown server error occurred when processing this request."}
Here is my sample query from Rob3T and below that sample .NET harness.. Doesn't matter what I use, same error every time.
db.wihistory.find({})
and the dotnet core code :
string connectionString = #"my connections string here";
MongoClientSettings settings = MongoClientSettings.FromUrl(
new MongoUrl(connectionString)
);
settings.SslSettings =
new SslSettings() { EnabledSslProtocols = SslProtocols.Tls12 };
var mongoClient = new MongoClient(settings);
var database = mongoClient.GetDatabase("vstsagileanalytics");
var collection = database.GetCollection<dynamic>("wihistory");
var data = collection.Find(new BsonDocument()).ToList();
System.Console.WriteLine(data.ToString());
The issue comes from mixing API usage in the account. As stated in the comments, you are using Azure Function's Cosmos DB Output binding, which uses the SQL API (.NET SDK for SQL API) to connect to the account and store data. There is a note in that documentation that says:
Don't use Azure Cosmos DB input or output bindings if you're using
MongoDB API on a Cosmos DB account. Data corruption is possible.
The documents stored through this method do not enforce certain MongoDB requirements (like the existence of a "_id" identifier) that a MongoDB client would (a MongoDB client would automatically create the "_id" if not present).
Robo3T and other Mongo clients (including the Azure Portal) are failing to correctly parse and read the stored documents as valid MongoDB documents (due to the lack of requirements like "_id") and that is the cause of the error.
You can either switch to use a Cosmos DB SQL API account if you want to maintain the Azure Functions pipeline or change the output binding and replace it with a manual implementation of a MongoDB client.

Resources