LINQ CosmosDB MongoDB API upsert E11000 duplicate key error collection - azure-cosmosdb

I have just started to try out cosmosdb with the mongodb api and my application is quite easy. It listens on a message queue and store that data in the database. This data might already be stored and needs to get updated so I do an upsert.
The problem is that on the update it fails with a duplicate key error. I have tried to read a bit about this but haven't found any documentation. What I did find out is that you shouldn't set the id when doing the update which I find hard to do.
This is the code I have:
await Ctx.ReplaceOneAsync(d => d.Id == importedData.Id, importedData, new UpdateOptions { IsUpsert = true });
And this is the error I get:
A write operation resulted in an error.
E11000 duplicate key error collection: test Failed _id or unique key constraint A bulk write operation resulted in one or more errors.
How do I do an update based on the id when using linq?

Related

Logic app Delete document cosmos db gives 404 error

I am trying to delete a document from cosmos db using Logic App. The id exists in db, and I have provided the partition key as well. But still I get 404 error.
Can someone assist in adding a delete document function in logic app.
But the id exists in db
You need to pass the partition key value as the actual id instead of "id" which is the partition key path.
Change it to "fd39d4c4........." it should work then.

How to delete all data in a partition?

I have a CosmosDB collection with a number of different partitions. I want to delete all of the data in one of the partitions so I tried to run the command:
db.myCollection.deleteAll({PartitionKey: 'pop-9q'})
Where PartitionKey is the field that I partition/shard based on. But when I execute this it returns the not very helpful message:
ERROR: An Error has occurred
Why would I be getting this message and how can I either get more details on the cause or find a resolution?
Currently, at this time, you are unable to perform a bulk delete. Please Up Vote and Comment on this functionality: Add the ability to delete ALL data in a partition
Additionally, which API are you consuming? For Gremlin API you could execute something like the following: g.V().drop()
The Microsoft.Azure.Cosmos SDK has added this ability - currently only available as a preview feature (which requires you to opt-in via the portal)
See here for more details:
https://learn.microsoft.com/en-us/azure/cosmos-db/nosql/how-to-delete-by-partition-key?tabs=dotnet-example
Sample code included there:
// Get reference to the container
var container = cosmosClient.GetContainer("DatabaseName", "ContainerName");
// Delete by logical partition key
ResponseMessage deleteResponse = await container.DeleteAllItemsByPartitionKeyStreamAsync(new PartitionKey("Contoso"));
if (deleteResponse.IsSuccessStatusCode) {
Console.WriteLine($"Delete all documents with partition key operation has successfully started");
}
As #Mike said, a "delete all data" feature is not supported yet in Cosmos db SQL API and Mongo API. I notice that you have already added comments in above link. I just provide you with a workaround here that using bulk delete stored procedure for Cosmos db SQL API.
(sample code: https://gist.github.com/deepumi/2a23c5380202bddf0b85e83baf5833be)
For Mongo API, unfortunately, even stored procedure is not supported. You could create an Azure HTTP Trigger Function to execute bulk delete code in the function whenever you want or merge it into your program code.

Determine if Cosmos DB NotFound due to missing collection vs. document

Is there a way to programmatically determine from a DocumentClientException where StatusCode == HttpStatusCode.NotFound whether it was the document, the collection, or the database that was not found?
I'm trying to figure out whether I can implement on-demand collection provisioning and only call DocumentClient.CreateDocumentCollectionIfNotExistsAsync when I need to. I'm trying to avoid calling it before making every request (presumably this adds an extra network roundtrip to every request). Likewise, I'm trying to avoid calling it on error recovery when I know it won't help.
From experimentation with the local emulator, the only field I see varying in these three cases is DocumentClientException.Error.Message, and only when the database cannot be found. I generally try to avoid exception dispatching based on human-readable messages.
Wrong database name:
StatusCode: HttpStatusCode.NotFound
Error.Message: {\"Errors\":[\"Owner resource does not exist\"]}...
Correct database name, wrong collection name:
StatusCode: HttpStatusCode.NotFound
Error.Message: {\"Errors\":[\"Resource Not Found\"]}...
Correct database name, correct collection name, incorrect document ID:
StatusCode: HttpStatusCode.NotFound
Error.Message: {\"Errors\":[\"Resource Not Found\"]}...
I'm planning to use a database with its own offer. Since collections inside a database with its own offer are cheap, I'm trying to see whether I can segregate each tenant in my multi-tenant application into its own collection. Each tenant ends up having a different indexing and default TTL policy. The set of collections is not fixed and changes dynamically during runtime as new tenants sign up. I cannot predict when I will need to add a new collection. There's no new tenant notification: I just get a request that I need to handle by creating a document in a possibly non-existent collection. There's a process to garbage collect unused collections.
I'm using the NuGet package Microsoft.Azure.DocumentDB.Core Version 1.9.1 in a .NET Core 2.1 app targeting a SQL API Cosmos DB instance.
If you look at the Message property in detail, you should see following strings that informs whether 404 Not Found response was generated due to Document vs Collection.
ResourceType: Document
ResourceType: Collection
It's not ideal but you can try to regex this information out of error message.

CosmosDB : How to apply concurrency while inserting a document (in parallel requests)

Background:
We have a EventHub where thousands of events are logged every day. The Azure function are configured on trigger over this eventhub on arrival of new messages. The azure function does following two tasks:
Write the raw message into document DB (collection 1)
Upsert an summary (aggregated) message into collection 2 of document Db. Before writing a message it checks if a summary message is already exists based on partition key and unique id (not id), it a doc exists then it update the doc with new aggregated value and if not then insert a new doc. This unique id is created based on a business logic.
Problem Statement:
More than one summary document is getting created for a PartitionKey and unique Id
Scenario Details
let us say, for PartitionKey PartitionKey1 there is no summary
document created in Collection for computed unique key.
multiple messages (suppose 2) arrived at eventhub and which have triggered azure functions.
all these 2 requests run concurrently, Since no existing document is found using the query, so each request make a message, now the Upsert function is
invoked almost at the same time for writing summary document by concurrent request and resulted to have multiple summary documents for a PartitionKey and unique Id.
I've searched and read about Optimistic Concurrency which definitely I will implement for update scenario. but I could not able to find any way through which insert scenarios can be handled?
According to your description, I suggest you use Stored Procedure to achieve this.
Cosmos DB to guarantee ACID for all operations that are part of a single stored procedure.
As the official said: If the collection the stored procedure is registered against is a single-partition collection, then the transaction is scoped to all the documents within the collection. If the collection is partitioned, then stored procedures are executed in the transaction scope of a single partition key. Each stored procedure execution must then include a partition key value corresponding to the scope the transaction must run under.
For more information about Stored Procedure of Cosmos DB and how to create Stored Procedure, we can refer to:
Azure Cosmos DB server-side programming: Stored procedures, database triggers, and UDFs
Create and use stored procedures using C#

Cosmos DB: Getting document id from exception upon conflict

Cosmos DB returns a DocumentClientException with a status of 409 Conflict when trying to insert a document with unique keys that match an existing document (cf. unique keys, exceptions).
Is it possible to get the id of the document from the exception without querying the db again?

Resources