Determine if Cosmos DB NotFound due to missing collection vs. document - azure-cosmosdb

Is there a way to programmatically determine from a DocumentClientException where StatusCode == HttpStatusCode.NotFound whether it was the document, the collection, or the database that was not found?
I'm trying to figure out whether I can implement on-demand collection provisioning and only call DocumentClient.CreateDocumentCollectionIfNotExistsAsync when I need to. I'm trying to avoid calling it before making every request (presumably this adds an extra network roundtrip to every request). Likewise, I'm trying to avoid calling it on error recovery when I know it won't help.
From experimentation with the local emulator, the only field I see varying in these three cases is DocumentClientException.Error.Message, and only when the database cannot be found. I generally try to avoid exception dispatching based on human-readable messages.
Wrong database name:
StatusCode: HttpStatusCode.NotFound
Error.Message: {\"Errors\":[\"Owner resource does not exist\"]}...
Correct database name, wrong collection name:
StatusCode: HttpStatusCode.NotFound
Error.Message: {\"Errors\":[\"Resource Not Found\"]}...
Correct database name, correct collection name, incorrect document ID:
StatusCode: HttpStatusCode.NotFound
Error.Message: {\"Errors\":[\"Resource Not Found\"]}...
I'm planning to use a database with its own offer. Since collections inside a database with its own offer are cheap, I'm trying to see whether I can segregate each tenant in my multi-tenant application into its own collection. Each tenant ends up having a different indexing and default TTL policy. The set of collections is not fixed and changes dynamically during runtime as new tenants sign up. I cannot predict when I will need to add a new collection. There's no new tenant notification: I just get a request that I need to handle by creating a document in a possibly non-existent collection. There's a process to garbage collect unused collections.
I'm using the NuGet package Microsoft.Azure.DocumentDB.Core Version 1.9.1 in a .NET Core 2.1 app targeting a SQL API Cosmos DB instance.

If you look at the Message property in detail, you should see following strings that informs whether 404 Not Found response was generated due to Document vs Collection.
ResourceType: Document
ResourceType: Collection
It's not ideal but you can try to regex this information out of error message.

Related

CosmosDB Container without PartitionKey

I'm using Azure Cosmos DB .NET SDK Version 3.0 and I want to create container programmatically without partition key. Is it possible? I always got error saying Value cannot be null.
Parameter name: partitionKey
I use method CosmosContainers.CreateContainerIfNotExistsAsync
Reproduce your issue on my side always.
Notice the exception is caused by below method:
Try to deserialize the dll source code and find the detailed logical code.
It seems we can't cross this judgement so far because cosmos db team is planning to deprecate ability to create non-partitioned containers, as they do not allow you to scale elastically.(Mentioned in my previous case:Is it still a good idea to create comos db collection without partition key?)
But you still could create non-partitioned containers with DocumentDB .net package or REST API.

How to delete all data in a partition?

I have a CosmosDB collection with a number of different partitions. I want to delete all of the data in one of the partitions so I tried to run the command:
db.myCollection.deleteAll({PartitionKey: 'pop-9q'})
Where PartitionKey is the field that I partition/shard based on. But when I execute this it returns the not very helpful message:
ERROR: An Error has occurred
Why would I be getting this message and how can I either get more details on the cause or find a resolution?
Currently, at this time, you are unable to perform a bulk delete. Please Up Vote and Comment on this functionality: Add the ability to delete ALL data in a partition
Additionally, which API are you consuming? For Gremlin API you could execute something like the following: g.V().drop()
The Microsoft.Azure.Cosmos SDK has added this ability - currently only available as a preview feature (which requires you to opt-in via the portal)
See here for more details:
https://learn.microsoft.com/en-us/azure/cosmos-db/nosql/how-to-delete-by-partition-key?tabs=dotnet-example
Sample code included there:
// Get reference to the container
var container = cosmosClient.GetContainer("DatabaseName", "ContainerName");
// Delete by logical partition key
ResponseMessage deleteResponse = await container.DeleteAllItemsByPartitionKeyStreamAsync(new PartitionKey("Contoso"));
if (deleteResponse.IsSuccessStatusCode) {
Console.WriteLine($"Delete all documents with partition key operation has successfully started");
}
As #Mike said, a "delete all data" feature is not supported yet in Cosmos db SQL API and Mongo API. I notice that you have already added comments in above link. I just provide you with a workaround here that using bulk delete stored procedure for Cosmos db SQL API.
(sample code: https://gist.github.com/deepumi/2a23c5380202bddf0b85e83baf5833be)
For Mongo API, unfortunately, even stored procedure is not supported. You could create an Azure HTTP Trigger Function to execute bulk delete code in the function whenever you want or merge it into your program code.

How Are Objects Synced in a Shared Realm in Swift

After scouring the documentation, I recently learned that a shared realm (globally available to all users of my app) can only be queried with Realm.asyncOpen. For example, I have a /shared realm that has read-only access to any user. I tried querying it in the usual way, but it returned zero objects. But if I query it like this, it works:
Realm.asyncOpen(configuration: sharedConfig) { realm, error in
if let realm = realm {
// Realm successfully opened
self.announcements = realm.objects(Announcement.self)
print(self.announcements)
self.tableView.reloadData()
} else if let error = error {
print(error)
}
}
This method is visibly slower than a usual realm query since it appears to be fetching the data from the server instead of a local, already-synced realm.
Does this mean that the objects pulled down are never stored in the local copy of the realm, but are queried from the ROS each time I access them?
In other words, are shared realms pulled and not synced?
a shared realm (globally available to all users of my app) can only be queried with Realm.asyncOpen
This is incorrect. If a user only has read-only access to a Realm, it must be obtained with Realm.asyncOpen. That's explicitly what the documentation you linked to states.
This method is visibly slower than a usual realm query since it appears to be fetching the data from the server instead of a local, already-synced realm.
Almost correct. Yes data is fetched from the server, but not the whole Realm from scratch. Only the new data since the last time the Realm was synced with your local copy.
Does this mean that the objects pulled down are never stored in the local copy of the realm, but are queried from the ROS each time I access them?
This synced Realm is persisted locally and will be preserved across application launches.
In other words, are shared realms pulled and not synced?
No.
Taking a step back, let's explain what's happening here.
The reason why you get a "permission denied" error if you attempt to open a read-only synced Realm synchronously is that upon initialization, a local Realm file will be created, performing write operations to write the Realm's schema (i.e. create db tables, columns & metadata) immediately. However, since the user does not have write access to the Realm, the Realm Object Server (ROS) rejects the changes and triggers your global error handler notifying you that an illegal attempt to modify the file was made by your user.
The reason why this doesn't happen with asyncOpen is that it's an asynchronous operation and therefore doesn't need to give you a valid Realm immediately, so it doesn't need to "bootstrap" it by writing the schema to it. Instead, it requests the latest state of the Realm from ROS and vends it back to you once it's fully available in its latest state at the point in time at which the call was started.
That being said, if the local copy of the Realm already has its schema initialized (i.e. after a successful asyncOpen call), and the in-memory schema defined by either the default schema or the custom object types specified in Realm.Configuration hasn't changed, then no schema will be attempted to be written to the file.
This means that any time after a successful asyncOpen call, the Realm could be accessed synchronously without going through asyncOpen as long as you're ok with potentially not having the most up to date data from ROS.
So in your case, it appears as though you only want to use asyncOpen for the very first access to the Realm, so you could persist that state (using another Realm, or NSUserDefaults) and check for it to determine whether or not to open the Realm the asynchronously or synchronously.

Symfony - Log runnables natives queries when database is out

I'am working on a Symfony app that provides a rest web service (simple HTTP Request with JSON).
That service check some rules and inserts few lines in two MySQL table (write only).
For optimize reason, even if Doctrine bundle is available, i use native MySQL Query (with bind params) to insert this lines.
My need is : If for any reason, the database is not available, write "runnables" queries into a log file.
The final purpose is that when database is back, i want to be able to execute directly the file's content on the database.
Note that there is no unique constraint (pk is a generated uuid) and no lock or transaction to handle (simple insert statements).
I write a custom SQLLogger, but when $connection->insert(...) is called, the connect fail before logger is called.
So, my question is : There is a way to get the final query (with binded parameters) without database connection ?
Or should i rewrite the mecanism that bind params into query and log it myself when database is not available ?
Best regards,
Julien
As the final query with parameters is build by the database, there is just no way to build the query with PHP and to be garanteed that the query will be the same as the database.
The only way si to build query without binded parameters, but this is clearly not a good practice.
So, i finally decided to store all the JSON (API request body) in a file if the database is not available.
So when the database is back, instead of replay SQL queries, i can replay the original HTTP query.
Hope this late self-anwser will help someone.
Best regards.

SQL Server load balancing optimizing Hits or Optimize the query

When we developers write data access code what should we really worry about if the application should scale well and handle the load / Hits.
Given this simple problem , how would you solve it in scalable manner.
1.ProjectResource is a Class ( Encapsulating resources assigned to a Project)
2.Each resource assigned to Project is User Class
3.Each User in the Project also has ReportingHead and ProjectManager who are also instance of User
4.Finally there is a Project class containing project details
Legend of classes used
User
Project
ProjectResource
Table Diagram
ProjectResource
ResourceId
ProjectId
UserId
ReportingHead
ProjectManager
Class Diagram
ProjectResource
ResourceId : String / Guid
Project : Project
User : User
ReportingHead : User
ProjectManager : User
note:
All the user information is stored in the User table
All the Project information is stored in the project table
Here's the Problem
When the application requests for Resource In a Project operations below are followed
First Get the Records for the Project
Get the UserId , make the request(using Users DAL) to get the user instance
Get the ProjectId, make the request(using Projects DAL) to get the project information
Finally assign Users and Project to instance of ProjectResource
clearly you can see 3 Db Calls are made here for populating single ProjectResource but the concerns and who manages the objects are clearly defined. This is the way i have planned to , since there is also connection pooling available in Sql Server & ADO.net
There is also another way where all the details are retrieved in single hit using Table Inner Joins and then Populating.
Which way should i really be taking and Why?
Extras:
.NET 2.0,ASP.net 2.0,C#,Sql Server 2005,DB on same machine hosting application.
For best performance and scalability, you should minimize the number of round-trips to the DB. To prove that to yourself, just run some benchmarks; it becomes clear very quickly.
One approach to a single round-trip is to use joins. Another is to return multiple result sets. The latter can be helpful in eliminating possible duplicate data.

Resources