What's the easiest way to delete an entire partition in cosmos Db assuming that I'm using spirngboot with SQL API?
I have a class marked with the #Repository that extends the CosmosRepository and I want to delete every item from a particular partition.
I know that with CosmosClientBuilder I could do something like:
cosmosDbClient.getDatabase(dataBaseName)
.getContainer(container)
.deleteAllItemsByPartitionKey(
PartitionKey("partitionKey0001"),
CosmosItemRequestOptions())
Is it possible to access the container from the repository?
I don't want to use stored procedures for something that should be easy to do.
Thank you
Microsoft is releasing this feature since year now.
Its yet in preview state and you will need to opt to this before using
Related
I am very new to Azure and currently looking for assistance to help me start my project. Recently I am trying to automate the process of adding alerts for our Azure CosmosDB.
After some research, it looks like I can use Add-AzMetricAlertRuleV2 to add alerts through powershell. According to MS online document I will need to pass $condition to add the rule to the existing ResourceGroup. Where can I get all the existing conditions that I can add to the rule so I can sort out the one that is needed? Some people use New-AzMetricAlertRuleV2Criteria to assign the value that need to be passed to $condition. And I can see some example from MS website: https://learn.microsoft.com/en-us/powershell/module/az.monitor/new-azmetricalertrulev2criteria?view=azps-5.0.0
However the above document from MS is not clear to me where I can find the whole list of the name spaces along with the existing criterias that I can add. For example, if I want to add an alert when a new cosmos DB is created on one of the existing ResourceGroup using powershell, where can I check what is the name space I can use to add the new criteria? In another word, where can I find out all the existing conditions with the name space along with them? Thanks,
For cosmos DB, the metric namespace will be Microsoft.DocumentDB/databaseAccounts.
I need to query a collection and return all documents that are new or updated since the last query. The collection is partitioned by userId. I am looking for a value that I can use (or create and use) that would help facilitate this query. I considered using _ts:
SELECT * FROM collection WHERE userId=[some-user-id] AND _ts > [some-value]
The problem with _ts is that it is not granular enough and the query could miss updates made in the same second by another client.
In SQL Server I could accomplish this using an IDENTITY column in another table. Let's call the table version. In a transaction I would create a new row in the version table, do the updates to the other table (including updating the version column with the new value. To query for new and updated rows I would use a query like this:
SELECT * FROM table WHERE userId=[some-user-id] and version > [some-value]
How could I do something like this in Cosmos DB? The Change Feed seems like the right option, but without the ability to query the Change Feed, I'm not sure how I would go about this.
In case it matters, the (web/mobile) clients connect to data in Cosmos DB via a web api. I have control of the entire stack - from client to back-end.
As the statements in this link:
Today, you see all operations in the change feed. The functionality
where you can control change feed, for specific operations such as
updates only and not inserts is not yet available. You can add a “soft
marker” on the item for updates and filter based on that when
processing items in the change feed. Currently change feed doesn’t log
deletes. Similar to the previous example, you can add a soft marker on
the items that are being deleted, for example, you can add an
attribute in the item called "deleted" and set it to "true" and set a
TTL on the item, so that it can be automatically deleted. You can read
the change feed for historic items, for example, items that were added
five years ago. If the item is not deleted you can read the change
feed as far as the origin of your container.
Change feed is not available for your requirements.
My idea:
Use Azure Function Cosmos DB Trigger to collect all the operations in your specific cosmos collection. Follow this document to configure the input of azure function as cosmos db, then follow this document to configure the output as azure queue storage.
Get the ids of changed items and send them into queue storage as messages.When you want to query the changed item,just query the messages from the queue to consume them at a specific unit time and after that just clear the entire queue. No items will be missed.
With your approach, you can get added/updated documents and save reference value (_ts and id field) somewhere (like blob)
SELECT * FROM collection WHERE userId=[some-user-id] AND _ts > [some-value] and id !='guid' order by _ts desc
This is a similar approach we use to read data from Eventhub and store checkpointing information (epoch number, sequence number and offset value) in blob. And at a time only one function can take a lease of that blob.
If you go with ChangeFeed, you can create listener (Function or Job) to listen all add/update data from collection and you can store those value in some collection, while saving data you can add Identity/version field on every document. This approach may increase your cosmos DB bill.
This is what the transaction consistency levels are for: https://learn.microsoft.com/en-us/azure/cosmos-db/consistency-levels
Choose strong consistency and your queries will always return the latest write.
Strong: Strong consistency offers a linearizability guarantee. The
reads are guaranteed to return the most recent committed version of an
item. A client never sees an uncommitted or partial write. Users are
always guaranteed to read the latest committed write.
This is how I delete thousands of entities in datastore: First, get 1st entity. If 1st entity exist, fetch 500 entities to delete. Second, defers deletealltarget again and again until 1st entity does not exist.
def deletealltarget(twaccount_db_key):
target_db = model.Target.query().filter(ndb.GenericProperty('twaccount_key') == twaccount_db_key).get()
if target_db:
target_dbs = model.Target.query().filter(ndb.GenericProperty('twaccount_key') == twaccount_db_key).fetch(500,keys_only=True)
ndb.delete_multi(target_dbs)
deferred.defer(deletealltarget,twaccount_db_key)
Is there any better way?
You could use delete_multi_async asynchronously, instead of using defer, but besides that, you are doing good with this way. For example, you are using other advices already told, like using keys_only.
Google recommends using this Dataflow template for bulk deletion., but I don't know if it fits your scenario.
am trying to get the value from db without using serviceHub and vault.but i couldn't. what my logic is, when i pass the country name, it should return the id's(PK)of that country which is in one table.using those id's, it should return the values related to those id's from other table.it could be possible in flow class.but am trying to do in api class where servicehub couldn't import. Please help me out.
Only the node has access to the ServiceHub. The API runs outside of the node in a separate process, so it is limited to interacting with the node via the operations offered by CordaRPCOps.
Either you need to store the data you want to access in a separate database outside of the node, or you need to find some way to programatically log into the node's database from the API, using JDBC as described here: https://docs.corda.net/node-database.html.
I have an application with user and admin sections. If an admin updates data with the help of sql datasource then it's updated the database. However, when we retrieve data with linq query then it's showing its old value rather than the updated value.
After some time, the linq query automatically shows the correct value.
I think its caching the value, but I find myself helpless. Please help me with this.
When you say
when we retrieve data with linq query
Do you mean you call your select methods again or are you using the current in memory objects?
In either case, you can always refresh an entity with :
Context.Refresh(System.Data.Linq.RefreshMode.OverwriteCurrentValues, entity)
Make sure that you're using your DataContext efficiently (ideally one per unit of work).
After each update, make sure you call DataContext.SubmitChanges(); to commit your changes back to the database.
Also be aware that any context you instanciate between your changes being added to another context and calling SubmitChanges() will not reflect those changes.