I want move only changed data from CosmosDB to Storage account like only the changed data that is modified or inserted can be moved. How can I perform change feed process in cosmosDB? .
Please share a link for this.
You can consider having an Azure Function that is triggered when documents are added / updated in your collection and have the Azure Function output the document to blob storage.
This should scale well and would be relatively easy to implement. Here is an example
There are several ways to read the change feed and all are described here: https://learn.microsoft.com/azure/cosmos-db/read-change-feed
Summary:
Using Azure Functions: https://learn.microsoft.com/azure/cosmos-db/change-feed-functions
Using the Change Feed Processor on the SDK: https://learn.microsoft.com/azure/cosmos-db/change-feed-processor
Using the Change Feed query/pull model on the SDK: https://learn.microsoft.com/azure/cosmos-db/change-feed-pull-model
Using the Spark Connector if you are doing data analysis: https://learn.microsoft.com/azure/cosmos-db/sql-api-sdk-java-spark-v3
It really depends on your scenario, which tool you might want to use.
Related
What is that query builder which is shown on the Firebase Firestore page, what's its use case and how does it benefit my project?
From the documentation, I get that it's used to query collections and sub-collections in that actual Firestore database.
But does this used just for visualizing the queries from it? or does it prepare those types of queries to be ready when requesting them so it will cause some performance and speed of requesting improvements?
and when should I use it, and when I shouldn't?
The new query building in the Firestore console is just a visual way to build a query that then determines what data the console shows. I find it most helpful to limit the amount of data the console shows, and to see if a query is going to be possible before I translate it in to code.
Aside from that, the query build shows the resulting documents in a tabular view (rather than the panel view that already existed), which makes it possible to compare documents at a glance and fits more data in less space.
I have just published an app that uses Firestore as a backend.
I want to change how the data is structured;
for example if some documents are stored in subcollections like 'PostsCollection/userId/SubcolletionPosts/postIdDocument' I want to move all this last postIdDocument inside the first collection 'PostsCollection'.
Obviously doing so would prevent users of the previous app version from writing and reading the right collection and all data would be lost.
Since I don't know how to approach this issue, I want to ask you what is the best approach that also big companies use when changing the data structure of their projects.
So the approach I have used is document versioning. There is an explanation here.
You basically version your documents so when you app reads them, it knows how to update those documents to get them to the desired version. So in your case, you would have no version, and need to get to version 1, which means read the sub-collections to the top collection and remove the sub collection before working with the document.
Yes it is more work, but allows an iterative approach to document changes. And sometimes, a script is written to update to the desired state and new code is deployed 😛. Which usually happens when someone wants it done yesterday. Which with many documents can have it's own issues.
I've found this thread but I would like to extend my question based on the answers.
My goal is to implement something close to the full text search solution provided by the firebase documentation. While Algolia seems to be a great choice, it does not come cheap and introduces another third-party.
Question: I have the idea of building a search index based on Firestore data and save it as JSON in Cloud Storage. This JSON would be accessed by a Cloud Function that would use a library like fuse.js to execute a search (see the page for an example) and return it to a user.
Is this a reasonable thing to do for searching a collection with around 20000 documents / JSON size of around 3MB maximum?
Assuming we do not have strong consistency set, when using azure functions change feed are we guaranteed to get the latest document when querying against the same partition? Also, are all queries issues from within the change feed guaranteed the latest records since the change feed runs on the write region?
Thanks!
You can read changefeed from any read region. Check the pull request https://github.com/Azure/azure-webjobs-sdk-extensions/pull/508. This code will be part of function soon. But using Change feed processor library you can do it today.
If you are reading it from write region then you are getting the current document. However, if you are reading it from any other read region then it will be dependent upon your consistency.
I am trying to use Microsoft's Face API, for facial recognition for my company employees. I see that you need to create a database in Microsoft's serverS.
Is there a way to use their API's on our company database (without creating another DB on their server? Also any changes you make to this DB will be taken care of.
If no, then how will you take care of the changes you want (I know that there are delete API calls as well, but will not it be cumbersome?)
I think you may have misunderstood how this API works. Whenever there is a change to your list of employees, the PersonGroup (a friendly name for the image classifier model), must be retrained in order for it to start recognizing the added faces and stop recognizing removed ones. So even if there was a way to store the model locally (which there isn't), you will still need to track add/removes and take the additional step of training.