Suppose I have a users collection. The users collection has a large number of documents in it. Now in my app, I have a feature request that forces me to add or remove a field in my users collection data model. How can I add a new field or remove an existing field from all my users documents? Is there any best practice that the community recommends here?
How can I add a new field or remove an existing field from all my users documents?
While #AdityaNandardhane solution might work, please note that if you have a lot of documents, then you have a lot of update operations to perform, which also means that you have to play a lot of writes.
So the best approach would be to perform the update, only when the user reads the document. When it comes to users, most likely the details of the users are displayed on a profile screen. This means that when the users want to check the profile, before displaying the data, check for the existence of the new field. If it doesn't exist, then perform the update operation, and right after that display the data, otherwise, just display the data. This means that you'll have to pay for an update operation only when needed. It doesn't make any sense to update all documents, of all users, since there may be users that will never use their accounts anymore. So there is no need to pay for them.
As I understood, You can do the following thing
1. Add New Field
If you are using Firebase Functions- you can create one function and write an update query with a new field and set one default value and Run the function. You can do the same from android also with kotlin/java.
2. Remove existing Field
If you are using Firebase Functions- you can create one function and write a query to delete one field and Run the function. You can do the same from android also with kotlin/java.
Look for a better approach If any, Its suggestion as per my knowledge.
Related
I am trying to build a mobile app which has a NewsBulletin feature using a NoSQL Cloud Firestore. I am trying to get the unique post view by keeping the user's uid into an array called "views" and count it by getting the length of the array. Is this recommendable or are there other better solution for this? Thank you
Currently this is the structure of my database:
News(Collection)
-DummyNews1(Document)
-newsTitle
-posterName
-bodyMessage
-timeCreated
-views(array)
-dummyuid1
-dummyuid2
I like your solution as it is easy to implement. You don't actually have to manually check for duplicate uids, as firestore has a built in feature that does that for you.
Here is an example:
FirebaseFirestore.instance.collection('news').doc('documentId').update({
'views': FieldValue.arrayUnion([viewerUid]),
});
FieldValue.arrayUnion will check if the contents exists in the database, and only when it does not will add the content.
Now, although I am a fan of you solution, and I do use this method for like type of feature in my own published apps, there are some limitations that you should be aware in case your app becomes super popular.
Maximum document size in firestore is 1MiB. Since firebase auth's uid is 28 characters long, that would be about 37,400 views maximum to be stored in a document ignoring other fields.
But if this is a new application, I would not worry too much about this limit. Besides, once you get close to this limit, you should have more than enough resources to pivot to another method that scales.
I have set up all the currently needed Fields in my firestore collection documents, but I fear I may need to add new fields later in future as the usage of the app grows.
My question is, will I need to delete the whole collection just to add another new field in future? or will i simply add a new field thereby changing the structure of all the future documents that will be created in the collection.
Thanks
I fear I may need to add new fields later in future as the usage of the app grows.
It happens all the time. Is normal. As the app grows, new features are needed.
Will I need to delete the whole collection just to add another new field in future?
No and never think about that. You can simply update each document with the new properties that you need. You can easily do that using a POJO class, as explained in my anwer from this post or even simpler using a Map, as explained here.
Firestore, like most NoSQL databases, is schema-less. This means there is no structure to the data you put in it. The only structure is that which you impose by your own code. You could have millions of documents in a collection that all have completely different fields, and they will not conflict with each other in any way. You can add and remove fields at any time. You choose whatever suits your application the best.
If you have decided to denormalize/duplicate your data in Firestore to optimize for reads, what patterns (if any) are generally used to keep track of the duplicated data so that they can be updated correctly to avoid inconsistent data?
As an example, if I have a feature like a Pinterest Board where any user on the platform can pin my post to their own board, how would you go about keeping track of the duplicated data in many locations?
What about creating a relational-like table for each unique location that the data can exist that is used to reconstruct the paths that require updating.
For example, creating a users_posts_boards collection that is firstly a collection of userIDs with a sub-collection of postIDs that finally has another sub-collection of boardIDs with a boardOwnerID. Then you use those to reconstruct the paths of the duplicated data for a post (eg. /users/[boardOwnerID]/boards/[boardID]/posts/[postID])?
Also if posts can additionally be shared to groups and lists would you continue to make users_posts_groups and users_posts_lists collections and sub-collections to track duplicated data in the same way?
Alternatively, would you instead have a posts_denormalization_tracker that is just a collection of unique postIDs that includes a sub-collection of locations that the post has been duplicated to?
{
postID: 'someID',
locations: ( <---- collection
"path/to/post/location1",
"path/to/post/location2",
...
)
}
This would mean that you would basically need to have all writes to Firestore done through Cloud Functions that can keep a track of this data for security reasons....unless Firestore security rules are sufficiently powerful to allow add operations to the /posts_denormalization_tracker/[postID]/locations sub-collection without allowing reads or updates to the sub-collection or the parent postIDs collection.
I'm basically looking for a sane way to track heavily denormalized data.
Edit: oh yeah, another great example would be the post author's profile information being embedded in every post. Imagine the hellscape trying to keep all that up-to-date as it is shared across a platform and then a user updates their profile.
I'm aswering this question because of your request from here.
When you are duplicating data, there is one thing that need to keep in mind. In the same way you are adding data, you need to maintain it. With other words, if you want to update/detele an object, you need to do it in every place that it exists.
What patterns (if any) are generally used to keep track of the duplicated data so that they can be updated correctly to avoid inconsistent data?
To keep track of all operations that we need to do in order to have consistent data, we add all operations to a batch. You can add one or more update operations on different references, as well as delete or add operations. For that please see:
How to do a bulk update in Firestore
What about creating a relational-like table for each unique location that the data can exist that is used to reconstruct the paths that require updating.
In my opinion there is no need to add an extra "relational-like table" but if you feel confortable with it, go ahead and use it.
Then you use those to reconstruct the paths of the duplicated data for a post (eg. /users/[boardOwnerID]/boards/[boardID]/posts/[postID])?
Yes, you need to pass to each document() method, the corresponding document id in order to make the update operation work. Unfortunately, there are no wildcards in Cloud Firestore paths to documents. You have to identify the documents by their ids.
Alternatively, would you instead have a posts_denormalization_tracker that is just a collection of unique postIDs that includes a sub-collection of locations that the post has been duplicated to?
I consider that isn't also necessary since it require extra read operations. Since everything in Firestore is about the number of read and writes, I think you should think again about this approach. Please see Firestore usage and limits.
unless Firestore security rules are sufficiently powerful to allow add operations to the /posts_denormalization_tracker/[postID]/locations sub-collection without allowing reads or updates to the sub-collection or the parent postIDs collection.
Firestore security rules are so powerful to do that. You can also allow to read or write or even apply security rules regarding each CRUD operation you need.
I'm basically looking for a sane way to track heavily denormalized data.
The simplest way I can think of, is to add the operation in a datastructure of type key and value. Let's assume we have a map that looks like this:
Map<Object, DocumentRefence> map = new HashMap<>();
map.put(customObject1, reference1);
map.put(customObject2, reference2);
map.put(customObject3, reference3);
//And so on
Iterate throught the map, and add all those keys and values to batch, commit the batch and that's it.
I'm setting up a Firestore database and am playing around with structuring it. Is there a way to populate and change it quickly without having to add/change fields manually every single time?
Two example things I am looking to do are:
1) Populate collections with documents that have predetermined fields. Currently I have to add the fields manually every single time.
2) Edit the fields en masse for all documents within a collection (e.g. change the name of a field, delete a field entirely, add a new field)
The Firebase console doesn't seem to provide these tools, would my best bet be to write a separate app specifically for this purpose?
Since such bulk uploads and bulk edits are not part of the console, you'll have to build something yourself indeed.
A good place to start would be the Cloud Firestore API, which allows adding and updating documents in the database.
If I have User and Profile objects. What is the best way to structure my collections in firestore given that the follow scenarios can take place?
Users have a single Profile
Users can update their Profile
Users can save other users' profiles
Users can deleted their saved profiles
The same profile can't be saved twice
If Users and Profiles are separate collections, what is the best way to store saved profiles?
One way that came to mind was that each user has a sub collection called SavedProfiles. The id of each document is the id of the profile. Each saved Profile only contains a reference to the user who's profile it belongs to.
The other option was to do the same thing but store the whole profile of each saved profile.
The benefits of the first approach is that when a user updates their own profile there's no need to update any of the their profiles that have already been saved as it's only the reference that is stored. However, attempting to read a user's saved profiles may require two read operations (which will be quite often), one to get all the references then querying for all the profiles with those reference (if that's even possible???). This seems quite expensive.
The second approach seems like the right way to go as it solves the problem of reading all the saved profiles. But updating multiple saved profiles seems like an issue as each user's saved profiles may be unique. I understand that it's possible to do a batch update but will it be necessary to query each user in the db for their saved profiles and check if that updated profile exists, if so update it? I'm not too sure which way to go. I'm not super used to NoSQL data structures and it already seems like I've done something wrong since I've used a sub collection since it's advised to keep everything as denormalized as possible so please let me know if the structure to my whole db is wrong too, which is also quite possible...
Please provide some examples of how to get and update profiles/saved profiles.
Thank you.
Welcome to the conundrum that is designing a NoSQL database. There is no right or wrong answer, here. It's whatever works best for you.
As you have identified, querying will be much easier with your second option. You can easily create a Cloud Function which updates any profiles which have been modified.
Your first option will require multiple gets to the database. It really depends how you plan to scale this and how quick you want your app to run.
Option 1 will be a slow user experience, while all of the data is fetched. Option 2 will be a much faster user experience, but will requre your Cloud Function to update every saved profile. However, this is a background task so wouldn't matter if it takes a few seconds.