I am building a firebase web application. Here is a simplified Firestore structure:
...
- ...
ITEMS
- item1
PHOTOS
- photo1
uid
VIDEOS
- item2
USERS
- uid1
displayName
...
- uid2
- ...
When someone accesses a photo or video I want to display the appropriate displayName and profilePhoto for that item.
I see two possible solutions:
Store only user UID in items. On every load, get user data from USERS collection or Then loop in user's UIDs and connect them with item UIDs. This would be done on every call. When users update displayName or profilePhoto, there wouldn't be any problem.
Store user UID, displayName, and profilePhoto path on the item. When users update data I would need to run a cloud function which would then update all items with new data.
How should I approach this and is there any other approach that I should consider? I lean towards the second solution.
What you have are the two most common approaches. The first is commonly referred to as performing a client-side join, while the second is duplicating/denormalizing the necessary data.
Neither is pertinently better than the other, but folks who are new to NoSQL databases are typically hesitant to duplicate data so tend to gravitate to performing client-side joins.
As you gain experience in NoSQL data models, you'll learn the right questions to ask yourself. For example: how often do a display name and profile photo change in my app? If that happens infrequently, duplicating those values in each item will make my reading code a lost simpler, faster, and more scalable.
For more good reading/watching on this:
How to write denormalized data in Firebase
NoSQL data modeling
Todd's excellent video series: Getting to know Cloud Firestore
Related
I have a user collection in firestore and each user object has an array of references to "tasks" that they have applied to. Tasks is a separate collection as well and each task object has a user ref array as well.
Collection Tasks:
doc: {
name: "Do something",
time: "Time",
users: ["/users/u1", "/users/u2"]
}
Collection Users:
doc: {
name: Username,
tasks: [ "/tasks/docRef", "/tasks/anotherDoc" ]
}
I have a screen in my react-native app that lists all the tasks and when a task is clicked, it goes to a details screen that displays all the users in a list as well.
Is this the best approach to have this kind of data? Or should I have collections instead of arrays with references. I refrained from collections to prevent duplication of data but I'm not sure if they will be more efficient.
(From the comments )I wanted to inquire if this was the right approach to store the data?
1/ Should I just use uids and then query the collection to matching
those ids?
Storing uid or DocumentReferences in the arrays will not make a difference in terms of ease of querying the corresponding documents. It would only make a difference at the level of the size of the document containing this data since DocumentReference`s are longer than uids).
2/ Or create a sub-collection in the user document itself with references to the tasks?
In the NoSQL world you should not hesitate to denormalize your data.
So having the tasks list in a user doc AND having the users list in a task doc and synchronize the docs when a change is done in one of the collections is a valid approach.
HOWEVER, you may encounter a problem if you have "a lot" of tasks for a given user or a lot of users involved in a task since you may hit the maximum size for a document which is 1 MiB (I agree that you need a LOT of tasks or users :-)).
To avoid that I would advise using sub-collections. This is also the preferred approach if you plan a high frequency of changes that could cause database contention or higher latency, see the documentation section about "Designing for scale".
If the user data you want to show in a task is limited (e.g. just their name plus a button to open each user Profile based on the uid) I would keep an array of users in the task document with this limited amount of data (of course after being sure that there is no risk that a doc reaches the limit of 1 MiB). And have a sub-collection of task documents under each user doc (as advised above).
Modeling well in firestore is a bit difficult, you have to think hard about your use case.
Don't worry about data duplication, this is very common in Firestore. Remembering that you mainly pay for the number of reads and writes.
In your user array, you could keep the necessary data to save you from doing more reads on the user collection.
In your example, the user only has the username, you could keep the username and uid, saved in tasks. That way, no reads would be done on the user collection.
What if the user changes username? Use a batched writes and update all docs that contain that user.
I have been using Firestore for a very long time. I am building an app now where scalability and keeping low costs is important. (I am using flutter)
My app has users, which have user profiles, also they can add friends and talk to them (like instagram or facebook).
I have a problem building this friends system.
My model for this friends system currently looks like this:
Users collection. Each document id = user id from auth, those docs contain data like name, username, profile picture, etc.
Friends collection. Each document id = user id from auth. For each user, those docs contain a field called: friends, which is an array with each of his friends user ids.
The model looks like:
Friends collection:
- uid:
- friends_list: [friend_uid1, friend_uid2, ...]
This is how my "backend" looks.
Now I want to show my user a list of his friends. How do I do that?
I want a list that looks like instagram, with a nice UI showing each of my user friend profile pic, name, last message, etc.
I can not find a straight forward way to do this with Firestore and queries.
Let's say I do it like this:
Get all my friends user ids in an array.
Get all their user documents using .get() for each document.
This is not doable in firestore cause it would eliminate all the querying power I have (such being able to query only for users with name "x"), I would have to fetch all users and do the query on my front-end (or in a cloud function, same thing, not scalable).
If I do this like:
Get all document using a query for all users in the Friends collection, where friends_list contains my user id.
Save from those documents only the documentID and fetch all the friends user data manually.
This comes with another problem. In Firestore there is no way of fetching a document without fetching all of its fields, so the first query which I use to get the ids only of my friends would actually give me their id + their friend list instead (cause when I query, it also gets the document id + the data), which is not good.
If I do it like:
When you add a friend, instead of just saving its uid, save its uid + data.
Now I can easily show my user his friends list nicely and do some querying on front-end.
The problem here is that now if one of my friends updates his profile photo, I need to update it in every document of all of his friends, which is very write expensive for just a little profile update.
There is also the problem of watching for more data, maybe I have another collection with Chats, and I want to show the last message of my chat with a friend, now I have to fetch the chat rooms too, which is more hard to query data that comes with all the problems that I mentioned before.
In conclusion: I don't see a good scalable way to do this kind of system within Firestore. It seems a simple system which any basic app should have, but I do not see how I can do it in a way that does not make lots of reads or read more data (or sensitive data) than it should.
What kind of model would you do for a friends system like this?
You're decribing a quintessential drawback of NoSQL Databases.
A similar example is actually given in the Get to Know Cloud Firestore series.
Like others have commented, the answer really depends on your application. And this is the assessment you'll have to do. Like which of the options is cheaper depending on the use case of the app.
For example, if you go with your third option and store the friend's user data that you'll need to populate the list. This means you'll have to implement measures to keep the integrity of the copied data whenever the user updates their information.
You can then look at the usage of your app and determine how often users change their information vs how often you would need to retrieve full users if you don't copy the data to find the cheapest method for your application.
is there any better way to get multiple specific data from collection in firestore?
Let's say have this collection:
--Feeds (collection)
--feedA (doc)
--comments (collection)
--commentA (doc)
users_in_conversation: [abcdefg, hijklmn, ...] //Field contains list of all user in conversation
Then, I'll need to retrieve the user data (name and avatar) from the Users collection, currently, I did 1 query per user, but it will be slow when there are many people in conversation.
What's the best way to retrieve specific users?
Thanks!
Retrieving the additional names is actually a lot faster than most developers expect, as the requests can often be pipelined over a single HTTP/2 connection. But if you're noticing performance problems, edit your question to show the code you use, the data you have, and the performance you're getting.
A common way to reduce the need to load additional documents is by duplicating data. For example, if you store the name and avatar of the user in each comment document, you won't need to look up the user profile every time you read a comment.
If you come from a background in relational databases, this sort of data duplication may be very unexpected. But it's actually quite common in NoSQL databases.
You will of course then have to consider how to deal with updates to the user profile, for which I recommend reading: How to write denormalized data in Firebase While this is for Firebase's other database, the same concepts apply to Firebase. I also in general recommend watching Getting to know Cloud Firestore.
I have tried some solution, but I think this solution is the best for the case:
When a user posts a comment, write a field of array named discussions in the user document containing the feed/post id.
When user load on a feed/post, get all user data which have its id in the user discussions (using array-contains)
it’s efficient and costs fewer transaction processes.
I'm totally new to Firebase, and I'm trying to get my head round the best db model design for 'relational' data, both 1-1 and 1-many.
We are using the Firestore db (not the realtime db).
Say we have Projects which can contain many Users, and a User can be in multiple Projects
The UI needs to show a list of Users in a Project which shows things like email, firstname, lastname and department.
What is the best way to store the relationship?
An array of User ids in the Project document?
A map of Ids in the Project document?
Ive read the above approaches were recommended, but was that for realtime database? Firestore supports Sub Collections, which sound more appropriate...
A sub collection of Users in the Project document?
A separate collection mapping Project id to User id?
A Reference data type? I've read here https://firebase.google.com/docs/firestore/manage-data/data-types about Reference data type, which sounds like what I want, but I cant find any more on it!
If its just a map or array of Ids, how would you then retrieve the remaining data about the user? Would this have to sit in the application UI?
If its a sub collection of Users documents, is there any way to maintain data integrity? If a user changed their name, would the UI / a cloudFunction then have to update every entry of that users name in the Sub collections?
any help / pointers appreciated...
The approach for modeling many-to-many relationships in Firestore is pretty much the same as it was in Firebase's Realtime Database, which I've answered here: Many to Many relationship in Firebase. The only difference is indeed that you can store the lookup list in a sub-collection of each project/user.
Looking up the linked item is also the same as before, it indeed requires loading them individually from the client. Such a client-side join is not nearly as slow as you may initially expect, so test it before assuming it can't possibly be fast enough.
Ensuring data integrity can be accomplished by performing batched writes or using transactions. These either completely succeed or completely fail.
I have a collection, itemsCollection, which contains a very large amount of small itemDocs. Each itemDoc has a subcollection, statistics. Each itemDoc also has a field "owner" which indicates which user owns the itemDoc.
itemsCollection
itemDoc1
statistics
itemDoc2
statistics
itemDoc3
statistics
itemDoc4
statistics
...
I also have a collection, usersCollection, which contains basic user info.
usersCollection
user1
user2
user3
...
Since each itemDoc belongs to a specific user, it's necessary to display to each user which itemDocs they own. I have been using the query:
db.collection("itemsCollection").where("owner", "==", "user1")
I am wondering if this will scale effectively, i.e. whenever itemsCollection gets to be millions of records? If not, is the best solution to duplicate each itemDoc and its statistics subcollection as a subcollection in the user document, or should I be doing something else?
As Alex Dufter, the product manager from Firebase, explained in one of days at Firebase Dev Summit 2017 that Firestore was inspired in many ways by the feed-back that they had on the Firebase Realtime Database over the years. They faced two types of issues:
Data modelling and querying. Firebase Realtime Database cannot query over multiple properties because it ussaly involves duplication data or cliend-side filtering, which we all already know that is some kind of messy.
Realtime Database does not scale automatically.
With this new product, they say that you can now build an app and grow it to planetary scale without changing a single line of code. Cloud Firestore is also a NoSQL database that was build specifically for mobile and web app development. It's flexible to build all kinds of apps and scalable to grow to any size.
So because the new database was build knowing this iusses, duplication data is not nedeed anymore. So you will not have to worry about using that line of code, even if your data will grow to millions of records, it will scale automatically. But one thing you need to remember, if you will use multiple conditions, don't forget to use the indexes by simply adding them in the Firebase console. Here are two simple examples from the offical documentation:
citiesRef.whereEqualTo("state", "CO").whereEqualTo("name", "Denver");
citiesRef.whereEqualTo("state", "CA").whereLessThan("population", 1000000);