Let's say I have the following scenario:
I have multiple events, that multiple users can attend. Also, users can attend multiple events. What's the best way of storing the required information with maintaining data consistency?
Here's what I came up with and why I don't really fancy them:
collection "events" -> event document -> subcollection "users" -> user document
Problem:
Each user exists on each event, resulting in multiple documents of each user. I can't just update user information as I would need to write to every relevant event document and fetch the relevant user documents.
Really a disaster if trying to make the least reads/writes possible
E.g.:
this.afs.collection('events').get().then(res => {
res.forEach(document => {
document.ref.collection('users', ref => ref.where('name', '==', 'Will Smith')).get()
//Change name accordingly
})
})
collection "users" -> user document -> subcollection "events" -> event document
Problem:
Each event exists on each user, resulting in multiple documents of each event. (Same problem as in the first scenario, just the other way around)
collection "users" and collection "events" with each having users and events as documents subordinate to them.
There's an array attending_events which has the relevant event id's in it.
Problem:
Kind of the SQL way of sorting things. There's the need of getting each document with a seperate query using a forEach() function.
E.g.
this.afs.collection('events').doc(eventId).get().then(res => {
res.users.forEach(elem => {
this.afs.collection('users').doc(elem.name).get()
//Change name accordingly
})
})
What am I missing, is there better approaches to model the desired architecture?
When using collection "events" -> event document -> subcollection "users" -> user document
It's not so bad as you might think. This practice is called denormalization and is a common practice when it comes to Firebase. If you are new to NoSQL databases, I recommend you see this video, Denormalization is normal with the Firebase Database for a better understanding. It is for Firebase realtime database but the same rules apply in the case of Cloud Firestore.
I want to be able to change user information or event information without the need of fetching hundreds of documents.
If you think that user details will be changed very often, then you should consider storing under each user object an array of event IDs and not use a subcollection. In the same way, you should also add under each event object an array of UIDs. Your new schema should look like this:
Firestore-root
|
--- users (collection)
| |
| --- uid (document)
| |
| --- events: ["evenIdOne", "evenIdTwo", "evenIdThere"]
| |
| --- //Other user properties
|
--- events (collection)
|
--- eventId (document)
|
--- users: ["uidOne", "euidTwo", "uidThere"]
|
--- //Other event properties
Since you are holding only references, when the name of a user is changed, there is no need to update it in all user objects that exist in events subcollection. But remember that in this approach, to get for example all events a user is apart off, you should create two queries, one to get the event IDs from user document and second to get event documents based on those event IDs.
Basically it's a trade-off between using denormalization and storing data in arrays.
What's the best way of storing the required information with maintaining data consistency?
Usually, we create the database schema according to the queries we intend to perform. For more infos, I also recommend to see my answer from the following post:
What is the correct way to structure this kind of data in Firestore?
Related
As you can there are 3 collections in my Firestore database:
Plans
UPIs
Users
Upon Successful Signup of a user, I want to copy all the values present in Plans Collection to Users Collection under their mobile number as a document.
There is currently no straightforward solution for that. The single option that you have, is to move each and every document that exists in the "Plans" collection, to any other collection of your choice.
If you want the "Plans" to be a subcollection under the user's mobile number document in the "Users" collection, then you need to have a schema similar to this:
Firestore-root
|
--- Users (collection)
|
--- $phoneNumber (document) //👈
|
--- Plans (Sub-collection)
|
--- $planId
|
--- //fields
This operation can be done on the client, or using a trusted environment you control. The latter is obviously more recommended. So you might consider using Cloud Functions for Firebase, to trigger a function on user creation that does exactly that.
I have a Firestore structure with an "organizations" collection and a "users" collection.
When a user creates an account via Auth, I'd like to create a new "Organization" and add him to this organization. That means having a "Create" right.
The problem is that, by doing so, the user can create multiple Organizations and be in them.
The other issue I'm facing is regarding the changes. When that user will change their information (name, email, etc), it will also update their line at the "users" collection, but that also means they will be able to change the "organization" reference and point it to another one, which is bad.
So I wonder what is the proper way to do so, and/or if I'm doing it wrong.
That technique is called denormalization and it's a common practice when it comes to top NoSQL databases.
As I understand from your question, you want to add users to be part of the organization. In that case, there is no need to duplicate the data. I would use a structure that looks like this:
Firestore-root
|
---- users (collection)
| |
| --- $uid (document)
| |
| --- organizations: [$orgId, $orgId, $orgId] (array)
|
---- organizations (collection)
|
--- $orgId (document)
|
--- users: [$uid, $uid, $uid] (array)
In which "organizations" is an array that holds organizations IDs, and "users" is an array that holds user IDs.
Since we usually are structuring a Firestore database according to the queries that we want to perform, the above schema will help you query all the organizations a user is a part of or all users that are a part of an organization. This means that if you want to display user data, you have to perform a new Firestore database call.
Following Current Datamodell
User
User ID
Video
VideoID
LikedBy (Subcol)
User ID
User ID
User ID
Now if a User visits a video I wanna show if he Liked the Video already or not (similar to youtubes button color if you liked already).
My current approach is querieing for a Document with the Key of the signed In UserID and if I find one it means the user liked the video. The problem is I have this for Artists that you can subscribe too similar to channels on youtube.
This alone created about 3x the initial Reads I have on Page Load.
I would like to hear if there is any more efficient way to query for such a thing or structure the data.
Be aware that if you suggest me to store all liked Shows in the User or Show Document that this is not scalable due to the 1MB Limit.
1) You can have a subcollection on the Users, storing the ids of the posts the likes.
2) You can create a users_likes, collections where the Ids is the user id and inside have an array with the ids of the posts the user likes.
3) Last, just make props called likes on the user collection an store the ids of the posts.
All options have a trade-off, I would make like a user and posts_likes query on load and keep that in memory (no external user is going to affect this).
Be aware that if you suggest me to store all liked Shows in the User or Show Document that this is not scalable due to the 1MB Limit.
If you are expecting a user to like more than 1 millions of posts... otherwise, storing 1Mb of only ids is a good idea... I use this same pattern for a user events tracking, I have events defined (equivalent to your posts) and the user make actions that correlate to those events (your likes), I have cases with more than 80K and it works like charm. I gave your 3 options, I would say, start with 3 until it doesnt work, then go to 2 and same process up to 1. Since you will work with array of ids, support yourself with this
My current approach is querieing for a Document with the Key of the signed In UserID and if I find one it means the user liked the video.
Yes, that's a correct approach.
This alone created about 3x the initial Reads I have on Page Load.
I don't know where this is coming from but there is certainly something wrong. Unfortunately, nothing in your question can help me see the problem.
I would like to hear if there is any more efficient way to query for such a thing or structure the data.
I don't understand much from your schema, but I would structure the database this way:
Firestore-root
|
--- users (collection)
| |
| --- uid (document)
| |
| --- //user properties
|
--- video (collection)
|
--- videoId (document)
|
--- likedBy: ["uid", "uid", "uid"]
As you can see, likedBy property is of type array. So once you get a video document, you can simply check the uid of the logged in user against the likedBy array. If it exists, it means that user has already liked that video, otherwise has not.
I'm looking for a proper way to structure Firestore database to handle multiple version histories of documents inside a single collection.
For example: I have a collection named offers which have multiple documents which correspond to multiple offers. For each of these documents, I'd like to have history of changes, something like changes on Google Docs.
Since documents support only adding fields directly or nesting another collection, here's a structure I had in mind:
collections: offers
- documents: offer1, (offer2, offer3, ...)
- fields populated with latest version of the offer content
- nested collection named history
- nested documents for each version (v1, v2, v3), which in turn have fields specifing state of each field in that version.
This seems a bit overly complicated since I have latest state and than nested collection for history. Can this be somehow in flat structure where latest item in array is the latest state, or something similar.
Also, history state is generated on a button click, so I don't need every possible change saved in a history, just snapshots when user saves it.
I'd like to use Firebase as my DB for this, as I need it some other things, so I'm not looking into different solutions for now.
Thanks!
EDIT: According to the Alex's answer, here's my another take on this.
Firestore-root
|
--- offers (collection)
|
--- offerID (document)
| (with fields populated )
| |
| --- history (collection) //last edited timestamp
| |
| --- historyId
| --- historyId
|
--- offerID (document)
(with fields populated with latest changes)
|
--- history (collection) //last edited timestamp
|
--- historyId
--- historyId
This way I can query whole offers collection and get array of offers together with latest status since it's on the same level as the collection itself. Then if I need specific content from history state, I can query history collection of specific offer and get it's history states. Does this make sense?
I'm not sure about denormalization as this seems like it solves my problem and avoids complication.
Once more, requirements are:
- being able to fetch all offers with latest state (works)
- being able to load specific history state (works)
Just every time I update history collection with new state, I overwrite the fields directly in offerID collection with the same, latest, state.
Am I missing something?
In my opinion, your above schema might work but you'll need to do some extra database calls, since Firestore queries are shallow. This means that Firestore queries can only get items from the collection that the query is run against. Firestore doesn't support queries across different collections. So there is no way in which you can get one document and the corresponding history versions that are hosted beneath a collection of that document in a single query.
A possible database structure that I can think of, would be to use a single collection like this:
Firestore-root
|
--- offerId (collection)
|
--- offerHistoryId (document)
| |
| --- //Offer details
|
--- offerHistoryId (document)
|
--- //Offer details
If you want to diplay all history versions of an offer, a single query is required. So you just need to attach a listener on offerId collection and get all offer objects (documents) in a single go.
However, if you only want to get the last version of an offer, then you should add under each offer object a timestamp property and query the database according to it descending. At the end just make a limit(1) call and that's it!
Edit:
According to your comment:
I need to get a list of all offers with their latest data
In this case you need to create a new collection named offers which will hold all the latest versions of your offers. Your new collection should look like this:
Firestore-root
|
--- offers (collection)
|
--- offerHistoryId (document)
| |
| --- date: //last edited timestamp
| |
| --- //Offer details
|
--- offerHistoryId (document)
|
--- date: //last edited timestamp
|
--- //Offer details
This practice is called denormalization and is a common practice when it comes to Firebase. If you are new to NoQSL databases, I recommend you see this video, Denormalization is normal with the Firebase Database for a better understanding. It is for Firebase realtime database but same rules apply to Cloud Firestore.
Also, when you are duplicating data, there is one thing that need to keep in mind. In the same way you are adding data, you need to maintain it. With other words, if you want to update/detele an item, you need to do it in every place that it exists.
In your particular case, when you want to create an offer you need to add it in two places, once in your offerId collection and once in your offers collection. Once a new history version of an offer is created, there is only one more operation that you need to do. As before, add the offerHistoryId document in your offerId collection, add the same object in your offers collection, but in this case you need to remove the older version of the offer from the offers collection.
I can think of it like this. Each offers document will have offerHistoryID as number.
You can have a separate root collection for versioned documents of offers(say offers_transactions).
Now write an update trigger cloud function on offers document which will have both after and before values of the document.
Before doing the doc update, you can write the before values into the offers_transactions along with timestamp and latest historyID.
Increment the offerHistoryID by 1 for that offer and update the doc with new values.
Now you can query the root collection offers_transactions for historic transactions based on your filters. This way you can keep your root collection offers cleaner.
Thoughts?
Here's a solution my team uses to leverage Google Cloud Functions to add every collection update to a dedicated "history" collection in Firestore (no command line necessary):
Identify path of document to watch: COLLECTION-NAME/{documentID} (or define a specific document to watch)
Create a new Cloud Function (1st gen because 2nd gen doesn't support Firestore triggers yet)
Set trigger as any Firestore "write" event watching the document path from Step 1.
In the Cloud Function's inline code editor, select the language of your choice (I'll use Python), and include google-cloud-firestore==2.6.0 in your requriements.txt file (or whatever the latest version is)
Finally, define your Cloud Function's code (be sure to import Firestore correctly!)
def hello_firestore(event, context):
resource_string = context.resource
# print out the resource string that triggered the function
print(f"Function triggered by change to: {resource_string}.")
# now print out the entire event object
print(str(event))
# now import firestore and add event to the 'history' collection
from google.cloud import firestore
db = firestore.Client(project="YOUR-PROJECT-ID")
newHistDoc = db.collection(u'history').add(event)
I have a following collections
-likes (collection)
-{uid} (document)
{otheruserUID: true, anotherUID: true ...}
-likedBy (collection)
-{uid} (document)
{otheruserUID: true, anotherUID: true ...}
A user can like other users. What I want to query for is given a user, query for all matches of that user. Should I query whole likes and likedby data and run match in result and produce match results? Is there any other easy way to do this? Or may be better way to model the data?
Personally, I would simply have a single collection, called likes. Each like generates a new document with an auto-id and contains 3 fields: user (an object containing the id and name of the user), likedBy (an object containing the id and name of the user who liked them) and timestamp (when they were liked).
You'll be able to carry out the following queries:
// Find all users who liked likedUser, sorted by user
db.collection('likes').where('likedBy.name', '!=', null).where('user.id', '==', likedUser).orderBy('likedBy.name');
// Find all users who were liked by likedByUser, sorted by user
db.collection('likes').where('user.name', '!=', null).where('likedBy.id', '==', likedByUser).orderBy('user.name');
The first time that you run these queries, you will get an error, telling you to create an index. This error will include the URL to create the index for you.
The first where is required to allow the orderBy to work, see the documentation section Range filter and orderBy on the same field