Modeling one to one chat on firebase - firebase

I'm building a one to one messaging feature the intent behind is the following:
There is a unique project and people (two or more) can chat about the project so we can think a project is a room, I've been looking to different modeling structures the most common is something like the following:
Chats
- projectId (room)
- messages
message
userId
name
profilePicture
posted (timestamp)
But I've been thinking in a flat structure something like
Messages
ProjectId
Message
userId
name
profilePicture
posted
The chat feature is going to have a huge impact on the web app I'm building, being said that is quite important to make the right desition (I'm sure there is no always a right or wrong but consider the purpose of the chat)
Just some questions that come to my mind:
are there any implications in performance by using a flat structure?
what are the advantages of using a nested structure like the mentioned in example #1
which solution is cheaper? (reads/writes)

There are befenits from both the solutions you proposed. Let's dive into them:
performance: they are pretty similar from this point of view. In fact, if you want to get a chat from Firestore, in the second case simply make a query for the messages of a particular chat and parse the required information from the first document you receive (since in each message you have the userID, name, profilePicture, etc ...). With the first approach this operation is straightforward since you already asking for a Chat document.
structure: the first solution is the one that I prefer because it's clear what it does and since Firestore is schemaless it enforces a clear design. With the second approach you are basically flattening your DB but you are also exposing your messages to privacy issues. In fact, setting up rules in the first case is pretty straightforward, simply let the users access only the chats they are involved in. But in this case, all the users can, "possibly", read each other messages which should not be something which you want.
cost: this basically depends on what you will do with these documents. In fact, the cost of Firestore either depended on the number of documents read/written but also on the amount of data you store. Here, the first solution is clearly better since you are not adding redundancy for fields like profilePicture, name, userID, etc ... This fields logically belong to the Chat entity, and not to its messages.
I hope this helps since properly setting up a database is vital for any good project.

Related

Best way to save multiple collections under one user UID

I am writing an app where there is not a lot of interaction with other users. Set and retrieve your own data only.
In Firebase Firestore how could I model this so that everything fits under a users UID?
Something that would look like this?
users/{uid}/user/
users/{uid}/settings/
users/{uid}/weather/
If I want to achieve something like this, then I need to create another UID:
users/{uid}/user/{uid}/{userInfo}
This feels a bit off to me.
Is this wrong? Would it be better if I moved every subcollection into its own collection?
Is this faster / more efficient?
Any help is appreciated!
The most common approaches for me:
Store the profile information, settings and weather in the user document (your {uid}) itself. This most common for the profile information, but it's always worth considering for other types too: do they really need to be in their own documents?
Have a default name for a single subcollection for each user, and then have each information type as a document with a known name in there. So /users/$uid/documents/profile, /users/$uid/documents/settings, and /users/$uid/documents/weather. So now each information type is in a separate document, meaning you can for example secure access to them individually.
If the information for a certain type is repeated, I'd put that in documents in a known/named subcollection. So if there are many weathers, you'd get /users/$uid/weather/$weatherdocs. So with this you can now have an endless set of the specific type of information.
Neither of these is pertinently better/worse, as it all depends on the use-cases of your app.
There will be performance differences between these approaches, as they require a different number of network requests. If this is a concern for your app, I'd recommend testing all approaches above to measure their relative performance against your requirements.

Is it always safe to use eventId as the Firestore document id?

This article here recommends using the eventId as the document id to prevent multiple creations of a document due to background process retries. Is it guaranteed that there will never be a collision?
Mentioned article is showing how to avoid duplicate item created by retires of unsuccessful function. In shortcut its saying that if you use add method (reference) and function is retried (but failed after Firestore write) you may have a problem with 2 documents identical created in Firestore with different IDs created automatically.
As solution to this author is proposing to create documentID with eventID and write to it using set (refrence).
This approach gives you 100% that retries of the same function invocation will not create duplicate items.
Backing to the question... I think you are afraid that 2 different invocation will want will have the same event_id and the document can be overwritten. This I think is possible, but in my opinion it's not in scope of this article as it's answers different question and creating as simple use case as possible to help understand the approch.
Lets imagine we have to different functions invoked by the same event writing different content to the same collection. The result will be unpredictable, I think. However in such situation you can use the same mechanism, little bit upgraded ex. like this <function_name>_<event_id>. Using the example from the article it will be small change like:
...
return db.collection('contents').doc('<function_name>_'+eventId).set(content).then
...
So in my understanding if you afraid of collision you should add additional elements to created document references, like in the example above.
From my point of view, an ability to use an event_id as a firestore document id depends on a your context and requirements.
For example - from the "business" point of view - is the message/event really a unique business related thing (thus you really would like to avoid duplication of messages)? Or are there some other business entity which is to be unique, but there can be more than one messages (with different event_id) about that business entity?
On top of that, from the best of my knowledge, it may be a good practice to generate/create the firestore document ids randomly (as a hash, of a guid, etc.). In that case, the search/retrieval from the firestore should work "faster". So, I don't know if the event_id is "random" enough in your context. Maybe it is Ok, may be not...
In my personal experience I try to generate a document id as a hex digest of a hash from a string (may be composed string), which supposed to be unique in the business context. For example, the event/message - is a google.storage.object.finalize event. In that case, I would use some metadata about the underlined object/file. Depends on the business context and requirements, or can be (or not be) a bucket name, object name, size, md5 or crc32c etc. or a combination of those elements... The chosen elements are concatenated into a string, then a hash is calculated, and a hex digest of that hash becomes a document id in the firestore collection.

Firebase Cloud Firestore Social network database design

I have a simple question. I am building a Instagram clone app and I want to show each user to their friends. Also they can see the friends list. I am using cloud firestore approach. However I'm a little bit confused about how to store user's friends data? . Should I create a new collection as friendsList
or should I hold the data in users collection as a friends array ?
In the first approach I will create the user data again when some user adds a new friend. Am a new for both firestore and NoSql I would be thankful If anyone can explain.
I'm not going to "answer" as such, but explain the philosophy of NoSQL a bit. The best approach is to design your queries first (i.e. what do you want to get from the database), then design your database schema to make getting the results of those queries efficient and affordable. There are many ways to organize data; you want to take advantage of NoSQL "schema-less" to make your schema match your needs, not the other way around.
Other things to keep in mind: DRY is less critical to NoSQL. Static data (i.e. never or rarely changes) can be stored in multiple places (i.e. a friend's name might be in their profile and in a friends-list) if that saves reads & writes (which are the biggest factor in costs).
So how to organize your database? I don't know; what do you want your database to do?
I should read to this tutorial.This tutorial about is MySql but not important for me if you understand this tutorial you can apply firebase.
I leave a tip below.

Firestore database model for Notion-like modules [duplicate]

I have seen videos and read the documentation of Cloud firestore, from Google Firebase service, but I can't figure this out coming from realtime database.
I have this web app in mind in which I want to store my providers from different category of products. I want perform a search query through all my products to find what providers I have for such product, and eventually access that provider info.
I am planning to use this structure for this purpose:
Providers ( Collection )
Provider 1 ( Document )
Name
City
Categories
Provider 2
Name
City
Products ( Collection )
Product 1 ( Document )
Name
Description
Category
Provider ID
Product 2
Name
Description
Category
Provider ID
So my question is, is this approach the right way to access the provider info once I get the product I want?
I know this is possible in the realtime database, using the provider ID I could search for that provider in the providers section, but with Firestore I am not sure if its possible or if this is right approach.
What is the correct way to structure this kind of data in Firestore?
You need to know that there is no "perfect", "the best" or "the correct" solution for structuring a Cloud Firestore database. The best and correct solution is the solution that fits your needs and makes your job easier. Bear also in mind that there is also no single "correct data structure" in the world of NoSQL databases. All data is modeled to allow the use-cases that your app requires. This means that what works for one app, may be insufficient for another app. So there is not a correct solution for everyone. An effective structure for a NoSQL type database is entirely dependent on how you intend to query it.
The way you are structuring your data looks good to me. In general, there are two ways in which you can achieve the same thing. The first one would be to keep a reference of the provider in the product object (as you already do) or to copy the entire provider object within the product document. This last technique is called denormalization and is a quite common practice when it comes to Firebase. So we often duplicate data in NoSQL databases, to suit queries that may not be possible otherwise. For a better understanding, I recommend you see this video, Denormalization is normal with the Firebase Database. It's for Firebase Realtime Database but the same principles apply to Cloud Firestore.
Also, when you are duplicating data, there is one thing that needs to keep in mind. In the same way, you are adding data, you need to maintain it. In other words, if you want to update/delete a provider object, you need to do it in every place that it exists.
You might wonder now, which technique is best. In a very general sense, the best way in which you can store references or duplicate data in a NoSQL database is completely dependent on your project's requirements.
So you should ask yourself some questions about the data you want to duplicate or simply keep it as references:
Is the static or will it change over time?
If it does, do you need to update every duplicated instance of the data so they all stay in sync? This is what I have also mentioned earlier.
When it comes to Firestore, are you optimizing for performance or cost?
If your duplicated data needs to change and stay in sync in the same time, then you might have a hard time in the future keeping all those duplicates up to date. This will also might imply you spend a lot of money keeping all those documents fresh, as it will require a read and write for each document for each change. In this case, holding only references will be the winning variant.
In this kind of approach, you write very little duplicated data (pretty much just the Provider ID). So that means that your code for writing this data is going to be quite simple and quite fast. But when reading the data, you will need to load the data from both collections, which means an extra database call. This typically isn't a big performance issue for reasonable numbers of documents, but definitely does require more code and more API calls.
If you need your queries to be very fast, you may want to prefer to duplicate more data so that the client only has to read one document per item queried, rather than multiple documents. But you may also be able to depend on local client caches makes this cheaper, depending on the data the client has to read.
In this approach, you duplicate all data for a provider for each product document. This means that the code to write this data is more complex, and you're definitely storing more data, one more provider object for each product document. And you'll need to figure out if and how to keep up to date on each document. But on the other hand, reading a product document now gives you all information about the provider document in one read.
This is a common consideration in NoSQL databases: you'll often have to consider write performance and disk storage vs. reading performance and scalability.
For your choice of whether or not to duplicate some data, it is highly dependent on your data and its characteristics. You will have to think that through on a case-by-case basis.
So in the end, remember that both are valid approaches, and neither of them is pertinently better than the other. It all depends on what your use-cases are and how comfortable you are with this new technique of duplicating data. Data duplication is the key to faster reads, not just in Cloud Firestore or Firebase Realtime Database but in general. Any time you add the same data to a different location, you're duplicating data in favor of faster read performance. Unfortunately in return, you have a more complex update and higher storage/memory usage. But you need to note that extra calls in Firebase real-time database, are not expensive, in Firestore are. How much duplication data versus extra database calls is optimal for you, depends on your needs and your willingness to let go of the "Single Point of Definition mindset", which can be called very subjective.
After finishing a few Firebase projects, I find that my reading code gets drastically simpler if I duplicate data. But of course, the writing code gets more complex at the same time. It's a trade-off between these two and your needs that determines the optimal solution for your app. Furthermore, to be even more precise you can also measure what is happening in your app using the existing tools and decide accordingly. I know that is not a concrete recommendation but that's software development. Everything is about measuring things.
Remember also, that some database structures are easier to be protected with some security rules. So try to find a schema that can be easily secured using Cloud Firestore Security Rules.
Please also take a look at my answer from this post where I have explained more about collections, maps and arrays in Firestore.

What's the most scalable and performant solution to store chat logs on firebase realtime database?

I'm working on a chat client using the firebase realtime database as the database. The way that it currently works is that it saves a chat log between two people in a chat collection with each entry in the following format <uid>-<uid>. This works great as it just looks your uid and the uid of the person you want to chat with and then sorts them, so it's always a consistent format and then it looks if that entry exists on the chat collection and if so, it just adds to that entry. Otherwise it creates a new one.
This works awesome. I'm trying to think ahead though if we want to be able to have multiple people talk together like in slack. I could just add 3 or even 4 people's uid as the key but eventually it's going to be insanely long. The limitation of a firebase key is 768 Bytes. Apparently that's somewhere between 500 and 700 characters. I doubt we will have the key get that long, but if we can figure out a solution that is more scalable now and won't require us to fix our data later, i'd rather do that.
I was thinking that each chat entry could have a participants array with the uid's of all the users in that chat. Then if you want to chat with someone, we would need to query all chat entries and check the arrays in each of them for the current user uid and the uid of the person(s) they want to chat with. That doesn't seem very efficient though.
Any thoughts on which implementation is better / more scalable / performant? Or perhaps a suggestion for another implementation?
How about simply using the hash of the resulting concatenation of UIDs?
Alternatively:
Come up with your own unique room key, e.g. using a push ID.
create a new top-level node with chatroom-keys and store the concatenated UID as the value there:
chatroom-keys
push-id1: uid1-uid2-uid3
push-id2: uid1-uid2-uid3-uid4-uid5-uid6
push-id3: uid3-uid4-uid5-uid6-uid7-uid8-uid8-uid10
In this structure you can look up the room key for a set of participants by:
firebase.database().ref("chatroom-keys").orderByValue().equalTo("uid1-uid2-uid3")

Resources