Firestore database model for Notion-like modules [duplicate] - firebase

I have seen videos and read the documentation of Cloud firestore, from Google Firebase service, but I can't figure this out coming from realtime database.
I have this web app in mind in which I want to store my providers from different category of products. I want perform a search query through all my products to find what providers I have for such product, and eventually access that provider info.
I am planning to use this structure for this purpose:
Providers ( Collection )
Provider 1 ( Document )
Name
City
Categories
Provider 2
Name
City
Products ( Collection )
Product 1 ( Document )
Name
Description
Category
Provider ID
Product 2
Name
Description
Category
Provider ID
So my question is, is this approach the right way to access the provider info once I get the product I want?
I know this is possible in the realtime database, using the provider ID I could search for that provider in the providers section, but with Firestore I am not sure if its possible or if this is right approach.

What is the correct way to structure this kind of data in Firestore?
You need to know that there is no "perfect", "the best" or "the correct" solution for structuring a Cloud Firestore database. The best and correct solution is the solution that fits your needs and makes your job easier. Bear also in mind that there is also no single "correct data structure" in the world of NoSQL databases. All data is modeled to allow the use-cases that your app requires. This means that what works for one app, may be insufficient for another app. So there is not a correct solution for everyone. An effective structure for a NoSQL type database is entirely dependent on how you intend to query it.
The way you are structuring your data looks good to me. In general, there are two ways in which you can achieve the same thing. The first one would be to keep a reference of the provider in the product object (as you already do) or to copy the entire provider object within the product document. This last technique is called denormalization and is a quite common practice when it comes to Firebase. So we often duplicate data in NoSQL databases, to suit queries that may not be possible otherwise. For a better understanding, I recommend you see this video, Denormalization is normal with the Firebase Database. It's for Firebase Realtime Database but the same principles apply to Cloud Firestore.
Also, when you are duplicating data, there is one thing that needs to keep in mind. In the same way, you are adding data, you need to maintain it. In other words, if you want to update/delete a provider object, you need to do it in every place that it exists.
You might wonder now, which technique is best. In a very general sense, the best way in which you can store references or duplicate data in a NoSQL database is completely dependent on your project's requirements.
So you should ask yourself some questions about the data you want to duplicate or simply keep it as references:
Is the static or will it change over time?
If it does, do you need to update every duplicated instance of the data so they all stay in sync? This is what I have also mentioned earlier.
When it comes to Firestore, are you optimizing for performance or cost?
If your duplicated data needs to change and stay in sync in the same time, then you might have a hard time in the future keeping all those duplicates up to date. This will also might imply you spend a lot of money keeping all those documents fresh, as it will require a read and write for each document for each change. In this case, holding only references will be the winning variant.
In this kind of approach, you write very little duplicated data (pretty much just the Provider ID). So that means that your code for writing this data is going to be quite simple and quite fast. But when reading the data, you will need to load the data from both collections, which means an extra database call. This typically isn't a big performance issue for reasonable numbers of documents, but definitely does require more code and more API calls.
If you need your queries to be very fast, you may want to prefer to duplicate more data so that the client only has to read one document per item queried, rather than multiple documents. But you may also be able to depend on local client caches makes this cheaper, depending on the data the client has to read.
In this approach, you duplicate all data for a provider for each product document. This means that the code to write this data is more complex, and you're definitely storing more data, one more provider object for each product document. And you'll need to figure out if and how to keep up to date on each document. But on the other hand, reading a product document now gives you all information about the provider document in one read.
This is a common consideration in NoSQL databases: you'll often have to consider write performance and disk storage vs. reading performance and scalability.
For your choice of whether or not to duplicate some data, it is highly dependent on your data and its characteristics. You will have to think that through on a case-by-case basis.
So in the end, remember that both are valid approaches, and neither of them is pertinently better than the other. It all depends on what your use-cases are and how comfortable you are with this new technique of duplicating data. Data duplication is the key to faster reads, not just in Cloud Firestore or Firebase Realtime Database but in general. Any time you add the same data to a different location, you're duplicating data in favor of faster read performance. Unfortunately in return, you have a more complex update and higher storage/memory usage. But you need to note that extra calls in Firebase real-time database, are not expensive, in Firestore are. How much duplication data versus extra database calls is optimal for you, depends on your needs and your willingness to let go of the "Single Point of Definition mindset", which can be called very subjective.
After finishing a few Firebase projects, I find that my reading code gets drastically simpler if I duplicate data. But of course, the writing code gets more complex at the same time. It's a trade-off between these two and your needs that determines the optimal solution for your app. Furthermore, to be even more precise you can also measure what is happening in your app using the existing tools and decide accordingly. I know that is not a concrete recommendation but that's software development. Everything is about measuring things.
Remember also, that some database structures are easier to be protected with some security rules. So try to find a schema that can be easily secured using Cloud Firestore Security Rules.
Please also take a look at my answer from this post where I have explained more about collections, maps and arrays in Firestore.

Related

Firebase Cloud Firestore Social network database design

I have a simple question. I am building a Instagram clone app and I want to show each user to their friends. Also they can see the friends list. I am using cloud firestore approach. However I'm a little bit confused about how to store user's friends data? . Should I create a new collection as friendsList
or should I hold the data in users collection as a friends array ?
In the first approach I will create the user data again when some user adds a new friend. Am a new for both firestore and NoSql I would be thankful If anyone can explain.
I'm not going to "answer" as such, but explain the philosophy of NoSQL a bit. The best approach is to design your queries first (i.e. what do you want to get from the database), then design your database schema to make getting the results of those queries efficient and affordable. There are many ways to organize data; you want to take advantage of NoSQL "schema-less" to make your schema match your needs, not the other way around.
Other things to keep in mind: DRY is less critical to NoSQL. Static data (i.e. never or rarely changes) can be stored in multiple places (i.e. a friend's name might be in their profile and in a friends-list) if that saves reads & writes (which are the biggest factor in costs).
So how to organize your database? I don't know; what do you want your database to do?
I should read to this tutorial.This tutorial about is MySql but not important for me if you understand this tutorial you can apply firebase.
I leave a tip below.

Is this pattern valid for communicating among multiple firestore databases?

I'm currently brainstorming and wondering if it's possible to easily communicate among multiple firestore databases. If so, I could isolate collections and therefore also isolate writes/updates on those collections from competing with other services reducing the risk that I hit the 10,000 write limit p/second on a given database.
Conceptually, I figure I can capture the necessary information from one document in DB_A (including the doc_id) in a read and then set that document in DB_B with the matching doc_id.
In a working example, perhaps one page has a lot of content (documents) that I need to generate and I don't want those writes to compete with writes used in other services on my app. When a user visits this page, we show those documents from DB_A and if the user is interested in one of those documents, we can take that document that we've effectively already read, and now write it into DB_B where user-specific content lives. It seems practical enough. Are there any indexing problems / other problems that could come out of this solution that I'm not seeing?
In the example you give the databases themselves are not communicating, but your app is communicating with multiple database instances. That is indeed possible. Since you can only have one Firestore instance per project, you will need to add multiple projects to your app.
What you're describing is known as sharding, as each database becomes a shard of (a subset of) your entire data set.
Note that it is quite uncommon to have shards to Firestore. If you predict such a high volume of writes, also have a look at Firebase's Realtime Database - as that is typically better suited for use-cases with more, small writes. Firestore is more suited for use-cases that have fewer larger writes, and many more readers. While you may also still to shard (and possibly shard more to reach the same read capacity) with Realtime Database, it can have multiple database instances per project - making the process easier to manage.

Modeling one to one chat on firebase

I'm building a one to one messaging feature the intent behind is the following:
There is a unique project and people (two or more) can chat about the project so we can think a project is a room, I've been looking to different modeling structures the most common is something like the following:
Chats
- projectId (room)
- messages
message
userId
name
profilePicture
posted (timestamp)
But I've been thinking in a flat structure something like
Messages
ProjectId
Message
userId
name
profilePicture
posted
The chat feature is going to have a huge impact on the web app I'm building, being said that is quite important to make the right desition (I'm sure there is no always a right or wrong but consider the purpose of the chat)
Just some questions that come to my mind:
are there any implications in performance by using a flat structure?
what are the advantages of using a nested structure like the mentioned in example #1
which solution is cheaper? (reads/writes)
There are befenits from both the solutions you proposed. Let's dive into them:
performance: they are pretty similar from this point of view. In fact, if you want to get a chat from Firestore, in the second case simply make a query for the messages of a particular chat and parse the required information from the first document you receive (since in each message you have the userID, name, profilePicture, etc ...). With the first approach this operation is straightforward since you already asking for a Chat document.
structure: the first solution is the one that I prefer because it's clear what it does and since Firestore is schemaless it enforces a clear design. With the second approach you are basically flattening your DB but you are also exposing your messages to privacy issues. In fact, setting up rules in the first case is pretty straightforward, simply let the users access only the chats they are involved in. But in this case, all the users can, "possibly", read each other messages which should not be something which you want.
cost: this basically depends on what you will do with these documents. In fact, the cost of Firestore either depended on the number of documents read/written but also on the amount of data you store. Here, the first solution is clearly better since you are not adding redundancy for fields like profilePicture, name, userID, etc ... This fields logically belong to the Chat entity, and not to its messages.
I hope this helps since properly setting up a database is vital for any good project.

Firebase Realtime Database vs Cloud Firestore

Edit: After posting the question I thought I could also make this post a quick reference for those of you needs a quick peek at some of the differences between these two technologies which might help you decide on one of them eventually. I will be editing this question and adding more info as I learn more.
I have decided to use firebase for the backend of my project. For firestore is says "the next generation of the realtime database". Now I am trying to decide which way to go. Realtime database or cloud firestore?
Billing:
At a first glance, it looks like firestore charges per number of results returned, number of reads, number of writes/updates etc. Real-time database charges based on the data transmitted. The number of read-write operations is irrelevant. They both also charge on the data stored on the google servers too (I think in this respect firestore is cheaper one). Why am I mentioning this price point? Because from my point of view, although it might a lower weight, it is also a point to consider while choosing the one over the other.
Scaling:
Cloudstore seems to scale horizontally seamlessly. I think this is not possible with the real-time database.
Edit:
In the real-time database, you need to shard your data yourself using multiple databases. And you can only do this if you are in BLAZE pracing plan.
ref: https://firebase.google.com/docs/database/usage/sharding
Performance & Indexing:
Another thing is the real-time database data structure is different in both. The real-time database stores the data as a JSON object in any way we structure them. Firestore structures the data as collections and documents. And hence the querying also changes between the two.
I think firestore does auto indexing which increases the read performance greatly too (which will decrease read performance). I am not sure if this is also the case with the real-time database.
Edit:
The real-time database does not automatically index your data. You need to do it yourself after a solid inspection of your data and your needs.
ref:https://firebase.google.com/docs/database/security/indexing-data
What other differences can you think of?
What would be (or has been) your choice for different types of projects?
Do you still go with the real-time database or have you migrated from that to the firestore? If so why?
And one last thing. How would you compare the SDKs of these two?
Thanks a lot!
What other differences can you think of?
what i think, ok. I use realtime-database for 6 months experience and difference is, firestore easy for sorting data. As Example, i want to retrieving user name based timestamp.
Query firstQuery = firestore.collection("Names").orderBy("timestamp", Query.Direction.DESCENDING).limit(10); // load 10 names
What would be (or has been) your choice for different types of
projects?
For me, Realtime-Database for Data Streaming when i work with Arduino, i want to store Drone Speed.
And Firestore for SMART OFFICE, like Air Conditioner, or light-room and Enterprise like Inventory Quantities, etc.
Do you still go with the real-time database or have you migrated from
that to the firestore? If so why?
still go with real-time because i need TREE for displaying streaming data strucure instead of query TABLE like firestore.

How to design a Cloud Firestore database schema

Migrating from realtime database to cloud firestore needs a total redesign of the database. For this I created an example with some main design decisions.
See picture and the database design in the spreadsheet below.
My two questions are:
1 - when I have a one to many relation is it also an option to store information as an array within the document? See line 8 in database design.
2 - Should I include only a reference, or duplicate all information in the one to many relation. See line 38 in the database model.
https://docs.google.com/spreadsheets/d/13KtzSwR67-6TQ3V9X73HGsI2EQDG9FA8WMN9CCHKq48/edit?usp=sharing
In general: keep the data store as shallow as possible, i.e., avoid subcollections and nesting.
Data can be related one-to-one, one-to-many, or many-to-many. Firestore is an automatically indexed realtime datastore. Firestore is often subscribed to rather than just a one time query/response (the realtime nature of the system).
Regarding the Firestore data model, always consider How will I query this data store?. Use subcollections, arrays, and maps sparingly (rarely) and only if you must (and you most likely don't need to). Use auto-id's vs human readable id's, e.g. use 000kztLDGafF4uKb8Cal rather than banana for document ID's.
As app functionality increases, server-side scripting with Cloud Functions for Firebase and/or the Admin SDK becomes an invaluable tool for managing (creating and indexing) many-to-many data relationships. For example, full-text search is not supported in Firestore. This boils down to what seems like a barrier to implementing robust search functionality on your app.
In conclusion, try and avoid subcollections, nesting, arrays, and maps. Follow the keep it simple stupid, KISS, principle. Once your app scales up and/or requires more functionality, server-side scripting can be utilized to to keep your app responsive (fast) while offering robust features.
For Question 1 there's a solution in the firestore docs:
https://cloud.google.com/firestore/docs/solutions/arrays
instead of using an array you use a map of values and set them to 'true' which allows you to query for them, like so:
teachers: {
"teacherid1": true,
"teacherid2": true,
"teacherid3": true
}
And for Question 2, you just need to save the teacher-ids because if you have those you can easily query for the corresponding data.

Resources