I'm building an apps that involved travel planning using flutter. this app will help people plan their travel by providing few options for them to choose from either cheapest, fastest, shortest etc.
I'm quite new with firebase and i need some advice with the data structure, I was thinking of having public transportation such as train. this train will have it's own schedule. What is the best way to structure this schedule inside the firestore. so that i could create a view that will display train schedule.
If it were me, I would want to map out how I intend to interact with the application. I usually draw this out by hand as I find it much quicker than trying to model something electronically. I review other similar apps at the same time and try and identify data that are missing. Start simple and build up the various data points that you want to include.
The main thing to note is that Cloud Firestore is a NoSQL, document-oriented database; there are no tables or rows. You store data in documents, which are organised into collections that contain a set of key-value pairs.
Once you have a basic structure, you can opt for one of the Firestore data structures and test it out:
Documents
Multiple collections
Subcollections within documents
Nested data
A typical use case for this might involve a chat app, and you want to store a user's three most recently visited chat rooms as a nested list in their profile.
Subcollections
You can create collections within documents when you have data that might expand over time; for example, you might create a collection of users or messages within chat room documents.
Root-level collections
You can create collections at the root level to organise distinct data sets for users and another for rooms and messages.
Related
I am considering storing multiple tenants in a single Firebase Firestore database. There will only be one collection per tenant and a few shared collections. Some will have more data than others. Some tenants may have a few million records while others may end up with a few billion. I want to confirm that the size of data in one collection will not impact the performance or storage of another collection in the same database.
I couldn't find much in the documentation about how the data is physically stored. Is all the data in Firestore stored in a single blob/file? If so, this could be a problem when there are hundreds of tenants with billions of records each. In an ideal world, each collection would be a physically separate file, and the server orchestration would separate the collections onto multiple servers so that a single server is not sharing the load between a very heavy tenant, and a very light tenant. This scenario would mean that a heavy tenant would slow down a light tenant.
My basic question is: can a single Firestore database infinitely scale up in size assuming that no single collection is bigger than a few billion records?
I know that there are two types of databases: native and datastore. Which of these seems more appropriate, and is the answer to my question different depending on which of these I select?
If the answer is that Firestore cannot scale infinitely in this way, what is the alternative approach? Should I be using Bigtable instead? Cassandra? Or, is there another way to physically divide my Firestore database other than collections?
Some tenants may have a few million records while others may end up with a few billion. I want to confirm that the size of data in one collection will not impact the performance or storage of another collection in the same database.
The performance in Firestore isn't related to the number of documents that exist in a collection. In terms of speed, it doesn't matter if you perform a query on:
A top-level (root-level) collection.
A sub-collection, which basically represents a collection that is nested under a document.
A collection group, which actually means querying collections and sub-collections that exist across the entire database.
The speed will always be the same, as long as the query returns the same number of documents. This is happening because the query performance depends on the number of documents you request and not on the number of documents you search. So it doesn't really matter if you query a collection with 1 MILLION documents or even 1 BILLION documents, the time for getting the same results will be the same.
I couldn't find much in the documentation about how the data is physically stored. Is all the data in Firestore stored in a single blob/file? If so, this could be a problem when there are hundreds of tenants with billions of records each.
In Cloud Firestore, the unit of storage is the document. Documents live in collections, which are simply containers for documents. Please note that Firestore is optimized for storing large collections of small documents. And when I say large, I mean extremely large. So when you perform a query against a collection of 1 MILLION documents, the speed depends on the number of results you return and it does not depend on the number of the documents in which you search, or on the number of documents that exist in other collections in which you aren't performing a search.
Can a single Firestore database infinitely scale up in size assuming that no single collection is bigger than a few billion records?
While when using the Firebase Realtime Database you had to scale using multiple databases, in Firestore this practice is not necessary. However, the are some techniques that are really good explained in the official docs:
Building scalable applications with Firestore
If the answer is that Firestore cannot scale infinitely in this way, what is the alternative approach?
I can definitely massively scale.
See the Firestore best practices and security rules.
You may conceptualize Firestore as being one service being shared by all of Google's customers. Just as Google's attempts to ensure that one customer's (so-called "noisy neighbor") impact on the service does not affect others, you don't want to be a noisy neighbor to yourself.
You need to consider more than just performance.
Security. E.g.see security rules as a mechanism that you may be able to use to help enforce segregation of your tenants' data. You will want to understand fully how to keep different customers' data separated securely. Your customers will want to understand what measures you're employing to ensure their data is keep separate too.
Multitenancy. Google Cloud Platform has no intrinsic (platform-wide) multitenant capabilities and, often, a way to manifest tenancy has been to use different Google Projects for different customers. This is because Projects provide a well-defined security perimeter. You may want to investigate whether (some subset of your customers) would benefit from being one customer, one project.
Quota. Another important consideration is quota. Every Cloud Platform method is constrained by some quota. You will want to be careful in ensuring that quota is distributed fairly across customers so that some customers don't consume all the quota denying other customers access to the service.
A newbie here.
I need help regarding Firebase sub collection referencing in a structured way where a user can select and pass information through sub collection.
=> Tournaments => Cities => Cairo => Year => High Goal => Team A
That goes like this from the root I have a list of cities let’s say
1. Cairo
2. Alexandria
3. Sixth October
I want to keep record of tournaments hosted each year by these cities based on years. Let’s say
a.
1. 2019
2. 2018
3. 2017
Each year there are 3 different competed cups let’s say
1. High goal
2. Medium goal
3. Low goal
Every competed cup has teams that participate in the tournament
1. Team A
2. Team B
3. Team C
I have added a visual representation of the app designed in adobe XD.
Data modeling for NoSQL databases depends as much on the use-cases of your app as it depends on the data that you store. So there is no "perfect" data model, nor are there nearly as many best practices (or normal forms) for NoSQL databases are there are for relational data models.
Firestore (which you seem to be looking to use), offers a few tools for modeling data:
The discrete unit of storage is called a document. Each document contains fields of various types, including nested fields, and a document can be up to 1MB in size.
Documents are stored in named collections.
You can nest collections under a document, and build hierarchies that way.
Each document has a unique path of the form /collection1/docid1/collection2/doc2 etc.
To write to a document, you must know its exact path.
You can query a collection for a subset of the documents in there.
You can query across all collections with the same name, no matter their path in the database.
The performance of queries depends solely on the number of documents you retrieve, and not on the number of documents in the collection(s).
There are probably quite a few more rules, but these should be enough to get you started.
I typically recommend writing a list of your top 3-5 use-cases, and determining what reads/queries you need for that. With those queries, you can then start defining your data model, and implementing your application code.
Then each time you add a use-case, you figure out how to read/write the data for that use-case, and potentially change/expand the data model to allow for the new and existing use-cases. If you get stuck when adding a specific use-case, report back here and we can try to help.
Some good additional material to get started:
NoSQL data modeling
Getting to know Cloud Firestore
Firebase for SQL developers, which is for Firebase's other NoSQL database, but is a great primer on NoSQL modeling too.
The article about Best practices for Cloud Firestore states that we should keep the rate of write operations for an individual collection under 1,000 operations/second.
But at the same time, the Firebase team says in Choose a data structure that root-level collections "offer the most flexibility and scalability".
What if I have a root-level collection (e.g. "messages") which expects to have more than 1,000 write operations/second?
If you think at that limitation of 1,000 operations/second it's pretty much but if you find your self in a situation in which you need more than that, then you should consider changing your database schema to allow writes on multiple collections. So you should multiply the number of collections. Having a single collection of messages, in which every user can add messages doesn't sound as a good way to go since you can reach that limitation very soon. In this case you should split that collection into multiple other collections. A possible schema might be the one I have explained in the following video:
https://www.youtube.com/watch?v=u3KwKQddPoo
See, at the end of that video, there is collection named messages which in term contains a roomId document. This document contains a subcollection named roomMessages which contains as documents all messages from a chat room. In this case, there are no chances you can reach that limitation.
But at the same time, the Firebase team says in Choose a data structure that root-level collections "offer the most flexibility and scalability".
But also rememeber, Firestore can as quickly look up a collection at level 1 as it can at level 100, so you don't need to worry about that.
The limit of 1,000 ops/sec per collection only apply to realtime update, so as long as you don't have a snapshot listener this should be okay.
I asked the question on the Cloud Firestore Google Groups
The limit is 10,000 writes per second if no other limits apply first:
https://firebase.google.com/docs/firestore/quotas#writes_and_transactions
Also just keep in mind the best practices for scaling cloud firestore
I'm totally new to Firebase, and I'm trying to get my head round the best db model design for 'relational' data, both 1-1 and 1-many.
We are using the Firestore db (not the realtime db).
Say we have Projects which can contain many Users, and a User can be in multiple Projects
The UI needs to show a list of Users in a Project which shows things like email, firstname, lastname and department.
What is the best way to store the relationship?
An array of User ids in the Project document?
A map of Ids in the Project document?
Ive read the above approaches were recommended, but was that for realtime database? Firestore supports Sub Collections, which sound more appropriate...
A sub collection of Users in the Project document?
A separate collection mapping Project id to User id?
A Reference data type? I've read here https://firebase.google.com/docs/firestore/manage-data/data-types about Reference data type, which sounds like what I want, but I cant find any more on it!
If its just a map or array of Ids, how would you then retrieve the remaining data about the user? Would this have to sit in the application UI?
If its a sub collection of Users documents, is there any way to maintain data integrity? If a user changed their name, would the UI / a cloudFunction then have to update every entry of that users name in the Sub collections?
any help / pointers appreciated...
The approach for modeling many-to-many relationships in Firestore is pretty much the same as it was in Firebase's Realtime Database, which I've answered here: Many to Many relationship in Firebase. The only difference is indeed that you can store the lookup list in a sub-collection of each project/user.
Looking up the linked item is also the same as before, it indeed requires loading them individually from the client. Such a client-side join is not nearly as slow as you may initially expect, so test it before assuming it can't possibly be fast enough.
Ensuring data integrity can be accomplished by performing batched writes or using transactions. These either completely succeed or completely fail.
I have a collection, itemsCollection, which contains a very large amount of small itemDocs. Each itemDoc has a subcollection, statistics. Each itemDoc also has a field "owner" which indicates which user owns the itemDoc.
itemsCollection
itemDoc1
statistics
itemDoc2
statistics
itemDoc3
statistics
itemDoc4
statistics
...
I also have a collection, usersCollection, which contains basic user info.
usersCollection
user1
user2
user3
...
Since each itemDoc belongs to a specific user, it's necessary to display to each user which itemDocs they own. I have been using the query:
db.collection("itemsCollection").where("owner", "==", "user1")
I am wondering if this will scale effectively, i.e. whenever itemsCollection gets to be millions of records? If not, is the best solution to duplicate each itemDoc and its statistics subcollection as a subcollection in the user document, or should I be doing something else?
As Alex Dufter, the product manager from Firebase, explained in one of days at Firebase Dev Summit 2017 that Firestore was inspired in many ways by the feed-back that they had on the Firebase Realtime Database over the years. They faced two types of issues:
Data modelling and querying. Firebase Realtime Database cannot query over multiple properties because it ussaly involves duplication data or cliend-side filtering, which we all already know that is some kind of messy.
Realtime Database does not scale automatically.
With this new product, they say that you can now build an app and grow it to planetary scale without changing a single line of code. Cloud Firestore is also a NoSQL database that was build specifically for mobile and web app development. It's flexible to build all kinds of apps and scalable to grow to any size.
So because the new database was build knowing this iusses, duplication data is not nedeed anymore. So you will not have to worry about using that line of code, even if your data will grow to millions of records, it will scale automatically. But one thing you need to remember, if you will use multiple conditions, don't forget to use the indexes by simply adding them in the Firebase console. Here are two simple examples from the offical documentation:
citiesRef.whereEqualTo("state", "CO").whereEqualTo("name", "Denver");
citiesRef.whereEqualTo("state", "CA").whereLessThan("population", 1000000);