Firestore max. number of collections? - firebase

this is my first time using Firestore and I am confused about the limit number of collections that I can create. Is there a limit?
-I need suggestions for another thing as well. I am building an app that will require different tables in the database such as Restaurants, Clients and Reservations. In Firestore there are no tables since it is a non-SQL DB, so does a 'Collection' serve as a 'Table'? What about 'Document'?

The documentation doesn't say anything about maximum number of collections. They are essentially just containers for documents, so there is no practical limit that you should be concerned about.
A SQL table is roughly analogous to a Cloud Firestore collection. A SQL row is roughly analogous to a document. It's advisable to think of Cloud Firestore not in terms of what you know in SQL, but on its own terms.

Related

Are Firestore Collections Physically Isolated from Each Other?

I am considering storing multiple tenants in a single Firebase Firestore database. There will only be one collection per tenant and a few shared collections. Some will have more data than others. Some tenants may have a few million records while others may end up with a few billion. I want to confirm that the size of data in one collection will not impact the performance or storage of another collection in the same database.
I couldn't find much in the documentation about how the data is physically stored. Is all the data in Firestore stored in a single blob/file? If so, this could be a problem when there are hundreds of tenants with billions of records each. In an ideal world, each collection would be a physically separate file, and the server orchestration would separate the collections onto multiple servers so that a single server is not sharing the load between a very heavy tenant, and a very light tenant. This scenario would mean that a heavy tenant would slow down a light tenant.
My basic question is: can a single Firestore database infinitely scale up in size assuming that no single collection is bigger than a few billion records?
I know that there are two types of databases: native and datastore. Which of these seems more appropriate, and is the answer to my question different depending on which of these I select?
If the answer is that Firestore cannot scale infinitely in this way, what is the alternative approach? Should I be using Bigtable instead? Cassandra? Or, is there another way to physically divide my Firestore database other than collections?
Some tenants may have a few million records while others may end up with a few billion. I want to confirm that the size of data in one collection will not impact the performance or storage of another collection in the same database.
The performance in Firestore isn't related to the number of documents that exist in a collection. In terms of speed, it doesn't matter if you perform a query on:
A top-level (root-level) collection.
A sub-collection, which basically represents a collection that is nested under a document.
A collection group, which actually means querying collections and sub-collections that exist across the entire database.
The speed will always be the same, as long as the query returns the same number of documents. This is happening because the query performance depends on the number of documents you request and not on the number of documents you search. So it doesn't really matter if you query a collection with 1 MILLION documents or even 1 BILLION documents, the time for getting the same results will be the same.
I couldn't find much in the documentation about how the data is physically stored. Is all the data in Firestore stored in a single blob/file? If so, this could be a problem when there are hundreds of tenants with billions of records each.
In Cloud Firestore, the unit of storage is the document. Documents live in collections, which are simply containers for documents. Please note that Firestore is optimized for storing large collections of small documents. And when I say large, I mean extremely large. So when you perform a query against a collection of 1 MILLION documents, the speed depends on the number of results you return and it does not depend on the number of the documents in which you search, or on the number of documents that exist in other collections in which you aren't performing a search.
Can a single Firestore database infinitely scale up in size assuming that no single collection is bigger than a few billion records?
While when using the Firebase Realtime Database you had to scale using multiple databases, in Firestore this practice is not necessary. However, the are some techniques that are really good explained in the official docs:
Building scalable applications with Firestore
If the answer is that Firestore cannot scale infinitely in this way, what is the alternative approach?
I can definitely massively scale.
See the Firestore best practices and security rules.
You may conceptualize Firestore as being one service being shared by all of Google's customers. Just as Google's attempts to ensure that one customer's (so-called "noisy neighbor") impact on the service does not affect others, you don't want to be a noisy neighbor to yourself.
You need to consider more than just performance.
Security. E.g.see security rules as a mechanism that you may be able to use to help enforce segregation of your tenants' data. You will want to understand fully how to keep different customers' data separated securely. Your customers will want to understand what measures you're employing to ensure their data is keep separate too.
Multitenancy. Google Cloud Platform has no intrinsic (platform-wide) multitenant capabilities and, often, a way to manifest tenancy has been to use different Google Projects for different customers. This is because Projects provide a well-defined security perimeter. You may want to investigate whether (some subset of your customers) would benefit from being one customer, one project.
Quota. Another important consideration is quota. Every Cloud Platform method is constrained by some quota. You will want to be careful in ensuring that quota is distributed fairly across customers so that some customers don't consume all the quota denying other customers access to the service.

Firestore Collection Write Rate

The article about Best practices for Cloud Firestore states that we should keep the rate of write operations for an individual collection under 1,000 operations/second.
But at the same time, the Firebase team says in Choose a data structure that root-level collections "offer the most flexibility and scalability".
What if I have a root-level collection (e.g. "messages") which expects to have more than 1,000 write operations/second?
If you think at that limitation of 1,000 operations/second it's pretty much but if you find your self in a situation in which you need more than that, then you should consider changing your database schema to allow writes on multiple collections. So you should multiply the number of collections. Having a single collection of messages, in which every user can add messages doesn't sound as a good way to go since you can reach that limitation very soon. In this case you should split that collection into multiple other collections. A possible schema might be the one I have explained in the following video:
https://www.youtube.com/watch?v=u3KwKQddPoo
See, at the end of that video, there is collection named messages which in term contains a roomId document. This document contains a subcollection named roomMessages which contains as documents all messages from a chat room. In this case, there are no chances you can reach that limitation.
But at the same time, the Firebase team says in Choose a data structure that root-level collections "offer the most flexibility and scalability".
But also rememeber, Firestore can as quickly look up a collection at level 1 as it can at level 100, so you don't need to worry about that.
The limit of 1,000 ops/sec per collection only apply to realtime update, so as long as you don't have a snapshot listener this should be okay.
I asked the question on the Cloud Firestore Google Groups
The limit is 10,000 writes per second if no other limits apply first:
https://firebase.google.com/docs/firestore/quotas#writes_and_transactions
Also just keep in mind the best practices for scaling cloud firestore

Firestore Realtime Updates 1M Limit

When using Firestore and subscribing to document updates, it states a limit of 1M concurrent mobile/web connections per database.
https://firebase.google.com/docs/firestore/quotas#realtime_updates
Is that a hard limit (enforced/throttled in code)? Or is it a theoretical limit (like you're safe up to 1M, then things get dicey)? Is it possible to get an uplift?
Trying to understand how to support a large user base without needing to shard the database (which is one of the advantages of Firestore). Even at 5M users, it seems you would start having problems because you'd probably hit times when >20% of those users were on your app simultaneously.
As you already noticed, the maximum size of a single document in Firestore is 1 Megabyte. Trying to store large number of objects (maps) that may exceed this limitation, is generally considered a bad design.
You should reconsider the logic of you app and think at the reson why you need to have more than 1Mib in single a document, rather than each object being their own document. So to be able to use Firestore, you should change the way you are holding the data from within a single documents to a collection. In case of collections, there are no limitations. You can add as many documents as you want. According to the official documentation regarding Cloud Firestore Data model:
Cloud Firestore is optimized for storing large collections of small documents.
IMHO, you should take advantage of this feature.
For details, I recommend you see my answer from this post where I have explained some practices regarding storing data in arrays (documents), maps or collections.
Edit:
Without sharding, I'm affraid it is not an option. So in this case, sharding will work for sure. So in my opinion, that's certainly a reasonable option.

Firebase Realtime Database vs Cloud Firestore

Edit: After posting the question I thought I could also make this post a quick reference for those of you needs a quick peek at some of the differences between these two technologies which might help you decide on one of them eventually. I will be editing this question and adding more info as I learn more.
I have decided to use firebase for the backend of my project. For firestore is says "the next generation of the realtime database". Now I am trying to decide which way to go. Realtime database or cloud firestore?
Billing:
At a first glance, it looks like firestore charges per number of results returned, number of reads, number of writes/updates etc. Real-time database charges based on the data transmitted. The number of read-write operations is irrelevant. They both also charge on the data stored on the google servers too (I think in this respect firestore is cheaper one). Why am I mentioning this price point? Because from my point of view, although it might a lower weight, it is also a point to consider while choosing the one over the other.
Scaling:
Cloudstore seems to scale horizontally seamlessly. I think this is not possible with the real-time database.
Edit:
In the real-time database, you need to shard your data yourself using multiple databases. And you can only do this if you are in BLAZE pracing plan.
ref: https://firebase.google.com/docs/database/usage/sharding
Performance & Indexing:
Another thing is the real-time database data structure is different in both. The real-time database stores the data as a JSON object in any way we structure them. Firestore structures the data as collections and documents. And hence the querying also changes between the two.
I think firestore does auto indexing which increases the read performance greatly too (which will decrease read performance). I am not sure if this is also the case with the real-time database.
Edit:
The real-time database does not automatically index your data. You need to do it yourself after a solid inspection of your data and your needs.
ref:https://firebase.google.com/docs/database/security/indexing-data
What other differences can you think of?
What would be (or has been) your choice for different types of projects?
Do you still go with the real-time database or have you migrated from that to the firestore? If so why?
And one last thing. How would you compare the SDKs of these two?
Thanks a lot!
What other differences can you think of?
what i think, ok. I use realtime-database for 6 months experience and difference is, firestore easy for sorting data. As Example, i want to retrieving user name based timestamp.
Query firstQuery = firestore.collection("Names").orderBy("timestamp", Query.Direction.DESCENDING).limit(10); // load 10 names
What would be (or has been) your choice for different types of
projects?
For me, Realtime-Database for Data Streaming when i work with Arduino, i want to store Drone Speed.
And Firestore for SMART OFFICE, like Air Conditioner, or light-room and Enterprise like Inventory Quantities, etc.
Do you still go with the real-time database or have you migrated from
that to the firestore? If so why?
still go with real-time because i need TREE for displaying streaming data strucure instead of query TABLE like firestore.

Should I use redundancy or a simple query on a large dataset with Firebase Cloud Firestore database?

I have a collection, itemsCollection, which contains a very large amount of small itemDocs. Each itemDoc has a subcollection, statistics. Each itemDoc also has a field "owner" which indicates which user owns the itemDoc.
itemsCollection
itemDoc1
statistics
itemDoc2
statistics
itemDoc3
statistics
itemDoc4
statistics
...
I also have a collection, usersCollection, which contains basic user info.
usersCollection
user1
user2
user3
...
Since each itemDoc belongs to a specific user, it's necessary to display to each user which itemDocs they own. I have been using the query:
db.collection("itemsCollection").where("owner", "==", "user1")
I am wondering if this will scale effectively, i.e. whenever itemsCollection gets to be millions of records? If not, is the best solution to duplicate each itemDoc and its statistics subcollection as a subcollection in the user document, or should I be doing something else?
As Alex Dufter, the product manager from Firebase, explained in one of days at Firebase Dev Summit 2017 that Firestore was inspired in many ways by the feed-back that they had on the Firebase Realtime Database over the years. They faced two types of issues:
Data modelling and querying. Firebase Realtime Database cannot query over multiple properties because it ussaly involves duplication data or cliend-side filtering, which we all already know that is some kind of messy.
Realtime Database does not scale automatically.
With this new product, they say that you can now build an app and grow it to planetary scale without changing a single line of code. Cloud Firestore is also a NoSQL database that was build specifically for mobile and web app development. It's flexible to build all kinds of apps and scalable to grow to any size.
So because the new database was build knowing this iusses, duplication data is not nedeed anymore. So you will not have to worry about using that line of code, even if your data will grow to millions of records, it will scale automatically. But one thing you need to remember, if you will use multiple conditions, don't forget to use the indexes by simply adding them in the Firebase console. Here are two simple examples from the offical documentation:
citiesRef.whereEqualTo("state", "CO").whereEqualTo("name", "Denver");
citiesRef.whereEqualTo("state", "CA").whereLessThan("population", 1000000);

Resources