The application I'm currently working on needs real-time communication that is scalable. We have been looking into and tried out Firebase real-time database and firestore. It seems Firebase real-time database is more mature and tested out, while firestore is still in beta, which is why we are leaning towards the real-time database.
We are however worried about its scaling capabilities in our context. Our queries will mainly be geo spatial based on the user's location. According to Firebase simultaneous realtime connections to my database and https://firebase.google.com/pricing/#faq-simultaneous the maximum number of concurrent users is 100.000, which will be too low for our needs.
According to their documentation, it seems like database sharding is the way to scale beyond 100.000 concurrent users https://firebase.google.com/docs/database/usage/sharding. Since our queries are based on the user's location, we could group the data into regions, e.g. US West, US Central, and US East and have a database instance for each of those three regions.
While this method may work, it seems very cumbersome to set it up. We would probably need to have a service which the user initially connects to in order to be redirected to the correct database instance that fits the region which the user is in. Additionally, it should handle the case where a user moves into another region, and should therefore be redirected to another database instance containing the data for that specific region.
Another complex task would be to distribute the data into the correct database instances.
Is there a more simple approach to scale beyond 100.000 users or is it possible to increase the amount of concurrent connections for a single Firebase real-time database?
To me it seems like almost a waste to use Firebase if it requires you to do so much "load" balancing yourself.
The 100K concurrent connections is a hard cap on the Firebase Realtime Database.
The approach you describe with a two-step connect is quite idiomatic. The first step is usually quite simple. In fact for many apps it is part of their authentication flow, or based on the outcome of that. For example, many apps base the user's shard on a hash of their UID.
In your case, you could inject the users region into their token as a custom claim when they register. Then you'd get that claim when they sign in, and can redirect them to their shard. You could also persist the shard info in the client when they first connect, so that you only have to determine that only once for each client/device.
Is there a more simple approach to scale beyond 100.000 users or is it
possible to increase the amount of concurrent connections for a single
Firebase real-time database?
Yes. use Firestore database.
Scales completely automatically. Currently, scaling limits are:
Around 1 million concurrent connections and 10,000 writes/second. (they plan to increase these limits in the future) (source)
Maximum write rate to a document is 1 per second (source)
Is officially out of beta and in General Availability from 31/1/2019 (source)
Related
Despite not performing too many read/write operations - 170r, 4w to my Firestore database, I appear to have a huge number of API calls - 15,093 (See image below). The cause of this high calls to read/write ratio should be accounted to my application's use of network streams. My question is, should this be considered as a billable metric, or should I not worry if this runs into the millions so long as read/write are within limits (theoretically, I've never seen this happen on my own account).
I'm inclined to believe that I needn't worry about this metric, as I can't seem to find it under either the Firebase or GCP quotas page.
It may also be considered that I use the google maps and directions API from GCP, although this aren't nearly used as much as Firestore.
Thanks.
According to this doc, you are charged for the following:
The number of documents you read, write, and delete.
The amount of storage that your database uses, including overhead
for metadata and indexes.
The amount of network bandwidth that you use.
I have thought about monitoring reads for users in Firebase.
Let's say that a user makes a lot of "reads", then I will monitor these "reads" by writing to a counter in a separate document. Once the user has reached 104 reads I will tell the user that they have reached their daily quota. In such a case I also want to disable them from making any further reads (for a day), is this possible in Firebase?
No, that's not currently possible. At least not in any secure way that would prevent the user from manipulating the value.
Your best solution at present would be to use Firestore or RTDB as a non-public database and then create an API behind the API Gateway to handle your rate limiting. Then reads go through your API giving you much more control but of course involves losing realtime updates.
I want to use firestore in my app due to the scaling limit being 1 million concurrent connections. I have found the pricing to be quite high especially when compared with the real time database, but cannot use this as it only scales to around 200k.
I was wondering whether I could use firestore which will be directly accessed on the client side for some of my data that will need live document listeners and use the realtime data for storing larger chunks of data which will be queried indirectly using firebase functions.
My question is:
if the only way to read/write the realtime database is through a cloud function which is called by the client side, will this only count as 1 concurrent connection as the client side is not directly connected to it?
Thank you
but cannot use [Realtime Database] as it only scales to around 200k.
Keep in mind that this is per database instance. On a paid project, you can create additional database instances to scale much further (even beyond the 1m concurrents that Firestore supports), as long as you are able/willing to define how to distribute your users over the database instances (commonly referred to as a "sharding strategy").
On your actual question: each Cloud Functions instance counts as a single connection to the database. Keep in mind here that Cloud Functions auto-scale, so you will have as many connections from Cloud Functions as you have concurrently running Cloud Functions instances. So while it may well be more than a single connection, it is extremely unlikely you'll reach the limit of 200K connections through this means.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
Is there any situation where it makes sense to use both realtime and firestore in conjunction? What situations lend themselves more favorably to firebase realtime vs firestore, or a combination? I keep reading about horror stories of people getting hit with huge costs is there anyway to test before hand.
For context I am looking to work with an auction based market place of over 50,000 products. The idea is to be able to filter those products as needed, create, modify and delete bids for those products, favorite items and retrieve Users bids. From what I was reading the general suggestion (to keep cost low) for market places using firebase seems to suggest storing products in realtime db and the user objects, sales etc. in firestore. Kinds of queries I will need are find products with the lowest/highest bids, most favorited items, as well as fetching users current and purchases.
When would it be optimal to store in realtime vs firestore, from a cost perspective?
My current logic is to store the product objects in realtime since they will be referenced more frequently. Alternatively I am thinking it makes sense to store the user info, their bids, and purchases in one document in a firestore as that would incur just one read cost, and for a highly active user could result in a large amount of data to be transferred. Where I am confused comes with things like viewing the previous sales of a given product vs getting a user's previous sales, should sales be stored in realtime (as their own object or embedded in the product object) or firestore (embedded in the User doc) or both?
Looking at your app that you plan to make, let have a short talk regarding it.
A bidding app, first someone wants to sell their stuff so they post it in your app. Then every single user of your app may see it start bidding on it. Now as I don't know how your app is going to work but here's my assumption you will store the data of bidders and the bids they make in firebase realtime database.
This will involve lots of read, write operations. Now Firestore does offer you 20K operations/day, but if you cross the limit it will barely cost you $0.18/100K writes and $0.06/100K reads. Now the choice entirely depends on scale of your app. If your app has large number of audience, go for Realtime-Database. You can download upto 10GB of data per month for free and a dollar per GB beyond that. But this has a catch, if you stick to the spark plan, you can have only 100 simultaneous connections to the database so I doubt the performance if you have large number of users. It can go upto 200K using Blaze plan and that too per database. So if you create another database you will have more. I will personally suggest create multiple databases as per the region or any parameter to spread the traffic. [Again it's upto how many people use your app]
In my opinion, you should use the Firebase Realtime database your app. [Make sure you utilize the firebase storage as well for storing large photos of the things on sale].
Lastly, use firestore when you have less number of operations but are larger in size. Use firebase realtime database when you have many small tasks like updating the highest bid value or number of users currently bidding for a particular thing, use Realtime DB.
In my opinion, go for realtime database. I too use it for some game stuff like to store user stats and update it as the user progresses. This involves lots of read/write/update/delete operations so I stick with realtime-database.
When to use Firestore along with the real-time database?
As you have mentioned user profile, I will suggest use Firestore to store those credentials. Because user's won't generally update their profile so this won't cost much writes. Also the bidders would be much interest in bidding rather than watching others profiles. So even if if a few users check other's profile. This won't cost you much reads. But even if your app is designed in such a way that bidder must check seller's profile once, then firestore will definitely help you reduce usage of realtime database's [GB Downloaded] quota.
Every time someone queries data from your realtime database, you consume some part of the 10 GB of free download limit.
Also as I have mentioned the simultaneous connections to the database, if you host user profile data in Firestore then firestore will take care of profile visits so that bidders get faster response from your application. Just make sure you utilise all the free quotas from firebase storage, firestore and the realtime database and make sure your app is designed in such a way that it spreads traffic evenly between all services. Use the cloud functions on your back-end, and don't make your application [.apk] too heavy on client side as the app needs a lot to code.
So the conclusion, use firestore to store data which won't be accessed frequently like the user credentials and whatever stuff they have on sell. Use realtime database to store bidding data. Oh and yes, if you also want to store some stats like how many purchases has someone made or some information that changes too frequently put that in firebase realtime database. You can simply create a child node users/${username} and keep the frequent stuff in realtime database. This won't cost you much storage but take of that download limit. Shouldn't be expensive much especially talking of your app is going to address 50000 products XD.
I am looking to work with an auction based market place of over 50,000 products.
If you have comparatively less number of users, realtime database is sufficient but who knows when there may be a huge rise in your app users. So it's better to spread the data in both Firestore and Real-time database as mentioned above.
Just a caution: This is what I faced, then searched over stackoverflow and found this. Firestore counts READS even if you are just scrolling over the data tab in firestore. So make sure you don't just get surfing over there. I made 2 writes and was just looking at how the data is being stored and I already got 27 reads ...
I know scalability is not an issue in Firebase and supports up to 100k Simultaneous connections(in general).
Based on pricing documentation:
You can create multiple database instances to go beyond the 100K
concurrent limit. See Pricing FAQ for more information.
Question 1: What if there is more than 200k users using simultaneously on the same database? The other half of the users could not query, connect or the request will be placed in queue?
(As a Firebase plan subscriber, I would like to know how Firebase deals with the problem to ensure the quality of the services provided to our customers are always in top-notch)
Since, App globalisation is common nowadays and many companies' practices are to have servers across multiple regions to provide better and stable performance. Online game for example which required low latency.
As for now, the firebase user is required to set the default location when creating the project which is non-editable afterward. Some issues even rises where the users realised they deployed their app to the wrong regions and do not have clues on how to change the regions.
This represents the country/region of your organisation/company. Your
selection also sets the appropriate currency for your revenue
reporting. The selected country does not determine the location of
your data for Firebase features. Google may process and store Customer
Data anywhere Google or its agents maintain facilities.
Question 2: Will or does Firebase provide a solution / tailor-made to such practice which having our database in multiple regions while having a headquartered region and multiple other regions sharing all the databases, functions and auth across the regions?
(For now to have multiple servers location, we have to create different projects and the user and data syncing will be a problem)
Hope the language does not offend, cheers!
It seems like your question (or at least your assumptions) is based on the Firebase Realtime Database, so I'll answer for that below.
Q1) You can create more than 2 databases in a single project, each of which allows 100K connections. So it can scale beyond 200K connections. All of these are hosted in the same region though, so you can't use each database for a separate region.
Q2) For a database solution that handles multiple regions, I'd recommend looking at Cloud Firestore. Also see: Cloud Firestore - selecting region to store data?