Firestore Document "Too much contention": such thing in realtime database? - firebase

I've built an app that let people sell tickets for events. Whenever a ticket is sold, I update the document that represents the ticket of the event in firestore to update the stats.
On peak times, this document is updated quite a lot (10x a second maybe). Sometimes transactions to this item document fail due to the fact that there is "too much contention", which results in inaccurate stats since the stat update is dropped. I guess this is the result of the high load on the document.
To resolve this problem, I am considering to move the stats of the items from the item document in firestore to the realtime database. Before I do, I want to be sure that this will actually resolve the problem I had with the contention on my item document. Can the realtime database handle such load better than a firestore document? Is it considered good practice to move such data to the realtime database?

The issue you're running into is a documented limit of Firestore. There is a limit to the rate of sustained writes to a single document of 1 per second. You might be able to burst writes faster than that for a while, but eventually the writes will fail, as you're seeing.
Realtime Database has different documented limits. It's measured in the total volume of data written to the entire database. That limit is 64MB per minute. If you want to move to Realtime Database, as long as you are under that limit, you should be OK.
If you are effectively implementing a counter or some other data aggregation in Firestore, you should also look into the distributed counter solution that works around the per-document write limit by sharding data across multiple documents. Your client code would then have to use all of these document shards in order to present data.
As for whether or not any one of these is a "good practice", that's a matter of opinion, which is off topic for Stack Overflow. Do whatever works for your use case. I've heard of people successfully using either one.

On peak times, this document is updated quite a lot (10x a second maybe). Sometimes transactions to this item document fail due to the fact that there is "too much contention"
This is happening because Firestore cannot handle such a rate. According to the official documentation regarding quotas for writes and transactions:
Maximum write rate to a document: 1 per second
Sometimes it might work for two or even three writes per second but at some time will definitely fail. 10 writes per second are way too much.
To resolve this problem, I am considering to move the stats of the items from the item document in Firestore to the realtime database.
That's a solution that I even I use it for such cases.
According to the official documentation regarding usage and limits in Firebase Realtime database, there is no such limitation there. But it's up to you to decide if it fits your needs or not.
There one more thing that you need to into consideration, which is distributed counter. It can solve your problem for sure.

Related

Has firestore removed the soft limit of 1 write per second to a single document?

Firestore has always had a soft limit of 1 write per second to a single document. That meant that for doing things like a counter that updates more frequently than once per second, the recommended solution was sharded counters.
Looking at the Firestore Limits documentation, this limit appears to have disappeared. The Firebase summit presentation mentioned that Firestore is now more scalable, but only mentioned hard limits being removed.
Can anyone confirm whether this limit has indeed been removed, and we can remove all our sharded counters in favor of writing to a single count document tens or hundreds of times per second?
firebaser here
This was indeed removed from the documentation. It was always a soft limit that was not enforced in the code, but instead an estimate for the physical limitation of how long it takes to synchronize the changes to indexes and multiple data center.
We've significantly improved the infrastructure used for these write operations, and now provide tools such a the key visualizer to better analyze performance and look for hot spots in the read and write behavior of your app. While there's still some physical limit, we recommend using these tools rather than depending on a single documented value to analyze your app's database performance.
For most use-cases I'd recommend using the new COUNT() operator nowadays. But if you want to continue using write-time aggregation counters, it is still recommended to use a sharded counter for high-volume count operations, we've just stopped giving a hard number for when to use it.

Strength of atomic update in Firestore increment

I'm Firestore user recently diving into a concept of "atomic" update, especially Firestore documents' increment update. There is a classic article on Firestore increment in context of atomic update. And here comes my question.
Q, How strong is this atomic increment(number) update? Does this operation really have no limitation when it comes to operating truly atomically?
Let me explain a bit of details with an example case. We know that Firestore has a write limitation of 10,000 (up to 10 MiB per second) per db instance, and we also know that Firestore's increment method updates documents atomically. So, I hope to know if the below extreme example case would work perfectly atomically.
This Firestore instance only has a single document, and numerous users-maybe 10000 users maximum- update a single document using increment method, which increments a same field value as much as a random double number between 0 and 1 each, WITHIN a single second: 10000 updates in 1 second;
Above case makes use of Firestore write rate limit per second as much as possible, and all operations are updating a single field of same document. If increment method deals with update requests truly atomically, we might say all 10000 details will be calculated correctly into a single field.
But, this is only theoretic and conceptual idea, and it seems really hard for Firestore(or even any other db systems) to make no exception when it performs such an extreme set of increment operations when it has to deal with other upcoming operations linearly. It means that the Firestore instance would keep going on with upcoming API requests. This is a real world problem, actually. Let's say a lovely singer, Ariana Grande's Instagram post is just uploaded. If we deal with the event with Firestore document, we would have to deal with thousands of increment requests for likes per a single second.
So, i hope to know if there is truly no limitations for atomic increment method even there comes a set of high number of extremely concurrent increment requests to very few number of target documents. Hope this question reach to firebase gurus in the community! Comments are really welcomed! Thanks in advance [:
I'm not sure I understand your question completely, but I'll try to help anyway by explaining how Firestore and its increment operation work.
Firestore's main write limits come from the fact that data needs to be synchronized between data centers for each write operation. This is not a quota-type limit, but a physical limit of how fast data can be pushed across the wires.
Since you talk about frequent writes to a single document, you're going to sooner hit the soft limit of 1 sustained write per second per document. This is also caused by the physical nature of how the database works, and needs to synchronize the documents and indexes between servers/data centers.
While using the increment() operation means that no roundtrip is needed between the client and the server, it makes no difference to the data that needs to be read/written on the servers themselves. Therefore it makes no difference to the documented throughput limits.
If you need to perform counts beyond the documented throughput limits, have a look at the documentation on using a distributed counter.

How to avoid Firestore document write limit when implementing an Aggregate Query?

I need to keep track of the number of photos I have in a Photos collection. So I want to implement an Aggregate Query as detailed in the linked article.
My plan is to have a Cloud Function that runs whenever a Photo document is created or deleted, and then increment or decrement the aggregate counter as needed.
This will work, but I worry about running into the 1 write/document/second limit. Say that a user adds 10 images in a single import action. That is 10 executions of the Cloud Function in more-or-less the same time, and thus 10 writes to the Aggregate Query document more-or-less at the same time.
Looking around I have seen several mentions (like here) that the 1 write/doc/sec limit is for sustained periods of constant load, not short bursts. That sounds reassuring, but it isn't really reassuring enough to convince an employer that your choice of DB is a safe and secure option if all you have to go on is that 'some guy said it was OK on Google Groups'. Is there any official sources stating that short write bursts are OK, and if so, what definitions are there for a 'short burst'?
Or are there other ways to maintain an Aggregate Query result document without also subjecting all the aggregated documents to a very restrictive 1 write / second limitation across all the aggregated documents?
If you think that you'll see a sustained write rate of more than once per second, consider dividing the aggregation up in shards. In this scenario you have N aggregation docs, and each client/function picks one at random to write to. Then when a client needs the aggregate, it reads all these subdocuments and adds them up client-side. This approach is quite well explained in the Firebase documentation on distributed counters, and is also the approach used in the distributed counter Firebase Extension.

Firestore Realtime Updates 1M Limit

When using Firestore and subscribing to document updates, it states a limit of 1M concurrent mobile/web connections per database.
https://firebase.google.com/docs/firestore/quotas#realtime_updates
Is that a hard limit (enforced/throttled in code)? Or is it a theoretical limit (like you're safe up to 1M, then things get dicey)? Is it possible to get an uplift?
Trying to understand how to support a large user base without needing to shard the database (which is one of the advantages of Firestore). Even at 5M users, it seems you would start having problems because you'd probably hit times when >20% of those users were on your app simultaneously.
As you already noticed, the maximum size of a single document in Firestore is 1 Megabyte. Trying to store large number of objects (maps) that may exceed this limitation, is generally considered a bad design.
You should reconsider the logic of you app and think at the reson why you need to have more than 1Mib in single a document, rather than each object being their own document. So to be able to use Firestore, you should change the way you are holding the data from within a single documents to a collection. In case of collections, there are no limitations. You can add as many documents as you want. According to the official documentation regarding Cloud Firestore Data model:
Cloud Firestore is optimized for storing large collections of small documents.
IMHO, you should take advantage of this feature.
For details, I recommend you see my answer from this post where I have explained some practices regarding storing data in arrays (documents), maps or collections.
Edit:
Without sharding, I'm affraid it is not an option. So in this case, sharding will work for sure. So in my opinion, that's certainly a reasonable option.

Does firestore charge for reads which does not return data

I checked the documentation on firebase but it does not mention the scenario where for example I have a collection with 100,000 records but the query that I am running does not bring back any result, which means none of the document satisfied the condition. Would I be still charged for checking 100,000 document ?
I currently have a cron job running in a node server which constantly queries the firestore database to look at records which have expired, it the record has expired (this is done by checking the timestamp with the current timestamp) then I am updating a field in the document. I noticed that I am being charged for the reads even though the result set was empty.
According to the Cloud Firestore billing:
“There is a minimum charge of one document read for each query that you perform, even if the query returns no results.”
All of your questions about Firestore billing should be made clear by reading the documentation. There are many different situations, and you'll possibly need to be aware of all of them, depending on your code.
But to briefly answer your question, you are only charged for documents that are actually delivered to the client, in the case of a simple query. The size of the collection is not considered at all for the purpose of counting documents read. Of course, if you have a large collection, you will increase the amount of billing based on its total storage size, including indexes.

Resources