Number of READS in firestore and the basis of its calculation - firebase

I still fail to understand the calculation of no. of reads on Firestore. Just as an experiment, I just sat a the Firestore console without doing anything, no devices connected, no mobile, no emulator nothing, and the no. of reads registered in under the usage TAB was about 600 reads in about 10 minutes. So my guess is, if it's a real app out there, 50000 reads will be breached in no time at all! Can someone please explain FIRESTORE READS and its fundamentals?

The number of reads in Firestore is always equal to the number of documents that are returned from the server by a query. Let's say you have a collection of 1 million documents, but your query only returns 10 documents, then you'll have to pay only 10 document reads.
If your query yields no results, according to the official documentation regarding Firestore pricing, it said that:
Minimum charge for queries
There is a minimum charge of one document read for each query that you perform, even if the query returns no results.
Those unexpected reads most likely come from the fact that you are using the Firebase console. All operations that you perform in the console are counted towards the total quota. So please remember to not keeping your Firebase console open, as it is considered another Firestore client that reads data. So you'll be also billed for the reads that are coming from the console.

Related

Firebase Firestore Read Costs - Clarification

I am using Firestore DB for an e-commerce app. I have a collection of products, each product has a document that has a "title" field and "search_keywords" field. The search keyword field stores an array. For example, if the title="apple", then the "search_keywords" field would store the following array: ["a","ap","app","appl","apple"]. When the user starts typing "apple" in the search box, I want to show the user, all products where "search_keywords" contains "a", then when they type the "p", I want to show all products where search keywords contain "ap"...and so on. Here is the snippet of code that gets called each time an additional letter is typed:
firebaseFireStore.collection("Produce").whereArrayContains("search_keywords", toSearch).get()
For example, in every case, the documents that would be returned on each successive call where an additional letter was typed would be a subset of what was returned in the previous call - it would just be a smaller list of documents - documents that were read on the previous query. My question is since the documents retrieved on a successive query are a subset of those retrieved in a prior query, would I be charged reads based on how many documents each successive query returns, or would Firestore have them in the cache and read them from there since the successive result set is a subset of a prior result set. This question has been on my mind for a while and every time I search for it, I can't seem to find a clear answer. For example, based on my research, the following two posts on Stackoverflow have involved similar questions and the following are relevant quotes from there, but they seem to contradict each other because #AlexMamo says "it will always read the online version of the documents...[when online]" and #Doug Stevenson says "if the local persistence is enabled on your client (it is by default) and the documents haven't been updated in the server...[it will get them from the cache]". I would appreciate any clarification on this if anyone knows the answer. Thanks.
"If the OP has offline persistence enabled, which is by default in Cloud Firestore, then he will be able to read the cache only while offline. When the OP has internet connectivity, it will always read the online version of the documents." –
Alex Mamo (https://stackoverflow.com/a/69320068/14556386)
"According to this answer by Doug Stevenson, the reads are only charged when performed upon the server, not your local cache. That is if the local persistence is enabled on your client (it is by default) and the documents haven't been updated in the server."
(https://stackoverflow.com/a/61381656/14556386)
EDIT: In addition, if for each product document that was retrieved by the Firestore search, I download its corresponding image file from Firebase Storage. Would it charge me for downloading that file on successive attempts to download it or would it recognize that I had previously downloaded that image and fetch it from cache automatically?
First of all, storing ["a", "ap", "app", "appl", "apple"] into an array and performing an whereArrayContains() query, doesn't sound like a feasible idea. Why? Imagine you have a really big online shop with 100k products, in which 5k start with "a". Are you willing to pay 5k reads every time a user types "a"? That's a very costly feature.
Most likely you should return the corresponding documents when the user types, for example, two, or even three characters. You'll reduce costs enormously. Or you might take into consideration using the solution I have explained in the following article:
How to filter Firestore data cheaper?
Let's go forward.
For example, in every case, the documents that would be returned on each successive call where an additional letter was typed would be a subset of what was returned in the previous call, it would just be a smaller list of documents.
Yes, that's correct.
My question is since the documents retrieved on a successive query are a subset of those retrieved in a prior query, would I be charged reads based on how many documents each successive query returns?
Yes. You'll always be charged with a number of reads that is equal to the number of documents that are returned by your query. It doesn't matter if a query was previously performed, or not. Every time you perform a new query, you'll be charged with a number of reads that is equal to the number of documents you get.
For example, let's assume you perform this query:
.whereArrayContains("search_keywords", "a")
And you get the 100 documents, and right after that you perform:
.whereArrayContains("search_keywords", "ap")
And you get only 30 documents, you'll have to pay 130 reads, and not only 100. So it doesn't matter if the documents that are returned by the second query are a subset of the documents that are returned by the first query.
Or would Firestore have them in the cache and read them from there since the successive result set is a subset of a prior result set.
No, it won't. It will read those documents from the cache only if the user losses the internet connectivity, otherwise it will always read the online versions of the documents that exist on the Firebase servers. The cached version of the documents works only when the user is offline. I have also written an article on this topic called:
How to drastically reduce the number of reads when no documents are changed in Firestore?
In Doug's answer:
Am I charged with read operations everytime the location is changed?
He clearly says:
You are charged for the number of documents read on the server every time you call get().
So if you called get(), you have to pay as reads, the number of documents that are returned.
The following statement is available:
If local persistence is enabled in your client (it is by default), then the documents may come from the cache if the documents are also not changed on the server.
When you are listening for real-time updates. According to the docs:
When you listen to the results of a query, you are charged for a read each time a document in the result set is added or updated. You are also charged for a read when a document is removed from the result set because the document has changed.
And I would add, if nothing has changed, you don't have to pay anything. Again, according to the same docs:
Also, if the listener is disconnected for more than 30 minutes (for example, if the user goes offline), you will be charged for reads as if you had issued a brand-new query.
So if the listener is active, you always read the documents from the cache. Bear in mind that a get() operation is different than listening for real-time updates.
if for each product document that was retrieved by the Firestore search, I download its corresponding image file from Firebase Storage. Would it charge me for downloading that file on successive attempts to download it or would it recognize that I had previously downloaded that image and fetch it from cache automatically?
You'll always be charged if you download the image over and over again unless you are using a library that helps you cache the images. For Android, there is a library called Glide:
Glide is a fast and efficient open-source media management and image loading framework for Android that wraps media decoding, memory and disk caching, and resource pooling into a simple and easy-to-use interface.

Firebase query charges additional reads?

How does firebase query works?
for example, if i write this query,
var collectionReference = FirebaseFirestore.instance
.collection('collection')
.where(cond)
.where(cond2)
.where(cond3);
So is this gonna return only the documents which fit the conditions?
AND I AM GOING TO GET CHARGED ONLY FOR THOSE DOCUMENT READS?
from docs TL:DR
Charges for reads have some nuances that you should keep in mind. The following sections explain these nuances in detail.
Listening to query results
Cloud Firestore allows you to listen to the results of a query and get realtime updates when the query results change.
When you listen to the results of a query, you are charged for a read each time a document in the result set is added or updated. You are also charged for a read when a document is removed from the result set because the document has changed. (In contrast, when a document is deleted, you are not charged for a read.)
Also, if the listener is disconnected for more than 30 minutes (for example, if the user goes offline), you will be charged for reads as if you had issued a brand-new query.
Managing large result sets
Cloud Firestore has several features to help you manage queries that return a large number of results:
Cursors, which allow you to resume a long-running query.
Page tokens, which help you paginate the query results.
Limits, which specify how many results to retrieve
Offsets, which allow you to skip a fixed number of documents.
There are no additional costs for using cursors, page tokens, and limits. In fact, these features can help you save money by reading only the documents that you actually need.
However, when you send a query that includes an offset, you are charged a read for each skipped document. For example, if your query uses an offset of 10, and the query returns 1 document, you are charged for 11 reads. Because of this additional cost, you should use cursors instead of offsets whenever possible.
Queries other than document reads
For queries other than document reads, such as a request for a list of collection IDs, you are billed for one document read. If fetching the complete set of results requires more than one request (for example, if you are using pagination), you are billed once per request.
Minimum charge for queries
There is a minimum charge of one document read for each query that you perform, even if the query returns no results.

Firestore Document "Too much contention": such thing in realtime database?

I've built an app that let people sell tickets for events. Whenever a ticket is sold, I update the document that represents the ticket of the event in firestore to update the stats.
On peak times, this document is updated quite a lot (10x a second maybe). Sometimes transactions to this item document fail due to the fact that there is "too much contention", which results in inaccurate stats since the stat update is dropped. I guess this is the result of the high load on the document.
To resolve this problem, I am considering to move the stats of the items from the item document in firestore to the realtime database. Before I do, I want to be sure that this will actually resolve the problem I had with the contention on my item document. Can the realtime database handle such load better than a firestore document? Is it considered good practice to move such data to the realtime database?
The issue you're running into is a documented limit of Firestore. There is a limit to the rate of sustained writes to a single document of 1 per second. You might be able to burst writes faster than that for a while, but eventually the writes will fail, as you're seeing.
Realtime Database has different documented limits. It's measured in the total volume of data written to the entire database. That limit is 64MB per minute. If you want to move to Realtime Database, as long as you are under that limit, you should be OK.
If you are effectively implementing a counter or some other data aggregation in Firestore, you should also look into the distributed counter solution that works around the per-document write limit by sharding data across multiple documents. Your client code would then have to use all of these document shards in order to present data.
As for whether or not any one of these is a "good practice", that's a matter of opinion, which is off topic for Stack Overflow. Do whatever works for your use case. I've heard of people successfully using either one.
On peak times, this document is updated quite a lot (10x a second maybe). Sometimes transactions to this item document fail due to the fact that there is "too much contention"
This is happening because Firestore cannot handle such a rate. According to the official documentation regarding quotas for writes and transactions:
Maximum write rate to a document: 1 per second
Sometimes it might work for two or even three writes per second but at some time will definitely fail. 10 writes per second are way too much.
To resolve this problem, I am considering to move the stats of the items from the item document in Firestore to the realtime database.
That's a solution that I even I use it for such cases.
According to the official documentation regarding usage and limits in Firebase Realtime database, there is no such limitation there. But it's up to you to decide if it fits your needs or not.
There one more thing that you need to into consideration, which is distributed counter. It can solve your problem for sure.

How to optimize firestore read per app launch

Understand firestore charge based on read / write operation.
But I notice that the firestore read from server per app launch, it will cause a big read count if many user open the app quite frequent.
Q1 Can I just limit user read from server for first time login. After that it just read for those update document per app launch?
For example there's a chat app group.
100 users
100 message
100 app launch / user / day
It will become 1,000,000 read count per day?
Which is ridiculous high.
Q2 Read is count per document, doesn't matter is root collection / sub collection, right?
For example, I read from a root collection that contain 10 subcollection and each of them having 10 documents, which will result 100 read count, am i right?
Thanks.
That’s correct, Cloud Firestore cares less about the amount of downloaded data and more about the number of performed operations.
As Cloud Firestore’s pricing depends on the number of reads, writes, and deletes that you perform, it means that if you had 100 users communicating within one chat room, each of the users would get an update once someone sends a message in that chat, therefore, increasing the number of read operations.
Since the number of read operations would be very much affected by the number of people in the same chatroom, Cloud Firestore suits best (price-wise) for a person-to-person chat app.
However, you could structure your app to have more chat rooms in order to decrease the volume of reads. Here you can see how to store different chat rooms, while the following link will guide you to the best practices on how to optimize your Cloud Firestore realtime updates.
Please keep in mind that Cloud Firestore itself does not have any rate limiting by default. However, Google Cloud Platform, has configurable billing alerts that apply to your entire project.
You can also limit the billing to $25/month by using the Flame plan, and if there is anything unclear in your bill, you can always contact Firebase support for help.
Regarding your second question, a read occurs any time a client gets data from a document. Remember, only the documents that are retrieved are counted - Cloud Firestore does searching through indexes, not the documents themselves.
By using subcollections, you can still retrieve data from a single document, which will count only as 1 read, or you can use a collection group query that will retrieve all the documents within the subcollection, counting into multiple reads depending on the amount of documents (in the example you put, it would be 10x10 = 100).

Does firestore charge for reads which does not return data

I checked the documentation on firebase but it does not mention the scenario where for example I have a collection with 100,000 records but the query that I am running does not bring back any result, which means none of the document satisfied the condition. Would I be still charged for checking 100,000 document ?
I currently have a cron job running in a node server which constantly queries the firestore database to look at records which have expired, it the record has expired (this is done by checking the timestamp with the current timestamp) then I am updating a field in the document. I noticed that I am being charged for the reads even though the result set was empty.
According to the Cloud Firestore billing:
“There is a minimum charge of one document read for each query that you perform, even if the query returns no results.”
All of your questions about Firestore billing should be made clear by reading the documentation. There are many different situations, and you'll possibly need to be aware of all of them, depending on your code.
But to briefly answer your question, you are only charged for documents that are actually delivered to the client, in the case of a simple query. The size of the collection is not considered at all for the purpose of counting documents read. Of course, if you have a large collection, you will increase the amount of billing based on its total storage size, including indexes.

Resources