What constitutes a write action in Firestore? - firebase

I'm currently developing a Flutter web application using Firestore for data persistence. The app is not live in production, so I'm the only one accessing this backend. There is only one collection that holds a single document, with many nested fields (6 levels deep). My understanding from looking at https://firebase.google.com/docs/firestore/pricing, is that reads are counted per doc, so every time I reload my app it should count as one read, yet in the last 4 hours since I started working today I already hit 1.7K reads (as reported in the usage tab). I know I haven't reloaded the app that many times, and there's also no hidden loop that calls the collection multiple times.
This is the Flutter code that calls Firestore:
final sourceRef=FirebaseFirestore.instance.collection("source");
var data=await sourceRef.doc("stats").get();
What am I missing please?

According to Firebase pricing, writes are defined as:
You are charged for each document read, write, and delete that you perform with Cloud Firestore.
Charges for writes and deletes are straightforward. For writes, each set or update operation counts as a single write.
Meaning that one document created is one write. If the same document is updated later, then Firebase counts it as one more write.
Here is a more detailed table that you can use for billing, and an example.
It is recommended to view individual product usage in the "Usage" tab for many products in the Firebase console, as this can narrow the product that is causing the elevated usage that you are seeing.
I would highly recommend adding write and view logs to your application; that way, you can monitor how many writes and reads you have.

Related

Is there a way to limit the size of a collection in firebase firestore?

I am using a collection in Firebase Firestore to log some activities but I don't want this log collection to grow forever. Is there a way to set a limit to the number of documents in a collection or a size limit for the whole collection or get a notification if it passes a limit?
OR is there a way to automatically delete old documents in a collection just by settings and not writing some cron job or scheduled function?
Alternatively, what options are there to create a rotational logging system for client activities in Firebase?
I don't want this log collection to grow forever.
Why not? There are no downsides. In Firestore the performance depends on the number of documents you request and not on the number of documents you search. So it doesn't really matter if you search 10 documents in a collection of 100 documents or in a collection of 100 MIL documents, the response time will always be the same. As you can see, the number of documents within a collection is irrelevant.
Is there a way to set a limit to the number of documents in a collection or a size limit for the whole collection or get a notification if it passes a limit?
There is no built-in mechanism for that. However, you can create one mechanism yourself in a very simple way. Meaning, that you can create a document in which you can increment/decrement a numeric value, each time a document is added or deleted from the collection. Once you hit the limit, you can restrict the addition of documents in that particular collection.
OR is there a way to automatically delete old documents in a collection just by settings and not writing some cron job or scheduled function?
There is also no automatic operation that can help you achieve that. You can either use the solution above and once you hit the limit + 1, you can delete the oldest document. Or you can use a Cloud Function for Firebase to achieve the same thing. I cannot see any reason why you should use a cron job. You can use a Cloud Scheduler to perform some operation at a specific time, but as I understand you want it to happen automatically when you hit the limit.
Alternatively, what options are there to create a rotational logging system for client activities in Firebase?
If you still don't want to have larger collections, maybe you can export the data into a file and add that file to Cloud Storage for Firebase.

Firebase Firestore Read Costs - Clarification

I am using Firestore DB for an e-commerce app. I have a collection of products, each product has a document that has a "title" field and "search_keywords" field. The search keyword field stores an array. For example, if the title="apple", then the "search_keywords" field would store the following array: ["a","ap","app","appl","apple"]. When the user starts typing "apple" in the search box, I want to show the user, all products where "search_keywords" contains "a", then when they type the "p", I want to show all products where search keywords contain "ap"...and so on. Here is the snippet of code that gets called each time an additional letter is typed:
firebaseFireStore.collection("Produce").whereArrayContains("search_keywords", toSearch).get()
For example, in every case, the documents that would be returned on each successive call where an additional letter was typed would be a subset of what was returned in the previous call - it would just be a smaller list of documents - documents that were read on the previous query. My question is since the documents retrieved on a successive query are a subset of those retrieved in a prior query, would I be charged reads based on how many documents each successive query returns, or would Firestore have them in the cache and read them from there since the successive result set is a subset of a prior result set. This question has been on my mind for a while and every time I search for it, I can't seem to find a clear answer. For example, based on my research, the following two posts on Stackoverflow have involved similar questions and the following are relevant quotes from there, but they seem to contradict each other because #AlexMamo says "it will always read the online version of the documents...[when online]" and #Doug Stevenson says "if the local persistence is enabled on your client (it is by default) and the documents haven't been updated in the server...[it will get them from the cache]". I would appreciate any clarification on this if anyone knows the answer. Thanks.
"If the OP has offline persistence enabled, which is by default in Cloud Firestore, then he will be able to read the cache only while offline. When the OP has internet connectivity, it will always read the online version of the documents." –
Alex Mamo (https://stackoverflow.com/a/69320068/14556386)
"According to this answer by Doug Stevenson, the reads are only charged when performed upon the server, not your local cache. That is if the local persistence is enabled on your client (it is by default) and the documents haven't been updated in the server."
(https://stackoverflow.com/a/61381656/14556386)
EDIT: In addition, if for each product document that was retrieved by the Firestore search, I download its corresponding image file from Firebase Storage. Would it charge me for downloading that file on successive attempts to download it or would it recognize that I had previously downloaded that image and fetch it from cache automatically?
First of all, storing ["a", "ap", "app", "appl", "apple"] into an array and performing an whereArrayContains() query, doesn't sound like a feasible idea. Why? Imagine you have a really big online shop with 100k products, in which 5k start with "a". Are you willing to pay 5k reads every time a user types "a"? That's a very costly feature.
Most likely you should return the corresponding documents when the user types, for example, two, or even three characters. You'll reduce costs enormously. Or you might take into consideration using the solution I have explained in the following article:
How to filter Firestore data cheaper?
Let's go forward.
For example, in every case, the documents that would be returned on each successive call where an additional letter was typed would be a subset of what was returned in the previous call, it would just be a smaller list of documents.
Yes, that's correct.
My question is since the documents retrieved on a successive query are a subset of those retrieved in a prior query, would I be charged reads based on how many documents each successive query returns?
Yes. You'll always be charged with a number of reads that is equal to the number of documents that are returned by your query. It doesn't matter if a query was previously performed, or not. Every time you perform a new query, you'll be charged with a number of reads that is equal to the number of documents you get.
For example, let's assume you perform this query:
.whereArrayContains("search_keywords", "a")
And you get the 100 documents, and right after that you perform:
.whereArrayContains("search_keywords", "ap")
And you get only 30 documents, you'll have to pay 130 reads, and not only 100. So it doesn't matter if the documents that are returned by the second query are a subset of the documents that are returned by the first query.
Or would Firestore have them in the cache and read them from there since the successive result set is a subset of a prior result set.
No, it won't. It will read those documents from the cache only if the user losses the internet connectivity, otherwise it will always read the online versions of the documents that exist on the Firebase servers. The cached version of the documents works only when the user is offline. I have also written an article on this topic called:
How to drastically reduce the number of reads when no documents are changed in Firestore?
In Doug's answer:
Am I charged with read operations everytime the location is changed?
He clearly says:
You are charged for the number of documents read on the server every time you call get().
So if you called get(), you have to pay as reads, the number of documents that are returned.
The following statement is available:
If local persistence is enabled in your client (it is by default), then the documents may come from the cache if the documents are also not changed on the server.
When you are listening for real-time updates. According to the docs:
When you listen to the results of a query, you are charged for a read each time a document in the result set is added or updated. You are also charged for a read when a document is removed from the result set because the document has changed.
And I would add, if nothing has changed, you don't have to pay anything. Again, according to the same docs:
Also, if the listener is disconnected for more than 30 minutes (for example, if the user goes offline), you will be charged for reads as if you had issued a brand-new query.
So if the listener is active, you always read the documents from the cache. Bear in mind that a get() operation is different than listening for real-time updates.
if for each product document that was retrieved by the Firestore search, I download its corresponding image file from Firebase Storage. Would it charge me for downloading that file on successive attempts to download it or would it recognize that I had previously downloaded that image and fetch it from cache automatically?
You'll always be charged if you download the image over and over again unless you are using a library that helps you cache the images. For Android, there is a library called Glide:
Glide is a fast and efficient open-source media management and image loading framework for Android that wraps media decoding, memory and disk caching, and resource pooling into a simple and easy-to-use interface.

Firestore count number of documents

If I have a Firestore document in the following structure:
In my web app, I would like to display the number of followers. If I just do a get() of the whole followers sub-collection. That will be costly in terms of read operations. I thought about the following solution:
Having a counter document and having a counter field that would be incremented every time a document is created inside the followers collection using cloud function. But there is the limit of one write per second per document for that counter. The idea to have a followers collection and each document for each follower is to avoid the one write per second limit (thanks to Doug Stevenson's blog: The top 10 things to know about Firestore when choosing a database for your app).
The only get around for that I can think of is to use distributed counter extension. But from I read so far, the counter only works with front-end SDK. Would I be able to use the extension in a cloud function or in a node.js backend to increase the followers counter?
The "one write per document per second" is a guideline and not a hard rule, so I'd highly recommend not immediately getting hung up on that.
Then again, if you think you'll consistently need to count more than can be kept in a single document, your options are:
Keep a distributed counter, as shown in the documentation on distributed counters.
Keep the counter somewhere else. For example, I typically keep counters in Realtime Database, which has much higher write throughput (but lower read concurrency per shard).
But from I read so far, the counter only works with front-end SDK.
That's not true. The extension works for any query made to Firestore.
Would I be able to use the extension in a cloud function or in a node.js backend to increase the followers counter?
The extension works by monitoring documents added to and removed from the collection. It doesn't matter where the change comes from. You will still be able to use the computed counter from any code that's capable of querying the counter documents.

does accessing the firebase firestore database dashboard will be considered as read operation?

I am now in developing phase for the project. currently the project only using one Android app as the frontend. the query from Android using limit and pagination. but the total number of documents read is way above the expected number.
I am trying to figure this out, why the number of read documents is so big even though the user is only one (me). I am scared the project will not be feasible if the number of read is so big. thats why i need to figure out the firestore read behaviour
When I accessed the firestore dashboard, and select a collection like the image below, it will show blue loading indicator and then show all documents available. currently in the event collection I have 52 documents. I access all documents in the event collection like this for several times for debugging purpose.
so whenever i tap that event collection, I assume it will be counted as 52 read operation, so the read operation will not only come from Android device but also from the dashboard ? thats why the number of reads is so big. am I right ?
if thats the case....
say if I have 100000 documents in event collection, then whenever i tap that event collection, will i perform 100000 read operation as well ? is there a way to limit this dashboard read ?
so the read operation will not only come from the Android device but also from the dashboard? That's why the number of reads is so big. am I right?
Yes, you are right.
say if I have 100000 documents in event collection, then whenever I tap that event collection, will I perform 100000 read operation as well?
No, you'll be charged only for the number of documents that belong to the first page. Inside the Console, there is a pagination mechanism especially implemented for that. So you'll be not charged for all the documents that exist in your collection.
Is there a way to limit this dashboard read?
The limitation already exists but be aware that as much as you scroll down, you get more documents which means more read operations charged.
One thing to bear in mind about the Firebase console is that it reflects changes to visible documents in real time, and each one of those changes also costs you a read. So, if you leave the console open while documents are changing, you will accumulate reads over time, even if you aren't actively using the console. This is a common source of unexpected reads.

Cloud Firestore Data Structure

I am creating an application that uses cloud firestore to store data about "events" in our lab on several assets. We collected data for a few months and we are averaging about 2000 events per asset per month. Each event captures a few pieces of meta data that the user can query.
I imported all the data into firestore with a very simple layout at first.
Events (Collection of event data)
-> EventData (documents which contains a few fields for metadata)
From my understanding, even if the collection of events becomes quite large, for billing and speed of queries this won't be a problem (assuming I do some sort of pagination on the query results). The composite indexes are also very manageable with this structure.
The problem I see, is if someone goes and looks at the firestore console and brings that collection up, our read requests go through the roof. It seems that does a full read on the entire collection...which of course will kill us on billing as time goes on. I don't see this as a problem forever as eventually we should get to the point where everything is stable and won't need to go into the console very often, but what if someone does when we have a million or more records.
My next thought was to structure the database like this:
Events -> Assets -> {Asset_Name } -> {year_month} -> {Collection of
Document with field meta-data}
This certainly solves the issue of the ever growing collection of documents. The number of assets that we have is fixed, and the number of events is (effectively) capped to a maximum amount per month as well. The problem with this setup, however, is managing composite indexes. There are about 5 indexes needed for my original setup. I think this alternative setup means I would need to setup the same 5 indexes for each each collection of documents for every asset every month.
I thought maybe there could be a way to have a cloud function manage it for me (it doesn't appear there is an API for this). I think the number of indexes per project is also capped.
So, in the end, I am looking for recommendations on how to structure this database to limit reads if using the console, as well as keeping the indexes manageable. I am pretty new to NoSQL and perhaps I am just completely off.
I recommend you keep your structure as is if that's what's working for you. You should not need to optimize for reducing console reads. Console reads do count towards your usage but the console does not load the entire collection when you open the console.
The console loads just enough documents to let you scroll a bit and then it loads more documents if you scroll down. It will only load the entire collection if you scroll through the entire collection.

Resources