Firestore - Caching behaviour with data pagination - firebase

Going through the documentation, I implemented pagination but I am confused how firestore's local cache behave with pagination. Suppose my query is to get first 20 documents. For testing, I change paging size to 25 and start application again to get exactly same 20 documents(cached before) + 5 new documents. How does the cache mechanism will behave with respect to number of reads in this case? Will it cost 5 new reads or 25 new reads? I tried several times to see if firebase console stats could help but the read counts there made no sense.
Console stats before the call show 68 reads but after second query it should be either (68+5) or (68+25), instead it shows 76 read operations. These stats didn't help me out to figure out the behavior.

The cache only has an effect for any query when:
The client is offline
The query specifically uses the cache as a source
All other cases, the cache is not used, and the server sends all documents. Each document is read and sent to the client, and you will be billed for all those document reads. Pagination doesn't change this behavior at all.
Read this to learn more about how the cache works.

Related

How can I used 10k read in firebase when I refresh a document that only contains 10 fields?

Recently, I have learned to deploy a website by React using Firebase.
Since yesterday the database returned that the limit is over, something like that.
Today I monitor it on purpose.
After I refresh the data which includes only one document that contains 10 fields,
and it shows that it has used 10k reads.
Appreciate someone can tell me that is the read count on times or data.

does accessing the firebase firestore database dashboard will be considered as read operation?

I am now in developing phase for the project. currently the project only using one Android app as the frontend. the query from Android using limit and pagination. but the total number of documents read is way above the expected number.
I am trying to figure this out, why the number of read documents is so big even though the user is only one (me). I am scared the project will not be feasible if the number of read is so big. thats why i need to figure out the firestore read behaviour
When I accessed the firestore dashboard, and select a collection like the image below, it will show blue loading indicator and then show all documents available. currently in the event collection I have 52 documents. I access all documents in the event collection like this for several times for debugging purpose.
so whenever i tap that event collection, I assume it will be counted as 52 read operation, so the read operation will not only come from Android device but also from the dashboard ? thats why the number of reads is so big. am I right ?
if thats the case....
say if I have 100000 documents in event collection, then whenever i tap that event collection, will i perform 100000 read operation as well ? is there a way to limit this dashboard read ?
so the read operation will not only come from the Android device but also from the dashboard? That's why the number of reads is so big. am I right?
Yes, you are right.
say if I have 100000 documents in event collection, then whenever I tap that event collection, will I perform 100000 read operation as well?
No, you'll be charged only for the number of documents that belong to the first page. Inside the Console, there is a pagination mechanism especially implemented for that. So you'll be not charged for all the documents that exist in your collection.
Is there a way to limit this dashboard read?
The limitation already exists but be aware that as much as you scroll down, you get more documents which means more read operations charged.
One thing to bear in mind about the Firebase console is that it reflects changes to visible documents in real time, and each one of those changes also costs you a read. So, if you leave the console open while documents are changing, you will accumulate reads over time, even if you aren't actively using the console. This is a common source of unexpected reads.

Cloud Firestore Data Structure

I am creating an application that uses cloud firestore to store data about "events" in our lab on several assets. We collected data for a few months and we are averaging about 2000 events per asset per month. Each event captures a few pieces of meta data that the user can query.
I imported all the data into firestore with a very simple layout at first.
Events (Collection of event data)
-> EventData (documents which contains a few fields for metadata)
From my understanding, even if the collection of events becomes quite large, for billing and speed of queries this won't be a problem (assuming I do some sort of pagination on the query results). The composite indexes are also very manageable with this structure.
The problem I see, is if someone goes and looks at the firestore console and brings that collection up, our read requests go through the roof. It seems that does a full read on the entire collection...which of course will kill us on billing as time goes on. I don't see this as a problem forever as eventually we should get to the point where everything is stable and won't need to go into the console very often, but what if someone does when we have a million or more records.
My next thought was to structure the database like this:
Events -> Assets -> {Asset_Name } -> {year_month} -> {Collection of
Document with field meta-data}
This certainly solves the issue of the ever growing collection of documents. The number of assets that we have is fixed, and the number of events is (effectively) capped to a maximum amount per month as well. The problem with this setup, however, is managing composite indexes. There are about 5 indexes needed for my original setup. I think this alternative setup means I would need to setup the same 5 indexes for each each collection of documents for every asset every month.
I thought maybe there could be a way to have a cloud function manage it for me (it doesn't appear there is an API for this). I think the number of indexes per project is also capped.
So, in the end, I am looking for recommendations on how to structure this database to limit reads if using the console, as well as keeping the indexes manageable. I am pretty new to NoSQL and perhaps I am just completely off.
I recommend you keep your structure as is if that's what's working for you. You should not need to optimize for reducing console reads. Console reads do count towards your usage but the console does not load the entire collection when you open the console.
The console loads just enough documents to let you scroll a bit and then it loads more documents if you scroll down. It will only load the entire collection if you scroll through the entire collection.

Firestore pricing

There are several questions asked about this topic but I cant find one that answers my question. As described here, there is no clear explanation as to whether the minimum charges are applicable to query.get() or real-time listeners as well. Quoted:
There is a minimum charge of one document read for each query that you perform, even if the query returns no results.
The reason am asking this question even though it may seem obvious for someone is due to the section; *for each query that you perform* in that statement which could mean a one time trigger e.g with get() method.
Scenario: If 10 users are listening to changes in a collection with queries i.e query.addSnapshotListener() then change occurs in one document which matches query filter of only two users, are the other eight charged a cost of one read too?
Database used: Firestore
In this scenario I would say no, the other eight would not be counted as reads because the documents they are listening to have not been updated or have not been added/removed from that collection based on their filters (query params). The reads aren't based on changes to the collection but rather changes to the stream of documents you are specifically listening to. Because that 1 document change was not part of the documents that the other 8 users were listening to then there is no new read for them. However, if that 1 document change led to that document now matching the query filters of those other 8, then yes there would be 8 new reads for those users. Hope that makes sense.
Also it's worth noting that things like have offlinePersistence enabled via the sdk and firestore's caching maximize the efficiency of limiting reads as well as using a singleton Observable that multiple instances in your app subscribe to as oppose to opening multiple streams of the same query throughout your app. Doesn't really apply to this question directory but again while in the same vein, it's worth noting.

Loading Bulk data in Firebase

I am trying to use the set api to set an object in firebase. The object is fairly large, the serialized json is 2.6 mb in size. The root node has around 90 chidren, and in all there are around 10000 nodes in the json tree.
The set api seems to hang and does not call the callback.
It also seems to cause problems with the firebase instance.
Any ideas on how to work around this?
Since this is a commonly requested feature, I'll go ahead and merge Robert and Puf's comments into an answer for others.
There are some tools available to help with big data imports, like firebase-streaming-import. What they do internally can also be engineered fairly easily for the do-it-yourselfer:
1) Get a list of keys without downloading all the data, using a GET request and shallow=true. Possibly do this recursively depending on the data structure and dynamics of the app.
2) In some sort of throttled fashion, upload the "chunks" to Firebase using PUT requests or the API's set() method.
The critical components to keep in mind here is that the number of bytes in a request and the frequency of requests will have an impact on performance for others viewing the application, and also count against your bandwidth.
A good rule of thumb is that you don't want to do more than ~100 writes per second during your import, preferably lower than 20 to maximize your realtime speeds for other users, and that you should keep the data chunks in low MBs--certainly not GBs per chunk. Keep in mind that all of this has to go over the internets.

Resources