First query snapshot of snapshotListener - firebase

According to the Firebase - Firestore documentation, snapshotListeners provides all of the available records when we enabled the listener based on our query.
Firestore documentation:
The first query snapshot contains added events for all existing documents that match the query. This is because you're getting a set of changes that bring your query snapshot current with the initial state of the query. This allows you, for instance, to directly populate your UI from the changes you receive in the first query snapshot, without needing to add special logic for handling the initial state.
As far as I understood, it's not possible to disable this feature but there are some workarounds.
My question is if this behavior counts as one read for every record received during the first initialization or not?

My question is if this behavior counts as one read for every record
received during the first initialization or not?
The answer is yes: the "initial state of the query" implies that all documents corresponding to the query are read.
However, as explained in the documentation:
The initial state can come from the server directly, or from a local
cache. If there is state available in a local cache, the query
snapshot will be initially populated with the cached data.
If the initial state comes from a local cache (See offline data persistence), it will not count for any read.

Related

Firebase Firestore Read Costs - Clarification

I am using Firestore DB for an e-commerce app. I have a collection of products, each product has a document that has a "title" field and "search_keywords" field. The search keyword field stores an array. For example, if the title="apple", then the "search_keywords" field would store the following array: ["a","ap","app","appl","apple"]. When the user starts typing "apple" in the search box, I want to show the user, all products where "search_keywords" contains "a", then when they type the "p", I want to show all products where search keywords contain "ap"...and so on. Here is the snippet of code that gets called each time an additional letter is typed:
firebaseFireStore.collection("Produce").whereArrayContains("search_keywords", toSearch).get()
For example, in every case, the documents that would be returned on each successive call where an additional letter was typed would be a subset of what was returned in the previous call - it would just be a smaller list of documents - documents that were read on the previous query. My question is since the documents retrieved on a successive query are a subset of those retrieved in a prior query, would I be charged reads based on how many documents each successive query returns, or would Firestore have them in the cache and read them from there since the successive result set is a subset of a prior result set. This question has been on my mind for a while and every time I search for it, I can't seem to find a clear answer. For example, based on my research, the following two posts on Stackoverflow have involved similar questions and the following are relevant quotes from there, but they seem to contradict each other because #AlexMamo says "it will always read the online version of the documents...[when online]" and #Doug Stevenson says "if the local persistence is enabled on your client (it is by default) and the documents haven't been updated in the server...[it will get them from the cache]". I would appreciate any clarification on this if anyone knows the answer. Thanks.
"If the OP has offline persistence enabled, which is by default in Cloud Firestore, then he will be able to read the cache only while offline. When the OP has internet connectivity, it will always read the online version of the documents." –
Alex Mamo (https://stackoverflow.com/a/69320068/14556386)
"According to this answer by Doug Stevenson, the reads are only charged when performed upon the server, not your local cache. That is if the local persistence is enabled on your client (it is by default) and the documents haven't been updated in the server."
(https://stackoverflow.com/a/61381656/14556386)
EDIT: In addition, if for each product document that was retrieved by the Firestore search, I download its corresponding image file from Firebase Storage. Would it charge me for downloading that file on successive attempts to download it or would it recognize that I had previously downloaded that image and fetch it from cache automatically?
First of all, storing ["a", "ap", "app", "appl", "apple"] into an array and performing an whereArrayContains() query, doesn't sound like a feasible idea. Why? Imagine you have a really big online shop with 100k products, in which 5k start with "a". Are you willing to pay 5k reads every time a user types "a"? That's a very costly feature.
Most likely you should return the corresponding documents when the user types, for example, two, or even three characters. You'll reduce costs enormously. Or you might take into consideration using the solution I have explained in the following article:
How to filter Firestore data cheaper?
Let's go forward.
For example, in every case, the documents that would be returned on each successive call where an additional letter was typed would be a subset of what was returned in the previous call, it would just be a smaller list of documents.
Yes, that's correct.
My question is since the documents retrieved on a successive query are a subset of those retrieved in a prior query, would I be charged reads based on how many documents each successive query returns?
Yes. You'll always be charged with a number of reads that is equal to the number of documents that are returned by your query. It doesn't matter if a query was previously performed, or not. Every time you perform a new query, you'll be charged with a number of reads that is equal to the number of documents you get.
For example, let's assume you perform this query:
.whereArrayContains("search_keywords", "a")
And you get the 100 documents, and right after that you perform:
.whereArrayContains("search_keywords", "ap")
And you get only 30 documents, you'll have to pay 130 reads, and not only 100. So it doesn't matter if the documents that are returned by the second query are a subset of the documents that are returned by the first query.
Or would Firestore have them in the cache and read them from there since the successive result set is a subset of a prior result set.
No, it won't. It will read those documents from the cache only if the user losses the internet connectivity, otherwise it will always read the online versions of the documents that exist on the Firebase servers. The cached version of the documents works only when the user is offline. I have also written an article on this topic called:
How to drastically reduce the number of reads when no documents are changed in Firestore?
In Doug's answer:
Am I charged with read operations everytime the location is changed?
He clearly says:
You are charged for the number of documents read on the server every time you call get().
So if you called get(), you have to pay as reads, the number of documents that are returned.
The following statement is available:
If local persistence is enabled in your client (it is by default), then the documents may come from the cache if the documents are also not changed on the server.
When you are listening for real-time updates. According to the docs:
When you listen to the results of a query, you are charged for a read each time a document in the result set is added or updated. You are also charged for a read when a document is removed from the result set because the document has changed.
And I would add, if nothing has changed, you don't have to pay anything. Again, according to the same docs:
Also, if the listener is disconnected for more than 30 minutes (for example, if the user goes offline), you will be charged for reads as if you had issued a brand-new query.
So if the listener is active, you always read the documents from the cache. Bear in mind that a get() operation is different than listening for real-time updates.
if for each product document that was retrieved by the Firestore search, I download its corresponding image file from Firebase Storage. Would it charge me for downloading that file on successive attempts to download it or would it recognize that I had previously downloaded that image and fetch it from cache automatically?
You'll always be charged if you download the image over and over again unless you are using a library that helps you cache the images. For Android, there is a library called Glide:
Glide is a fast and efficient open-source media management and image loading framework for Android that wraps media decoding, memory and disk caching, and resource pooling into a simple and easy-to-use interface.

Does Firestore snapshost() listeners do an initial document read every data every time the app restarts?

When using firestore snapshot(), and set a listener, Cloud Firestore sends your listener an initial snapshot of the data, and then another snapshot each time the documents change.
However if I close the app, and reopen it, does firestore make a read on all the data it already has queried or is there an internal sync system (for example if they store documents metadata, like updatedAt they could only read documents that haven't been updated since x) ?
In other words. if I use onSnapshot() listener, I will make x documents read initially, then 1 document each time a document changes. My question is: If I close the app and a document changes, then when I open the app, is 1 read made or x + 1 ?
It is important for me because I have a bunch of initial calls and I'm wondering how that'd affect the cost($).
It's also important to know for data modeling and how it affects the cost.
Every time you perform a new query against the server (this is the default), it will cost a read, and the documents will have to be transferred. It will not use the cache unless there is no connection, or your specifically target the cache for the query. Quitting and returning to the app doesn't change this behavior at all.
I suggest reading this: https://medium.com/firebase-developers/firestore-clients-to-cache-or-not-to-cache-or-both-8f66a239c329
It depends on the type of listener
OnChange() will read only when data changes
addListenerForSingleValueEvent will check just once, and if it is the onCreate section, it will be executed immediately
addValueEventListener will keep checking constantly, but will log as a read only if the data changes

How to check if a document exists with a given id in firestore, without costing money?

I have a scenario where I have the phone number of the user and I want to check if the user is already registered on my app or not. To do this, I have a collection in firestore. In this collection, I the contact number of the individual user as a document. Whenever the user goes on the app and enters his mobile number, the app sends the request to search a specific document using
final snapShot = await Firestore.instance.collection('rCust').document(_phoneNumberController.text).get();
My database structure is as follows
Due to this, my firestore billing is spiking up really fast. In just with 4-5 queries, my number of reads spiked from 75 to 293. It would be great if anyone could guide me in how to do this efficiently.
If you want to know if a document definitely exists on the server, it will always cost you a document read. There is currently no way to avoid this cost. It's the cost of accessing the massively scalable index that allows you to find 1 document among potentially billions.
You could try to query your local cache first, which is doesn't cost anything. You do this by passing a Source.cache argument to get(). If you want to make the assumption that presence in the local cache always means that the document exists on the server, that will save you one document read. However, if the document is deleted on the server, the local cache query will be incorrect. You will still have to query the server to know for sure.
To check if a document exists, you can use the .exists propety in the documentSnapshot, in your case:
if(snapShot.exists) {
}
From that query, you are selecting a single document, not a collection.
Because we can't see other code, I am assuming that your firestore usage is actually not spiking due to your query, but due to you viewing your documents in the firebase web console. Viewing the console on the web also incurrs billing, and lists documents 300 at a time.
You can check it doing this
if(snapShot.getResults().exists()) {
// ...
}
if you don't want to set each time you send the phoneNumber to the document but instead updating just that number, you should use update("fieldToUpdate",value) on the document you are setting the data instead of using .set(value)

Can transaction be used on collection?

I am use Firestore and try to remove race condition in Flutter app by use transaction.
I have subcollection where add 2 document maximum.
Race condition mean more than 2 document may be add because client code is use setData. For example:
Firestore.instance.collection(‘collection').document('document').collection('subCollection’).document(subCollectionDocument2).setData({
‘document2’: documentName,
});
I am try use transaction to make sure maximum 2 document are add. So if collection has been change (For example new document add to collection) while transaction run, the transaction will fail.
But I am read docs and it seem transaction use more for race condition where set field in document, not add document in subcollection.
For example if try implement:
Firestore.instance.collection(‘collection').document('document').collection('subCollection').runTransaction((transaction) async {
}),
Give error:
error: The method 'runTransaction' isn't defined for the class 'CollectionReference'.
Can transaction be use for monitor change to subcollection?
Anyone know other solution?
Can transaction be use for monitor change to subcollection?
Transactions in Firestore work by a so-called compare-and-swap operation. In a transaction, you read a document from the database, determine its current state, and then set its new state based on that. When you've done that for the entire transaction, you send the whole package of current-state-and-new-state documents to the server. The server then checks whether the current state in the storage layer still matches what your client started with, and if so it commits the new state that you specified.
Knowing this, the only way it is possible to monitor an entire collection in a transaction is to read all documents in that collection into the transaction. While that is technically possible for small collections, it's likely to be very inefficient, and I've never seen it done in practice. Then again, for just the two documents in your collection it may be totally feasible to simply read them in the transaction.
Keep in mind though that a transaction only ensures consistent data, it doesn't necessarily limit what a malicious user can do. If you want to ensure there are never more than two documents in the collection, you should look at a server-side mechanism.
The simplest mechanism (infrastructure wise) is to use Firestore's server-side security rules, but I don't think those will work to limit the number of documents in a collection, as Doug explained in his answer to Limit a number of documents in a subcollection in firestore rules.
The most likely solution in that case is (as Doug also suggests) to use Cloud Functions to write the documents in the subcollection. That way you can simply reject direct writes from the client, and enforce any business logic you want in your Cloud Functions code, which runs in a trusted environment.

Firebase web - transaction on query

Can I run a transaction on a query referring to multiple locations ?
In the doc I see that for example startAt returns a firebase.database.Query which has a ref property of type firebase.database.Reference which has the transaction method.
So can I do:
ref.startAt(ver).ref.transaction(transactionUpdate).then(... ?
Would the transaction then operate on multiple locations and update them correctly ?
What I'm trying to do is to get all locations since a particular version (key) and then mark them as 'read' so that a writing client will not update them. For that I need a transaction rather than a simple update.
Thx!
The answer is "no" to all questions.
The ref property of a Query gives you the reference of the node on which you set up the query. Consider how you built the query in the first place. In other words, ref.startAt(x).ref is equivalent to ref.
Manipulating a reference (navigating to children, adding query options, etc.) is completely independent of any query results. It's just local, trivial path manipulation, very similar to formatting a URL.
Transactions can only operate on a single node, by definition, using that node's value snapshots for incremental updates. They cannot "operate on multiple locations and update them correctly". These are not SQL transactions, the only thing common is the name – which might be, unfortunately, confusing.
The starting node doesn't have to be a leaf node. But if you start a transaction on a "parent" node, the client will have to download every child to create a whole snapshot, potentially multiple times if any of them is modified by another client.
This is most certainly a very slow, fragile and expensive operation, both for the user and you, the owner of the database. In general, it's not recommended to run transactions if the node might grow unbounded.
I suggest revising the presented strategy. Updating "all children" just to store a "read" marker simply does not scale.
You could for example store the last read ID of the client in a single node, and write security rules to enforce that no data with an ID less than this may be modified.

Resources