we've been using realtime database to save some data from mobile devices (ios, android + now web). I earlier asked if the order, in which other clients see the data, is guaranteed to be the same order in which client wrote those. (here Does Firebase guarantee that data set using updateValues or setValue is available in the backend as one atomic unit? , the title is a bit misleading, but the answer is there)
The answer was yes, and now we're migrating to Firestore and I'm wondering if the same applies to Firestore too?
So, If I write in client A documents 1, 2 and 3 is it guaranteed that Client N will observe the writes (given that there is a suitable listener) in the same order as client A wrote those?
Does this apply to Cloud Functions too? We're writing 3 pieces of data to separate documents and then we write fourth document as a way to trigger a function to do some processing. So is it guaranteed that the 3 documents written earlier will be available when the function is triggered?
Note that the 4 documents are NOT written in the same transaction or batch, but as separate document.create calls.
It would be catastrophically bad if the order of writes was not maintained within an individual client. The client would not have a strong internal understanding of the state of the system after such a series of writes, and it would have to read every document back in order to validate the contents written.
So you can expect that the order would be maintained. However, if you aren't using a transaction, there are no guarantees about the order of writes to document coming from multiple clients.
Related
I have two Screens(Scaffold) and I'm thinking to use a single stream() in both StreamBuilder. My question is will I be charged double reads? Or the read will be equivalent to one read?
The Firestore clients try to minimize the amount of data they have to read from the server. It depends on your code of course, but in most cases there will be a lot of re-use of data that was already read - so that you won't get charged again.
Some examples:
If both builders are using the same stream, the data for the second one will come from what was already ready into memory for the first reader. So there will be no charges for the second builder.
If both builder use their own stream, there typically is also a lot of reuse. If the streams are listening at the same time, the SDK will reuse the data between them where possible.
And if you have disk caching enabled (which it is by default on iOS and Android), the streams will even share results if they're not active at the same time.
Firebase's Cloud Firestore gives you limits on the number of document writes and reads (and deletes). For example, the spark plan (free) allows 50K reads and 20k writes a day. Estimating how many writes and reads is obviously important when developing an app, as you will want to know the potential costs incurred.
Part of this estimation is knowing exactly what counts as a document read/write. This part is somewhat unclear from searching online.
One document can contain many different fields, so if an app is designed such that user actions done through a session require the fields within a single document to be updated, would it be cost-efficient to update all the fields in one single document write at the end of the session, rather than writing the document every single the user wants to update one field?
Similarly, would it not make sense to read the document once at the start of a session, getting the values of all fields, rather than reading them when each is needed?
I appreciate that method will lead to the user seeing slightly out-of-date field values, and the database not being updated admittedly, but if such things aren't too much of a concern to you, couldn't such a method reduce you reads/writes by a large factor?
This all depends on what counts as a document write/read (does writing 20 fields within the same document in one go count as 20 writes?).
The cost of a write operation has no bearing on the number of fields you write. It's purely based on the number of times you call update() or set() on a document reference, weither independently, in a transaction, or in a batch.
If you choose to write each N fields using N separate updates, then you will be charged N writes. If you choose to write N fields using 1 update, then you will be charged 1 write.
There are several questions asked about this topic but I cant find one that answers my question. As described here, there is no clear explanation as to whether the minimum charges are applicable to query.get() or real-time listeners as well. Quoted:
There is a minimum charge of one document read for each query that you perform, even if the query returns no results.
The reason am asking this question even though it may seem obvious for someone is due to the section; *for each query that you perform* in that statement which could mean a one time trigger e.g with get() method.
Scenario: If 10 users are listening to changes in a collection with queries i.e query.addSnapshotListener() then change occurs in one document which matches query filter of only two users, are the other eight charged a cost of one read too?
Database used: Firestore
In this scenario I would say no, the other eight would not be counted as reads because the documents they are listening to have not been updated or have not been added/removed from that collection based on their filters (query params). The reads aren't based on changes to the collection but rather changes to the stream of documents you are specifically listening to. Because that 1 document change was not part of the documents that the other 8 users were listening to then there is no new read for them. However, if that 1 document change led to that document now matching the query filters of those other 8, then yes there would be 8 new reads for those users. Hope that makes sense.
Also it's worth noting that things like have offlinePersistence enabled via the sdk and firestore's caching maximize the efficiency of limiting reads as well as using a singleton Observable that multiple instances in your app subscribe to as oppose to opening multiple streams of the same query throughout your app. Doesn't really apply to this question directory but again while in the same vein, it's worth noting.
I need to delete very large collections in Firestore.
Initially I used client side batch deletes, but when the documentation changed and started to discouraged that with the comments
Deleting collections from an iOS client is not recommended.
Deleting collections from a Web client is not recommended.
Deleting collections from an Android client is not recommended.
https://firebase.google.com/docs/firestore/manage-data/delete-data?authuser=0
I switched to a cloud function as recommended in the docs. The cloud function gets triggered when a document is deleted and then deletes all documents in a subcollection as proposed in the above link in the section on "NODE.JS".
The problem that I am running into now is that the cloud function seems to be able to manage around 300 deletes per seconds. With the maximum runtime of a cloud function of 9 minutes I can manage up to 162000 deletes this way. But the collection I want to delete currently holds 237560 documents, which makes the cloud function timeout about half way.
I cannot trigger the cloud function again with an onDelete trigger on the parent document, as this one has already been deleted (which triggered the initial call of the function).
So my question is: What is the recommended way to delete large collections in Firestore? According to the docs it's not client side but server side, but the recommended solution does not scale for large collections.
Thanks!
When you have too muck work that can be performed in a single Cloud Function execution, you will need to either find a way to shard that work across multiple invocations, or continue the work in a subsequent invocations after the first. This is not trivial, and you have to put some thought and work into constructing the best solution for your particular situation.
For a sharding solution, you will have to figure out how to split up the document deletes ahead of time, and have your master function kick off subordinate functions (probably via pubsub), passing it the arguments to use to figure out which shard to delete. For example, you might kick off a function whose sole purpose is to delete documents that begin with 'a'. And another with 'b', etc by querying for them, then deleting them.
For a continuation solution, you might just start deleting documents from the beginning, go for as long as you can before timing out, remember where you left off, then kick off a subordinate function to pick up where the prior stopped.
You should be able to use one of these strategies to limit the amount of work done per functions, but the implementation details are entirely up to you to work out.
If, for some reason, neither of these strategies are viable, you will have to manage your own server (perhaps via App Engine), and message (via pubsub) it to perform a single unit of long-running work in response to a Cloud Function.
The Firestore docs says that both transactions and batched writes are atomic operations - either all changes are written or nothing is changed.
This question is about whether the changes of an atomic operation in Firestore can be partially observed, or whether the all or nothing guarantee applies to readers too?
Example:
Let's say that we have a Firestore database with at least two documents, X and Y.
Let's also say that there are at least two clients (A and B) connected to this database.
At some point client A executes a batched write that updates both document X and Y.
Later, client B reads document X and observes the change that client A made.
Now, if client B would read document Y too, is there a guarantee that the change made by A (in the same batched write operation) will be observed?
(Assuming that no other changes where made to those documents)
I've tested it and I've never detected any inconsistencies. However, just testing this matter can't be enough. It comes down to the level of consistency provided by Firestore, under all circumstances (high write frequency, large data sets, failover etc)
It might be the case that Firestore is allowed (for a limited amount of time) to expose the change of document X to client B but still not expose the change of document Y. Both changes will eventually be exposed.
Question is; will they be exposed as an atomic operation, or is this atomicity only provided for the write?
I've received an excellent response from Gil Gilbert in the Firebase Google Group.
In short; Firestore do guarantee that reads are consistent too. No partial observations as I was worried about.
However, Gil mentions two cases were a client could observe this kind of inconsistency anyway due to offline caching and session handling.
Please refer to Gil's response (link above) for details.