Can we not query collections inside transactions? - firebase

Looking at https://firebase.google.com/docs/reference/js/firebase.firestore.Transaction I see four methods: delete, set, get, update.
I was about to construct a lovely little collection query and pass it to .get, but I see the docs say that .get "Reads the document referenced by the provided DocumentReference."
It appears this means we cannot get a collection, or query a collection, with a Transaction object.
I could query those with the query's .get() method instead of the transaction's .get() method, but if the collection changes out from under me, the transaction will end up in an inconsistent state without retrying.
It seems I am hitting a wall here. Is my understanding correct? Can we not access collections inside a transaction in a consistent way?

Your understanding is correct. When using the web and mobile SDKs, you have to identify the individual documents that you would like to ensure will not change before your transaction is complete. If those documents come from a collection query ahead of time, fine. But think for a moment about how not-scalable it would be if you had to track every document in a (very large) collection in order to complete your transaction.
However, for backend SDKs, you can perform a query inside a transacction and effectively transact on all the documents that were returned by the query, up to the limit of number of documents in a transaction (500).

You can run queries (not just fetch single documents) in a transaction's get() method, but that's only for server execution. So if you really need to do that (say for maintaining denormalized data's consistency), you can put that code in a cloud function and make use of server-side transactions

Related

Firestore data model for events planning app

I am new to Firestore and building an event planning app but I am unsure what the best way to structure the data is taking into account the speed of queries and Firestore costs based on reads etc. In both options I can think of, I have a users collection and an events collection
Option 1:
In the users collection, each user has an array of eventIds for events they are hosting and also events they are attending. Then I query the events collection for those eventIds of that user so I can list the appropriate events to the user
Option 2:
For each event in the events collection, there is a hostId and an array of attendeeIds. So I would query the events collection for events where the hostID === user.id and where attendeeIds.includes(user.id)
I am trying to figure out which is best from a performance and a costs perspective taking into account there could be thousands of events to iterate through. Is it better to search events collections by an eventId as it will stop iterating when all events are found or is that slow since it will be searching for one eventId at a time? Maybe there is a better way to do this than I haven't mentioned above. Would really appreciate the feedback.
In addition to #Dharmaraj answer, please note that none of the solutions is better than the other in terms of performance. In Firestore, the query performance depends on the number of documents you request (read) and not on the number of documents you are searching. It doesn't really matter if you search 10 documents in a collection of 100 documents or in a collection that contains 100 million documents, the response time will always be the same.
From a billing perspective, yes, the first solution will imply an additional document to read, since you first need to actually read the user document. However, reading the array and getting all the corresponding events will also be very fast.
Please bear in mind, that in the NoSQL world, we are always structuring a database according to the queries that we intend to perform. So if a query returns the documents that you're interested in, and produces the fewest reads, then that's the solution you should go ahead with. Also remember, that you'll always have to pay a number of reads that is equal to the number of documents the query returns.
Regarding security, both solutions can be secured relatively easily. Now it's up to you to decide which one works better for your use case.
I would recommend going with option 2 because it might save you some reads:
You won't have to query the user's document in the first place and then run another query like where(documentId(), "in", [...userEvents]) or fetch each of them individually if you have many.
When trying to write security rules, you can directly check if an event belongs to the user trying to update the event by resource.data.hostId == request.auth.uid.
When using the first option, you'll have to query the user's document in security rules to check if this eventID is present in that events array (that may cost you another read). Checkout the documentation for more information on billing.

Firestore listener mechanism efficiency

If I understand correctly, in the initialization phase of addSnapshotListener you get a list of all the documents (even if it is 500 trillion documents) from the QuerySnapshot if you call the getDocuments function.
Then, every time you modify or add a document to a collection, you get from QuerySnapshot all the documents that have been modified by calling the getDocumentChanges function or all the existing documents by calling getDocuments.
That means both at the initialization stage and after every change, I always get a list of all the documents. That's logical? Assuming I have 500 trillion documents under the same collection (just for the sake of exaggeration), at every change and initialization of the app will I get them all?
Is that really the case?
Or is it some kind of lazy instantiation or something?
Because if so, when I would like to question the whole collection, no matter what I get at first the whole list?
The QuerySnapshot always contains all documents that match the query (or collection reference). Even when there's an update and only a subset of the documents matched by the query is changed, the QuerySnapshot still contains all of the documents, even though in its communication between the SDK and the backend serves, Firestore only synchronizes the modified documents. If you only want to process the changes, you can process just the changes between the snapshots.

Firebase Firestore, Delete Collection with a Callable Cloud Function

if you see here https://firebase.google.com/docs/firestore/solutions/delete-collections
you can see the below
Consistency - the code above deletes documents one at a time. If you
query while there is an ongoing delete operation, your results may
reflect a partially complete state where only some targeted documents
are deleted. There is also no guarantee that the delete operations
will succeed or fail uniformly, so be prepared to handle cases of
partial deletion.
so how to handle this correctly?
this means "preventing users from accessing this collection while deletion is in progress?"
or "If the work is stopped by accessing the collection in the middle, is it to call the function again from the failed part to proceed with the complete deletion?"
so how to handle this correctly?
It's suggesting that you should check for failures, and retry until there are no documents remaining (or at least until you are satisfied with the result).

Downside of using transactions in google firestore

I'm developing a Flutter App and I'm using the Firebase services. I'd like to stick only to using transactions as I prefer consistency over simplicity.
await Firestore.instance.collection('user').document(id).updateData({'name': 'new name'});
await Firestore.instance.runTransaction((transaction) async {
transaction.update(Firestore.instance.collection('user').document(id), {'name': 'new name'});
});
Are there any (major) downsides to transactions? For example, are they more expensive (Firebase billing, not computationally)? After all there might be changes to the data on the Firestore database which will result in up to 5 retries.
For reference: https://firebase.google.com/docs/firestore/manage-data/transactions
"You can also make atomic changes to data using transactions. While
this is a bit heavy-handed for incrementing a vote total, it is the
right approach for more complex changes."
https://codelabs.developers.google.com/codelabs/flutter-firebase/#10
With the specific code samples you're showing, there is little advantage to using a transaction. If your document update makes a static change to a document, without regard to its existing data, a transaction doesn't make sense. The transaction you're proposing is actually just a slower version of the update, since it has to round-trip with the server twice in order to make the change. A plain update just uses a single round trip.
For example, if you want to append data to a string, two clients might overwrite each other's changes, depending on when they each read the document. Using a transaction, you can be sure that each append is going to take effect, no matter when the append was executed, since the transaction will be retried with updated data in the face of concurrency.
Typically, you should strive to get your work done without transactions if possible. For example, prefer to use FieldValue.increment() outside of a transaction instead of manually incrementing within a transaction.
Transactions are intended to be used when you have changes to make to a document (or, typically, multiple documents) that must take the current values of its fields into account before making the final write. This prevents two clients from clobbering each others' changes when they should actually work in tandem.
Please read more about transactions in the documentation to better understand how they work. It is not quite like SQL transactions.
Are there any (major) downsides to transactions?
I don't know any downsides.
For example, are they more expensive (Firebase billing, not computationally)?
No, a transaction costs like any other write operaton. For example, if you create a transaction to increase a counter, you'll be charged with only one write operation.
I'm not sure I understand your last question completely but if a transaction fails, Cloud Firestore retries the transaction for sure.

Can transaction be used on collection?

I am use Firestore and try to remove race condition in Flutter app by use transaction.
I have subcollection where add 2 document maximum.
Race condition mean more than 2 document may be add because client code is use setData. For example:
Firestore.instance.collection(‘collection').document('document').collection('subCollection’).document(subCollectionDocument2).setData({
‘document2’: documentName,
});
I am try use transaction to make sure maximum 2 document are add. So if collection has been change (For example new document add to collection) while transaction run, the transaction will fail.
But I am read docs and it seem transaction use more for race condition where set field in document, not add document in subcollection.
For example if try implement:
Firestore.instance.collection(‘collection').document('document').collection('subCollection').runTransaction((transaction) async {
}),
Give error:
error: The method 'runTransaction' isn't defined for the class 'CollectionReference'.
Can transaction be use for monitor change to subcollection?
Anyone know other solution?
Can transaction be use for monitor change to subcollection?
Transactions in Firestore work by a so-called compare-and-swap operation. In a transaction, you read a document from the database, determine its current state, and then set its new state based on that. When you've done that for the entire transaction, you send the whole package of current-state-and-new-state documents to the server. The server then checks whether the current state in the storage layer still matches what your client started with, and if so it commits the new state that you specified.
Knowing this, the only way it is possible to monitor an entire collection in a transaction is to read all documents in that collection into the transaction. While that is technically possible for small collections, it's likely to be very inefficient, and I've never seen it done in practice. Then again, for just the two documents in your collection it may be totally feasible to simply read them in the transaction.
Keep in mind though that a transaction only ensures consistent data, it doesn't necessarily limit what a malicious user can do. If you want to ensure there are never more than two documents in the collection, you should look at a server-side mechanism.
The simplest mechanism (infrastructure wise) is to use Firestore's server-side security rules, but I don't think those will work to limit the number of documents in a collection, as Doug explained in his answer to Limit a number of documents in a subcollection in firestore rules.
The most likely solution in that case is (as Doug also suggests) to use Cloud Functions to write the documents in the subcollection. That way you can simply reject direct writes from the client, and enforce any business logic you want in your Cloud Functions code, which runs in a trusted environment.

Resources