How can I write to the Realtime Database, Cloud Storage and Firestore at the same time using transactions? [duplicate] - firebase

This question already has answers here:
Couple Firebase Firestore and Firebase Storage calls together into a batch?
(2 answers)
Closed last month.
I'm developing an app in Flutter, I have a method called **CreateUser **which takes as parameters the user information, his profile picture, and a list of strings, I need to save the information in the Realtime Database, the picture in the Cloud Storage and the list in the Firestore.
I would like all these operations to be successful, if one of these should fail then I would like the others to undo the data they wrote. How can I implement the rollback of the other operations? Can I use transactions?
I've tried using transactions but I'm not sure if I can use them on different Databases.

I need to save the information in the Realtime Database, the picture in the Cloud Storage, and the list in the Firestore.
That's indeed possible, by performing one operation, right after another, only when the operation succeeds. For example, as soon as the operation for writing data to the Realtime Database completes, then inside the callback, perform the addition of the image to Storage. As soon as the addition of the image to Storage succeeds, perform the last operation of writing the data to Firestore.
I would like all these operations to be successful, if one of these should fail then I would like the others to undo the data they wrote.
There is no built-in mechanism for that. If you thought you can add to a batch operation, a Realtime Database write operation, a Firebase Storage file upload
and Firestore write operation and be sure that all three are complete, so you can have consistent data, please note that this is not possible. These operations are a part of different Firebase services and unfortunately, at the moment I'm writing this answer there is no way you can make them atomic, meaning all succeed or all fail with an exception.
How can I implement the rollback of the other operations?
You have to write code for that because none of the Firebase products support cross-product transactional operations. To solve this, you'll have to nest the calls during your write/upload operations and handle the error if the next operation fails. This means that you either have to delete the data from the Realtime Database and the file from Storage if the write operation in Firestore fails. Or only delete the data from the Realtime Database if the file upload to Storage fails.
But note, at some point in time, there will be a failure that the client can't roll back one of the delete operations. The most common approach for these inevitable failures which might happen is to make your code robust by handling exceptions and performing occasional cleanups in both places, Firebase Storage and Firestore, considering that the first operation is the one that writes data to the Realtime Database.
As discussed with the Firebase engineers, the reason is quite clear. Even if the Realtime Database and Cloud Firestore are both Firebase products, they are still different products. Besides that, Firebase Storage is a service within Google Cloud. So now, 2023-01-12 there is no way you we can do that. Hopefully, it will be available in the near future.
Can I use transactions?
No, and that's for the exact same reason as above.

One way I might address this, is to use the Firestore document write operation to trigger a Workflow [1] that can handle the three operations and rollback depending on failure state. That way you can also have a constant transaction record follow the process.
If you wanted to provide app feedback say to the user, you could have your app. wait for a DB record of completion (or error) get written and based on that report back to the user.
[1] https://cloud.google.com/firestore/docs/solutions/workflows

Related

Using Firebase Realtime Database node keys as document ID's in Cloud Firestore

I am in the process of migrating my Realtime Database to Cloud Firestore. Ideally, I need to keep the same Realtime Database node keys that have been generated using push() and use it as the document ID in Firestore, but is this safe to do so?
I have read information at https://firebase.google.com/docs/firestore/best-practices and I am still unsure whether this will be safe. I am aware that auto generated document IDs in Cloud Firestore are in a different format to those automatically generated in Realtime Database.
Am I likely to run into problems by using by using Realtime Database generated keys such as: -M_NHw525_IxMqiGPUvd as the document ID in Cloud Firestore?
I really appreciate any help, Thanks.
Firestore is sensitive to hot spots in its writing process, meaning that write throughput is best when the writes are randomly distributed across the address space. In other words: if the IDs of the documents being written, and the values that are being written to the indexes, are randomly distributed.
Firebase Realtime Database push IDs start with an encoded timestamp, so they are definitely not randomly distributed. They are (by design) largely sequential: subsequent calls to push() typically leads to keys that are next to each other. This is exactly what they were designed for in Realtime Database, but it doesn't meet the requirement of a random distribution that is needed for maximizing write throughput in Firestore.
Whether you'll run into problems when using existing push keys for your Firestore writes really depends on the implementation. For example, during a data migration you'll want to be ready to throttle the writes (sooner) when they're not randomly distributed. Hopefully the above helps you to know what to keep an eye out for when performing the data migration.

Which firebase database to use for chat applicatoin, Firestore or Realtime Database?

I'm building an app which uses Firestore for storing most data. The app has a chat functionality and I was considering using Realtime Database for that. What are the benefits of using Firebase Firestore vs Realtime Database for this chat functionality? If there is no difference, should I use Firestore for everything?
P.S. I have already read the firebase comparison of the two https://firebase.google.com/docs/database/rtdb-vs-firestore and I am still not sure which way to go about this.
FB RTDB was designed for a chat application but is not so great for more than simple querying. Firestore was developed to improve the querying requirements and is newer. Newer doesn't necessarily mean better, depends on the use case. Their pricing models are very different, so you need to understand how your use case will be charged.
You can use both of course. They can work well together but if a simple chat requirement is all you need, I would use RTDB.
PS. The unique keys generated in RTDB for each new record are automatically in chronological order, which relates back to it being designed for a chat app. There is a caveat though, the chat messages may still get out of order because the keys are generated on the device and if the device clocks are slightly out and messages are being exchanged rapidly then you may get a miss timing. The way round this is to write each record with a property of server time...and use that to sort the chat messages. Hope that helps your decision.
PPS. RTDB charges for data storage volumes and data download volumes. Firestore charges for storage and db reads and writes. There will be a lot of the latter in a chat app so I would recommend running some what-if scenarios in Excel.

Is transaction really required in a distributed counter?

According to firestore documentation:
a transaction is a set of read and write operations on one or more documents.
Also:
Transactions will fail when the client is offline.
Now the limitation in firestore is that:
In Cloud Firestore, you can only update a single document about once per second, which might be too low for some high-traffic applications.
So using cloud functions and running transactions to increment/decrement counters when the traffic is high will fail.
So they have discussed to use the approach of distributed counters.
According to the algorithm of distrbuted counter:
create shards
choose a shard randomly
run a transaction to increment/decrement the counter
get all the shards and aggregate the
result to show the value of a counter
Scenerio:
consider you have a counter which is to be updated when a document is added and that counter is being displayed in the UI. Now for good UX, I cannot block the UI when network is offline. So I must allow creation/updation of documents even when client is offline and sync these changes once client is online so that everyone else listening to these changes receive the correct value of the counter.
Now transactions fail when the client is offline.
So my question for best user experience (even when offline) is:
Do you really require a transaction to increment a counter? I know
transactions ensure that writes are atomic and are either
successful/unsuccessful and prevent partial writes. But what's the
point when they fail offline? I was thinking maybe write them to local cache and sync it once the network is back online.
Should this be done via client sdks of via cloud functions?
Do you really require a transaction to increment a counter?
Definitely yes! Because we are creating apps that can be used in a multi user environment, transactions are mandatory, so we can provide consistent data.
But what's the point when they fail offline?
When there is a loss of network connectivity (there is no network connection on user device), transactions are not supported for offline use. This is because a transaction absolutely requires round trip communications with server in order to ensure that the code inside the transaction completes successfully. So, transactions can only execute when you are online.
Should this be done via client sdks of via cloud functions?
Please note, that the Firestore SDK for Android has a local cache that's enabled by default. According to the official documentation regarding Firestore offline persistence:
For Android and iOS, offline persistence is enabled by default. To disable persistence, set the PersistenceEnabled option to false.
So all read operations will come from cache if there are no updates on the server. So Firestore provides this feature for handle offline data.
You can also write a function in Cloud Function that will increment the counter while a new document is added or to decrement the conter while a document is deleted.
I also recommend you to take a look:
How to count the number of documents under a collection in Firestore?
So you may also consider using Firebase realtime database for that. Cloud Firestore and Firebase realtime database work very well together.
Edit:
It allows one to upvote the answer even when the device is offline. After the network is online, it syncs to the server and the counter is updated. Is there a way i can do this in firestore when the device is offline.
This is also happening by default. So if the user tries to add/delete documents while offline, every operation is added to a queue. Once the user regains the connection, every change that is made while offline, will be updated on Firebase servers. With other words, all queries will be commited on the server.
Cloud fnctions are triggered only when the change is received and that can only happen when the device is online.
Yes, that correct. Once the device regains the network connection, the document is added/deleted from the database, moment in which the function fires and increases/decreases the counter.
Edit2:
Suppose I have made around 100 operations offline, will that not put a load on the cloud functions when the device comes online? What's your thought on this?
When offline, pending writes that have not yet been synced to the server are held in a queue. If you do too many write operations without going online to sync them, that queue will grow fast and it will not slow down only the write operations it will also slow down your read operations. So I suggest use this database for its online capabilities.
Regarding Cloud Functions for those 100 offline operations, there will be no issues. Firebase servers work very well with concurent operations.

How firebase functions realtime-db triggers work with offline writes?

When some a series of write operations happens into Firebase Realtime DB when client is off-line,it's stored in the client and added into the db once it get connected.
The behavior of Firebase Functions will depending on, how it is written to the database. Will it just sync two DB's as a single write operation?
Or will it trigger all these write operations?
I just tried this. You could try it yourself and verify the results on your own.
Each offline write to the exact same location in the database triggered a call to an onUpdate trigger at the same location.
However, you should not expect the triggers to be executed in any particular order. There is no guarantee to the order of events delivered to a Cloud Functions trigger, and they may all be executed in parallel to some degree.

Firestore pricing clarifications for offline cached data

It seems odd to me that Firestore would charge me for read queries to locally cached data, but I can't find any clarification to the contrary in the Firestore Pricing document. If I force Firebase into offline mode and then perform reads on my locally cached data, am I still charged for each individual entity that I retrieve?
Second, offline users in my app write many small updates to a single entity. I want the changes to persist locally each time (in case they quit the app), but I only need eventually consistent saves to the cloud. When a user reconnects to the internet and Firestore flushes the local changes, will I be charged a single write request for the entity or one per update call that I made while offline?
Firestore could potentially fit my use case very well, but if offline reads and writes are charged at the same rate as online ones it would not be an affordable option.
As the offical documentation says,
Cloud Firestore supports offline data persistence. This feature caches a copy of the Cloud Firestore data that your app is actively using, so your app can access the data when the device is offline. You can write, read, listen to, and query the cached data. When the device comes back online, Cloud Firestore synchronizes any local changes made by your app to the data stored remotely in Cloud Firestore.
So, every client that is using a Firestore database and sets PersistenceEnabled to true, maintains it's own internal (local) version of the database. When data is inserted/updated, it is first written to this local version of the database. As a result, all writes to the database are added to a queue. This means that all the operations that where stored there will be commited on Firebase servers once you are back online. This also means that those operations will be seen as independent operations and not as a whole.
But remeber, don't use Firestore as an offline-only database. It is really designed as an online database that came work for short to intermediate periods of being disconnected. While offline it will keep queue of write operations. As this queue grows, local operations and app startup will slow down. Nothing major, but over time these may add up.
If Google Cloud Firestore priceing model does not fit your use case very well then use Firebase Realtime Database. As mentioned also in this post from the Firebase offical blog, one the reasons you still might want to use the Realtime Database is:
As we noted above, Cloud Firestore's pricing model means that applications that perform very large numbers of small reads and writes per second per client could be significantly more expensive than a similarly performing app in the Realtime Database.
So it's up to you which option you choose.
According to this If you want to work completely offline with Cloud Firestore you can disable network by :
FirebaseFirestore.getInstance().disableNetwork()
but firestore will cause client offline error for first user get request, that you must consider this error as empty response.

Resources