Firestore offline data: Merging writes, maximum time of offline persistence - firebase

https://firebase.google.com/docs/firestore/manage-data/enable-offline
How does Firestore work with offline data?
How are writes merged by many clients editing the same data offline, that then come online at the same time?
How long is offline data persisted? If my user uses my app for 5 years offline, then comes back online, will this be an issue? Do offline changes persist after device restarts?
Does query performance of offline data degrade as the data set gets larger?
Im specifically interested in the web Firestore client.
Do all the language clients implement the above in the same manner?
Thanks.

How are writes merged by many clients editing the same data offline, that then come online at the same time?
The write operations that will take place on Firebase servers, will be in the order in which that series of operations happened. The last operation (the most recent one) will be the one that will be available in the database by the time the synchronization occurs.
How long is offline data persisted? If my user uses my app for 5 years offline, then comes back online, will this be an issue?
The problem is not about how long is about how many operations do you make while the device is offline. While offline, Firestore will keep in queue all the write operations. As this queue grows, local operations and app startup will slow down. Nothing major, but over time these may add up. The major problem in this case is that the end result of this will be that the data on the server stays unmodified. Then what is the purpose of a realtime database? Firestore is really designed as an online database that came work for short to intermediate periods of being disconnected, not to stay offline for 5 years. Beside that, in 5 years it might be a problem of compatibility and not of the number of writes.
Do offline changes persist after device restarts?
The offline persistence is also called disk persistence. This type of persistence is enabled by default in Cloud Firestore and it means that recently listened data (as well as any pending writes from the app to the database) are persisted to disk. The data in this cache survives app restarts and device reboots.
Does query performance of offline data degrade as the data set gets larger?
Yes it does, like explained above.
Do all the language clients implement the above in the same manner?
No. For iOS and Android, the offline feature works fine while for web, this feature is still experimental.

Related

Best strategy to develop back end of an app with large userbase, taking into account limitations of bandwidth, concurrent connections etc.?

I am developing an Android app which basically does this: On the landing(home) page it shows a couple of words. These words need to be updated on daily basis. Secondly, there is an 'experiences' tab in which a list of user experiences (around 500) shows up with their profile pic, description,etc.
This basic app is expected to get around 1 million users daily who will open the app daily at least once to see those couple of words. Many may occasionally open up the experiences section.
Thirdly, the app needs to have a push notification feature.
I am planning to purchase a managed wordpress hosting, set up a website, and add a post each day with those couple of words, use the JSON-API to extract those words and display them on app's home page. Similarly for the experiences, I will add each as a wordpress post and extract them from the Wordpress database. The reason I am choosing wordpress is that it has ready made interfaces for data entry which will save my time and effort.
But I am stuck on this: will the wordpress DB be able to handle such large amount of queries ? With such a large userbase and spiky traffic, I suspect I might cross the max. concurrent connections limit.
What's the best strategy in my case ? Should I use WP, or use firebase or any other service ? I need to make sure the scheme is cost effective also.
My app is basically very similar to this one:
https://play.google.com/store/apps/details?id=com.ekaum.ekaum
For push notifications, I am planning to use third party services.
Kindly suggest the best strategy I should go with for designing the back end of this app.
Thanks to everyone out there in advance who are willing to help me in this.
I have never used Wordpress, so I don't know if or how it could handle that load.
You can still use WP for data entry, and write a scheduled function that would use WP's JSON API to copy that data into Firebase.
RTDB-vs-Firestore scalability states that RTDB can handle 200 thousand concurrent connections and Firestore 1 million concurrent connections.
However, if I get it right, your app doesn't need connections to be active (i.e. receive real-time updates). You can get your data once, then close the connection.
For RTDB, Enabling Offline Capabilities on Android states that
On Android, Firebase automatically manages connection state to reduce bandwidth and battery usage. When a client has no active listeners, no pending write or onDisconnect operations, and is not explicitly disconnected by the goOffline method, Firebase closes the connection after 60 seconds of inactivity.
So the connection should close by itself after 1 minute, if you remove your listeners, or you can force close it earlier using goOffline.
For Firestore, I don't know if it happens automatically, but you can do it manually.
In Firebase Pricing you can see that 100K Firestore document reads is $0.06. 1M reads (for the two words) should cost $0.6 plus some network traffic. In RTDB, the cost has to do with data bulk, so it requires some calculations, but it shouldn't be much. I am not familiar with the pricing small details, so you should do some more research.
In the app you mentioned, the experiences don't seem to change very often. You might want to try to build your own caching manually, and add the required versioning info in the daily data.
Edit:
It would possibly be more efficient and less costly if you used Firebase Hosting, instead of RTDB/Firestore directly. See Serve dynamic content and host microservices with Cloud Functions and Manage cache behavior.
In short, you create a HTTP function that reads your database and returns the data you need. You configure hosting to call that function, and configure the cache such that subsequent requests are served the cached result via hosting (without extra function invocations).

Firebase realtime database's unknown pending issue occur every day

We have a mobile app service on Firebase.
Our service's concurrent connection is of almost 5,000 ~ 10,000.
I know, it does NOT matter with performance limitations.
But, we have an issue about Realtime Database's pending (1~3 minutes).
It occurs every single day at night time, even with few connections.
We started logging the main realtime database with Elasticsearch because of these pending issues.
This pending issue can be checked in detail.
If occurs db's pending, our app going to disable in 1~3 minutes suddenly (db load 3% -> 100%)
and we can checked 'concurrent-connect/concurrent-disconnect' of almost concurrent users in same time.
I have attached a relevant screen capture. This issue can also be checked in GCP's Stackdriver.
We guess this issue was generated logic of Firebase assigned performance because of usage difference daytime with nighttime.
I've contacted support 3 days ago but haven't received a reply yet.
So, I wondering if anyone has the same problem or knows of this issue.
stackdriver captured
elk captured
It sounds like you're performing a bulk read or write operation every day at the same time. This type of operation typically comes from a batch process and is unrelated to the number of active users of your app. In fact, the bulk read/write will lock out the regular users of your app.
An example of such a batch process could be a bulk read that every night synchronizes the data in the Firebase Realtime Database with Elasticsearch, but there are many more options.
If there's indeed a batch process that is causing this load, you'll want to look into:
Either splitting the process into multiple smaller steps, so that they can intersperse with the traffic from your regular clients.
Or running the same type of process on a backup of the database. Since you'd be running against a local file, it won't interfere with the regular clients, and the backup is made by an out-of-band process (so not causing interference either).

Send data to DynamoDb over intermittent connection

I have an application that needs to send data to a cloud database (DynamoDb).
The app runs on a computer that can lose internet connectivity or be switched off at any time, but I must ensure that all data eventually gets to the cloud database.
I can assume the application will eventually be switched on, and will eventually get internet access back.
The app is written in VB .NET
What are some schemes for achieving this, and are there any ready-made products that already achieve this?
You could implement a write-through cache using a local DynamoDB instance (or even using SQLite). But without getting specific details about what kind of data you'd be storing into the database, and what data should be made available "offline" it's hard to say exactly how you should structure your application. You'll definitely want to not keep everything local, unless the volume of data is really small overall.
Then there is the problem of resolving conflicts that may occur during network partitions (ie. a client goes offline and makes some database modifications, while other clients also make modifications to the database; these need to be reconciled and it's up to you, and your users to determine how)
It's not a simple problem to solve.

Can I use Firebase as offline NoSQL database? [duplicate]

I am wondering whether it is a sound strategy to use the firebase offline capabilities as a "free" cache.
Let's assume that I am in activity A, I fetch some data from firebase, and then I move to activity B, which needs the same data. If the app is configured with setPersistenceEnabled(true) and, if necessary, also with keepSynced(true), can I just re-query the same data in activity B, rather that passing it around?
I understand that there is a difference between the two approaches regarding reading-from-memory and reading-from-disk (firebase offline cache). But do I really get rid of all the network overhead by using firebase offline?
Relevant links:
Firebase Offline Capabilities and addListenerForSingleValueEvent
https://groups.google.com/forum/#!msg/firebase-talk/ptTtEyBDKls/XbNKD_K8CQAJ
Yes, you can easily re-query your Firebase Database in each activity instead of passing data around. If you enable disk persistence, this will be a local read operation. But since you attach a listener (or keep it attached through keepSynced()), it will cause network traffic.
But don't use Firebase as an offline-only database. It is really designed as an online database that can work for short to intermediate periods of being disconnected. While offline it will keep queue of write operations. As this queue grows, local operations and app startup will slow down. Nothing major, but over time these may add up.

Latency for SQLite update propagation across iCloud

I've build a library-style SQLite iOS app using the code in the Recipes sample app, and it works - updates on one device are (eventually) reliably propagated to all other devices running the same app. I've been testing it with multiple events per hour all day long, and all the log transactions do get to every device. However, the time for the updates to propagate is highly variable. If I bring the app up and let it sit, it could be a relatively long time before the cloud sends update transactions to the app, and so what's on-screen remains old data for that same long time. Worse, there's no indication the data is out of date.
If I cause the app to post a change to the cloud, though, updates from the cloud propagate down relatively soon. That suggests that I could put in a hack that periodically posts pointless changes to the database, but even then I won't know if I've received all the changes.
First question: Do methods that will force transactions to propagate exist? a This thread suggests not.
Second question: Is there a way to detect if the local database is out of date? I don't want to tickle the cloud copy incessantly, but doing so now and then until the database is current might not be such a bad idea.
1) if you find such methods, please let me know.
2) same stuff here. I didn't even find any reliable way to detect whether the currently running instance of the iCloud app is not the first instance i.e. other instances have already sent updates to the iCloud until the first push received (which can take some unknown time)

Resources