Message guarantees on rapidly updated entities in Firebase - firebase

I'd like to understand how Firebase and listening Clients behave in the situation where a large number of updates are made to an entity in a short amount of time, and a client is listening to 'value' changes on that entity.
Say I have an entity in firebase with some simple data.
{
"entity": 1
}
And the value of that "entity" was updated very rapidly. Something like the below code that writes 1000 integers.
//pseudo-code for making 1000 writes as quickly as possible
for(var i = 0; i < 1000; i++) {
ref.child('entity').set(i)
}
Ignoring transient issues, would a listening client using the 'on' API in a browser receive ALL 1000 notifications containing 0-999, or does Firebase have throttles in place?

First off, it's important to note that the Firebase realtime database is a state synchronization service, and is not a pub/sub service.
If you have a location that is updating rapidly, the service guarantees that eventually the state will be consistent across all clients, but not that all intermittent states will be surfaced. At most one event will fire for every update, but the server is free to 'squash' successive updates to the same location into one.
On the client making the updates, I think the current behavior is that every change propagates a local event, but I could be wrong and this is a notable exception.
In order to achieve guaranteed delivery of every intermediate state, it's possible to push (childByAutoId in Objective-C) onto a list of events at a database location instead of simply updating the value directly. Check out for the Firebase REST API docs on saving lists of data

Related

Firebase Firestore - delay between data is written and available for a query

I have noticed a delay between the time some data is written to Firestore and when it is available to be read via a query. In my tests I see this can go up to 30sec.
Is this documented anywhere?
Are there ways to decrease this delay?
Is there a way to know the server timestamp corresponding to the data being returned? Or to have any indication about this delay in the data being returned from Firestore?
(say some data is written to the server at 1:00 - the document is created server-side at that time, I query it at 1:01, but due to the delay it returns the data as it was at 0:58 server-side, the timestamp would be 0:58)
Here I am not speaking about anything with high load, my tests were just about writing and then reading 1 small document.
Firestore will have some delays, even I have noticed it. It's better to use the Realtime Database as it lives up to the name, the time lag is minimal, less than a second in most cases !
If you use Firestore on a native device offline first is enabled by default. That means data is first written to a cache on your device and then synced to the backend. On Web you can enable that to. To listen for those changes when a write is saved to the backend you would need to enable includeMetadataChanges on your realtime listeners:
db.collection("cities").doc("SF")
.onSnapshot({
// Listen for document metadata changes
includeMetadataChanges: true
}, (doc) => {
// ...
});
That way you will get a notice when the data is written to the backend. You can read more about it here.
The delay should be only between different devices. If you listen to the changes on your device where you write the data it will be shown immidiately.
Also something you should take care of. If you have offline enabled you should not await for the writes to finish. Just use them as if they are synchronous (even on web with enabled offline feature). I had that before in my code and it looked like the database was slow. By removing the await the code was running much faster.

Do Firebase listeners incur more reads than normal 'snapshots'?

I wonder if firebase listeners incur more reads than 'normal', snapshot gets from firestore database. I assume a listener has to be constantly checking the DB to see if something new is in there and therefore that would add up to the reads on my app.
A get() is not necessarily more expensive than a snapshot listener. The difference between a get and a listen boils down to the fact that get will read a document exactly once, and a listen will read at least once, which each change to that document incurring another read, for as long as the listener is active.
A listener will only cause more billing for reads if the listener is left active, and the document changes over the time while it's active.
You could instead poll the document with several gets over time, but each one of those gets will cost a document read, even if the document is unchanged. So you will have to determine for yourself what the best way is to get updates to a document, or even if you want updates at all.
You have two ways to retrieve data from Firestore:
Promise
Listener
When you make use of promise with get() method, you are sending lazy request to server. While listener method opens a connection channel that retrieve information in real time on every change. (It doesn't mean such a process making request every time)
When you access to data using the Listener strategy, you are opening a communication channel that, of course, consumes connection resources. But you can unsubscribe it: Check How to remove listener for DocumentSnapshot events (Google Cloud FireStore)
The whole thing with a listener is that you don't want it to constantly check the database, like on an interval. Instead, a firestore listener opens up a websocket connection which, in contrast to http, allows two way communication - both hosts can be both server and client. This means that instead of constantly checking the database, the server tells you when there are updates, after which your snapshot callback code is run.
So like others has stated, attaching a listener consumes one read per document when the listener is attached, plus one per potential document update for as long as the listener is active.

Streams with initial state

I would like to expose something like a subscription or a "sticky query": the goal is to query DynamoDB and return the results via the WebSockets API in API Gateway. Well, whenever DynamoDB changes in a way the query would be affected (I guess I could use Streams for that) I would like to notify the client(s). How can I make sure the client gets the initial list and all updates? I would like to make sure the client doesn't miss any updates right after the subscription is created and before the initial list of results is returned to it...
To inform your clients about changes in your DynamoDB, DynamoDB Streams could be used. However, the information is only available for 24 hours.
Even if you write your updates to a Kinesis Stream, your information will be available for a maximum of 7 days (according to the FAQ)
I suggest splitting your use-case in two:
Create a service where you return the initial state of your "sticky query"
Create a stream which will notify your clients about updates to the "sticky query"

Firebase real time database transaction while offline

I am using react-native-firebase package in a react native application and am trying to understand how transactions work in offline. I am trying to write a transaction using the following code
firebase.database().ref('locations').transaction(locations => {
... my location modification logic,
return locations
})
However, if I go offline before writing the transaction and have not accessed the reference previously and therefore have no cached data, locations is null.
There is this small tidbit in Firebase's official documentation
Note: Because your update function is called multiple times, it must
be able to handle null data. Even if there is existing data in your
remote database, it may not be locally cached when the transaction
function is run, resulting in null for the initial value.
Which leads me to believe I should wrap the entire transaction logic inside
if (locations) {
... my location modification logic
}
But I still don't fully understand this. Is the following assumption correct?
Submit transaction
If offline and cached data exists, apply transaction against cached data, then apply towards current data in remote when connectivity resumes
If offline and no cached data exists, do not apply transaction. Once connectivity resumes, apply transaction to current data in remote
If online, immediately apply transaction
If these assumptions are correct, then the user will not immediately see their change in case #3, but in case #2 it will 'optimistically' update their cached data and the user will feel like their action immediately took place. Is this how offline transactions work? What am I missing?
Firebase Realtime Database (and Firestore) don't support offline transactions at all. This is because a transaction must absolutely round trip with the server at least once in order to safely commit the changes to the data, while also avoiding collisions with other clients that could be trying to change the same data.
If you're wondering why the SDK doesn't just persist the callback that handles the transaction, all that can be said is that persisting an instance of an object (and all of its dependent state, such as the values of all variables in scope) is actually very difficult, and is not even possible in all environments. So, you can expect that transaction only work while the client app is online and able to communicate with the server.

Does Meteor Merge Box reuse documents when a subscription changes it's arguments

If a subscription is rerun with the "same arguments" in a flush cycle it reuses the observer or the server and the data in minimongo:
If the subscription is run with the same arguments then the “new” subscription discovers the old “marked for destruction” subscription that’s sitting around, with the same data already ready, and simply reuses that. - Meteor Guide
Additionally, if two subscriptions both request the same document Merge Box will ensure the data is not sent multiple times across DDP.
Furthermore, if a subscription is marked for destruction and rerun with different arguments the observer cannot be reused, however my question is: if there are shared documents being published by the old and new subscription, in the same flush cycle, will the overlapping documents need be intelligently recycled on the client or will they need be sent over the wire a second time.
[Assume there are no other subscriptions that share this data.]
I believe the data will be reused I need to double check though.

Resources