General questions about data transfer in Realtime Database - firebase

When I say "transfer" I am referring only to billable transferred data, which I believe is downloaded data only.
Lets say you're listening to ref('/posts/').onValue() and then you call ref('/posts').once(), does Realtime Database know to transfer no data across the network because the client already has the most up-to-date version of the data?
Now lets say you're listening to ref('/posts/').onValue() and a new update is found. Does Realtime Database transfer the delta or the entire document?
Now lets say you are doing a filter like ref('/posts/').orderByChild('timestamp').limitToLast(10) and your /posts/ ref has 500 entries. Does Realtime Database transfer 10 children to the client or all 500?
Now lets say you register thousands of listeners with Realtime Database, are you billed for the setting and removal of listeners?
Note: I'm not sure if this should be multiple questions or if this shouldn't be posted on Stack Overflow. Such is the uncertainty of a stack newbie.

If you attach multiple listeners to a location at the same time, the data for that location will only be transferred once.
When there is an update to a part of a larger node that you listen to, Firebase tries to send only the delta over the wire. The size of the exact data depends on the total size of the node and the update under it.
If you have an index on the queried property, only the query results will be transferred. If you don't have an index on the property, the Firebase client will log an error, transfer all data at the location and filter client-side.
Note that there are tools to learn about these things:
Use the Firebase Database profiler to learn more about read/write speed, bandwidth, and unindexed queries.
Enable debug logging or check the network tab of your browser to see the exact wire traffic between the client and the database.

Related

Using Realtime Database and Firestore together

I want to use firestore in my app due to the scaling limit being 1 million concurrent connections. I have found the pricing to be quite high especially when compared with the real time database, but cannot use this as it only scales to around 200k.
I was wondering whether I could use firestore which will be directly accessed on the client side for some of my data that will need live document listeners and use the realtime data for storing larger chunks of data which will be queried indirectly using firebase functions.
My question is:
if the only way to read/write the realtime database is through a cloud function which is called by the client side, will this only count as 1 concurrent connection as the client side is not directly connected to it?
Thank you
but cannot use [Realtime Database] as it only scales to around 200k.
Keep in mind that this is per database instance. On a paid project, you can create additional database instances to scale much further (even beyond the 1m concurrents that Firestore supports), as long as you are able/willing to define how to distribute your users over the database instances (commonly referred to as a "sharding strategy").
On your actual question: each Cloud Functions instance counts as a single connection to the database. Keep in mind here that Cloud Functions auto-scale, so you will have as many connections from Cloud Functions as you have concurrently running Cloud Functions instances. So while it may well be more than a single connection, it is extremely unlikely you'll reach the limit of 200K connections through this means.

Offline access to Firestore sub-collections

Looking for advice on my data structure in Firebase.
My app: Plant care reminders
I'm thinking the basic data structure can look something like this.
So the user can have many plants, and for each plant it can have many tasks.
I believe I would have a collection of users top level in Firestore, then each userData document would have a sub-collection of plants. Subsequently each plant would have a sub-collection of tasks.
The app will display all the users plants on one screen, that user can then click on a plant and view the tasks.
I would like the ability for the user to go offline for a period and still be able to access everything.
Is it wise to do one big query to retrieve all the data on the app load up? Doing this to make sure if they do go offline Firestore has all their cached sub-collections.
Or is it better to do a query on load up to get the users sub-collection of plants so they can see what they have, then when they click on a plant do another query to get that plants sub-collection of tasks?
If a user can see a plant, then goes offline and clicks that plant. Is it possible to query the plants sub-collection of tasks without network connection?
Apologies for poor explanation, trying to wrap my head round offline data persistence with Firestore and nested sub-collections when Firestore does shallow queries.
Firestore's disk persistence functions as a cache, maintaining data the app has recently loaded, and all local write operations that haven't been synchronized to the server yet.
I would like the ability for the user to go offline for a period and still be able to access everything.
This is inherently not a great match for how the Firestore disk cache works. To make it work in your use-case, you'd need to make sure to read all data, which will both drive up read operations and bandwidth consumption, and will also make the local cache runs more slowly than needed.
If you need a fully local database instead of a cache of recently read and locally modified data, Firestore might not be the best fit for this use-case. Consider using your own local database instead.

Does Firebase have a way to limit access to all public data in the security rules?

Update: Editing the question title/body based on the suggestion.
Firebase store makes everything that is publicly readable also publicly accessible to the browser with a script, so nothing stops any user from just saying db.get('collection') and saving all the data as theirs.
In more traditional db setup where an app's frontend is pulling data from backend, and the user would have to at least go through the extra trouble of tweaking the UI and then scraping the front end to pull more-and-more data (think Twitter load more button).
My question was whether it was possible to limit users from accessing the entire database in a click, while also keeping the data publicly available.
Old:
From what I understand, any user who can see data coming out of a Firebase datastore can also run a query to extract all of that data. That is not desirable when data itself is of any value, and yet Firebase is such an easy to use tool, it's great for pretty much everything else.
Is there a way, or a best practice, for how to structure the data or access rules s.t. users see the data, but can't just run a script to download all of it entirely?
Thanks!
Kato once implemented a simplistic rate limit for writes in Realtime Database security rules: Firebase rate limiting in security rules?. Something similar could be possible in Cloud Firestore rules. But this approach won't work for reads, since you can't update the timestamp at the same time the read is performed.
You can however limit what queries a user can perform on your database. For example, to limit them to reading 50 documents at a time:
allow list: if request.query.limit <= 50;

Does FireBase RealTime DataBase account for downloading data view this data in the Developer Console?

I just can not figure out where the megabytes of downloaded data from FireBase RealTime DataBase come from, whereas I'm requesting a specific value in a particular line with a size of 10-20 characters. Values come. Requests for such values were not more than one hundred.
Value Request string
Firebase ref = new Firebase("https://XXXXXXXX.firebaseio.com/");
ref.child("city").child("street").addValueEventListener(new ValueEventListener() {
...
String street = snapshot.getValue().toString();
Perhaps FireBase RealTime DataBase takes into account for downloading data viewing this data in the Developer Console?
From: https://firebase.google.com/docs/database/usage/billing
Outbound traffic includes connection and encryption overhead from all database operations and data downloaded through database reads. Both database reads and writes can lead to connection costs on your bill. All traffic to and from your database, including operations denied by security rules, leads to billable costs.
And:
Firebase console data: Although this isn't usually a significant portion of Realtime Database costs, Firebase charges for data that you read and write from the Firebase console
And:
Protocol overhead: Some additional traffic between the server and clients is necessary to establish and maintain a session. Depending on the underlying protocol, this traffic might include: Firebase Realtime Database's realtime protocol overhead, WebSocket overhead, and HTTP header overhead. Each time a connection is established, this overhead, combined with any SSL encryption overhead, contributes to the connection costs. Although this isn't a lot of bandwidth for a single request, it can be a substantial part of your bill if your payloads are tiny or you make frequent, short connections.
I think my downloaded data comes frome many small writes to the database and the overhead associated with it. I have an IoT application - now I am not sure if Firebase is the right choice for this.

Firebase multi location updates "cost"

I've been wondering how costly the multi location updates are?
I am working with many different nodes so I will not have to read big data when I need a report for example.
Assuming I will have to update about 12 differnet locations (nodes) whenever I add item to my database, will it use too much resources or it's designed for such cases?
Firebase Realtime Database uses websockets to exchange the information with database and client. Once the connection is established it sends and received information in this connection. Of course everything has its limit but I think 12 or 120 updates will not be problem.
There is no specific maximum of what you can do in a multi-location update.
If you have a problem with a specific multi-location update, post the minimum code that reproduces the problem and we can have a look.

Resources