Firebase social media app: saving groups of users - firebase

In a Firebase social-media-esque app, here is how I'm storing groups along with its members:
Here, I have a reciprocating relationship in terms of memberships of groups. A group knows all of the users that are a part of it in group-users and vise-versa—a user knows all of the groups it's in via user-groups. The reason I thought to do this is it seems to be quick for a user to access all of its groups, and likewise it seems to be quick for a group to access all of its users. However, having this reciprocating data is costly in terms of space. My other option would be to keep just the user-groups node, and to retrieve all the members of a single group I would need to query that node by group ID. This option seems slower but would take less space in the database. Which option is more advised? Thanks.

The best data model always depends on the use-cases of your app. If you need access to both the groups-for-a-user and the users-for-a-group, the cheapest way to do so is to store both.
See:
Many to Many relationship in Firebase
the answer to this question in a video
this article about NoSQL data modeling
the Firebase video series Firebase for SQL developers

Related

How to structure Firestore Security Rules & Data Structure for granular access

I am building a community-type app based on Firestore where users should have granual control over what kind of information they share with whom.
Users can have properties such as name, birthdate, etc. and for each of them they can decide to share it with the one of the following groups/roles:
Private
Contacts
Admin (Admins of organizations that user is a member of)
Organization (Members of organizations that a users is a member of)
Public (All users of the app)
As documents in Firestore will always be retrieved as a whole, I already know that I somehow will have to segregate my user properties by access level.
I've got two approaches so far:
Approach 1
Store each user property in a separate document that contains a field access level
Store some metadata in, for example /user/12345/meta/roles, so that I can point the security rules to those documents to validate access
Benefits:
Easy structure
Flexibly
(Almost) no data duplication
Drawbacks:
Lots of document reads for getting a user's profile
Approach 2
Store user profile in, for example /user/12345/profile/private and duplicate the public information into /user/12345/profile/public, and do the same for each access level
Benefits:
Reduced document reads
Drawbacks:
Complexity
It feels wrong to duplicate that much data
Does anyone have any experience with this and any suggestions or alternative approaches they can share?
Follow-up question:
Let’s say I store the list of members of an organization in a subcollection, that is only accessible for members of the organization (for privacy reasons). Doesn’t that mean that when querying that list of members from client side, I have to do it „blindly“, meaning I can’t know if the user can access that document until I actually try? The fact that the query might fail would tell me that the user is not actually a member of that organization.
Would you consider this kind of query that is set up for failure bad practice? Are there any alternatives that still allow to keep the memberlist private?
I think you are moving from a SQL environment to NoSql now which is why you are finding the Approach 2 as not the right way to proceed.
Actually approach 2 is the right way to proceed there are couple of advantages
1.) Reduced Document Reads - More cost savings. Firestore charges by number of reads and writes if you are reducing no of reads and writes optimally its always the way to go for. Also the cost of storage due is increased reads will always be less than the actual cost of reads if you are scaling up your application.
2.) In NoSql database your are allowed to duplicate data provided it is going to increase the read / search speed from the database.
I am not seeing the second approach as complex because that's the tradeoff you are making when Choosing a NoSql over Sql

Firebase Realtime Database vs Cloud Firestore

Edit: After posting the question I thought I could also make this post a quick reference for those of you needs a quick peek at some of the differences between these two technologies which might help you decide on one of them eventually. I will be editing this question and adding more info as I learn more.
I have decided to use firebase for the backend of my project. For firestore is says "the next generation of the realtime database". Now I am trying to decide which way to go. Realtime database or cloud firestore?
Billing:
At a first glance, it looks like firestore charges per number of results returned, number of reads, number of writes/updates etc. Real-time database charges based on the data transmitted. The number of read-write operations is irrelevant. They both also charge on the data stored on the google servers too (I think in this respect firestore is cheaper one). Why am I mentioning this price point? Because from my point of view, although it might a lower weight, it is also a point to consider while choosing the one over the other.
Scaling:
Cloudstore seems to scale horizontally seamlessly. I think this is not possible with the real-time database.
Edit:
In the real-time database, you need to shard your data yourself using multiple databases. And you can only do this if you are in BLAZE pracing plan.
ref: https://firebase.google.com/docs/database/usage/sharding
Performance & Indexing:
Another thing is the real-time database data structure is different in both. The real-time database stores the data as a JSON object in any way we structure them. Firestore structures the data as collections and documents. And hence the querying also changes between the two.
I think firestore does auto indexing which increases the read performance greatly too (which will decrease read performance). I am not sure if this is also the case with the real-time database.
Edit:
The real-time database does not automatically index your data. You need to do it yourself after a solid inspection of your data and your needs.
ref:https://firebase.google.com/docs/database/security/indexing-data
What other differences can you think of?
What would be (or has been) your choice for different types of projects?
Do you still go with the real-time database or have you migrated from that to the firestore? If so why?
And one last thing. How would you compare the SDKs of these two?
Thanks a lot!
What other differences can you think of?
what i think, ok. I use realtime-database for 6 months experience and difference is, firestore easy for sorting data. As Example, i want to retrieving user name based timestamp.
Query firstQuery = firestore.collection("Names").orderBy("timestamp", Query.Direction.DESCENDING).limit(10); // load 10 names
What would be (or has been) your choice for different types of
projects?
For me, Realtime-Database for Data Streaming when i work with Arduino, i want to store Drone Speed.
And Firestore for SMART OFFICE, like Air Conditioner, or light-room and Enterprise like Inventory Quantities, etc.
Do you still go with the real-time database or have you migrated from
that to the firestore? If so why?
still go with real-time because i need TREE for displaying streaming data strucure instead of query TABLE like firestore.

Firestore database model for Notion-like modules [duplicate]

I have seen videos and read the documentation of Cloud firestore, from Google Firebase service, but I can't figure this out coming from realtime database.
I have this web app in mind in which I want to store my providers from different category of products. I want perform a search query through all my products to find what providers I have for such product, and eventually access that provider info.
I am planning to use this structure for this purpose:
Providers ( Collection )
Provider 1 ( Document )
Name
City
Categories
Provider 2
Name
City
Products ( Collection )
Product 1 ( Document )
Name
Description
Category
Provider ID
Product 2
Name
Description
Category
Provider ID
So my question is, is this approach the right way to access the provider info once I get the product I want?
I know this is possible in the realtime database, using the provider ID I could search for that provider in the providers section, but with Firestore I am not sure if its possible or if this is right approach.
What is the correct way to structure this kind of data in Firestore?
You need to know that there is no "perfect", "the best" or "the correct" solution for structuring a Cloud Firestore database. The best and correct solution is the solution that fits your needs and makes your job easier. Bear also in mind that there is also no single "correct data structure" in the world of NoSQL databases. All data is modeled to allow the use-cases that your app requires. This means that what works for one app, may be insufficient for another app. So there is not a correct solution for everyone. An effective structure for a NoSQL type database is entirely dependent on how you intend to query it.
The way you are structuring your data looks good to me. In general, there are two ways in which you can achieve the same thing. The first one would be to keep a reference of the provider in the product object (as you already do) or to copy the entire provider object within the product document. This last technique is called denormalization and is a quite common practice when it comes to Firebase. So we often duplicate data in NoSQL databases, to suit queries that may not be possible otherwise. For a better understanding, I recommend you see this video, Denormalization is normal with the Firebase Database. It's for Firebase Realtime Database but the same principles apply to Cloud Firestore.
Also, when you are duplicating data, there is one thing that needs to keep in mind. In the same way, you are adding data, you need to maintain it. In other words, if you want to update/delete a provider object, you need to do it in every place that it exists.
You might wonder now, which technique is best. In a very general sense, the best way in which you can store references or duplicate data in a NoSQL database is completely dependent on your project's requirements.
So you should ask yourself some questions about the data you want to duplicate or simply keep it as references:
Is the static or will it change over time?
If it does, do you need to update every duplicated instance of the data so they all stay in sync? This is what I have also mentioned earlier.
When it comes to Firestore, are you optimizing for performance or cost?
If your duplicated data needs to change and stay in sync in the same time, then you might have a hard time in the future keeping all those duplicates up to date. This will also might imply you spend a lot of money keeping all those documents fresh, as it will require a read and write for each document for each change. In this case, holding only references will be the winning variant.
In this kind of approach, you write very little duplicated data (pretty much just the Provider ID). So that means that your code for writing this data is going to be quite simple and quite fast. But when reading the data, you will need to load the data from both collections, which means an extra database call. This typically isn't a big performance issue for reasonable numbers of documents, but definitely does require more code and more API calls.
If you need your queries to be very fast, you may want to prefer to duplicate more data so that the client only has to read one document per item queried, rather than multiple documents. But you may also be able to depend on local client caches makes this cheaper, depending on the data the client has to read.
In this approach, you duplicate all data for a provider for each product document. This means that the code to write this data is more complex, and you're definitely storing more data, one more provider object for each product document. And you'll need to figure out if and how to keep up to date on each document. But on the other hand, reading a product document now gives you all information about the provider document in one read.
This is a common consideration in NoSQL databases: you'll often have to consider write performance and disk storage vs. reading performance and scalability.
For your choice of whether or not to duplicate some data, it is highly dependent on your data and its characteristics. You will have to think that through on a case-by-case basis.
So in the end, remember that both are valid approaches, and neither of them is pertinently better than the other. It all depends on what your use-cases are and how comfortable you are with this new technique of duplicating data. Data duplication is the key to faster reads, not just in Cloud Firestore or Firebase Realtime Database but in general. Any time you add the same data to a different location, you're duplicating data in favor of faster read performance. Unfortunately in return, you have a more complex update and higher storage/memory usage. But you need to note that extra calls in Firebase real-time database, are not expensive, in Firestore are. How much duplication data versus extra database calls is optimal for you, depends on your needs and your willingness to let go of the "Single Point of Definition mindset", which can be called very subjective.
After finishing a few Firebase projects, I find that my reading code gets drastically simpler if I duplicate data. But of course, the writing code gets more complex at the same time. It's a trade-off between these two and your needs that determines the optimal solution for your app. Furthermore, to be even more precise you can also measure what is happening in your app using the existing tools and decide accordingly. I know that is not a concrete recommendation but that's software development. Everything is about measuring things.
Remember also, that some database structures are easier to be protected with some security rules. So try to find a schema that can be easily secured using Cloud Firestore Security Rules.
Please also take a look at my answer from this post where I have explained more about collections, maps and arrays in Firestore.

Should I use redundancy or a simple query on a large dataset with Firebase Cloud Firestore database?

I have a collection, itemsCollection, which contains a very large amount of small itemDocs. Each itemDoc has a subcollection, statistics. Each itemDoc also has a field "owner" which indicates which user owns the itemDoc.
itemsCollection
itemDoc1
statistics
itemDoc2
statistics
itemDoc3
statistics
itemDoc4
statistics
...
I also have a collection, usersCollection, which contains basic user info.
usersCollection
user1
user2
user3
...
Since each itemDoc belongs to a specific user, it's necessary to display to each user which itemDocs they own. I have been using the query:
db.collection("itemsCollection").where("owner", "==", "user1")
I am wondering if this will scale effectively, i.e. whenever itemsCollection gets to be millions of records? If not, is the best solution to duplicate each itemDoc and its statistics subcollection as a subcollection in the user document, or should I be doing something else?
As Alex Dufter, the product manager from Firebase, explained in one of days at Firebase Dev Summit 2017 that Firestore was inspired in many ways by the feed-back that they had on the Firebase Realtime Database over the years. They faced two types of issues:
Data modelling and querying. Firebase Realtime Database cannot query over multiple properties because it ussaly involves duplication data or cliend-side filtering, which we all already know that is some kind of messy.
Realtime Database does not scale automatically.
With this new product, they say that you can now build an app and grow it to planetary scale without changing a single line of code. Cloud Firestore is also a NoSQL database that was build specifically for mobile and web app development. It's flexible to build all kinds of apps and scalable to grow to any size.
So because the new database was build knowing this iusses, duplication data is not nedeed anymore. So you will not have to worry about using that line of code, even if your data will grow to millions of records, it will scale automatically. But one thing you need to remember, if you will use multiple conditions, don't forget to use the indexes by simply adding them in the Firebase console. Here are two simple examples from the offical documentation:
citiesRef.whereEqualTo("state", "CO").whereEqualTo("name", "Denver");
citiesRef.whereEqualTo("state", "CA").whereLessThan("population", 1000000);

Can Firebase Realtime Database effectively loop through billions of posts and retrieve them by the users that posted them?

I am developing an iOS app with Firebase Realtime Database. The app will potentially have billions of posts with a number of images and data that needs to be retrieved based on the people a specific user follows (something like Instagram).
I understand that the best practice in Firebase is to structure data as flat as possible which would mean having a "Posts" node with potentially billion of entries, which I would then filter by a kind of 'posted_by' parameter. This begs two questions:
1) Will I be able to retrieve said posts with a query that returns posts by any of the users I follow? (By passing something like an array of the users I follow)
2) Will Firebase be effective enough to loop through potentially billions of posts to find the ones that match my criteria, or is there otherwise a better way to structure data so as to make the app as optimal as possible?
Thanks in advance for the answers.
Billions of entries are no problem.
You should check if Firebase is the most cost efficient solution if you have huge volume of data.
1) Firebase can do that, but you probably don't want the user to wait for all entries (when there are a lot for a single user), but instead request them "page" by "page" and only request more pages on demand when the user scrolls up/down.
2) If you ensure you have an index on the user id, then it doesn't have to go through each one individually. Searching by index is efficient.

Resources