Absurd Firestore Collections and Security Rules inheritance - firebase

I have Firestore Collections structure like this
../cards/{cardId}/data/{dataId}
to safely read data I need this call on Firestore Security Rules
get(/databases/$(database)/documents/cards/$(cardId)).data
then compare of card fields. It's basically 2 reads everytime I do this.
yet if I change my structure by making it all parent like this (but also has to make some similar fields on both model).
../cards/{cardId}
../data/{dataId}
It does need only 1 read. But I need 2 writes each time because of changes on both similar fields. Write call does less than reads which makes it cheaper, but this is annoying to code. And this makes Firestore inheritance useless.
I mean, can Firestore just have ability to read parent fields too with no cost? At least for Security Rules. Firestore basically is making index for each field right? So, can it just also understand the meaning of inheritance which is make the child know/have parent fields too?. Or this is just the limitation of NoSQL? It's really annoying every time I runs into this.

can Firestore just have ability to read parent fields too with no cost?
Stack Overflow isn't the right place to ask a question like this. But I'll speak confidently and say that, no, Firestore can't have any free writes (outside of the free monthly allowance), else it will not be a service that can sustain itself from the money it receives in revenue.
If you have feedback to send to the Firebase team, contact Firebase support directly. But I suspect that asking to relax the billing requirements of the system is not going to be a request they will entertain.
can it just also understand the meaning of inheritance which is make the child know/have parent fields too?.
What you're describing is not really "inheritance". It's simply nesting. Collections can be nested under documents within other collections. The relationship between those documents is only in the path prefix that they have in common. Other than that, each document stands fully on its own without any ties to any other documents in the system.
Or this is just the limitation of NoSQL?
It has nothing really to do with NoSQL. The relationship between collections and subcollections is just a way that you can organize data in the system. You choose the method of organization that best suits the queries (and security requirements, when using security rules) for your app. Whether that organization is nested or not, it's up to you.
But there is no way to organize your data to get free reads. Each document read always costs 1 read, no matter how it's organized.

Related

How to structure Firestore Security Rules & Data Structure for granular access

I am building a community-type app based on Firestore where users should have granual control over what kind of information they share with whom.
Users can have properties such as name, birthdate, etc. and for each of them they can decide to share it with the one of the following groups/roles:
Private
Contacts
Admin (Admins of organizations that user is a member of)
Organization (Members of organizations that a users is a member of)
Public (All users of the app)
As documents in Firestore will always be retrieved as a whole, I already know that I somehow will have to segregate my user properties by access level.
I've got two approaches so far:
Approach 1
Store each user property in a separate document that contains a field access level
Store some metadata in, for example /user/12345/meta/roles, so that I can point the security rules to those documents to validate access
Benefits:
Easy structure
Flexibly
(Almost) no data duplication
Drawbacks:
Lots of document reads for getting a user's profile
Approach 2
Store user profile in, for example /user/12345/profile/private and duplicate the public information into /user/12345/profile/public, and do the same for each access level
Benefits:
Reduced document reads
Drawbacks:
Complexity
It feels wrong to duplicate that much data
Does anyone have any experience with this and any suggestions or alternative approaches they can share?
Follow-up question:
Let’s say I store the list of members of an organization in a subcollection, that is only accessible for members of the organization (for privacy reasons). Doesn’t that mean that when querying that list of members from client side, I have to do it „blindly“, meaning I can’t know if the user can access that document until I actually try? The fact that the query might fail would tell me that the user is not actually a member of that organization.
Would you consider this kind of query that is set up for failure bad practice? Are there any alternatives that still allow to keep the memberlist private?
I think you are moving from a SQL environment to NoSql now which is why you are finding the Approach 2 as not the right way to proceed.
Actually approach 2 is the right way to proceed there are couple of advantages
1.) Reduced Document Reads - More cost savings. Firestore charges by number of reads and writes if you are reducing no of reads and writes optimally its always the way to go for. Also the cost of storage due is increased reads will always be less than the actual cost of reads if you are scaling up your application.
2.) In NoSql database your are allowed to duplicate data provided it is going to increase the read / search speed from the database.
I am not seeing the second approach as complex because that's the tradeoff you are making when Choosing a NoSql over Sql

Best way to save multiple collections under one user UID

I am writing an app where there is not a lot of interaction with other users. Set and retrieve your own data only.
In Firebase Firestore how could I model this so that everything fits under a users UID?
Something that would look like this?
users/{uid}/user/
users/{uid}/settings/
users/{uid}/weather/
If I want to achieve something like this, then I need to create another UID:
users/{uid}/user/{uid}/{userInfo}
This feels a bit off to me.
Is this wrong? Would it be better if I moved every subcollection into its own collection?
Is this faster / more efficient?
Any help is appreciated!
The most common approaches for me:
Store the profile information, settings and weather in the user document (your {uid}) itself. This most common for the profile information, but it's always worth considering for other types too: do they really need to be in their own documents?
Have a default name for a single subcollection for each user, and then have each information type as a document with a known name in there. So /users/$uid/documents/profile, /users/$uid/documents/settings, and /users/$uid/documents/weather. So now each information type is in a separate document, meaning you can for example secure access to them individually.
If the information for a certain type is repeated, I'd put that in documents in a known/named subcollection. So if there are many weathers, you'd get /users/$uid/weather/$weatherdocs. So with this you can now have an endless set of the specific type of information.
Neither of these is pertinently better/worse, as it all depends on the use-cases of your app.
There will be performance differences between these approaches, as they require a different number of network requests. If this is a concern for your app, I'd recommend testing all approaches above to measure their relative performance against your requirements.

Using Firestore document's auto-generated ID versus using a custom ID

I'm currently deciding on my Firestore data structure.
I'll need a products collection, and the products items will live inside of it as documents.
Here are my product's fields:
uniqueKey: string
description: array of strings
images: array of objects
price: number
QUESTION
Should I use Firestore auto-generated ID's to be the ID of my documents, or is it better to use my uniqueKey (which I'll query for in many occasions) as the document ID? Is there a best option between the 2?
I imagine that if I use my uniqueKey, it will make my life easier when retrieving a single document, but I'll have to query for more than 1 product on many occasions too.
Using my uniqueKey as ID:
db.collection("products").doc("myUniqueKey").get();
Using my Firestore auto-generated ID:
db.collection("products").where("uniqueKey", "==", "myUniqueKey").get();
Is this enough of a reason to go with my uniqueKey instead of the auto-generated one? Is there a rule of thumb here? What's the best practice in this case?
In terms of making queries from a client, using only the information you've given in the question, I don't see that there's much practical difference between a document get using its known ID, or a query on a field that is also unique. Either way, an index is used on the server side, and it costs exactly 1 document read. The document get() might be marginally faster, but it's not worthwhile to optimize like this (in my opinion).
When making decision about data modeling like this, it's more important to think about things like system behavior under load and security rules.
If you're reading and writing a lot of documents whose IDs have a sequential property, you could run into hotspotting on those writes. So, if you want to use your own ID, and you expect to be reading and writing them in that sequence under heavy load, you could have a problem. If you don't anticipate this to be the situation, then it likely doesn't matter too much whose ID you use.
If you are going to use security rules to limit access to documents, and you use the contents of other documents to help with that, you'll need to be able to uniquely identify those documents in your rule. You can't perform a query against a collection in rules, so you might need meaningful IDs that will give direct access when used by rules. If your own IDs can be used easily this way in security rules, that might be more convenient overall. If you're force to used Firestore's generated IDs, it might become inconvenient, difficult, or expensive to try to maintain a relationship between your IDs and Firestore's IDs.
In any event, the decision you're making is not just about which ID is "better" in a general sense, but which ID is better for your specific, anticipated situation, under load, with security in mind.

Firestore database model for Notion-like modules [duplicate]

I have seen videos and read the documentation of Cloud firestore, from Google Firebase service, but I can't figure this out coming from realtime database.
I have this web app in mind in which I want to store my providers from different category of products. I want perform a search query through all my products to find what providers I have for such product, and eventually access that provider info.
I am planning to use this structure for this purpose:
Providers ( Collection )
Provider 1 ( Document )
Name
City
Categories
Provider 2
Name
City
Products ( Collection )
Product 1 ( Document )
Name
Description
Category
Provider ID
Product 2
Name
Description
Category
Provider ID
So my question is, is this approach the right way to access the provider info once I get the product I want?
I know this is possible in the realtime database, using the provider ID I could search for that provider in the providers section, but with Firestore I am not sure if its possible or if this is right approach.
What is the correct way to structure this kind of data in Firestore?
You need to know that there is no "perfect", "the best" or "the correct" solution for structuring a Cloud Firestore database. The best and correct solution is the solution that fits your needs and makes your job easier. Bear also in mind that there is also no single "correct data structure" in the world of NoSQL databases. All data is modeled to allow the use-cases that your app requires. This means that what works for one app, may be insufficient for another app. So there is not a correct solution for everyone. An effective structure for a NoSQL type database is entirely dependent on how you intend to query it.
The way you are structuring your data looks good to me. In general, there are two ways in which you can achieve the same thing. The first one would be to keep a reference of the provider in the product object (as you already do) or to copy the entire provider object within the product document. This last technique is called denormalization and is a quite common practice when it comes to Firebase. So we often duplicate data in NoSQL databases, to suit queries that may not be possible otherwise. For a better understanding, I recommend you see this video, Denormalization is normal with the Firebase Database. It's for Firebase Realtime Database but the same principles apply to Cloud Firestore.
Also, when you are duplicating data, there is one thing that needs to keep in mind. In the same way, you are adding data, you need to maintain it. In other words, if you want to update/delete a provider object, you need to do it in every place that it exists.
You might wonder now, which technique is best. In a very general sense, the best way in which you can store references or duplicate data in a NoSQL database is completely dependent on your project's requirements.
So you should ask yourself some questions about the data you want to duplicate or simply keep it as references:
Is the static or will it change over time?
If it does, do you need to update every duplicated instance of the data so they all stay in sync? This is what I have also mentioned earlier.
When it comes to Firestore, are you optimizing for performance or cost?
If your duplicated data needs to change and stay in sync in the same time, then you might have a hard time in the future keeping all those duplicates up to date. This will also might imply you spend a lot of money keeping all those documents fresh, as it will require a read and write for each document for each change. In this case, holding only references will be the winning variant.
In this kind of approach, you write very little duplicated data (pretty much just the Provider ID). So that means that your code for writing this data is going to be quite simple and quite fast. But when reading the data, you will need to load the data from both collections, which means an extra database call. This typically isn't a big performance issue for reasonable numbers of documents, but definitely does require more code and more API calls.
If you need your queries to be very fast, you may want to prefer to duplicate more data so that the client only has to read one document per item queried, rather than multiple documents. But you may also be able to depend on local client caches makes this cheaper, depending on the data the client has to read.
In this approach, you duplicate all data for a provider for each product document. This means that the code to write this data is more complex, and you're definitely storing more data, one more provider object for each product document. And you'll need to figure out if and how to keep up to date on each document. But on the other hand, reading a product document now gives you all information about the provider document in one read.
This is a common consideration in NoSQL databases: you'll often have to consider write performance and disk storage vs. reading performance and scalability.
For your choice of whether or not to duplicate some data, it is highly dependent on your data and its characteristics. You will have to think that through on a case-by-case basis.
So in the end, remember that both are valid approaches, and neither of them is pertinently better than the other. It all depends on what your use-cases are and how comfortable you are with this new technique of duplicating data. Data duplication is the key to faster reads, not just in Cloud Firestore or Firebase Realtime Database but in general. Any time you add the same data to a different location, you're duplicating data in favor of faster read performance. Unfortunately in return, you have a more complex update and higher storage/memory usage. But you need to note that extra calls in Firebase real-time database, are not expensive, in Firestore are. How much duplication data versus extra database calls is optimal for you, depends on your needs and your willingness to let go of the "Single Point of Definition mindset", which can be called very subjective.
After finishing a few Firebase projects, I find that my reading code gets drastically simpler if I duplicate data. But of course, the writing code gets more complex at the same time. It's a trade-off between these two and your needs that determines the optimal solution for your app. Furthermore, to be even more precise you can also measure what is happening in your app using the existing tools and decide accordingly. I know that is not a concrete recommendation but that's software development. Everything is about measuring things.
Remember also, that some database structures are easier to be protected with some security rules. So try to find a schema that can be easily secured using Cloud Firestore Security Rules.
Please also take a look at my answer from this post where I have explained more about collections, maps and arrays in Firestore.

Creating Empty Documents in Firestore

I understand that empty documents within collections are removed automatically by the system in Firestore. However, I have a situation now where the name of the document serves a purpose. I have a collection named usernames, and within this, many documents with the ID being the username. For instance, usernames/bob_marley is what I might see in the database. The problem here is that, since the documents do not have any fields in them, they get removed automatically thereby defeating the purpose of the set-up. How should I be structuring my database in these cases?
Thank you
The easiest thing to do is simply not allow the document to ever become empty. Keep one property in it with (for example) "exists = true" and make sure it never gets removed. Use a security rule to prevent this, if you're concerned about users accidentally doing this to themselves.
Another thing to do is re-evaluate what exactly you're trying to do with an empty document in the system, and if it's worthwhile to think about how to structure your data in a way that best meets the queries you want to perform.

Resources