Restrict specific object key values with authentication in Firestore - firebase

I have an object stored in the Firestore database. Among other keys, it has a userId of the user who created it. I now want to store an email address, which is a sensitive piece of info, in the object. However, I only want this email address to be retrieved by the logged in user whose userId is equal to the userId of the object. Is it possible to restrict this using Firebase rules? Or will I need to store that email address in a /private collection under the Firebase object, apply restrictive firebase rules, and then retrieve it using my server?

TL;DR: Firestore document reads are all or nothing. Meaning, you can't retrieve a partial object from Firestore. So there is no feature at rule level that will give you granularity to restrict access to a specific field. Best approach is to create a subcollection with the sensitive fields and apply rules to it.
Taken from the documentation:
Reads in Cloud Firestore are performed at the document level. You either retrieve the full document, or you retrieve nothing. There is no way to retrieve a partial document. It is impossible using security rules alone to prevent users from reading specific fields within a document.
We solved this in two very similar approaches:
As you suggested, you can move your fields to a /private collection and apply rules there. However, this approach caused some issues for us because the /private collection is completely dettached from the original doc. Solving references implied multiple queries and extra calls to FS.
The second option -which is what the Documentation suggests also, and IMHO a bit better- is to use a subcollection. Which is pretty much the same as a collection but it keeps a hierarchical relationship with the parent coll.
From the same docs:
If there are certain fields within a document that you want to keep hidden from some users, the best way would be to put them in a separate document. For instance, you might consider creating a document in a private subcollection
NOTE:
Those Docs also include a good step-by-step on how to create this kind of structure on FS, how to apply rules to them, and how to consume the collections in various languages

Related

Changing data model in existing cloud firestore collection?

Suppose I have a users collection. The users collection has a large number of documents in it. Now in my app, I have a feature request that forces me to add or remove a field in my users collection data model. How can I add a new field or remove an existing field from all my users documents? Is there any best practice that the community recommends here?
How can I add a new field or remove an existing field from all my users documents?
While #AdityaNandardhane solution might work, please note that if you have a lot of documents, then you have a lot of update operations to perform, which also means that you have to play a lot of writes.
So the best approach would be to perform the update, only when the user reads the document. When it comes to users, most likely the details of the users are displayed on a profile screen. This means that when the users want to check the profile, before displaying the data, check for the existence of the new field. If it doesn't exist, then perform the update operation, and right after that display the data, otherwise, just display the data. This means that you'll have to pay for an update operation only when needed. It doesn't make any sense to update all documents, of all users, since there may be users that will never use their accounts anymore. So there is no need to pay for them.
As I understood, You can do the following thing
1. Add New Field
If you are using Firebase Functions- you can create one function and write an update query with a new field and set one default value and Run the function. You can do the same from android also with kotlin/java.
2. Remove existing Field
If you are using Firebase Functions- you can create one function and write a query to delete one field and Run the function. You can do the same from android also with kotlin/java.
Look for a better approach If any, Its suggestion as per my knowledge.

Safety of exposing document IDs for "anyone with the link can view"-style functionality

I have an API endpoint that I'm using to return some data from a Cloud Firestore collection.
The data it returns is largely insensitive, but it's publicly callable, so I'm not using auth for this endpoint. I wouldn't want the collections to be listable, i.e. I want it to act like "anyone with the link can view" data.
I'm looking up data for a subcollection's document, so currently the call would look like something like this:
GET endpoint.example/?parentDoc=XXXX0000XXXX&subDoc=XXXX0000XXXXX
I was considering creating a separate "references" collection with a UUID or something to represent the two, in case revealing the document IDs like that is considered a bad practice(?) — e.g.
GET endpoint.example/?myOwnRef=123-234-123-234-ABC-DEF
Assuming I have the Firestore locked down with appropriate security rules, is it safe to assume that the only benefit I'd get from further hashing / creating my own (e.g. UUID) reference for the parent doc / subcollection doc is security by obscurity?
...Or is there more merit to further obscuring the IDs here if I'm after a private / shareable link style functionality to reference the data?
EDIT: As Doug Stevenson pointed out, this question refers to autogenerated Firestore document IDs.
It depends on if your document IDs actually contain any data in them. If they are just randomly generated, then good security rules should be sufficient to prevent someone from doing something they're not supposed to with a document if they know the ID. There is no advantage to hashing it, since it's already an opaque value.
If the ID does contain some data, then you are putting that data into the hands of someone who might do something with it that you'd not like, and you might want to remove that from view by hashing it.

data modeling & security rules advice firebase

I am trying to model 2 concepts in firestore and also associate
collection: users
key/document_id: email
document: profile info
collection: topics
key/document_id: random
document: metadata with a field indicating email of user (to use for lookups)
My goal is to
"reference" topics in users for easy lookups, but not sure how to do
it other than a sub collection.
Based on email which will be passed as part of auth, I want to have security rule to allow writes in collection only on path, field
based on email
Are both of above feasible in Firebase. Appreciate any pointers!
Preamble: There isn't ONE and only ONE correct approach in NoSQL data modelling
Your approach seems valid, however I would suggest the following adaptations:
"Reference topics in users for easy lookups":
To "reference topics in users for easy lookups" you could duplicate the list of topics in an array in the user profile. You will then be able to use array-contains (and other array membership methods) for your queries. (Note however the limitation of the in operator).
Advantage of this approach: you only need to query one document to get all the topics of a user. Possible drawback: there is a limit on the size for a document (and for a single field value) which is maximum 1 MiB (1,048,576 bytes), see the doc.
You can easily keep in sync the topics array and the topics sub-collection by combining a batched write and the arrayUnion() and arrayRemove() methods.
Use the user ID instead of the email for doc Ids and Security Rules:
Instead of using the email as the users collection document ID and using it in Security Rules, use the user ID. See the examples in the doc.

Managing Denormalized/Duplicated Data in Cloud Firestore

If you have decided to denormalize/duplicate your data in Firestore to optimize for reads, what patterns (if any) are generally used to keep track of the duplicated data so that they can be updated correctly to avoid inconsistent data?
As an example, if I have a feature like a Pinterest Board where any user on the platform can pin my post to their own board, how would you go about keeping track of the duplicated data in many locations?
What about creating a relational-like table for each unique location that the data can exist that is used to reconstruct the paths that require updating.
For example, creating a users_posts_boards collection that is firstly a collection of userIDs with a sub-collection of postIDs that finally has another sub-collection of boardIDs with a boardOwnerID. Then you use those to reconstruct the paths of the duplicated data for a post (eg. /users/[boardOwnerID]/boards/[boardID]/posts/[postID])?
Also if posts can additionally be shared to groups and lists would you continue to make users_posts_groups and users_posts_lists collections and sub-collections to track duplicated data in the same way?
Alternatively, would you instead have a posts_denormalization_tracker that is just a collection of unique postIDs that includes a sub-collection of locations that the post has been duplicated to?
{
postID: 'someID',
locations: ( <---- collection
"path/to/post/location1",
"path/to/post/location2",
...
)
}
This would mean that you would basically need to have all writes to Firestore done through Cloud Functions that can keep a track of this data for security reasons....unless Firestore security rules are sufficiently powerful to allow add operations to the /posts_denormalization_tracker/[postID]/locations sub-collection without allowing reads or updates to the sub-collection or the parent postIDs collection.
I'm basically looking for a sane way to track heavily denormalized data.
Edit: oh yeah, another great example would be the post author's profile information being embedded in every post. Imagine the hellscape trying to keep all that up-to-date as it is shared across a platform and then a user updates their profile.
I'm aswering this question because of your request from here.
When you are duplicating data, there is one thing that need to keep in mind. In the same way you are adding data, you need to maintain it. With other words, if you want to update/detele an object, you need to do it in every place that it exists.
What patterns (if any) are generally used to keep track of the duplicated data so that they can be updated correctly to avoid inconsistent data?
To keep track of all operations that we need to do in order to have consistent data, we add all operations to a batch. You can add one or more update operations on different references, as well as delete or add operations. For that please see:
How to do a bulk update in Firestore
What about creating a relational-like table for each unique location that the data can exist that is used to reconstruct the paths that require updating.
In my opinion there is no need to add an extra "relational-like table" but if you feel confortable with it, go ahead and use it.
Then you use those to reconstruct the paths of the duplicated data for a post (eg. /users/[boardOwnerID]/boards/[boardID]/posts/[postID])?
Yes, you need to pass to each document() method, the corresponding document id in order to make the update operation work. Unfortunately, there are no wildcards in Cloud Firestore paths to documents. You have to identify the documents by their ids.
Alternatively, would you instead have a posts_denormalization_tracker that is just a collection of unique postIDs that includes a sub-collection of locations that the post has been duplicated to?
I consider that isn't also necessary since it require extra read operations. Since everything in Firestore is about the number of read and writes, I think you should think again about this approach. Please see Firestore usage and limits.
unless Firestore security rules are sufficiently powerful to allow add operations to the /posts_denormalization_tracker/[postID]/locations sub-collection without allowing reads or updates to the sub-collection or the parent postIDs collection.
Firestore security rules are so powerful to do that. You can also allow to read or write or even apply security rules regarding each CRUD operation you need.
I'm basically looking for a sane way to track heavily denormalized data.
The simplest way I can think of, is to add the operation in a datastructure of type key and value. Let's assume we have a map that looks like this:
Map<Object, DocumentRefence> map = new HashMap<>();
map.put(customObject1, reference1);
map.put(customObject2, reference2);
map.put(customObject3, reference3);
//And so on
Iterate throught the map, and add all those keys and values to batch, commit the batch and that's it.

How can Firebase nodes be structured to restrict user access and allow admin to pull report data?

Context: I am putting together a time tracking application using Firebase as my backend. My current node structure has Time Entries and Clients at the root like so:
Time Entry
Entry ID
UserID
clientID, hours, date, description, etc
Clients
ClientID
name, projects, etc
This structure works fine if I'm just adding and pulling time entries based on the user, but I want to start putting together reports on a per client basis. Currently, this means making a separate HTTP request for each user and then filtering by the clientID to get at the data.
The rule structure for Firebase grants access to all child nodes once access is given to the parent node, so one big list doesn't work as it can't restrict users from seeing or editing each other's entries.
Question: Is there a way to structure the nodes that would allow for restricting users to only managing their own time entries, as well as allow for one query to pull all entries tied to a client?
** The only solution I could come up with was duplicating the entries into a single node used just for reporting purposes, but this doesn't seem like a sustainable option
#AL. your answer was what I went up going with after scouring the docs across the web. Duplicating the data is the best route to take.
The new Firestore beta seems to provided some workarounds to this.
The way that I would do this is with Cloud Firestore.
Create a root collection clients and a document for each client. This partitions the data into easily manageable chunks, so that a client admin can see all data for their company.
Within the client document, create a sub-collection called timeEntries. When a user writes to this, they must include a userId field (you can enforce this in the rules) which is equal to request.auth.uid
https://firebase.google.com/docs/firestore/security/rules-conditions#data_validation
You can now create read rules which allow an admin to query any document in the timeEntries sub-collection, but an individual user must query with userId = request.auth.uid in order to only return the entries that they have created.
https://firebase.google.com/docs/firestore/security/rules-conditions#security_rules_and_query_results
Within your users/{uid} collection or clients/{clientId} collection, you can easily create a flag to identify admin users and check this when reading data.
https://firebase.google.com/docs/firestore/security/rules-conditions#access_other_documents

Resources