I'm working on an iOS app which has (whoah surprise!) chat functionality. The whole app is heavily using the Firebase Tools, for the database I’m using the new Cloud Firestore solution.
Currently I'm in the process of tightening the security using the database rules, but I'm struggling a bit with my own data model :) This could mean that my data model is poorly chosen, but I'm really happy with it, except for implementing the rules part.
The conversation part of the model looks like this. At the root of my database I have a conversations collection:
/conversations/$conversationId
- owner // id of the user that created the conversation
- ts // timestamp when the conversation was created
- members: {
$user_id_1: true // usually the same as 'owner'
$user_id_2: true // the other person in this conversation
...
}
- memberInfo: {
// some extra info about user typing, names, last message etc.
...
}
And then I have a subcollection on each conversation called messages. A message document is a very simple and just holding information about each sent message.
/conversations/$conversationId/messages/$messageId
- body
- sender
- ts
And a screenshot of the model:
The rules on the conversation documents are fairly straightforward and easy to implement:
match /conversations/{conversationId} {
allow read, write: if resource.data.members[(request.auth.uid)] == true;
match /messages/{messageId} {
allow read, write: if get(/databases/$(database)/documents/conversations/$(conversationId)).data.members[(request.auth.uid)] == true;
}
}
Problem
My problem is with the messages subcollection in that conversation. The above works, but I don’t like using the get() call in there.
Each get() call performs a read action, and therefore affects my bill at the end of the month, see documentation.
Which might become a problem if the app I’m building will become a succes, the document reads ofcourse are really minimal, but to do it every time a user opens a conversation seems a bit inefficient. I really like the subcollection solution in my model, but not sure how to efficiently implement the rules here.
I'm open for any datamodel change, my goal is to evaluate the rules without these get() calls. Any idea is very welcome.
Honestly, I think you're okay with your structure and get call as-is. Here's why:
If you're fetching a bunch of documents in a subcollection, Cloud Firestore is usually smart enough to cache values as needed. For example, if you were to ask to fetch all 200 items in "conversions/chat_abc/messages", Cloud Firestore would only perform that get operation once and re-use it for the entire batch operation. So you'll end up with 201 reads, and not 400.
As a general philosophy, I'm not a fan of optimizing for pricing in your security rules. Yes, you can end up with one or two extra reads per operation, but it's probably not going to cause you trouble the same way, say, a poorly written Cloud Function might. Those are the areas where you're better off optimizing.
If you want to save those extra reads, you can actually implement a "cache" based on custom claims.
You can, for example, save the chats the user has access to in the custom claims under the object "conversations". Keep in mind custom claims has a limit of 1000 bytes as mentioned in their documentation.
One workaround to the limit is to just save the most recent conversations in the custom claims, like the top 50. Then in the security rules you can do this:
allow read, write: if request.auth.token.conversations[conversationId] || get(/databases/$(database)/documents/conversations/$(conversationId)).data.members[(request.auth.uid)] == true;
This is especially great if you're already using cloud functions to moderate messages after they were posted, all you need is to update the custom claims
Related
There are several articles (firestore and firebase realtime database) explaining how to build a user presence system but I cannot find a resource for a friend presence system.
A simple user presence system is not perfect for some applications such as chat apps where there are millions of users and each user wants to listen to only his/her friends. I've found similar questions:
exact same question on stackoverflow
exact same issue on github
Two ok solutions with a realtime database are: (solutions are from the above stackoverflow post)
Use many listeners (one for each friend) with a collection of users. Possibly have a cap on the number of friends to keep track of.
Each user has friends collections and whenever a user's status changes, his/her status changes wherever he/she shows up in some user's friends collection as well.
Is there a better way to do? What kind of databases do chat apps like discord, whatsapp and etc. use to build their friends presence system?
I came to two approaches that might be worth looking into. Note, that I have not tested how it will scale longer term as I just pushed to prod. First step, write a users presence on their user document (will need firebase, cloud functions, and cloud firestore per https://firebase.google.com/docs/firestore/solutions/presence).
Then take either approach:
Create an array field on your user documents (users> {userID}) called friends. Every time you add a friend add your id to this array, and vice versa. Then, on the client run a function like:
db.collection(users).where("friends", "array-contains", clientUserId).onSnapshot(...)
In doing so, all documents with friends field that contains the clientUserId will be listened to for real-time updates. For some reason, my team didn't approve of this design but it works. If anyone can share their opinion as to why I'd appreciate it
Create a friend sub-collection like so: users>{userID}>friends
. When you add a friend, add a document to your friend sub-collection with the id equal to your friends userID. When a user logs on, run a get query for all documents in this collection. Get the doc IDs and store into an array (call it friendIDs). Now for the tricky part. It'd be ideal if you can read use the in operator for unlimited comparison values because you can just run an onSnapshot as so:
this.unSubscribeFriends = db.collection(users).where(firebase.firestore.FieldPath.documentId(), "in", friendIDs).onSnapshot((querySnapshot) => {get presence data}). Since this onSnapshot is attached to this.unSubscribeFriends you just need to call this once to detach the listener:
componentWillUnmount() {
this.unSubscribeFriends && this.unSubscribeFriends()
}
Because a given users friends can definetely increase into the hundreds I had to create a new array called chunkedFriendsArray consisting of a chunked version of friendIDs (chunked as in every 10 string IDs I splice into a new array to bypass the in operator 10 comparison values limit). Thus, I had to map chunkedFriendsArray and set an onSnapshot like the one above for every array of a max length of 10 inside chunkedFriendsArray. The problem with this is that the all the listeners are attached to the same const (or this.unSubscribeFriends in my case). I have to call this.unSubscribeFriends as many times as chunkedArrays exist in chunkedFriendsArray:
componentWillUnmount() {
this.state.chunkedFriendsArray.forEach((doc) => {
this.unSubscribeFriends && this.unSubscribeFriends()
})
}
It feels weird having many listeners attached to the same const (method this.unSubscribeFriends) and calling the same exact one to stop listening to them. I'm sure this will lead to bugs in my production code.
There are other decentralize approaches but the two I listed are my best attempts at avoiding having a bunch of decentralized presence data.
I have a firestore collection called letters which holds a public letter from users. In my app, I am using pagination to limit the results to 20 when they go to public letters screen. My concern is that this would work fine from within the app but if some malicious user query the database from let's say postman then I will be billed heavily for all those reads. I have all security rules in place like the user should be authenticated but this needs to be public collection so I can't think of anything else to restrict this. How can I restrict someone to read about 20 documents at time?
There is actually no way to restrict the consumption of a collection based on direct query volume. Renaud's answer proposes to use request.query.limit in security rules, but that does not stop a malicious user from simply making as many calls to the pagination API as they want. It just forces them to provide a limit() on each query. The caller can still consume the entire collection, and consume it as many times as they want.
Watch my video on the topic: https://youtu.be/9sOT5VOflvQ?t=330
If you want to enforce a hard limit on the total number of documents to read, you will need a backend to do that. Clients can request documents from the backend up to the limit it enforces. If the backend wants to allow pagination, it will have to somehow track the usage of the provided endpoint to prevent each caller from exhausting whatever limits or quotas you want to enforce.
As explained in the doc:
The request.query variable contains the limit, offset, and orderBy
properties of a query.
So you can write a rule like:
allow list: if request.query.limit <= 20;
Note that we use list, instead of read. The doc says:
You can break read rules into get and list rules. Rules for get apply
to requests for single documents, and rules for list apply to queries
and requests for collections.
Is there a native or efficient way to restrict the user to load a document from a collection only once every 24h?
//Daily Tasks
//User should have only read rights
//User should only be able to read one document every 24h
match /tasks/{documents} {
allow read: if isSignedIn() && request.query.elapsedHours > 24;
}
I was thinking that I might be able to do this using a timestamp in the user document. But this would consume unnecessary writing resources to make a write to the user document with every request for a task document. So before I do it this way, I wanted to find out if anyone had a better approach.
Any ideas? Thanks a lot!
There is no native solution, because security rules can't write back into the database to make a record of the query.
You could instead force access through a backend (such as Cloud Functions) that also records the time of access of the particular authenticated user, and compare against that every time. Note that it will incur an extra document read every call.
There is no real "efficient" way to do so, neither a native at the moment of writing. And finding an actual solution to this "problem" won't be easy without further extensions.
There are however workarounds like with cloud functions for firebase that open new options for solving various limitations firestore has.
A native solution would be keeping track somewhere in the database when each user last accessed the document. This would, as you mentioned, create unnecessary reads and writes just for tracking.
I would prefer a caching mechanism on the client and allow the user to execute multiple reads. Don't forget that if the user clears the cache on the device, he has to query the document(s) again and won't get any data at all if you restrict him completely that way.
I think the best approach, due to the high amount of reads you get, is to cache on client side and set only a limit control (see Request.query limit value). This would look somehow like below:
match /tasks/{documents} {
// Allow only signed in users to query multiple documents
// with a limit explicitly set to less than or equal to 20 documents per read
allow list: if isSignedIn() && request.query.limit <= 20;
// Allow single document read to signed in users
allow get: if isSignedIn();
}
Sometimes, we don't want to display to the end user the entire document. For example - let's say we have users collection and each user has an email property. The last thing we want to do is to display the users' emails to each other.
So in RTDB, it was simple, since it's not structured as collection/document.
On Cloud Firestore, it's not as simple as RTDB. You can't filter the document to your needs, as stated in their docs:
When writing queries to retrieve documents, keep in mind that security rules are not filters—queries are all or nothing.
So I thought about 2 alternatives:
1. Split the user collection into public and private sub-collections.
Pros - You can have different rules to public and private. In other words - you can prevent other users to view private doc of a different user.
Cons - When you just want to get the entire document of the user, you'll have to make 2 reads instead of once, and then merge their response (assuming you don't get any errors).
2. Handle filtration of the user in the back-end (using Admin SDK).
Pros - This way, I only read one document, and I don't have to split my document into objects.
Cons - I haven't found a way so I could subscribe to the desired cloud function (I want to listen for changes).
I know that there's no one-way-go to achieve certain goals, but I would like to know if there's a more conventional approach of achieving what I'm trying to do.
Your first option is the most common, but if it doesn't work for your case, then don't use it.
Bear in mind that with your #2 option, you add complexity on both the client and the server. And you lose client side caching, which might save you a lot in terms of performance and billing.
Imagine we have Chat application and in this application, we have many rooms, some private and some for everyone. Every room has an admin who can manage users (can invite and remove). Only members of the room can read and write messages. An Admin is a person who created a room in this scenario.
I want to create security rules on room creation and update it on membersChange so only members can read and write the content of the message board.
In this case, that's how it could look like:
databse/rooms/
private1
admin: memberX
members: member1, member2
//only admin can write into members fields
messages
message1...
message2...
message3...
//only members can write and read messages
private2
admin: memberXY
members: member1, member4
//only admin can write into members fields
messages
message1...
message2...
message3...
//only members can write and read messages
So is it possible to create and update security rules from cloud function instead of manually updating them in firebase console? Or is there any way to automate this process?
I noticed that I can deploy security rules using CLI. What should be the process here? When do I call it? How can I get members from the database?
EDIT:
for anyone who wants more information check How to Build a Secure App in Firebase
I would rethink this model. Instead of updating the security rules all the time, I see several viable approaches:
Option 1
You can save which users can access a specific room on Firestore, and then on the security rules you can access the document for the room and see which if the authenticated user is in the list of authorized users. The problem with this is cost, because this will fire an extra database read for every operation, which can get expensive.
Option 2
You can create custom claims for the user using a cloud function, like this:
admin.auth().setCustomUserClaims(uid, {"rooms": "room1,room2"})
Then on the security rules you can check if the user has the claims to a specific room:
match /rooms/{roomId} {
allow read: if roomId in request.auth.token.rooms.split(',');
}
I believe you can also save the claim as an array directly, but I haven't tested it.
For this option you need to take into consideration the size of the token, which has a limit and can cause performance problems if it's too big. Depending on your scenario you can create a smaller set of permissions and then set those to the rooms and the users.
Option 3
You could save the uid of the users who can access each document, and then check if the authenticated user's uid exists on that document. But this can get out of hand if you have too many users.
I would go with option 2 if it makes sense for your scenario. Or you could combine more than one of these techniques. My idea was to show a few of the possibilities so that you can choose what works for you.
Having different rules for each room and dynamicly updating your rules is a bad idea. Here are a couples problems that come to mind with this solution:
Who will be updating the rules?
What happens when two rooms get created at the same time?
What will happen when something goes wrong?
How will you maintain your rules when you have a million rooms?
Also It may be a few minutes before changes to your rules take effect.
Instead you can, first of all, split you datastructure into public rooms and private rooms: database/rooms/public/... and database/rooms/private/....
For securing your private rooms you can take a look at rules conditions and do something like: member can read/write IF his UID is in /members (pseudo code, won't work like this).
You can take a look at this question for an example.