Safest and less expensive way to delete comments in FireStore - firebase

I am using FireStore for my Flutter application. Users can post comments under some articles. They can also reply to other comments. Comments have the following structure
where ancestorsId is a list containing all the parent comments id.
Comments can be deleted by the poster or an admin.
When a comment is deleted, all the children comments should be deleted as well.
How can I do that with safety and at the lowest cost ? I have thought to the following so far:
Case 1: Using a Go server and Custom Claims
I can set user role as a custom claim. Then, when a user clicks on delete comment button, it sends a request to the server with the comment ID and the user Token ID. I can check if the user is admin with the token ID. If it is not the case, the server asks the comment data and check if comment userId and token user Id match. If one of those two conditions is true, I can get all comment children with a where request on comments collection, and delete all involved comments with a batch.
Problems:
Custom claims use token, that live for 1 hour. It could create troubles if a crazy admin starts deleting everything, because his admin token can be valid for up to 1 hour. But if I use a server, I think I can manage this.
I need to get comments before deleting them. It involves two operations and then the price is actually twice the price of deleting comments operations.
Case 2: Using FireStore rules
I could stick with a client only, and use FireStore rules instead of using a server. It means that I could no longer use custom claims, and I would have to store users role in a field in my users collection. I could use a delete rule like:
match /comments/{comment}{
allow delete: if request.auth != null && (request.auth.uid == request.resource.data.userId || isAdmin(request));
}
where isAdmin is a function that checks if the user is an admin.
Problems:
isAdmin will need to read a data from another document and thus cost money
This solution doesn't delete children comments. And the rule doesn't allow a user to request another user's comment deletion, unless he is admin.
What could be a solution to solve my issue at low cost without putting safety aside ?

It seems to me that you really only have one solution that works, as the second approach leaves orphaned documents in the database.
But if you do consider the second approach valid for your app, you're trading the cost of reading-and-then-deleting some documents for the cost of leaving them in the database. While the cost of keeping a document in the database is low, you'll end up paying it every month. Since the number of orphaned documents will keep growing, the storage cost for them will keep growing too. So while deleting them now may seem more expensive, it's just a one time cost.
If you're worried about the cost of running Cloud Functions, keep in mind there's a pretty decent free tier for those. Even if that tier is not enough to run your code in production, it should at least be enough to give you a feeling for what the cost is gonna be.

Related

Fetching parent and child item in single query in DynamoDB

I have the following one-to-many relationship:
Account 1--* User
The Account contains global account-level information, which is mutable.
The User contains user-level information, which is also mutable.
When the user signs-in, they need both Account and User information. (I only know the UserId at this point).
I ideally want to design the schema such that a single query is necessary. However, I cannot determine how to do this without duplicating the Account into each User and thus requiring some background Lambda job to propagate changes to Account attributes across all User objects -- which, for the record, seems like more resource usage (and code to maintain) than simply normalizing the data and having 2 queries on each sign-in: fetch user, then fetch account (using an FK inside the user object that identifies the account).
Is it possible to design a schema that allows one query to fetch both and doesn't require a non-transactional background job to propagate updates? (Transactional batch updates are out of the question, since there's >25 users.) And if not, is the 2-query idea the best / an acceptable method?
I'll focus on one angle in your question - the 2-query idea. In many cases it is indeed an acceptable method, better than the alternatives. In fact in many NoSQL uses, every user-visible request results in significantly more than two database requests. In fact, it is often stated that this is the reason why NoSQL systems care about low tail latencies (i.e., even 99th percentile latencies should be low).
You didn't say why you wanted to avoid the 2-query solution. The 2-query implementation you presented has two downsides:
It is more costly: you need to do two queries instead of one, costing (when the reads are shorter than 4 KB) double than a single read.
Latency doubles if you need to do the first query, and only then can do the second query.
There may be tricks you can use to solve both problems, depending on more details of your use case:
For the latency: You didn't say what is a "user id" in your application. If it is some sort of unique numeric identifier, maybe it can be set up such that the account id can be determined from the user id directly, without a table lookup (e.g., the first bits of the user id are the account id). If this is the case, you can start both lookups at the same time, and not double the latency. The cost will still be double, but not the latency.
For the cost: If there is a large number of users per account (you said there are more than 25 - I don't know if it's much more or not), it may be useful to cache the Account data, so that not every user lookup will need to read the Account data again - it might often be cached. If Account information rarely changes and consistency of it is not a big deal (I don't know if it is...), you can also get by with doing an "eventual consistency" read for the Account information - which costs half of the regular "consistent" read.
I think the following scheme will be useful for.
You will store both account and user records inthe same table
You want to get both account metadata and linked users in a single query
PK: account SK: recordId
=== Account record ===
account: 123512321 recordId: METADATA attributes: name, environment, ownerId...
=== User record ===
account: 123512321 recordId: USERID#34543543 attributes: name, email, phone...
With this denormalization of the data, you can retrieve both account metadata and related users in a single query. You can also change the account metadata without a need to apply any change to related users.
BONUS: you can also link other types of assets to the account record

Firestore dynamically update security rules

Imagine we have Chat application and in this application, we have many rooms, some private and some for everyone. Every room has an admin who can manage users (can invite and remove). Only members of the room can read and write messages. An Admin is a person who created a room in this scenario.
I want to create security rules on room creation and update it on membersChange so only members can read and write the content of the message board.
In this case, that's how it could look like:
databse/rooms/
private1
admin: memberX
members: member1, member2
//only admin can write into members fields
messages
message1...
message2...
message3...
//only members can write and read messages
private2
admin: memberXY
members: member1, member4
//only admin can write into members fields
messages
message1...
message2...
message3...
//only members can write and read messages
So is it possible to create and update security rules from cloud function instead of manually updating them in firebase console? Or is there any way to automate this process?
I noticed that I can deploy security rules using CLI. What should be the process here? When do I call it? How can I get members from the database?
EDIT:
for anyone who wants more information check How to Build a Secure App in Firebase
I would rethink this model. Instead of updating the security rules all the time, I see several viable approaches:
Option 1
You can save which users can access a specific room on Firestore, and then on the security rules you can access the document for the room and see which if the authenticated user is in the list of authorized users. The problem with this is cost, because this will fire an extra database read for every operation, which can get expensive.
Option 2
You can create custom claims for the user using a cloud function, like this:
admin.auth().setCustomUserClaims(uid, {"rooms": "room1,room2"})
Then on the security rules you can check if the user has the claims to a specific room:
match /rooms/{roomId} {
allow read: if roomId in request.auth.token.rooms.split(',');
}
I believe you can also save the claim as an array directly, but I haven't tested it.
For this option you need to take into consideration the size of the token, which has a limit and can cause performance problems if it's too big. Depending on your scenario you can create a smaller set of permissions and then set those to the rooms and the users.
Option 3
You could save the uid of the users who can access each document, and then check if the authenticated user's uid exists on that document. But this can get out of hand if you have too many users.
I would go with option 2 if it makes sense for your scenario. Or you could combine more than one of these techniques. My idea was to show a few of the possibilities so that you can choose what works for you.
Having different rules for each room and dynamicly updating your rules is a bad idea. Here are a couples problems that come to mind with this solution:
Who will be updating the rules?
What happens when two rooms get created at the same time?
What will happen when something goes wrong?
How will you maintain your rules when you have a million rooms?
Also It may be a few minutes before changes to your rules take effect.
Instead you can, first of all, split you datastructure into public rooms and private rooms: database/rooms/public/... and database/rooms/private/....
For securing your private rooms you can take a look at rules conditions and do something like: member can read/write IF his UID is in /members (pseudo code, won't work like this).
You can take a look at this question for an example.

Firebase User photoURL and displayName

I'm having a time trying to wrap my head around what I thought was a simple concept.
I have an app where I sign up a user, allow that user to set a 'photoURL' to their 'user' information in the Firebase Auth system. This works. When the user creates a post in my app, I want to display the title, image and
'photoURL' of the creator.
Currently, I save the post:
-Post {
-id
-title
-image
-photoURL <- from current logged in user }
I also allow users to visit the posters page via routing /poster/'displayName'
So later, when a user updates their profile information like displayName or photoURL, do I need to go find all posts, comments, messages, replies and any other place that this user has a record and update the photoURL?
What I thought I would be able to do is say: (pseudo code)
get all posts =>
foreach(post)
post = {
title: post.title.val()
image: post.image.val()
avatar: firebase.database().ref().child('users' + post.key)
}
Everything I read says I need to store that photoURL in my own 'Users' table. If I do that, then none of the posts get updated unless I write a server call to do that every time there is a change. Problem is, if I have 100K users, and 10% of them change their photoURL, I then have to change it in posts, comments, replies and messages per user. If the average user has 100 posts, 4000 comments, 6000 replies, we're looking at about 10K places * 10K users that have to be updated and if the average server call is 137ms, then my costs are around $175 (costs)
The other option is to pull information from two tables and create a new object every time. This would lead to about double the server calls and time thus doubling my costs.
Is this the best approach for this? I thought this would be a lot easier to just get the user photo and display name.
Sorry for the epic long post but I'm trying to learn. Thanks all!
What you're describing is a typical issue when working with noSQL databases. On the one hand, data duplication makes your app and its queries run faster. On the other hand, if you want to change any that duplicated data, it can be problematic to find and replace all occurrences.
There's no "best" way to determine what to do. It's completely up to your particular case. It sounds like, if you have extreme amounts of data duplication that could be costly to update, it would be better to simply query the user record every time rather than to do the updates. But again, it's ultimately up to you.

How can Firebase nodes be structured to restrict user access and allow admin to pull report data?

Context: I am putting together a time tracking application using Firebase as my backend. My current node structure has Time Entries and Clients at the root like so:
Time Entry
Entry ID
UserID
clientID, hours, date, description, etc
Clients
ClientID
name, projects, etc
This structure works fine if I'm just adding and pulling time entries based on the user, but I want to start putting together reports on a per client basis. Currently, this means making a separate HTTP request for each user and then filtering by the clientID to get at the data.
The rule structure for Firebase grants access to all child nodes once access is given to the parent node, so one big list doesn't work as it can't restrict users from seeing or editing each other's entries.
Question: Is there a way to structure the nodes that would allow for restricting users to only managing their own time entries, as well as allow for one query to pull all entries tied to a client?
** The only solution I could come up with was duplicating the entries into a single node used just for reporting purposes, but this doesn't seem like a sustainable option
#AL. your answer was what I went up going with after scouring the docs across the web. Duplicating the data is the best route to take.
The new Firestore beta seems to provided some workarounds to this.
The way that I would do this is with Cloud Firestore.
Create a root collection clients and a document for each client. This partitions the data into easily manageable chunks, so that a client admin can see all data for their company.
Within the client document, create a sub-collection called timeEntries. When a user writes to this, they must include a userId field (you can enforce this in the rules) which is equal to request.auth.uid
https://firebase.google.com/docs/firestore/security/rules-conditions#data_validation
You can now create read rules which allow an admin to query any document in the timeEntries sub-collection, but an individual user must query with userId = request.auth.uid in order to only return the entries that they have created.
https://firebase.google.com/docs/firestore/security/rules-conditions#security_rules_and_query_results
Within your users/{uid} collection or clients/{clientId} collection, you can easily create a flag to identify admin users and check this when reading data.
https://firebase.google.com/docs/firestore/security/rules-conditions#access_other_documents

How to limit size of the data on the server side a user can fetch in Firebase?

I gave a talk about basics of Firebase (http://szimek.github.io/presentation-firebase-intro) at our local meetup and got 2 interesting questions from the audience.
Imagine you have a Twitter-like app with billion of tweets and everyone has read access to them.
Is there a way to limit size of the data (on the server side) a user can fetch? Even if I have tweetsRef.limit(10) call, a user could easily change it to tweetsRef.limit(10e9) and try to fetch all tweets.
How to prevent users from updating existing records (even if it was created by that user), but allow them to delete existing records (only if it was created by that user)?
If you are worried about limit manipulation you could just fetch the tweets on your server instead of the client (as you suggested) so users can't manipulate the limits.
For your second question, it depends on how you want to handle deletion. Often you don't actually want the object deleted, so you could just give the creating user write access on the deleted attribute. Alternatively, if you want them to actually delete the object, check to see that the user is the creator and that the value of newData is null.
Here is an example security rule from #Kato's comment below (writes/deletes allowed, updates prevented):
".write": "!data.exits() || !newData.exists()"

Resources