Are these two Firestore rules different at all in the number of reads that they spend from my quota? Note that isWebAdmin() does an exists(), which eats away from my read quota.
// example 1
match /companies/{company} {
// rule 1
allow list, write: if isWebAdmin();
// rule 2
allow get: if isInCompany(company)
// when isInCompany is true, this is short-circuited away
|| isWebAdmin();
}
vs.
// example 2
match /companies/{company} {
// rule 1
allow read, write: if isWebAdmin();
// rule 2
allow get: if isInCompany(company);
}
Here is my (possibly faulty) reasoning: For most get requests isInCompany(company) will be true and isWebAdmin() will be false. Therefore, in example 2, even though the user is authorized to get with rule 2, rule 1 will also execute because get is also a read. So, while trying to give the admin access, I'm spending more reads for regular users who have access.
In example 1, I separate out get and list and treat them separately. In get requests, it will not run rule 1 at all. When running rule 2, since isInCompany(company) is true, isWebAdmin() won't execute because of short circuiting. So, in the common case I saved a read by avoiding calling isWebAdmin().
Is this correct? If so, simply slapping admin privileges adds gets for each user's regular operation. I find this a bit inconvenient. I guess if this is not the case, we should be billed by only the "effective" rule, not everything that was tested. Is that the case instead?
With Firebase security rules, boolean expressions do short circuit, which is a valid way of optimizing the costs of your rules. Use the more granular rules in example 1 for that.
Related
I'm really digging Firestore but it's hard to find answer to specific question so here I am. This is just to be sure I understood properly how security rules works :)
Here's my schema:
/databases/{database}/documents/Bases/Base1 {
roles: { // map
user1: {admin: true}
},
Items: { // SubCollection
item1: {
name: "Hello World"
},
...n,
item10: {
name: "Good Bye World"
}
}
}
I want my user1 to fetch all 10 items in Base1. Query is pretty simple db.collection('Bases').doc('Base1').collection('Items').get()
But I also want to be sure that user1 is an admin in Base1. So I'm setting this security rules:
match /bases/{baseId}/items/{itemId}{
allow read: if request.auth != null
&& get(/databases/$(database)/documents/bases/$(baseId)).data.roles[request.auth.id].admin == true
}
Which works, all good. Here're the questions:
1/ I understand this rules get() will cost me one read (which is very cheap I know). Is it one read per query OR one read per document that needs to be validate? ie. 10 reads in my case.
2/ I assume answer to 1/ is that it'll cost 10 reads. But, as I'm always querying the same $(baseId), cache will kick-in and even if not guarantee, it should drastically reduce the number of charged reads (theorically 1 read even if I'm fetchin 1000 docs)?
Any other advice on how to handle those kind of schemas are welcome. I know read ops are very cheap but I like to understand where I'm going :)
Thanks SO :)
The cost of a get() in security rules only applies once per query. It does not apply per document fetched.
Since you have one query, it will cost one read.
If in a batch I update documents A and B and the rule for A does a getAfter(B) and the rule for B does a getAfter(A), am I charged with 2 reads for these or not? As they are part of the batch anyway.
Example rules:
match /collA/{docAid} {
allow update: if getAfter(/databases/$(database)/documents/collA/${docAid}/collB/{request.resource.data.lastdocBidupdated}).data.timestamp == request.time
&& ...
}
match /collA/{docAid}/collB/{docBid} {
allow update: if getAfter(/databases/$(database)/documents/collA/${docAid}).data.timestamp == request.time
&& getAfter(/databases/$(database)/documents/collA/${docAid}).data.lastdocBidupdated == docBid
&& ...
}
So are these 2 reads, 1 per rule, or no reads at all?
firebaser here
I had to check with our team for this. The first feedback is that it doesn't count against the maximum number of calls you can make in a single security rule evaluation run.
So the thinking is that it likely also won't count against documents read, since it doesn't actually read the document. That said: I'm asking around a bit more to see if I can get this confirmed, so hold on tight.
Are you using two different documents?
If it is the case, then two reads will be performed.
I'm crying myself to sleep on this one.
My getAfter is returning an object that only has 1 field, as every other field type is incorrect. Which I have no idea how to check without any debugging tools (I can't see the data, so its all guess and check).
Here is a watered down version of my rules for users.
match /users/{userId} {
function isValidUser(user) {
return user.id is string &&
(user.address is string || user.address == null) &&
(user.dateOfBirth is number || user.dateOfBirth == null) &&
user.email is string &&
user.name is string &&
(user.phoneNumber is string || user.phoneNumber == null);
}
function isValidWrite(userId, user) {
return signedIn() &&
writeHasMatchingId(userId, user) &&
isValidUser(user);
}
allow read: if signedIn();
allow create: if signedInAndWriteHasMatchingId(userId) &&
userHasId(userId) &&
isValidUser(request.resource.data); // Tested
allow update: if isValidWrite(
userId,
getAfter(/databases/$(database)/documents/users/$(userId))
);
}
and this is the transaction I am trying to run.
const user1Ref = this.userCollection.doc(user1Id);
const user2Ref = this.userCollection.doc(user2Id);
const batchWrite = this.store.batch();
batchWrite.update(user1Ref, {
"details.friend": user2Id,
});
batchWrite.update(user2Ref, {
"details.wishlist": true,
});
batchWrite.commit();
If I comment out the isValidUser(user) line, the operation succeeds. If I leave any line uncommented out inside the function isValidUser(user) except user.id is string, it fails.
Why would the getAfter document only have the id field and no others when they are listed in the Firebase console? Is there a way to output or debug the value of getAfter so I can see what it even is?
I'm answering based on just one line of your question:
Is there a way to output or debug the value of getAfter so I can see what it even is?
There kind of is - at least in 2020.
When one runs something in the Rules Playground (Rules Simulator, see bottom left), the steps taken in the rule evaluation are shown like this:
This list sometimes gives indications that help figure out what the rules evaluator is doing. It's a bit tedious that one needs to 'click' the steps open, individually, instead of seeing true/false just by glancing. But it's better than nothing.
Note: I presume this feature is under development by Firebase. It sometimes seems to give wrong information - or I have failed to read it correctly. But it may help, and looks like a good place for providing such information to the developers. We really would like to see: with the current data, the built query document, and the rules, how does Firebase see it and why does the rule evaluate to true or false?
Another approach, not mentioned here yet and likely not available at the time the question was raised, is wrapping your rules with debug().
Why this is cool?
Allows to see the values suspected of not being right; I still use the same comment-out-narrow-down method that #ColdLogic nicely described in one of their comments
Why this is not enough?
There is no tagging about which value was output; just eg. int_value: 0. Debug would benefit from eg. printing the first 10 letters of the equation it's evaluating, in the output.
Security Rules rejection reasons are still awfully short, as false for 'update' # L44.
the line number always points to the main expression being evaluated. Never to a function called, or a subexpression with && that really causes the fail.
Firebase could fix this (not change the output syntax; just give a more detailed line number). That would eliminate the need to comment-out-and-narrow-down.
The output goes to firestore-debug.log (fairly hidden), so one needs to open yet another terminal and keep an eye on it.
Debugging Security Rules is unnecessarily difficult - and I'm afraid it means people don't use their full potential. We should change this.
This is more of a theoretical how database should be setup, and less about programming.
Lets say I have a news feed full of cards, which each contain a message and a like count. Each user is able to like a mesesage. I want it to be displayed to a user if they have already liked that particular card. (The same way you can see the post you like on facebook, even if you come back days later)
How would you implement that with this Firestore type database? Speed is definitely a concern..
storying it locally isn't an option, my guess would be on each card object, you would have to reference a collection that just kept a list of people who liked it. The only thing is that is a lot more querying.. which feels like it would be slow..
is there a better way to do this?
TL;DR
This approach requires more to setup, ie a cron service, knowledge of Firestore Security Rules and Cloud Functions for Firebase. With that said, the following is the best approach I've come up with. Please note, only pseudo-rules that are required are shown.
Firestore structure with some rules
/*
allow read
allow update if auth.uid == admin_uid and the
admin is updating total_likes ...
*/
messages/{message_key} : {
total_likes: <int>,
other_field:
[,...]
}
/*allow read
allow write if newData == {updated: true} and
docId exists under /messages
*/
messages_updated/{message_key} : {
updated: true
}
/*
allow read
allow create if auth.uid == liker_uid && !counted && !delete and
liker_uid/message_key match those in the docId...
allow update if auth.uid == admin_uid && the admin is
toggling counted from false -> true ...
allow update if auth.uid == liker_uid && the liker is
toggling delete ...
allow delete if auth.uid == admin_uid && delete == true and
counted == true
*/
likes/{liker_uid + '#' + message_key} : {
liker_uid:,
message_key:,
counted: <bool>,
delete: <bool>,
other_field:
[,...]
}
count_likes/{request_id}: {
message_key:,
request_time: <timestamp>
}
Functions
Function A
Triggered every X minutes to count message likes for potentially all messages.
query /messages_updated for BATCH_SIZE docs
for each, set its docId to true in a local object.
go to step 1 if BATCH_SIZE docs were retrieved (there's more to read in)
for each message_key in local object, add to /count_likes a doc w/ fields request_time and message_key.
Function B
Triggered onCreate of count_likes/{request_id}
Delete created docs message_key from /messages_updated.
let delta_likes = 0
query /likes for docs where message_key == created docs message_key and where counted == false.
for each, try to update counted to true (in parallel, not atomically)
if successful, increment delta_likes by 1.
query /likes for docs where message_key == created docs message_key, where delete == true and where counted == true.
for each doc, try to delete it (in parallel, not atomically)
if successful, decrement delta_likes by 1
if delta_likes != 0, transact the total likes for this message under
/messages by delta_likes.
delete this doc from /count_likes.
Function C (optional)
Triggered every Y minutes to delete /count_likes requests that were never met.
query docs under /count_likes that have request_time older than Z.
for each doc, delete it.
On the client
to see if you liked a message, query under /likes for a doc where liker_uid equals your uid, where message_key equals the message's key and where delete == false. if a doc exists, you have liked it.
to like a message, batch.set a like under /likes and batch.set a /messages_updated. if this batch fails, try a batch_two.update on the like by updating its delete field to false and batch_two.set its /messages_updated.
to unlike a message, batch.update on the like by updating its delete field to true and batch.set its /messages_updated.
Pros of this approach
this can be extended to counters for other things, not just messages.
a user can see if they've liked something.
a user can only like something once.
a user can spam toggle a like button and this still works.
any user can see who's liked what message by querying /likes by message_key.
any user can see all the messages any user has liked by querying /likes by liker_uid.
only a cloud function admin updates your like counts.
if a function is fired multiple times for the same event, this function is safe, meaning like counts will not be incremented multiple times for the same like.
if a function is not fired for some event, this approach still works. It just means that the count will not update until the next time someone else likes the same message.
likes are denormalized to only one root level collection, instead of the two that would be required if you had the like under the the message's likes subcollection and under the liker's messages_liked subcollection.
like counts for each message are updated in batches, ie if something has been liked 100 times, only 1 transaction of 100 is required, not 100 transactions of 1. this reduces write rate conflicts due to like counter transactions significantly.
Cons of this approach
Counts are only updated however often your cron job fires.
Relies on a cron service to fire and in general there's just more to set up.
Requires the function to authenticate with limited privileges to perform secure writes under /likes. In the Realtime Database this is possible. In Firestore, it's possible, but a bit hacky. If you can wait and don't want to use the hacky approach, use the regular unrestricted admin in development until Firestore supports authenticating with limited privileges.
May be costly depending on your standpoint. There are function invocations and read/write counts you should think about.
Things to consider
When you transact the count in Function B, you may want to consider trying this multiple times in case the max write rate of 1/sec is exceeded and the transaction fails.
In Function B, you may want to implement batch reading like in Function A if you expect to be counting a lot of likes per message.
If you need to update anything else periodically for the message (in another cron job), you may want to consider merging that function into Function B so the write rate of 1/sec isn't exceeded.
I need help in a scenario when we do multipath updates to a fan-out data. When we calculate the number of paths and then update, in between that, if a new path is added somewhere, the data would be inconsistent in the newly added path.
For example below is the data of blog posts. The posts can be tagged by multiple terms like “tag1”, “tag2”. In order to find how many posts are tagged with a specific tag I can fanout the posts data to the tags path path as well:
/posts/postid1:{“Title”:”Title 1”, “body”: “About Firebase”, “tags”: {“tag1:true, “tag2”: true}}
/tags/tag1/postid1: {“Title”:”Title 1”, “body”: “About Firebase”}
/tags/tag2/postid1: {“Title”:”Title 1”, “body”: “About Firebase”}
Now consider concurrently,
1a) that User1 wants to modify title of postid1 and he builds following multi-path update:
/posts/postid1/Title : “Title 1 modified”
/tags/tag1/postid1/Title : “Title 1 modified”
/tags/tag2/postid1/Title : “Title 1 modified”
1b) At the same time User2 wants to add tag3 to the postid1 and build following multi-path update:
/posts/postid1/tags : {“tag1:true, “tag2”: true, “tag3”: true}
/tags/tag3/postid1: {“Title”:”Title 1”, “body”: “About Firebase”}
So apparently both updates can succeed one after other and we can have tags/tag3/postid1 data out of sync as it has old title.
I can think of security rules to handle this but then not sure if this is correct or will work.
Like we can have updatedAt and lastUpdatedAt fields and we have check if we are updating our own version of post that we read:
posts":{
"$postid":{
".write":true,
".read":true,
".validate": "
newData.hasChildren(['userId', 'updatedAt', 'lastUpdated', 'Title']) && (
!data.exists() ||
data.child('updatedAt').val() === newData.child('lastUpdated').val())"
}
}
Also for tags we do not want to check that again and we can check if /tags/$tag/$postid/updatedAt is same as /posts/$postid/updatedAt.
"tags":{
"$tag":{
"$postid":{
".write":true,
".read":true,
".validate": "
newData.hasChildren(['userId', 'updatedAt', 'lastUpdated', 'Title']) && (
newData.child('updatedAt').val() === root.child('posts').child('$postid').val().child('updatedAt').val())”
}
}
}
By this “/posts/$postid” has concurrency control in it and users can write their own reads
Also /posts/$postid” becomes source of truth and rest other fan-out paths check if updatedAt fields matches with it the primary source of truth path.
Will this bring in consistency or there are still problems? Or can bring performance down when done at scale?
Are multi path updates and rules atomic together by that I mean a rule or both rules are evaluated separately in isolation for multi path updates like 1a and 1b above?
Unfortunately, Firebase does not provide any guarantees, or mechanisms, to provide the level of determinism you're looking for. I have had the best luck front-ending such updates with an API stack (GCF and Lambda are both very easy, server-less methods of doing this). The updates can be made in that layer, and even serialized if absolutely necessary. But there isn't a safe way to do this in Firebase itself.
There are numerous "hack" options you could apply. You could, for example, have a simple lock mechanism using a dedicated collection for tracking write locks. Clients could post to a lock collection, then verify that their key was the only member of that collection, before performing a write. But I hope you'll agree with me that such cooperative systems have too many potential edge cases, potential security issues, and so on. In Firebase, it is best to design such that this component is not a requirement in the first place.