How do I prevent updates to nonexistent nodes? - firebase

I am using the REST api to PATCH nodes in my Firebase store. I would like to prevent updates to nodes that do not exist (because they were previously deleted). Right now, doing a PATCH to a nonexistent reference recreates it.
I looked into setting security rules, but newData.exists() does not discriminate between setting a new value and patching, so I couldn't figure out how to allow what I want without restricting new creations.
I can get a snapshot of the reference and check that before PATCHING, but I was hoping there was a more elegant way to do it without using two REST calls.
EDIT: some code!
My Firebase schema looks like:
requests:
rq123:
id: '123'
sender: '1'
recipient: '2'
expiration: '1234567',
filled: false,
filledDate: '',
New requests are written from a mobile client. My server can make updates to those request entries using the REST api. Using the python-firebase library, that looks like:
request_ref = firebase_root + '/requests/' + request.id
patch_data = {
'filled':'true',
'filled_date':'7654321'
}
firebase_conn.patch(request_ref, patch_data)
Given the design of my app, I'd like to only perform that patch if the request entry still exists. It's clear that I can get a snapshot and perform the check that way before patching, but that seemed awkward to me.

As I already remarked in the comments, there is no difference between these cases:
Writing to a database location that doesn't exist yet
Writing to a database location that doesn't exist anymore
So you will have to create that distinction in your application.
You have a few options. Not all of them apply to the REST API, but I'll mention them for completeness anyway.
Transactional update
The Firebase SDKs (for JavaScript, Java/Android and iOS/OSX) have a method called transaction, which allows you to run a compare-and-set operation.
With that operation you could
ref.transaction(function(current) {
if (current) {
current.filled: true,
current.filled_date:'7654321'
}
return current;
});
But since this method is not available on the REST API, it doesn't apply to your scenario.
Mark deleted records
Alternatively you can mark deleted records, instead of actually deleting them:
requests:
rq123:
id: '123'
sender: '1'
recipient: '2'
expiration: '1234567'
filled: false
filledDate: ''
DELETED: true
You could also delete all the other properties when you pseudo-delete the request, i.e.
requests:
rq123:
DELETED: true
Then in your security rules, you can reject the write operation when this flag is present:
".write": "data.child('DELETED').val() != true"
There are many ways to flag the record. For example it seems like in your case, the record-node will always have an id property. So you could also simply leave the record-node as a marker, but remove all its properties:
requests:
rq123: true
Since Firebase deleted nodes that don't have a value, I put true in here as the values.
With the above structure, we can only allow writes that either have an id property (which is the case when you create the request) or when an id property is already present (the PATH request from the REST API):
".write": "newData.child('id').exists() || data.child('id').exists()"
Keep a list of deleted nodes
My final approach would be to keep a list of the deleted request keys:
requests:
rq123:
id: '123'
sender: '1'
recipient: '2'
expiration: '1234567'
filled: false
filledDate: ''
deleted:
rq456: true
rq789: true
Once again, we set a dummy value of true for the deleted nodes to prevent Firebase from deleting them.
With this structure, you can reject write operations when the key you're writing to exists in the list of deleted requests:
".write": "!root.child('deleted').child(newData.key()).exists()"
Each approach has its own advantages and disadvantages, so you'll have to decide for yourself which one is best for your scenario.

Related

JSON API Standard

I have a doubt what is a better option when using json api standard and communication between backend and frontend. I need only one attribute from author association - „username” and other stuff should be hidden for user that fetch this
Case a)
data: [
{
id: „100”,
type: „resource1”,
attributes: {…},
relationships: {author: {data: {id: „10”, type: „author”}}}
}
],
included: [
{
id: „10”,
type: „author”,
attributes: {username: „name”},
relationships: {resources1: {data: [{id: „100”, type: „resource1”}]}}
}
]
Case b)
data: [
{
id: „100”,
type: „resource1”,
attributes: {authorName: „name”, …},
relationships: {author: {data: {id: „10”, type: „author”}}}
}
],
included: []
Case a) looks semantic but there serve much more information in payload
Case b) is faster to get what I want from author (one attribute „username” and this is added in additional attribute: „authorName”), so also don’t need to pleas with associations in frontend side.
Any thoughts which is better practice and why?
Strictly speaking both case a and case b are valid per JSON:API specification.
In case a username is an attribute of author resource. In case b authorName is an attribute of resource1. author resource may have a username attribute in case b as well. In that case you have duplicated state.
I would recommend to only use duplicated state if you have very good reasons. Duplicated state increases complexity - both on server- as well as client-side. Keeping both attributes in sync comes with high costs. E.g. you need to update a client that resource1 changed after a successful update request, which affected the username of author resource. And the client need to parse that response and update the local cache.
There are some reasons, in which duplicating state pays off for good reasons. Calculated values, which would require a client to fetch many resources to calculate them, is a typical example. E.g. you may decide to introduce a averageRating attribute on a product resource because without a client would need to fetch all related ratings of a product only to calculate it.
Trying to reduce payload size is nearly never a good reason to accept the increased complexity. If you consider compressing and package sizes at network level, the raw payload size often doesn't make a big difference.

When are writeFields specified in Firestore requests and what replaces them?

The simulator now displays an error message trying to access request.writeFields.
Before that writeFields in Firestore Security Rules did just not work in real requests.
The message states the following:
The simulator only simulates client SDK calls; request.writeFields is always null for these simulations
Does this mean that writeFields are only specified in HTTP requests?
The documentation only states this:
writeFields: List of fields being written in a write request.
A problem that arises from this
I am searching for something that replaces this property because it is "always null".
request.resource.data in update also contains fields that are not in the requests, but already in the document to my knowledge.
Example
// Existing document:
document:
- name: "Peter"
- age: 52
- profession: "Baker"
// Update call:
document:
- age: 53
// request.resource.data in allow update contains the following:
document:
- name: "Peter"
- age: 53
- profession: "Baker"
But I only want age.
EDIT Mar 4, 2020: Map.diff() replaces writeFields functionality
The Map.diff() function gives the difference between two maps:
https://firebase.google.com/docs/reference/rules/rules.Map#diff
To use it in rules:
// Returns a MapDiff object
map1.diff(map2)
A MapDiff object has the following methods
addedKeys() // a set of strings of keys that are in after but not before
removedKeys() // a set of strings of keys that are in before but not after
changedKeys() // a set of strings of keys that are in both maps but have different values
affectedKeys() // a set of strings that's the union of addedKeys() + removedKeys() + updatedKeys()
unchangedKeys() // a set of strings of keys that are in both maps and have the same value in both
For example:
// This rule only allows updates where "a" is the only field affected
request.resource.data.diff(resource.data).affectedKeys().hasOnly(["a"])
EDIT Oct 4, 2018: writeFields is no longer supported by Firestore and its functionality will eventually be removed.
writeFields is still valid, as you can see from the linked documentation. What the error message in the simulator is telling you is that it's unable to simulate writeFields, as it only works with requests coming from client SDKs. The simulator itself seems to be incapable of simulating requests exactly as required in order for writeFields to be tested. So, if you write rules that use writeFields, you'll have to test them by using a client SDK to perform the read or write that would trigger the rule.

Tracking if a User 'likes' a post

This is more of a theoretical how database should be setup, and less about programming.
Lets say I have a news feed full of cards, which each contain a message and a like count. Each user is able to like a mesesage. I want it to be displayed to a user if they have already liked that particular card. (The same way you can see the post you like on facebook, even if you come back days later)
How would you implement that with this Firestore type database? Speed is definitely a concern..
storying it locally isn't an option, my guess would be on each card object, you would have to reference a collection that just kept a list of people who liked it. The only thing is that is a lot more querying.. which feels like it would be slow..
is there a better way to do this?
TL;DR
This approach requires more to setup, ie a cron service, knowledge of Firestore Security Rules and Cloud Functions for Firebase. With that said, the following is the best approach I've come up with. Please note, only pseudo-rules that are required are shown.
Firestore structure with some rules
/*
allow read
allow update if auth.uid == admin_uid and the
admin is updating total_likes ...
*/
messages/{message_key} : {
total_likes: <int>,
other_field:
[,...]
}
/*allow read
allow write if newData == {updated: true} and
docId exists under /messages
*/
messages_updated/{message_key} : {
updated: true
}
/*
allow read
allow create if auth.uid == liker_uid && !counted && !delete and
liker_uid/message_key match those in the docId...
allow update if auth.uid == admin_uid && the admin is
toggling counted from false -> true ...
allow update if auth.uid == liker_uid && the liker is
toggling delete ...
allow delete if auth.uid == admin_uid && delete == true and
counted == true
*/
likes/{liker_uid + '#' + message_key} : {
liker_uid:,
message_key:,
counted: <bool>,
delete: <bool>,
other_field:
[,...]
}
count_likes/{request_id}: {
message_key:,
request_time: <timestamp>
}
Functions
Function A
Triggered every X minutes to count message likes for potentially all messages.
query /messages_updated for BATCH_SIZE docs
for each, set its docId to true in a local object.
go to step 1 if BATCH_SIZE docs were retrieved (there's more to read in)
for each message_key in local object, add to /count_likes a doc w/ fields request_time and message_key.
Function B
Triggered onCreate of count_likes/{request_id}
Delete created docs message_key from /messages_updated.
let delta_likes = 0
query /likes for docs where message_key == created docs message_key and where counted == false.
for each, try to update counted to true (in parallel, not atomically)
if successful, increment delta_likes by 1.
query /likes for docs where message_key == created docs message_key, where delete == true and where counted == true.
for each doc, try to delete it (in parallel, not atomically)
if successful, decrement delta_likes by 1
if delta_likes != 0, transact the total likes for this message under
/messages by delta_likes.
delete this doc from /count_likes.
Function C (optional)
Triggered every Y minutes to delete /count_likes requests that were never met.
query docs under /count_likes that have request_time older than Z.
for each doc, delete it.
On the client
to see if you liked a message, query under /likes for a doc where liker_uid equals your uid, where message_key equals the message's key and where delete == false. if a doc exists, you have liked it.
to like a message, batch.set a like under /likes and batch.set a /messages_updated. if this batch fails, try a batch_two.update on the like by updating its delete field to false and batch_two.set its /messages_updated.
to unlike a message, batch.update on the like by updating its delete field to true and batch.set its /messages_updated.
Pros of this approach
this can be extended to counters for other things, not just messages.
a user can see if they've liked something.
a user can only like something once.
a user can spam toggle a like button and this still works.
any user can see who's liked what message by querying /likes by message_key.
any user can see all the messages any user has liked by querying /likes by liker_uid.
only a cloud function admin updates your like counts.
if a function is fired multiple times for the same event, this function is safe, meaning like counts will not be incremented multiple times for the same like.
if a function is not fired for some event, this approach still works. It just means that the count will not update until the next time someone else likes the same message.
likes are denormalized to only one root level collection, instead of the two that would be required if you had the like under the the message's likes subcollection and under the liker's messages_liked subcollection.
like counts for each message are updated in batches, ie if something has been liked 100 times, only 1 transaction of 100 is required, not 100 transactions of 1. this reduces write rate conflicts due to like counter transactions significantly.
Cons of this approach
Counts are only updated however often your cron job fires.
Relies on a cron service to fire and in general there's just more to set up.
Requires the function to authenticate with limited privileges to perform secure writes under /likes. In the Realtime Database this is possible. In Firestore, it's possible, but a bit hacky. If you can wait and don't want to use the hacky approach, use the regular unrestricted admin in development until Firestore supports authenticating with limited privileges.
May be costly depending on your standpoint. There are function invocations and read/write counts you should think about.
Things to consider
When you transact the count in Function B, you may want to consider trying this multiple times in case the max write rate of 1/sec is exceeded and the transaction fails.
In Function B, you may want to implement batch reading like in Function A if you expect to be counting a lot of likes per message.
If you need to update anything else periodically for the message (in another cron job), you may want to consider merging that function into Function B so the write rate of 1/sec isn't exceeded.

Firebase database rules – `data.exists()` always seems to be true, possible bug?

I am trying to secure my firebase database to allow the creation of new records, but not allow the deletion of existing records. Ultimately, I plan to utilise Firebase authentication in my app as well, and allow users to update existing records if they are the author, but I am trying to get the simple case working first.
However! No matter what I try in the database rules simulator, despite what the documentation seems to suggest, the value of data.exists() seems to always be true. From what I what I can understand from the documentation, the variable data represents a record in the database as it did before an operation took-place. That is to say, for creates, data would not exist, and for updates/deletes, data would refer to a real record that exists in the database. This does not seem to be the case, to the point where I am actually suspecting a bug in Firebase, as when setting the following rules on my database, all write operations are disallowed:
{
"rules": {
".read": true,
".write": "!data.exists()"
}
}
No matter what values I put into the simulator, be it Location or Data. I have even written a small EmberJS app to verify if the Simulator is telling the truth and it too, is denied permission for all write operations.
I really have no idea where to go from here as I am pretty much out of things to try. I tried deleting all records from my database, which lets the simulator think it can perform write operations, but my test app still gets PERMISSION_DENIED, so I don't know what's causing inconsistencies there.
Is my understanding of the predefined data variable correct? If so, why can't I write the rules I want? I have seen snippets literally trying to achieve my "create only, no-delete" rule that seem to line up with my understanding.
Last note: I am trying this in a totally new Firebase project with JUST the rules above, and only ~a few records of junk data laying around my database.
Because you have placed the !data.exists() at the root location of your database, data refers to the entire database. You will only be able to write to the database when it is completely empty.
You indicate that you run your tests with only a few records of junk data laying around my database. Those records will cause data.exists() to be true.
You can achieve your goal by placing the !data.exists() rule in your tree at the specific location where you want to require that no data already exists. This is typically done at a location with a wildcard key, as in the example you linked:
{
"rules": {
// default rules are false if not specified
"posts": {
".read": true, // everyone can read all posts
"$postId": {
// a new post can be created if it does not exist
// existing posts can only be edited by their original "author"
".write": "!data.exists() && newData.exists() || data.child('author').val() == auth.uid",
".validate": "newData.hasChildren(['title', 'author', 'timestamp'])",
}
}
}
}

Having consistency during multi path updates when the paths are not deterministic and are variable

I need help in a scenario when we do multipath updates to a fan-out data. When we calculate the number of paths and then update, in between that, if a new path is added somewhere, the data would be inconsistent in the newly added path.
For example below is the data of blog posts. The posts can be tagged by multiple terms like “tag1”, “tag2”. In order to find how many posts are tagged with a specific tag I can fanout the posts data to the tags path path as well:
/posts/postid1:{“Title”:”Title 1”, “body”: “About Firebase”, “tags”: {“tag1:true, “tag2”: true}}
/tags/tag1/postid1: {“Title”:”Title 1”, “body”: “About Firebase”}
/tags/tag2/postid1: {“Title”:”Title 1”, “body”: “About Firebase”}
Now consider concurrently,
1a) that User1 wants to modify title of postid1 and he builds following multi-path update:
/posts/postid1/Title : “Title 1 modified”
/tags/tag1/postid1/Title : “Title 1 modified”
/tags/tag2/postid1/Title : “Title 1 modified”
1b) At the same time User2 wants to add tag3 to the postid1 and build following multi-path update:
/posts/postid1/tags : {“tag1:true, “tag2”: true, “tag3”: true}
/tags/tag3/postid1: {“Title”:”Title 1”, “body”: “About Firebase”}
So apparently both updates can succeed one after other and we can have tags/tag3/postid1 data out of sync as it has old title.
I can think of security rules to handle this but then not sure if this is correct or will work.
Like we can have updatedAt and lastUpdatedAt fields and we have check if we are updating our own version of post that we read:
posts":{
"$postid":{
".write":true,
".read":true,
".validate": "
newData.hasChildren(['userId', 'updatedAt', 'lastUpdated', 'Title']) && (
!data.exists() ||
data.child('updatedAt').val() === newData.child('lastUpdated').val())"
}
}
Also for tags we do not want to check that again and we can check if /tags/$tag/$postid/updatedAt is same as /posts/$postid/updatedAt.
"tags":{
"$tag":{
"$postid":{
".write":true,
".read":true,
".validate": "
newData.hasChildren(['userId', 'updatedAt', 'lastUpdated', 'Title']) && (
newData.child('updatedAt').val() === root.child('posts').child('$postid').val().child('updatedAt').val())”
}
}
}
By this “/posts/$postid” has concurrency control in it and users can write their own reads
Also /posts/$postid” becomes source of truth and rest other fan-out paths check if updatedAt fields matches with it the primary source of truth path.
Will this bring in consistency or there are still problems? Or can bring performance down when done at scale?
Are multi path updates and rules atomic together by that I mean a rule or both rules are evaluated separately in isolation for multi path updates like 1a and 1b above?
Unfortunately, Firebase does not provide any guarantees, or mechanisms, to provide the level of determinism you're looking for. I have had the best luck front-ending such updates with an API stack (GCF and Lambda are both very easy, server-less methods of doing this). The updates can be made in that layer, and even serialized if absolutely necessary. But there isn't a safe way to do this in Firebase itself.
There are numerous "hack" options you could apply. You could, for example, have a simple lock mechanism using a dedicated collection for tracking write locks. Clients could post to a lock collection, then verify that their key was the only member of that collection, before performing a write. But I hope you'll agree with me that such cooperative systems have too many potential edge cases, potential security issues, and so on. In Firebase, it is best to design such that this component is not a requirement in the first place.

Resources