Firebase indexOn works with something not in rules JSON - firebase

I have a legacy Firebase project i contribute to. In it I have the following rules for the resource songs:
"songs": {
".indexOn": ["artist_timestamp"]
},
Which allows me to do things like curl htttp://my-fire-base-ref/songs.json?orderBy="artist_timestamp"
However I can also do orderBy="$priority" which is a property we add to all song objects. This works even though it is not explicitly in the rules json definition. Is this a secretly allowed property??

The .priority of each node is implicitly indexed, so you don't need to define an index for it.
Why are you using priorities though? While they still work, using named properties allows you to accomplish the same with more readable code. See What does priority mean in Firebase?

According to the documentation for indexing data:
Firebase provides powerful tools for ordering and querying your data.
Specifically, Firebase allows you to do ad-hoc queries on a collection
of nodes using any common child key. As your app grows, the
performance of this query degrades. However, if you tell Firebase
about the keys you will be querying, Firebase will index those keys at
the servers, improving the performance of your queries.
This means you can order by any key at any time without specifying it as an index, but without a specific index specified for a key, performance may be very bad for large sets of data.

Related

Best way to filter out a subset of document IDs with firebase/firestore JS SDK?

I wanted to get some community consensus on how to achieve the following with the Firebase JS SDK (e.g., in React):
Suppose I have a collection users and I wanted to paginate users that do not have document IDs matching a subset of IDs (O(100-1000)). This subset of excluded IDs is dynamic based on the authenticated user.
It seems the not in query only supports up to 10 entries, so this is out of the question.
It also seems it's not possible to fetch all document IDs and filter on the client side, at least not in the 'firebase' JS SDK.
The only workaround I can think of is to have a document that keeps an array of all users document IDs, pull that document locally and perform the filtering/pagination logic locally. The limitation here is that a document can be at most 1MB, so realistically the single document can store at most O(10K) IDs.
Firestore has a special bunch of methods for pagination which may be useful for you. Those are called "query cursors".
You can use them to define the start point startAt() or startAfter() and to define an end point endAt() or endBefore(). Additionally, if needed, those can be combined with limit method.
I strongly encourage you to check this tutorial. Here you can find a quick video explaining the matter and lot of examples in all popular languages.

Using Firestore document's auto-generated ID versus using a custom ID

I'm currently deciding on my Firestore data structure.
I'll need a products collection, and the products items will live inside of it as documents.
Here are my product's fields:
uniqueKey: string
description: array of strings
images: array of objects
price: number
QUESTION
Should I use Firestore auto-generated ID's to be the ID of my documents, or is it better to use my uniqueKey (which I'll query for in many occasions) as the document ID? Is there a best option between the 2?
I imagine that if I use my uniqueKey, it will make my life easier when retrieving a single document, but I'll have to query for more than 1 product on many occasions too.
Using my uniqueKey as ID:
db.collection("products").doc("myUniqueKey").get();
Using my Firestore auto-generated ID:
db.collection("products").where("uniqueKey", "==", "myUniqueKey").get();
Is this enough of a reason to go with my uniqueKey instead of the auto-generated one? Is there a rule of thumb here? What's the best practice in this case?
In terms of making queries from a client, using only the information you've given in the question, I don't see that there's much practical difference between a document get using its known ID, or a query on a field that is also unique. Either way, an index is used on the server side, and it costs exactly 1 document read. The document get() might be marginally faster, but it's not worthwhile to optimize like this (in my opinion).
When making decision about data modeling like this, it's more important to think about things like system behavior under load and security rules.
If you're reading and writing a lot of documents whose IDs have a sequential property, you could run into hotspotting on those writes. So, if you want to use your own ID, and you expect to be reading and writing them in that sequence under heavy load, you could have a problem. If you don't anticipate this to be the situation, then it likely doesn't matter too much whose ID you use.
If you are going to use security rules to limit access to documents, and you use the contents of other documents to help with that, you'll need to be able to uniquely identify those documents in your rule. You can't perform a query against a collection in rules, so you might need meaningful IDs that will give direct access when used by rules. If your own IDs can be used easily this way in security rules, that might be more convenient overall. If you're force to used Firestore's generated IDs, it might become inconvenient, difficult, or expensive to try to maintain a relationship between your IDs and Firestore's IDs.
In any event, the decision you're making is not just about which ID is "better" in a general sense, but which ID is better for your specific, anticipated situation, under load, with security in mind.

Is it possible to query resource properties?

In the Firestore security rules you can access resource properties. I would like to use these properties in my queries, but I can't find any documentation on it.
Currently I am manually writing updatedAt timestamps into documents where I need them, but that is cumbersome and fragile, because it is easy to forget to update the timestamp. It also feels redundant, since the resource already has this data.
Is it, for example, possible to query all documents in a collection that have been updated since yesterday?
It is not possible to query on these, they are specific to the Security Rules layer.
While we can inspect the server update time for a specific document once retrieved, we cannot query for them since it is not indexed (and handled at a layer lower than our indexing engine).

How to design a Cloud Firestore database schema

Migrating from realtime database to cloud firestore needs a total redesign of the database. For this I created an example with some main design decisions.
See picture and the database design in the spreadsheet below.
My two questions are:
1 - when I have a one to many relation is it also an option to store information as an array within the document? See line 8 in database design.
2 - Should I include only a reference, or duplicate all information in the one to many relation. See line 38 in the database model.
https://docs.google.com/spreadsheets/d/13KtzSwR67-6TQ3V9X73HGsI2EQDG9FA8WMN9CCHKq48/edit?usp=sharing
In general: keep the data store as shallow as possible, i.e., avoid subcollections and nesting.
Data can be related one-to-one, one-to-many, or many-to-many. Firestore is an automatically indexed realtime datastore. Firestore is often subscribed to rather than just a one time query/response (the realtime nature of the system).
Regarding the Firestore data model, always consider How will I query this data store?. Use subcollections, arrays, and maps sparingly (rarely) and only if you must (and you most likely don't need to). Use auto-id's vs human readable id's, e.g. use 000kztLDGafF4uKb8Cal rather than banana for document ID's.
As app functionality increases, server-side scripting with Cloud Functions for Firebase and/or the Admin SDK becomes an invaluable tool for managing (creating and indexing) many-to-many data relationships. For example, full-text search is not supported in Firestore. This boils down to what seems like a barrier to implementing robust search functionality on your app.
In conclusion, try and avoid subcollections, nesting, arrays, and maps. Follow the keep it simple stupid, KISS, principle. Once your app scales up and/or requires more functionality, server-side scripting can be utilized to to keep your app responsive (fast) while offering robust features.
For Question 1 there's a solution in the firestore docs:
https://cloud.google.com/firestore/docs/solutions/arrays
instead of using an array you use a map of values and set them to 'true' which allows you to query for them, like so:
teachers: {
"teacherid1": true,
"teacherid2": true,
"teacherid3": true
}
And for Question 2, you just need to save the teacher-ids because if you have those you can easily query for the corresponding data.

Firebase: server side logic and real time database limitations

Server side custom operations equivalent to Parse cloud code:
Parse has the possibility to write cloud code. From my understanding of it Firebase doesn't offer any tools to do so on the console.
The only way to do so would be to implement a web-service using the Firebase API and monitor nodes changes and implement the cloud code on my own server.
A - Is this correct?
Server side rules:
The legacy documentation of Firebase describes rules which seem to be limited to deciding which user can read/write as well as validation.
{
"rules": {
"foo": {
// /foo/ is readable by the world
".read": true,
// /foo/ is writable by the world
".write": true,
// data written to /foo/ must be a string less than 100 characters
".validate": "newData.isString() && newData.val().length < 100"
}
} }
On Parse the complexity of the rules is greater. The programmer is able to create functions to perform custom operations.
Understanding the reason why Firebase is designed as it is:
I imagine that the reason for not having this complexity on Firebase is that probably a node based database is more complex than a table based one and it would be better if the developer has full control of this using the web-api and a custom server app.
B - Is this correct?
Real time database limitations:
The main limitation when using a real time database like Firebase seems to me that once you fetch some real time data if the data contains a two way redundancy the events triggered on one node are not propagated to the node containing the redundant information.
E.g. If a user node has keys id (ids of a different node at the same level of the user node) and if I display the list of keys that a user has on a table view in order to detect if the key list has changed I need to listen to changes in the keys node (and not only to changes in the user node).
- C: Is this correct?
The question is a tad vague as there are no use cases but based on the comments, here you go.
A) Yes, Maybe.
Yes, there is no server side logic (code-wise).
Maybe, it depends on what you are trying to do.
B) Firebase rules are very flexible; Rules can limit who can access data, read/write access, what kind of data, type of data, location of data etc. It is neither more or less complex than a 'table based one'. It's just a different way to verify, validate and store your data.
Just an FYI: Parse was was backed by MongoDB which is a document based NoSQL database (it's not table-based). On the back-end, Parse data was stored in a way similar to Firebase. (it's actually BSON). Their front-end implementation were objects that wrapped around JSON structures which gave the feeling that it was more table-like than Firebase, and that lead to the direct ability to have relationships between PFobjects.
C) No. You can observe any node for changes. In your example, you should probably have the keys node as a separate node than the /user node and have users observe the /keys node.
To expand on that a bit, data doesn't necessarily need to be redundant, but it can be. You only need to observe changes for the specific data you are interested in.
Suppose you have a /users node and /favorite_food node of each user. If your UI is displaying a list of users and one of them changes their favorite food, you don't really care - you are just interested in the list of users. So you would observe the /users node and not the /favorite_food node.

Resources