After reading through the firestore documentation about indexing I want to confirm that this is how a single-field index would look like:
Let's say I have a firestore collection cars with following documents:
car123: {
brand:"Mercedes"
model:"W123",
},
car423: {
brand:"BMW",
model:"x5"
},
carXyZ: {
brand:"Mercedes",
model:"S 500"
}
Would the indices that firestore creates automatically really look something more or less like this?
index for queries filtering by brand equals "Mercedes" = ["car123", "carXyZ"]
index for queries filtering by brand equals "BMW" = ["car423"]
index for queries filtering model equals "S 500" = ["CarXyZ"]
..and does this mean each time a car is added n indices are updated whereas n is the number of keys that car has + of each index one ASC and one DESC version?
The examples provided by you are an oversimplification of the what an actual index looks like, according to the documentation:
A single-field index stores a sorted mapping of all the documents in a
collection that contain a specific field. Each entry in a single-field
index records a document's value for a specific field and the location
of the document in the database.
So according to the excerpt cited, an index at least has 4 things, a mapping with all the documents that have the single-field index, the single field index, the value of that index per document and the location of the document in the database, how that structure looks like is not provided publicly as far as I know.
Regarding your question, it's possible to infer, that each time a new document, with a field that is being part of a single-field index, is created, it will be added to two single field-index structures the Descending and Ascending one.
Related
I have just read the documentation for Firestore indexing and now I have the following questions to make sure that I understood the concept correctly :
Assume that I have the following data structure:
{
"user_collection": {
"user1_document":{
"name": "Joe",
"age": 21
},
"user2_document":{
"name": "Sarah",
"age": 29
},
"user3_document":{
"name": "Sarah",
"age": 24
}
}
}
If I now perform a query that returns every document with the name Sarah, Firestore looks through every index record of the field name and returns every document where the name value equals "Sarah". Did I understand that correctly?
My next question is a little bit more specific: indexes are sorted(in ascending and descending order). Now, when a query is looking for every document where the user's age is smaller than 20, would Firestore start with the age 21, notice that the smallest age in the user collection is 21, and therefore stop checking any further document OR would Firestore still go through all the remaining documents? Generally, is there any information about what algorithm Firestore uses to search indexes, like binary search?
I know this information is irrelevant in terms of working with Firebase, but it just interests me.
If I now perform a query that returns every document with the name Sarah, Firestore looks through every index record of the field name and returns every document where the name value equals "Sarah". Did I understand that correctly?
Yes, and you'll have to pay a document read for each document the query returns. If however, your query yields no result, according to the official documentation regarding Firestore pricing, it is said:
Minimum charge for queries
There is a minimum charge of one document read for each query that you perform, even if the query returns no results.
So if, for example, you try to filter all users and you get no results, you're still charged with 1 read.
When a query is looking for every document where the user's age is smaller than 20, would Firestore start with the age 21, notice that the smallest age in the user collection is 21, and therefore stop checking any further document OR would Firestore still go through all the remaining documents?
No. When you're looking for every document where the user's age is less than 20, Firestore will return all documents where the age field holds a value that is less than 20. It would have returned documents where the field age holds a value of 20 if you were looking for every document where the user's age is less than or equal to 20.
Yes, in order to provide some results, Firestore will have to check all documents against a value.
Generally, is there any information about what algorithm Firestore uses to search indexes, like binary search?
I'm not aware of something public about the Firestore algorithm, but if I find something I will update my answer.
Please also note that in Firestore, we are not only charged based on the number of reads/writes/deletes we perform but also based on space. So we have to pay for what we consume, including storage overhead. What does that mean? It means that we have to pay for the metadata, automatic indexes, and composite indexes.
The single key indexes can be consider as a value -> docId mapping in short. As per your database structure, an index on field 'name' would be like this:
"Sarah": "user1_id",
"Sarah": "user2_id",
"Sarah": "user3_id",
For an index on field age, the index structure would be:
"21": "user1_id",
"29": "user2_id",
"24": "user3_id",
When you run a query and an index supporting the same exists, it just has to read those index entries.
Every document where the user's age is smaller than 20, would Firestore start with the age 21,
In case of where("age", "<", 20) (and you have no document matching the query), there are no index entries for the same and hence no data is returned i.e. no other entries are read. However, it'll still cost you a read as Alex mentioned.
Additionally, if you want to query based on both the fields, you would need a composite index e.g. { name: ASC, age: ASC }:
{"Sarah", 21}: "user1_id",
{"Sarah", 29}: "user2_id"
{"Sarah", 24}: "user3_id"
Whenever you create a new document, all the indexes related are updated so creating many indexes may slow down write operations generally. Databases (like MongoDB) generally use B-Trees. If you are curious about Firestore then it might be a good idea to contact Firebase.
I have the following structure in a Firestore collection. The "ranks" collection is updated with documents named after the timestamps. In each document, I have the same fields and values. How can I query all documents for a specific field without parsing the entire document? I.e. I want all values in all documents where field is "aave"?
I am new to Firestore and I've been trying this for several weeks now. I tried limiting with where and considered using sub collection group queries but in my case data is not stored in sub collections. Sorry, for not being able to provide more context, since I couldn't get much closer.
Queries select specific values, or ranges of values, of a known field. There is no support for dynamic field names in a query in Firestore.
But if you want to get all documents where the field aave exists/has any value, you can make use of the fact that in the sort order of values they always start with null. So to get all documents where the field aave exists/has any value, you could do:
firebase.firebase().collection("ranks").where("aave", ">=", null)
I was looking for a solution to Firestore's limitation of Sequential indexed fields which means the following from this doc.
"Sequential indexed fields" means any collection of documents that
contains a monotonically increasing or decreasing indexed field. In
many cases, this means a timestamp field, but any monotonically
increasing or decreasing field value can trigger the write limit of
500 writes per second.
As per the solution, I can add a shard field in my collection which will contain random value and create a composite index with the timestamp. I am trying to achieve this with the existing fields I have in my Document.
My document has the following fields:
{
users: string[],
createdDate: Firebase Timestamp
....
}
I already have a composite index created: users Arrays createdDate Descending. Also, I have created Exemptions for the fields field from Automatic index settings. The users field will contain a list of firebase auto-generated IDs so definitely its random. Now I am not sure whether the field users will do the job of field shard form the example doc. In this way we can avoid adding a new field and still increase the write rate. Can someone please help me with this?
While I don't have specific experience that says what you're trying to do definitely will or will not work the way you expect, I would assume that it works, based on the fact that the documentation says (emphasis mine):
Add a shard field alongside the timestamp field. Use 1..n distinct values for the shard field. This raises the write limit for the collection to 500*n, but you must aggregate n queries.
If each users array contains different and essentially random user IDs, then the array field values would be considered "distinct" (as two arrays are only equal if their elements are all equal to each other), and therefore suitable for sharding.
I have the following document structure in firebase:
{
typeId: number,
tripId: number,
locationId: string,
expenseId: number,
createtAt: timestamp
}
I want to query this collection using different 'where' statement everytime. Sometimes user wants to filter by type id and sometimes by locationId or maybe include all of the filters.
But it seems like I would need to create a compound index of each possible permutation? For example: typeId + expenseId, typeId + locationId, location + expenseId, etc, otherwise it doesn't work.
What if I have 20 fields and I want to make it possible to search across all of these?
Could you please help me to construct a query and indexes for the following requirement: Possibility to query across all fields, query can contain one, two, three, all or no fields included in where clause and always has to be ordered descending order by createdAt.
Cloud Firestore automatically creates indexes for the individual fields of your documents. So it can already filter on each field without you have to manually add these indexes.
In many cases it is able to combine these indexes to allow queries on field combinations, by performing a so-called zig-zag-merge-join.
Custom additional indexes are typically only needed once you add an ordering-clause to your query, in addition to filter clauses. If you have such a case, the Firestore client will log an error telling you exactly what index to create (with a link to the Firestore console that is prepopulated to created the index for you).
My firm asked me to get all the documents from the root collection, users. They must be sorted according to the value of a numeric field, titled amount, present in a first sub-level document and also in a second sub-level document.
I don't know how to fulfill this aim easily with Firestore NoSQL queries. What would you recommend to me?
Data structure
Here it is:
users
user_1 (document to get and sort)
user_2 (document to get and sort)
Each of these two users is structured like this:
collection_1
document_of_first_level
contains this field: amount
and contains also this collection: collection_2
document_of_second_level
contains this field: amount
So I should get a list containing user_1 and user_2. This is a sorted list. It's sorted according to the two fields both named amount, which are contained under user_1, but also under user_2. These both fields are nested.
They must be sorted according to the value of a numeric field, titled amount, present in a first sub-level document and also in a second sub-level document.
As also #DougStevenson mentioned in his comment, you cannot achieve this with Cloud Firestore. There is no way to get documents from a subcollection (level 1) and another subcollection (level 2) in a single query. Firestore doesn't support queries across different subcollections in one go. Queries in Firestore are shallow, which means they only get items from the collection that the query is run against. A single query may only use properties of documents in a single collection.
So the most simple solution I can think of, would be to query the database twice, once to get the documents within subcollection (level 1) and second to get documents within subcollection (level 2) and then compare them client side.
The idea from your comment is a solution since adding a property under the level 1 document will allow you to query the database just once. In Firestore, you can simply chain multiple where() functions in a single query.