Firestore get documents where value not in array? - firebase

Is there a way to get all documents which array field does not contain one or more values, now there is "array-contains" but is there something like "array-not-contains"?

You can only query Firestore based on indexes, so that queries all scale up to search billions of documents without performance problems.
Indexes work by recording values that exist in your data set. An index can't possibly be efficient if it tracks things that don't exist. This is because the universe of non-existant values compared to your data set is extremely large and can't be indexed as such. Querying for non-existence of some value would require a scan of all your documents, and that doesn't scale.

I don't think that is possible at the moment. I would try looking at this blog post for reference.
better arrays in cloud firestore
You might need to convert your array to an object so that you can query by (property === false)

Considering that your collection will have a low number of documents, you could store all of their ids in another document using an onCreate cloud function trigger, download this document from the client and do the filtering client-side. You could also do all of this inside a cloud function if you're worried about performance.
You'll have 1 extra read but that's no big deal, each document can have up to 1 MB of storage and that's a lot so you shouldn't be worried about it too much, you could also divide those ids into different documents and merge them on the client/cloud function if they get too big.
This works very well for small sets of data, but if you're expecting millions of documents, then there isn't much you can do.

Firestore recently added support for a not-in clause.
citiesRef.where('country', 'not-in', ['USA', 'Japan']);
That will get every doc where country exists and has a value other than 'USA', 'Japan', or 'null'.

Related

Large arrays in Firestore Database (Best practices)

I am populating a series of dates and temperatures that I was thinking of storing in a Firestore Database to later be consumed by the front-end with the following structure:
{
date: ['1920-01-01', '1920-01-02', '1920-01-03', '1920-01-04', '1920-01-05', ...],
values: [20, 18, 19.5, 20.5, ...]
}
The array may consider a lot of years, so it turns huge, with thousands of entries. Firestore database started complaining about returning the too many index entries for entity error, and even if I get the data uploaded, the user interface Firebase -> Firebase Database -> Panel View collapses. That happens even with less than 3000 entries array.
The fact is that the data is consumed in the front-end with an array structure very similar to the one described above (I want to plot it using Echarts library). This way, I found this structure to be the more natural way, as any other alternative will require reversing the structure to arrays in the front-end.
Nevertheless, I see that Firestore Database very clearly does not like this structure. What should I do? What is the best practice for dealing with this kind of data in Firestore?
The indexes required for the most basic queries in Firestore are automatically created for you. However, there are some limits involved. So you're getting the following error:
too many index entries for entity
Because you hit the maximum number of index entries for a document, which is 40,000. If you add too many elements into an array or you add too many fields to a document, then you can reach the maximum limit.
So most likely the number of elements that exist in the date array + the number of elements that exist in the values array is bigger than 40k, hence the error.
To solve this, you might consider creating two separate documents, one for each array. If you still hit the maximum limit, then you might consider creating a document for each hour, and not for an entire day. In this way, you'll drastically reduce the number of elements that exist in an array.
If you don't find these solutions useful, then you have to set some "Single-field index exemptions" to avoid the above error.
Firestore is not the best tool to deal with time series. The best solution I found in Firestore was creating an independent document for each day in my data. Nevertheless, that raises the number of documents I need to fetch from the front-end side and, therefore, the costs.
By using large arrays in Firestore, you easily reach the index limit, and you are forced to remove the index, which I feel is a big red flag, suggesting checking another tool.
The solution I found, in case is useful for anyone, was building my API in Flask using MongoDB as a database. Although it takes more effort than just using Firestore, it deals better with time series and brings more flexibility.

Clarification on how to use Not-in Firestore Operator in Queries [duplicate]

Is there a way to get all documents which array field does not contain one or more values, now there is "array-contains" but is there something like "array-not-contains"?
You can only query Firestore based on indexes, so that queries all scale up to search billions of documents without performance problems.
Indexes work by recording values that exist in your data set. An index can't possibly be efficient if it tracks things that don't exist. This is because the universe of non-existant values compared to your data set is extremely large and can't be indexed as such. Querying for non-existence of some value would require a scan of all your documents, and that doesn't scale.
I don't think that is possible at the moment. I would try looking at this blog post for reference.
better arrays in cloud firestore
You might need to convert your array to an object so that you can query by (property === false)
Considering that your collection will have a low number of documents, you could store all of their ids in another document using an onCreate cloud function trigger, download this document from the client and do the filtering client-side. You could also do all of this inside a cloud function if you're worried about performance.
You'll have 1 extra read but that's no big deal, each document can have up to 1 MB of storage and that's a lot so you shouldn't be worried about it too much, you could also divide those ids into different documents and merge them on the client/cloud function if they get too big.
This works very well for small sets of data, but if you're expecting millions of documents, then there isn't much you can do.
Firestore recently added support for a not-in clause.
citiesRef.where('country', 'not-in', ['USA', 'Japan']);
That will get every doc where country exists and has a value other than 'USA', 'Japan', or 'null'.

Indexes and array of maps in Firestore

I want to be sure that no, or very little, Firestore storage is used for indexing an array containing many maps. To my understanding when reading about Firestore index types, no index are created for array of maps in a document since that can not be queried. Am I right think this?
For example, here is an image of the array of maps:
There will be a lot of map elements in those progressionArray arrays but not enough to exceed 1MB per document. Since all progression data always needs to be loaded by the user, it seems best to me to store this data in an array to minimize Firestore reading costs (and index storage costs). Also there is no need to index this data since it will always all be loaded once by the user.
What are the indexing storage costs associated to this progressionArray? Are they zero like I think since it can not be queried?
Thank you!
The documentation says
“A single-field index stores a sorted mapping of all the documents in a collection that contain a specific field.” so indexes will be created for arrays of maps.
You can create an exemption for single field indexes as explained here.
The only cost the indexes have is the amount of storage it takes to save them. You can calculate the index cost with the values specified in this document.

Exclude list from another list of documents in Firestore - Swift

I have 2 collections in Firestore:
In the first I have the "alreadyLoaded" user ids,
In the second I have all userIDs,
How can I exclude the fist elements from the second elements making a query in Firestore?
the goal is to get only users that I haven't already loaded (optionally paginating the results).
Is there an easy way to achieve this using Firestore?
EDIT:
The number of documents I'm talking about will eventually become huge
This is not possible using a single query at scale. The only way to solve this situation as you've described is to fully query both collections, then write code in the client to remove the documents from the first set of results using the documents in the second set of results.
In fact, it's not possible to involve two collections in the same query at the same time. There are no joins. The only way to exclude documents from a query is to use a filter on the contents of the documents in the single collection.
Firestore might not be the best database for this kind of requirement if the collections are large and you're not able to precompute or cache the results.

How to delete Single-field indexes that generated automatically by firestore?

update:
TLDR;
if you reached here, you should recheck the way you build your DB.
Your document(s) probably gets expended over time (due to nested list or etc.).
Original question:
I have a collection of documents that have a lot of fields. I do not query documents even no simple queries-
I am using only-
db.collection("mycollection").doc(docName).get().then(....);
in order to read the docs,
so I don't need any indexing for this collection.
The issue is that firestore generates Single-field indexes automatically, and due to the amount of fields cause limitation exceeding of indexing:
And if I trying to add a field to one of the documents it throws me an error:
Uncaught (in promise) Error: Too many indexed properties for entity: app: "s~myapp",path < Element { type: "tags", name: "aaaa" }>
at new FirestoreError (index.cjs.js:346)
at index.cjs.js:6058
at W.<anonymous> (index.cjs.js:6003)
at Ab (index.js:23)
at W.g.dispatchEvent (index.js:21)
at Re.Ca (index.js:98)
at ye.g.Oa (index.js:86)
at dd (index.js:42)
at ed (index.js:39)
at ad (index.js:37)
I couldn't find any way to delete these single-field-indexing or to tell firestore to stop generating them.
I found this in firestore console:
but there is no way to disable this, and to disable auto indexing for a specific collection.
Any way to do it?
You can delete simple Indexes in Firestore firestore.
See this answer for more up to date information on creating and deleting indexes.
Firestore composite index permutation explosion?
If you go in to Indexes after selecting the firestore database and then select "single" indexes there is an Add exemption button which allows you to specify which fields in a Collection (or Sub-collection) have simple indexes generated by Firestore. You have to specify the Collection followed by the field. You then specify every field individually as you cannot specify a whole collection. There does not seem to be any checking on valid Collections or field names.
The only way I can think to check this has worked is to do a query using the field and it should fail.
I do this on large string fields which have normal text in them as they would take a long time to index and I know I will never search using this field.
Firestore creates two indexes for every simple field (ascending and descending) but it is also possible to create an exemption which removes one of these if you will never need the second one which helps improve performance and makes it less likely to hit the index limits. In addition you can select whether arrays are indexed or not. If you create a lot of entries it an Array, then this can very quickly hit the firestore limits on the number of indexes, so care has to be taken when using indexes and it will often be best to take the indexes off Arrays since the designer may have no control over how many Array data items are added with the result that the maximum index limit is reached and the application will get an error as the original poster explained.
You can also remove any simple indexes if you are not using them even if a field is included in a complex index. The complex index will still work.
Other things to keep an eye on.
If you are indexing a timestamp field (or any field that increases or decreases sequentially between documents) and you are not using this to force a sequence in queries, then there is a maximum write rate of 500 writes per second for the collection. In this case, this limit can be removed by removing the increasing and decreasing indexes.
Note that unlike the Realtime Database, fields created with Auto-ID do not guarantee any ordering as they are generated by firestore to spread writes and avoid hotspots or bottlenecks where all writes (and therefore reads) end up at a single location. This means that a timestamp is often needed to generate ordering but you may be able to design your collections / sub-collections data layout to avoid the need for a timestamp. For example, if you are using a timestamp to find the last document added to a collection, it might be better to just store the ID of the last document added.
Large array or map fields can also cause the 20,000 index entries per document limit to be reached, so you can exempt the array from indexing (see screenshot below).
Once you have added one exemption, then you will get this screen.
See this link as well.
https://firebase.google.com/docs/firestore/query-data/index-overview
The short answer is you can't do that right now with Firebase. However, this is a good signal that you need to restructure your database models to avoid hitting limits such as the 1MB per document.
The documentation talks about the limitations on your data:
You can't run queries on nested lists. Additionally, this isn't as
scalable as other options, especially if your data expands over time.
With larger or growing lists, the document also grows, which can lead
to slower document retrieval times.
See this page for more information about the advantages and disadvantages on the different strategies for structuring your data: https://firebase.google.com/docs/firestore/manage-data/structure-data
As stated in the Firestore documentation:
Cloud Firestore requires an index for every query, to ensure the best performance. All document fields are automatically indexed, so queries that only use equality clauses don't need additional indexes. If you attempt a compound query with a range clause that doesn't map to an existing index, you receive an error. The error message includes a direct link to create the missing index in the Firebase console.
Can you update your question with the structure data you are trying to save?
A workaround for your problem would be to create compound indexes, or as a last resource, Firestore may not be suited to the needs for your app and Firebase Realtime Database can be a better solution.
See tradeoffs:
RTDB vs Firestore
I don't believe that there currently exists the switch that you are looking for, so I think that leaves the following,
Globally disable built-in indexes and create all indexes explicitly. Painful and they have limits too.
A workaround where you treat your Cloud Firestore unfriendly content like a BLOB, like so:
To store,
const objIn = {text: 'my object with a zillion fields' };
const jsonString = JSON.stringify(this.objIn);
const container = { content: this.jsonString };
To retrieve,
const objOut = JSON.parse(container.content);

Resources