Query subset of Firebase (NoSQL) data - firebase

I have a Firebase (NoSQL) collection of say 5,000 "players". Each day I want to query a subset of those players in order to perform some operation. My question is, what is the best way to do that?
As best as I can tell, there is no way to perform such a query within Firebase directly. So for example, I cannot say "Collection of 5,000 players, give me all of the players which match ANY of these identifiers". If that is an option, please advise.
One option I thought of would be to create a new collection each day with the identifiers of players I am interested in performing operations on. Would this be the preferred method in Firebase? IE, I'd create a collection like 20190105Game and it would contain the identifier subset. I'd query that collection first, then go to the Players collection to get collection.where("identifier", "==", "other_identifier")
Is there a better way?

If you want to filter a subset of the players, you have two options:
Include the condition for the subset into your query. E.g. playersRef.where("subset", "=", 2).where("othercondition", "=", "value").orderBy("somefield").limit(2)
Create a (sub)collection for the subset of players.
Neither is pertinently better than the other, it all depends on your exact use-cases. I'd typically go for the first option, unless I have a use-case where tht is impossible due to my other query or throughput requirements.

Related

Firestore Query crashes while using whereNotEqualTo and multiple orderBy [duplicate]

Let's say I have a collection of cars and I want to filter them by price range and by year range. I know that Firestore has strict limitations due performance reasons, so something like:
db.collection("products")
.where('price','>=', 70000)
.where('price','<=', 90000)
.where('year','>=', 2015)
.where('year','<=', 2018)
will throw an error:
Invalid query. All where filters with an inequality (<, <=, >, or >=) must be on the same field.
So is there any other way to perform this kind of query without local data managing? Maybe some kind of indexing or tricky data organization?
The error message and documentation are quite explicit on this: a Firestore query can only perform range filtering on a single field. Since you're trying to filter ranges on both price and year, that is not possible in a single Firestore query.
There are two common ways around this:
Perform filtering on one field in the query, and on the other field in your client-side code.
Combine the values of the two range into a single field in some way that allows your use-case with a single field. This is incredibly non-trivial, and the only successful example of such a combination that I know of is using geohashes to filter on latitude and longitude.
Given the difference in effort between these two, I'd recommend picking the first option.
A third option is to model your data differently, as to make it easier to implement your use-case. The most direct implementation of this would be to put all products from 2015-2018 into a single collection. Then you could query that collection with db.collection("products-2015-2018").where('price','>=', 70000).where('price','<=', 90000).
A more general alternative would be to store the products in a collection for each year, and then perform 4 queries to get the results you're looking for: one of each collection products-2015, products-2016, products-2017, and products-2018.
I recommend reading the document on compound queries and their limitations, and watching the video on Cloud Firestore queries.
You can't do multiple range queries as there are limitations mentioned here, but with a little cost to the UI, you can still achieve by indexing the year like this.
db.collection("products")
.where('price','>=', 70000)
.where('price','<=', 90000)
.where('yearCategory','IN', ['new', 'old'])
Of course, new and old go out of date, so you can group the years into yearCategory like yr-2014-2017, yr-2017-2020 so on. The in can only take 10 elements per query so this may give you an idea of how wide of a range to index the years.
You can write to yearCategory during insert or, if you have a large range such as a number of likes, then you'd want another process that polls these data and updates the category.
In Flutter You can do something like this,
final _queryList = await db.collection("products").where('price','>=', 70000).get();
final _docL1 = _querList.where('price','<=', 90000);
Add more queries as you want, but for firestore, you can only request a limited number of queries, and get the data. After that you can filter out according to your need.

what would be efficient alternative of "JOIN" in Firestore(NoSQL)?

I have users collection & transactions collection.
I need to get the user's balance by calculating his/her transactions.
And I heard that you are allowed to make duplicates and denormalize your database to achieve less document read in one request. (reading many docs cost more)
My approaches:
set transaction collection as a "subcollection" in the user document, so that you only get a user's documentation and compute the values need on the client-side.
make those collections as TOP level collections separately and somehow make "JOIN" queries to get his/her transactions then compute the value on the client-side.
Just make a field named "balance" in the user's document and update it every time they make transactions. (But this seems not quite adaptable to changes that might be made in the future)
Which approach is efficient? Or Maybe are there totally different approaches?
Which approach is efficient?
The third one.
Or Maybe are there totally different approaches?
Of course, there are, but by far the third is the best and cheapest one. Every time a new transaction is performed simply increment the "balance" field using:
What is the recommended way of saving durations in Firestore?

Combining multiple Firestore queries to get specific results (with pagination)

I am working on small app the allows users to browse items based on various filters they select in the view.
After looking though, the firebase documentation I realised that the sort of compound query that I'm trying to create is not possible since Firestore only supports a single "IN" operator per query. To get around this the docs says to use multiple separate queries and then merge the results on the client side.
https://firebase.google.com/docs/firestore/query-data/queries#query_limitations
Cloud Firestore provides limited support for logical OR queries. The in, and array-contains-any operators support a logical OR of up to 10 equality (==) or array-contains conditions on a single field. For other cases, create a separate query for each OR condition and merge the query results in your app.
I can see how this would work normally but what if I only wanted to show the user ten results per page. How would I implement pagination into this since I don't want to be sending lots of results back to the user each time?
My first thought would be to paginate each separate query and then merge them but then if I'm only getting a small sample back from the db I'm not sure how I would compare and merge them with the other queries on the client side.
Any help would be much appreciated since I'm hoping I don't have to move away from firestore and start over in an SQL db.
Say you want to show 10 results on a page. You will need to get 10 results for each of the subqueries, and then merge the results client-side. You will be overreading quite a bit of data, but that's unfortunately unavoidable in such an implementation.
The (preferred) alternative is usually to find a data model that allows you to implement the use-case with a single query. It is impossible to say generically how to do that, but it typically involves adding a field for the OR condition.
Say you want to get all results where either "fieldA" is "Red" or "fieldB" is "Blue". By adding a field "fieldA_is_Red_or_fieldB_is_Blue", you could then perform a single query on that field. This may seem horribly contrived in this example, but in many use-cases it is more reasonable and may be a good way to implement your OR use-case with a single query.
You could just create a complex where
Take a look at the where property in https://www.npmjs.com/package/firebase-firestore-helper
Disclaimer: I am the creator of this library. It helps to manipulate objects in Firebase Firestore (and adds Cache)
Enjoy!

Firebase / NoSQL - How to aggregate data for statistics

I'm creating my first ever project with Firebase, and I come to the point when I need some statistics based on user input. I know Firebase (or NoSQL databases in general) are not ideal for statistics but they work for me in any other cases so I would like to give it a try.
What I have:
I work on the application where people can invite a friend to work for their company, so I have a collection of "referrals" where ID of each referral is basically UserID of a user to who the referral belongs, and then there is a subcollection with name "items" where data are stored.
How my data looks like:
Each item have these data:
applicant
appliedDate
position(part of position is positionId & department on which this position is coming from)
status
What I wanted is to let user to make statistics based on:
date range
status
department
What I was thinking about:
It's probably not the best idea to let firebase iterate over all referrals once users make requests as it may get really expensive on firebase. What I was thinking of is using cloudfunctions to calculate statistics always when something change e.g. when a new applicant applies I will increase the counter by one and the same for a counter to a specific department. However I feel like this make work for total numbers or for predefined queries e.g. "LAST MONTH" but once I will not know what dates user will select it start to get tricky.
Any idea how can I design something like this?
Thanks a lot!
What you're considering is the idiomatic approach to calculate aggregated in Firestore, and most NoSQL databases. If you follow this pattern, Firestore is quite well suited to storing statistics.
It's ad-hoc statistic, like the unknown data range, that are trickier. Usually this comes down to storing the right values to allow you to get rid of the need to read an unknown number of documents to calculate a value.
For example, if you store counters for the statistics per month, week, day and hour, you can satisfy a wide range of date ranges with a limited number of read operations. You may need to read multiple documents, but the number of documents to read depends on the range, and not on the total number of documents in the database.
Of course, for the most flexible ad-hoc querying, you may still want to consider another solution, such as BigQuery, which was made precisely for this use-case.

Firestore multiple range query

Let's say I have a collection of cars and I want to filter them by price range and by year range. I know that Firestore has strict limitations due performance reasons, so something like:
db.collection("products")
.where('price','>=', 70000)
.where('price','<=', 90000)
.where('year','>=', 2015)
.where('year','<=', 2018)
will throw an error:
Invalid query. All where filters with an inequality (<, <=, >, or >=) must be on the same field.
So is there any other way to perform this kind of query without local data managing? Maybe some kind of indexing or tricky data organization?
The error message and documentation are quite explicit on this: a Firestore query can only perform range filtering on a single field. Since you're trying to filter ranges on both price and year, that is not possible in a single Firestore query.
There are two common ways around this:
Perform filtering on one field in the query, and on the other field in your client-side code.
Combine the values of the two range into a single field in some way that allows your use-case with a single field. This is incredibly non-trivial, and the only successful example of such a combination that I know of is using geohashes to filter on latitude and longitude.
Given the difference in effort between these two, I'd recommend picking the first option.
A third option is to model your data differently, as to make it easier to implement your use-case. The most direct implementation of this would be to put all products from 2015-2018 into a single collection. Then you could query that collection with db.collection("products-2015-2018").where('price','>=', 70000).where('price','<=', 90000).
A more general alternative would be to store the products in a collection for each year, and then perform 4 queries to get the results you're looking for: one of each collection products-2015, products-2016, products-2017, and products-2018.
I recommend reading the document on compound queries and their limitations, and watching the video on Cloud Firestore queries.
You can't do multiple range queries as there are limitations mentioned here, but with a little cost to the UI, you can still achieve by indexing the year like this.
db.collection("products")
.where('price','>=', 70000)
.where('price','<=', 90000)
.where('yearCategory','IN', ['new', 'old'])
Of course, new and old go out of date, so you can group the years into yearCategory like yr-2014-2017, yr-2017-2020 so on. The in can only take 10 elements per query so this may give you an idea of how wide of a range to index the years.
You can write to yearCategory during insert or, if you have a large range such as a number of likes, then you'd want another process that polls these data and updates the category.
In Flutter You can do something like this,
final _queryList = await db.collection("products").where('price','>=', 70000).get();
final _docL1 = _querList.where('price','<=', 90000);
Add more queries as you want, but for firestore, you can only request a limited number of queries, and get the data. After that you can filter out according to your need.

Resources