How to work around firestore "array-contains-any" limitation - firebase

I am working on a Flutter project where I need to display the list of government agencies. I should implement two levels of filters. 1. State to which the agency belongs and 2. Service provided by the agency. So I have some choice chips to select the agent based on service and a checkbox-based dropdown for filtering states. As a result, I need to apply logical or based filters on state and service fields. Since that's impossible I thought of adding a helper field that holds both data in an array format. so that I could work on a single field with array-contains-any filter.
Now there is one more problem, let's say there are twelve states and 5 services, and I am selecting 6 states and 3 services. I need to create a total of 18 arrays which is also not supported by Firestore as the maximum is 10.
Please share with me an idea of any approach I could take;

Don't use arrays, simply use maps. There are no limitations regarding how many where equal methods are you chaining. The key of the Map will be the name of an agency and the value will be the boolean true. And you can add as many maps as you want.

Related

Single multi-tenanted firestore or many single tenanted firestores?

I'm building a SaaS system that allows users to define their own data models and enter data according to those models. It's a bit like airtable.
One user might model a bookshop, and would have a Book model, with title and ISBN fields. Another user might model medical records, and would have "date of last visit" as a field.
In the case of the bookshop, I want users to be able to search on title and ISBN. In the case of the medical records, I want users to be able to search on the date of the last visit.
I am using Firestore as my backend.
Firestore requires an index to enable a search. So that approach will not scale as # of customers increases.
My thought therefore was to have a Firestore instance for each customer, and those specific instances would have the necessary indexes.
I'm sure there are downsides to doing this though.
What would folks recommend to best solve this need?
What you are trying to achieve is some kind of weird, since you will not provide at least a few standard common properties for each user of your Bookshop.
When you want to perform a search in a Cloud Firestore database, you need the exact name of the property on which you want to search for. Having dynamic properties might not help you solve the search feature. However, you can create a document with a property of type array that can hold the name of all properties the users have chosen and perform a search on every property, but this solution will be much too expensive.
In my opinion, a possible solution might be to create at least a few common properties, so you can have the properties on which you can search. When someone creates, for example, a book shop you can display at the beginning all available properties a user can choose. Once you create a shop, you can have different users with different shop properties. This means that if a user does not choose a property, when you perform a search on that property, the results won't contain his/her products. This will work, only if you have predefined properties.

Firebase: How flat should my data structure be?

I'm building an app that tracks the user's location and updates Firebase. I've read the documentation about structure data but still have a few questions.
I'm considering structuring the data in one of two ways, but can't determine which one.
users
$id
-position
-other attr
vs:
user_position
$id
users
$id
-other attr.
In what scenario would the first design work best, second?
If you only keep one position per user (as seems to be the case by the fact that you use singular user_position), there is no useful difference between the two structures. A user's position in that case is just another attribute, just one that happens to have two value (lat and lon).
But if you want to keep multiple positions per user, then your first structure is mixing entity types: users and user_positions. This is an anti-pattern when it comes to Firebase Database.
The two most common reasons are:
Say you want to show a list of user names (or any specific, single-value attribute). With the first structure you will also need to read the list of all positions of all users, just to get the list of names. With the second structure, you just read the user's attributes. If that is still much more data than you need, consider also keeping a list of /user_names for optimal read performance.
Many developers end up wanting different access rules for the user positions and the other user attributes. In the first structure that is only possible by pushing the read permission from the top /users down to lower in the tree. In the second structure, you can just give separate permissions to /users and /user_positions.

Indexing only individual values in property arrays (instead of indexing every combination of those values) in Google datastore

The data model I am planning would have a few property "fields" in place, including a "category/tags" property, which would be a list/array of a lot of tags.
I'm planning on querying on one category at a time. I am not interested in indexing which entities have combinations of categories, just individual categories.
I am NOT referencing simply not indexing a particular property.
Bonus Question:
It seems Google datastore doesn't like "monotonically increasing" property values (ie timestamps) because presumably they make hotspots on the machines while forming indexes. So would just storing the current calendar date help? I could see that making even more of a "hotspot" since every entity for 24 hours would have the same index value for that property, is there some way of storing some data about when each entity was recorded?
Indeed, one should encounter no issues creating a builtin index, as mentioned in the above reply. Still, properties with array values can behave in surprising ways. For more than one filter, all conditions defined by the filters must be satisfied by at least one of the array’s individual values, for it to match the query. This does not apply in case of the equality filters.
Sort order is also unusual: the first value seen in the index determines an entity's sort order.
I don't think a property index (aka Built-in Index) on an Array property creates the index with various value combinations. I believe each value in the Array is indexed. For example, if you have a Book with two tags, the index will have two entries for each tag. Adding another book with three tags would add 3 more entries to the Tags index. This index allows you to query for books based on a single tag as well as multiple tags.
The "combination of values" that you mentioned happens if you create a composite index containing more than one Array type (e.g. Authors and Tags of a Book), and all/most books have multiple authors and multiple tags.
You should not have any issues creating a builtin index on your Category/Tag.
On your other question on indexing entity created/modified timestamp, I do see that the Best Practices says to avoid indexing such a property.
Do not index properties with monotonically increasing values (such as
a NOW() timestamp). Maintaining such an index could lead to hotspots
that impact Cloud Datastore latency for applications with high read
and write rates
Not sure what the alternative would be. If you don't have to query on the timestamp/sort on the timestamp, you are fine storing the timestamp by excluding the property from indexing.

Getting information of linked topics on a single request

So, say I search for a City using Freebase API. Say, San Francisco:
https://www.googleapis.com/freebase/v1/topic/m/0d6lp?limit=20&filter=/common/topic/description&filter=/common/topic/article&filter=/location/location/geolocation&filter=/location/location/containedby&filter=/travel/travel_destination/tourist_attractions
I get a bunch of data, including the '/location/location/containedby', which refers by which other entities this one is contained by. This is how I can find out to which State and Country the city belongs to.
The problem is that I only get those entities name and mid, but not '/common/topic/notable_for', therefore, I have to make separate queries per each entity, asking just the notable_for property, to find out which one of those is a Country, a State, or other stuff I don't need.
In example, this is one of the queries, which determines United States of America is a country:
https://www.googleapis.com/freebase/v1/topic/m/09c7w0?filter=/common/topic/notable_for
This is executed between 3 to 6 times each city.
Is there a way to tell the API to include more information about these linked entities on a certain Topic? Like on this case, to include '/common/topic/notable_for' on linked entities. It would save tons of queries, and time to the end user in my case.
Thank you for your time!
You can actually get these results using the new output parameter on the Freebase Search API. Like this:
query=/m/0d6lp
output=(description /location/location/geolocation (/location/location/containedby notable))
Try it out
I'd suggest using the MQL Read API if you want better control on the information returned. Then you can specify nested queries that ask for the contained_by locations to be returned with their types (or you can explicitly filter to only those which are a country or a state).

Matching unique ids in two different databases

I have two different databases that are not connected in any way. In fact, one is a public school database and one is a hud (housing) database. By law they are not allowed to share names and other specific identifying addresses. Birthdates and addresses are okay - along with zip codes and other more general ids. The uses need to be able to query the other database to get non-specific information so it would appear that they need to share the same unique id. I was considering such things as using birthdates and perhaps initials of name or perhaps last 4 digits of ssn along with the birthdate. The client was thinking of global positioning data but I'm concerned about apartments next to one another or moving of families. Any ideas?
First you need to determine what will be your measure of uniqueness. If there are two people in either database with more than one entry for your measure of uniqueness, you need to change your strategy. After that, put a constraint on both databases constraining that these properties(Birthday, SSN) are what make a Person record unique.

Resources