Filtering results with Geofire + Firebase - firebase

I'm trying to figure out how to query with filter with Geofire.
Suppose I have restaurants with different category. and I want to add that category to my query. How do I go about this?
One way I have now is querying the key with Geofire, run the for loop through each key and get the restaurant, and insert the appropriate restaurant to the array.
These seems so inefficient. Is there any other way to go about this?
Ideally I will have the filtered results, and only load each item when they're about to be shown.
Cheers!

Firebase queries can only filter by one condition. Geofire already does quite some "magic" to allow it to filter on both longitude and latitude. Adding another property to that equation might be possible, but is well beyond what Geofire handles by default. See GeoFire: How to add extra conditions within the query?
If you only ever want to access one category at a time, you can put the restaurants in a top-level node per category and point Geofire to one category.
/category1
item1
g: "pns0h0mf2u"
l: [-53.435719, 140.808716]
item2
g: "u417k3dwub"
l: [56.83069, 1.94822]
/category2
item3
g: "8m3rz3s480"
l: [30.902225, -166.66809]
/items
item1: ...
item2: ...
item3: ...
In the above example, we have two categories: category1 with 2 items and category2 with just 1 item. For each item, we see the data that Geofire uses: a geohash and the longitude and latitude. We also keep a single list with the other properties of these 3 items.
But more commonly, you simply do the extra filtering in client-side code. If you're worried about the performance of that: measure it, share the code, JSON data and measurements.

This is an old question, but I've seen it in a few places on the web, so I thought I might share one trick I've used.
The Problem
If you have a large collection in your database, maybe containing hundreds of thousands of keys, for example, it might not be feasible to grab them all. If you're trying to filter results based on location in addition to other criteria, you're stuck with something like:
Execute the location query
Loop through each returned geofire key and grab the corresponding data in the database
Check each returned piece of data to see if it matches the other criteria
Unfortunately, that's a lot of network requests, which is quite slow.
More concretely, let's say we want to get all users within e.g. 100 miles of a particular location that are male and between ages 20 and 25. If there are 10,000 users within 100 miles, that means 10,000 network requests to grab the user data and compare their gender and age.
The Workaround:
You can store the data you need for your comparisons in the geofire key itself, separated by a delimiter. Then, you can just split the keys returned by the geofire query to get access to the data. You still have to filter through them, but it's much faster than sending hundreds or thousands of requests.
For instance, you could use the format:
UserID*gender*age, which might look something like facebook:1234567*male*24. The important points are
Separate data points by a delimiter
Use a valid character for the delimiter -- "It can include any unicode characters except for . $ # [ ] / and ASCII control characters 0-31 and 127.)"
Use a character that is not going to be found elsewhere in your database - I used *, but that might not work for you. Do not use any characters from -0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyz, since those are fair-game for keys generated by firebase's push()
Choose a consistent order for the data - in this case, UserID first, then gender, then age.
You can store up to 768 bytes of data in firebase keys, which goes a long way.
Hope this helps!

Related

Pagination with Filtering using Query Operation in DynamoDB Template

I would like to be able to filter a pagination result using query operation before the limit is taken into consideration.Is there any suggestion to get right pagination on filtered results?
I would like to implement a DynamoDB Scan OR Query with the following logic:
Scanning -> Filtering(boolean true or false) -> Limiting(for pagination)
However, I have only been able to implement a Scan OR Query with this logic:
Scanning -> Limiting(for pagination) -> Filtering(boolean true or false)
Note: I have already tried Global Secondary Index but it didn't work in my case Because I have 5 different attributes to filter and limit.
Unfortunatelly DynamoDB is not capable to do this, once you do Query on one of your indexes, it will read every single item that satisfies your partition and sort key.
Lets check your example - You have boolean and you have index over that field. Lets say 50% of items are false and 50% are true. Once you search by that index you will read through 50% of all items in table (so its almost like SCAN). If you set up limit, it will read only that number of items and then it stops. You cannot use the combination of limit and skip/page/offset like in other databases.
There is some level of pagination https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Query.Pagination.html but it does not allow you to jump to i.e. page 10, it only allows you go through all the pages one by one. Also I am not sure how it is priced, maybe internally the AWS will go through all the items before preparing the results for you, so you will pay for reading 50% of whole table even if you stop iterating before you reach the end.
There is also the limitation that index can have maximum of 2 fields (partition, sort).
EXAMPLE
You wrote that you have 5 parameters you want to query. The workaround that is used to address these limitations is to create and manage extra fields that have combination of parameters you want to query. Lets say you have table of users and you have there gender, age, name, surname and position. Lets say its huge database, so you have to think about amount of data you can load. Then if you want to use DynamoDB, you have to think about all queries you want to do.
You most likely want to search by name and surname, so you create index with surname as partition key and name as sort key (in such case you can search by surname or by both surname and name). It can work for lot of names, but you found out that some name combinations are too common and you need to filter by position as well. In such case, you create new field (column) called i.e. name-surname and whenever you create or update item, you will need to handle this field in your app to make sure it contains both of it, i.e. will-smith. Then you can make another index, that has name-surname as partition key and position as sort key. Now you can use it for such searches.
However you found out, that for some name-surname-position combination you get too many results and you dont want to handle it on application level and you want to limit results by age as well. Then you can create index with name-surname-position as partition key and age as sort key. At this moment you can also figure out that your old name-surname field and index can be removed as it server no purposes anymore (name and surname are handled by another index and for searching just name-surname-position you can use this index)
You want to query by gender as well sometimes? Its probably better to handle that in application level (or extra filter in db query) rather than creating new index that must be handled and payed for. There are only two types of gender (ok, lets say there exists more, but 99% of people will have just male or female) so its probably cheaper to just hide few fields on application level if someone wants to check only male/female/transgenders..., but load all of them. Because for extra index you would have to pay for every single insert, but this filter will be used only from time to time. Also when someone searches already by name, surname and position you dont expect that much results anyway, so if you get 20 (all genders) or just 10 (male only) results does not make much difference.
This ^^ was just example of how you can think and work with DynamoDB. How exactly you use it depends on your business logic.
Very important note: DynamoDB is very simple database that can only do very simple queries. It has little more functionality than Redis but a lot less functionality than traditional databases. The valid result of thinking about your business model/use-cases is that maybe you should NOT use the DynamoDB at all, because it can simply not satisfy your needs and queries.
Some basic thinking can look like this:
Is key-value persistant storage enough? Use DynamoDB
Is key-value persistant storage, where one item can have multiple keys and I can search and filter by maximum of 2 fields enough? Use DynamoDB
Is persistant storage, where I want to search single Table/Collection by many multiple keys with lot of options enough? Use MongoDB
Do I need to search through multiple tables or do complex joins or need transactions? Use traditional SQL database

Firestore Food-Ordering Application Database Design Questions

I'm working on a Flutter application that basically allows users to place orders to restaurants then go and pick-up those orders.
A restaurant has a List of MenuGroups and each group has a List of ExtraIngredients and List of MenuItems.
A MenuItem has several variants with different prices also List of Ingredients that come with that item and ExtraIngridients that can be added.
Currently, in firestore I've a collection called restaurants and each restaurant has a List of MenuGroups. Is there a way to make this more efficient
For example, is it better to do the menuGroups as a subcollection in the document?
Also to implement an order queue number system (first order starts from 1 goes to 99 then goes back to 1)
Is it better to store that in a variable in restaurant document(Whenever there is a new order there will be 1 read to get the current number than 1 write to increase that number and also after reaching 99 to set it back to 1)
or in the order document itself (Now each order has an extra field 1 read to get the last order's number and the new order will be written all together so there is no extra write operation just for the queue number)
There is no certain way to answer this. But there are some rules I would say to tackle this efficiently.
Put Data in a same document if you want to show it together. (Not too big, neither too small)
Put data in collections, when you want to search an individual piece of that data, or you have a database to grow.
Use map if you want to search a parameter based on that data.
use map if you want to store related data ( Like delivery addresses of the user).
Document write doesn't count on data you wrote, no matter if you increase your order counter value by 1, or change the whole document, it will count as one write.

Ionic storage - tables with key-value pairs?

Is there a way to use something like tables combined with key value pairs in ionic 2+?
Explanation: I know ionic supports sqlite, but I don't need actual sql queries nor table structures. However the key-value pairs quickly hit a dead end.
For example if I have records of posts, all with unique ids (e.g. uuid), I could save every post as key-value like
let posts = [post1,post2,post3]
posts.foreEach(post=>{
this.storage.set(post.id,post)
})
However then I cannot retrieve the posts, because I don't know their ids.
Alternatively I could store the whole array like
let posts = [post1,post2,post3]
this.storage.set("posts",posts)
However then I cannot add, remove or edit a single post without first loading and then saving the whole array again. Especially with a lot of entries the rewriting becomes quite slow as I noticed.
It would be nice to have the option to group the key-value pairs into something like a table. Any chance to do so without using actual sql commands a la CREATE TABLE...?
I've seen the storage offers the option to create different instances, but unsure whether this fits the purpouse.

A limit clarification for the new Firestore

So in the limits section (https://firebase.google.com/docs/firestore/quotas) of the new Firestore product from Firebase it says:
Maximum write rate to a collection in which documents contain
sequential values in an indexed field: 500 per second
We're pretty confused as to what that actually entails.
If we have, say, a root-level collection called users with 10 million entries in it, will this rate affect this collection in such a way, so only 500 users can update their data in any given second?
Can anyone clarify?
Sorry for the confusion; an example might help.
If your user documents contained a last-updated timestamp and you index on that timestamp then each new write would end up clustering around the same value (now) creating a hotspot in the index.
Similarly if you somehow assigned users a sequential value like a place in line or something like that this would also create a hotspot.
Incidentally this is why generated document IDs are random strings. This evenly distributes the writes on the primary key index.
If you avoid these kinds of patterns the sky's the limit, though during beta you'd hit the database-wide limit.
A quick additional note : for the moment all properties are indexed by default, so if you had a last-updated timestamp it would necessarily be indexed - so you would not be able to avoid the hotspoting.
Index disablement will be available down the road though.

Query for nearby locations

I am using Firebase to store users with their last scanned latitude and longitude.
An entry looks like this:
"Bdhwu37Jdmd28DmenHahd221" : {
"country_code" : "at",
"firstname" : "John",
"gender" : "m",
"lat" : 11.2549387,
"lon" : 17.3419559
}
Whenever a user presses a specific "search" button, I want my Firebase function to fetch the people nearest to the person who sent the request.
Since Firebase only allows for querying after one field, I decided to add the country_code, to kind of have some range-restrictions and query for that field. But it is still super slow when I load every user of a specific country and then check for the smallest distance between a given user and all the other users in the same country.
Already with 5 users, the function takes like 40 seconds to achieve the results.
I have also read about compound Indexes, but I would need to somehow combine the latitude and the longitude and query for both fields.
Is there any way to either get a second and third query involved here (e.g. search for the same country_code, and then for a similar longitude and latitude) or do I have to solve this inside my server code ?
The Firebase Database can only query by a single property. So the way to filter on latitude and longitude values is to combine them into a single property. That combined property must retain the filtering traits you want for numeric values, such as the ability to filter for a range.
While this at first may seem impossible, it actually has been done in the form of Geohashes. A few of its traits:
It is a hierarchical spatial data structure which subdivides space into buckets of grid shape
So: Geohashes divide space into a grid of buckets, each bucket identified by a string.
Geohashes offer properties like arbitrary precision and the possibility of gradually removing characters from the end of the code to reduce its size (and gradually lose precision).
The longer the string, the larger the area that the bucket covers
As a consequence of the gradual precision degradation, nearby places will often (but not always) present similar prefixes. The longer a shared prefix is, the closer the two places are.
Strings starting with the same characters are close to each other.
Combining these traits and you can see why these Geohashes are so appealing for use with the Firebase Database: they combine the latitude and longitude of a location into a single string, where strings that are lexicographically close to each other point to locations that are physically close to each other. Magic!
Firebase provides a library called Geofire, which uses Geohashes to implement a Geolocation system on top of its Realtime Database. The library is available for JavaScript, Java and Objective-C/Swift.
To learn more about Geofire, check out:
this blog post introducing Geofire 2
the demo app that used to show local busses moving on a map . The app doesn't work anymore (the data isn't being updated), but the code is still available.
this video and documentation on how to implement geoqueries on Cloud Firestore.

Resources