Firebase indexing on huge lists (100000+ items) - firebase

I'm migrating my relational database to Firebase. In general, I have a planner for workers. They can add an item ('appointment') to their schedule. I've read the FireBase documentation, and found a section on indexing.
So I've created following structure (date = YYYYMMDD and time = HHMMSS):
{
appointments :
'id1' : { 'date' : '20141207', 'time' : '170000', worker : 'worker1' },
'id2' : { 'date' : '20141208', 'time' : '170000', worker : 'worker1' }
}
I've added an index for date, time and worker, to be able to query data like this (e.g. fetch all appointments for today):
curl -X GET 'https://myapp.firebaseio.com/appointments.json?orderBy="date"&equalsTo="20141207"'
This works as expected and does the job well. The problem is, the number of appointments can grow exponentially (about a year from now, there could be 100000+ appointments). Is it a good approach to use these indexes? Another option would be to store the date and time also separately, like this:
{
'20141207' :
{ '170000' : { 'id1' : true } },
'20141208' :
{ '170000' : { 'id2' : true } }
}
In order to ensure that appointments can be fetched per day very fast. Or is FireBase able to handle this just using indexes?

The number of records in the path won't be an issue; Firebase is a scalable, real-time back end that handles hundreds of thousands of concurrent connections and millions of nodes. Querying should be fast. This is the point of an index and, like all things Firebase, must meet our standards of speed and excellence.
Be sure to read about '.indexOn' and to implement this in your security rules:
{
"rules": {
"appointments": {
".indexOn": ["date", "time", "worker"]
}
}
}
Also, your real limitation here will be the bandwidth of transferring data over the tubes, so be sure to limit your results in some manner and paginate:
curl -X GET 'https://myapp.firebaseio.com/appointments.json?orderBy="date"&equalsTo="20141207"&limitToFirst=100'

Related

Queries in Realtime-database (using LimitToLast) are very very slow

I'm using RealTime-database(Firebase 7.3.2) and Unity.
When I'm using the LimitToLast() method the query takes a long time(1,5 to 2 minutes) to return a reponse.
But when I load the whole data or execute this query without the LimitToLast method this takes not a long time.
I want to ask if everyone has this problem during his development with realtime firebase database.
My database contains 1700 rooms.
this is the query :
var result = await FirebaseDatabase.DefaultInstance.GetReference("Rooms")
.OrderByChild("CreationDate").LimitToLast(10).GetValueAsync();
And that is the structur of rooms collection in database:
{
"Rooms" : {
"-Lp860kFH8TjdAsPpar1" : {
"CreationDate" : -14400,
"Title" : "Room 1",
...,
},
"-Lp860kFH8TjdAsPpbr2" : {
"CreationDate" : -14402,
"Title" : "Room 2",
...,
},
...
"-Lp860kFH8TjdAsPpar3" : {
"CreationDate" : -14404,
"Title" : "Room 1700",
...,
}
}
}
Are you sure you have indexing done in your Firebase Realtime Database Security Rules? If its not done, then the query is executed as follows:
1. Download all the data from the "Rooms" branch to the Unity client.
2. Sort the data according to your ordering criteria on the Unity client.
3. Discard all except the last 10 children in this sorted data.
I'm sure nobody would want to do that if you want to get just the last 10 children. The ordering and limiting to last 10 children should happen on the database server itself
which will ensure it to be fast enough to give you the result in milliseconds. For that, you'll have to index your data and then run your queries.

How to keep two paths in sync in firebase?

I have a complex application and I am using Firebase for my backend data.
I have an object that is used in two different contexts, one for viewing by the user and one for the server to do batch processing. Something like this:
users: {
USER_ID: {
...
txns: {
TXN_ID: {
// data I need
}
}
}
...
},
transactions: {
TXN_ID: {
USER_ID: {
// data I need
},
USER_ID2: {
// different data I need
}
}
}
In relational terms, txn_ids and user_ids are in a many-to-many relationship. So looking through the transactions node to find all the transactions that belong to a user is hard in FB and looking though all the users to find a transaction id is hard in FB. So I denormalize the data. I think this is correct.
But how do I keep them in sync?
I've been using MPUs, which is fine, but I'm worried about it being prone to bugs in the future.
I've considered writing a Firebase Cloud Function to keep them in sync, but I'm worried about that being more fragile than MPUs.
Advice?

Firebase .indexOn with complex DB structure

The current query you see below is not efficient because I have not setup the proper indexing. I get the suggestion Consider adding ".indexOn": "users/kxSWLGDxpYgNQNFd3Q5WdoC9XFk2" at /conversations in the console in Xcode. I have tried it an it works.
However, I need the user id after users/ to be dynamic. I've added a link to another post below that has tried a similar thing, but I just can't seem to get it. All help would be much appreciated!
Note: The console output user id above does not match the screenshot below, but does not matter to solve the problem I believe. Correct me if I'm wrong. Thanks!
Here is the structure of my DB in Firebase:
{
"conversationsMessagesID" : "-KS3Y9dMLXfs3FE4nlm7",
"date" : "2016-10-19 15:45:32 PDT",
"dateAsDouble" : 4.6601793282986E8,
"displayNames" : [ “Tester 1”, “Tester 2” ],
"hideForUsers" : [ "SjZLsTGckoc7ZsyGV3mmwc022J93" ],
"readByUsers" : [ "mcOK5wVZoZYlFZZICXWYr3H81az2", "SjZLsTGckoc7ZsyGV3mmwc022J93" ],
"users" : {
"SjZLsTGckoc7ZsyGV3mmwc022J93" : true,
"mcOK5wVZoZYlFZZICXWYr3H81az2" : true
}
}
and the Swift query:
FIRDatabase.database().reference().child("conversations")
.queryOrderedByChild("users/\(AppState.sharedInstance.uid!)").queryEqualToValue(true)
Links to other post:
How to write .indexOn for dynamic keys in firebase?
It seems fairly simple to add the requested index:
{
"rules": {
"users": {
".indexOn": ["kxSWLGDxpYgNQNFd3Q5WdoC9XFk2", "SjZLsTGckoc7ZsyGV3mmwc022J93", "mcOK5wVZoZYlFZZICXWYr3H81az2"]
}
}
}
More likely your concern is that it's not feasible to add these indexes manually, since you're generating the user IDs in your code.
Unfortunately there is no API to generate indexes.
Instead you'll need to model your data differently to allow the query that you want to do. In this case, you want to retrieve the conversations for a specific user. So you'll need to store the conversations for each specific user:
conversationsByUser {
"SjZLsTGckoc7ZsyGV3mmwc022J93": {
"-KS3Y9dMLXfs3FE4nlm7": true
},
"mcOK5wVZoZYlFZZICXWYr3H81az2": {
"-KS3Y9dMLXfs3FE4nlm7": true
}
}
It may at first seem inefficient to store this data multiple times, but it is very common when using NoSQL databases. And is really no different than if the database would auto-generate the indexes for you, except that you have to write the code to update the indexes yourself.

Structure a NoSQL database for a chat application (using FireBase)

Coming from years of using relational databases, i am trying to develop a pretty basic chat/messaging app using FireBase
FireBase uses a NoSQL data structure approach using JSON formatted strings.
I did a lot of research in order to understand how to structure the database with performance in mind. I have tried to "denormalize" the structure and ended up with the following:
{
"chats" : {
"1" : {
"10" : {
"conversationId" : "x123332"
},
"17": {
"conversationId" : "x124442"
}
}
},
"conversations" : {
"x123332" : {
"message1" : {
"time" : 12344556,
"text" : "hello, how are you?",
"userId" : 10
},
"message2" : {
"time" : 12344560,
"text" : "Good",
"userId" : 1
}
}
}
}
The numbers 1, 10, 17 are sample user id's.
My question is, can this be structured in a better way? The goal is to scale up as the app users grow and still get the best performance possible.
Using the document-oriented database structure such Firestore, you can store the conversations as below;
{
"chat_rooms":[
{
"cid":100,
"members":[1, 2],
"messages":[
{"from":1, "to":2, "text":"Hey Dude! Bring it"},
{"from":2, "to":1, "text":"Sure man"}
]
},
{
"cid":101,
"members":[3, 4],
"messages":[
{"from":3, "to":4, "text":"I can do that work"},
{"from":4, "to":3, "text":"Then we can proceed"}
]
}
]
}
Few examples of NoSQL queries you could run through this structure.
Get all the conversations of a logged-in user with the user id of 1.
db.chat_rooms.find({ members: 1 })
Get all the documents, messages sent by the user id of 1.
db.chat_rooms.find({ messages: { from: 1 } })
The above database structure is also capable of implementing in RDMS database as table relationships using MySQL or MSSQL. This is also can be implemented for group chat room applications.
This structure is optimized to reduce your database document reading usage which can save your money from paying more for infrastructure.
According to our above example still, you will get 2 document reads since we have 4 messages but if you store all the messages individually and run the query by filtering sender id, you will get 4 database queries which are the kind of massive amount when you have heavy conversation histories in your database.
One case for storing messages could look something like this:
"userMessages":
{ "simplelogin:1":
{ "simplelogin:2":
{ "messageId1":
{ "uid": "simplelogin:1",
"body": "Hello!",
"timestamp": Firebase.ServerValue.TIMESTAMP },
"messageId2": {
"uid": "simplelogin:2",
"body": "Hey!",
"timestamp": Firebase.ServerValue.TIMESTAMP }
}
}
}
Here is a fireslack example this structure came from. This tutorial builds an app like slack using firebase:
https://thinkster.io/angularfire-slack-tutorial
If you want something more specific, more information would be helpful.

how to retrieve data ordered by key inside unspecified key with firebase

I have a snapshot for my reference in firebase like this:
"friendlist" : {
"user1" : {
"user3" : 1
},
"user2" : {
"user1" : 0
}
"user3" : {
"user1" : 1
}
}
The explanation for the reference:
Every user has an unique id, i'm using user's id for their friendlist unique id. In example above i have 3 users and every user have his own friendlist. Inside their friendlist, there's other user's id that already be friend with him. If the value is 1, the user already be friend. But when the value is 0, the user is requesting to be friend.
My problem is:
How to get all user's friendlist's id which have "user1" with value 0 inside their friendlist? Can i do that in just one query?
I think i need to iterate through all friendlist and orderbykey for every friendlist and looking for "user1". Or there's any good approach to do that?
Any answer would be appreciated, thanks!
It would help if you next time tell a bit more about what you've already tried. Or at the very least specify what language/environment you're targeting.
But in JavaScript, you can get those users with:
var ref = new Firebase('https://yours.firebaseio.com/friendlist');
var query = ref.orderByChild('user1').equalTo(0);
query.once('value', function(usersSnapshot) {
usersSnapshot.forEach(function(userSnapshot) {
console.log(userSnapshot.key());
});
});
With the sample data you specified, this will print:
user2
You should add (and will get a warning about) an index for efficiently performing this query:
{
"rules": {
"friendlist": {
".indexOn": ['user1']
}
}
}
Without this index, the Firebase client will just download all data to the client and do the filtering client-side. With the index, the query will be performed server-side.
A better data model
You'll likely want to search for any friend, which turns the index into:
".indexOn": ['user1', 'user2', 'user3']
But with this structure, you'll need to add an index whenever you add a user. Firebase SDKs don't have an API to add indexes, which is typically a good indication that your data structure is not fitting your needs.
When using a NoSQL database, your data structure should meet the needs of the application you're building. Since you are looking to query the friends of user1, you should store the data in that format too:
"friendlist" : {
"user1" : {
"user3" : 1
},
"user2" : {
"user1" : 0
}
"user3" : {
"user1" : 1
}
},
"friendsOf": {
"user1": {
"user2": 0,
"user3": 1
},
"user3": {
"user1": 1
}
}
As you can see, we now store two lists:
* friendList is your original list
* friendsOf is the inverse of your original list
When you need to know who friended user 1, you can now read that data with:
ref.child('friendsOf').child('user1').on('value'...
Note that we no longer need a query for this, which makes the operation a lot more scalable on the database side.
Atomic updates
With this new data model, you need to write data in two places when adding a friend relation. You can do this with two set()/update() operations. But in recent Firebase SDKs, you can also perform both writes in a single update like this:
function setRelationship(user1, user2, value) {
var updates = {};
updates['friendList/'+user1+'/'+user2] = value;
updates['friendsOf/'+user2+'/'+user1] = value;
ref.update(updates);
}
setRelationship('user3', 'user4', 1);
The above will send a single command to the Firebase server to write the relationship to both friendList and friendsOf nodes.

Resources