Assuming I have a list of data I would like to store with Firebase realtime database, and search it later.
What would be the best way to store the data and query it to get the best performance?
My data is a list of names (containing a lot of names).
["Bob", "Rob", ...]
Note that I have multiple clients searching in a given time.
If the names are supposed to be unique and order doesn't matter, you'll want to store them as a mathematical set. In the Firebase Realtime Database you'll model this as:
"usernames": {
"Bob": true,
"Rob": true
}
A few things of note about this model:
We use the names as keys, which means that each name is by definition unique (since each key can exist only once in its containing node).
The true values have no specific meaning. They are just needed, since Firebase can't store a key without a value.
Certain characters (such as . and /) cannot be used in keys. If a name contains such characters, you will have to filter them out (or encode them) in the key. For example someone named Jes.sie will have to be stored as Jes.sie (lossy) or e.g. Jes%2Esie (with URL encoding).
In such cases you could store the original unfiltered/unencoded name as the value. So: "Jes%2Esie": "Jes.sie".
A few more general notes about (text) searching in the Firebase Realtime Database:
Firebase can only do prefix matches, it has no support for searching strings that contain or end with a certain substrings. This means that in the original data it can search for everything starting with an B (with orderByKey().startAt("R").endAt("R\uF7FF")), but it can't search for everything ending with ob.
Searches are case-sensitive. If you want to be able to search case-insensitive, consider storing the keys as all-lowercase:
"usernames": {
"bob": "Bob",
"rob": "Rob",
"jes%2esie": "Jes.sie"
}
If you need better support for text-search, consider integrating a third-party search engine. Common recommendations are Elastic-search (self-hosted) or Algolia (cloud-based).
For more information on many of these topics, see:
this article on NoSQL data modeling
the video series Firebase for SQL developers
Cloud Firestore Case Insensitive Sorting Using Query (while written for Firestore, the same applies here)
Related
The documentation in google cloud's datastore section is very confusing to me.
I was under the belief datastore may be used as a key-value storage. (Similar to mongoDB) But I guess I misunderstood how its keys work, since I can't use string keys outright, but I need to transform series of strings to a (list) key via some list => dataStore.key(list) transformation.
That's weird, and I don't understand why use a list instead of a string, and I don't understand why I don't use the list outright, and need to use datastore.key first, but I can do that. However, after playing with it a little bit, I discovered that the return value of datastore.key(list) would get me different values for the same list if I run it repeatedly!
So, now I need to somehow remember this key somewhere, but the reason I wanted to use datastore was that I was running in a service with no persistent memory to begin with.
Am I using datastore wrong? Can I use it at all for simple persistent key-value storage?
It appears that the issue was that I used a list that was too short.
The datastore expects collections, which contain documents, each of which is a key-value mapping. So instead of having my key point at a mapping, I needed to set a collection, and in it have keys mapped to mappings. (In other words, I needed to have one more element in the keys list)
results = await this.dataStore.get(this.dataStore.key([collectionId, documentId]));
AND
await this.dataStore.save({ key: this.dataStore.key([collectionId, documentId]), data: someKeyValueObject });
I'm creating an application for generating documents with unique id, my issue is that the id needs to be in a specific format ( 00000/A/B) so I can't use firestore document's id's.
The problem is also that I have that id's in different places,
#1 case
users/{userID}/id = //UNIQUE ID HERE
#2 case
users/{userID}/members <= members is an array of objects where every member need a unique id
I was thinking about the separate collection of id's where I can check which one is taken but maybe is there a better way to ensure id is unique across the whole app?
What you're considering is pretty much the only way to guarantee uniqueness of a value across the database.
In a few more structured steps, it'd be:
Use that value as the document ID in an secondary collection. This collection purely exists to ensure uniqueness of the IDs.
Let the user claim it, typically by writing their UID into the document.
Use security rules to ensure a user can only write a document if it doesn't exist yet, and (if needed) only deleted when they own it.
The topic of unique values has been covered quite a few times before, although usually in the form of unique user names, so I recommend checking out:
Cloud Firestore: Enforcing Unique User Names
How to generate and guarantee unique values in firestore collection?
How to enforce Uniqueness in a Property of a document field in Google Cloud Firestore
Firestore unique index or unique constraint?
I want to make unique usernames in firebase/firestore
I have a collection where the documents are uniquely identified by a date, and I want to get the n most recent documents. My first thought was to use the date as a document ID, and then my query would sort by ID in descending order. Something like .orderBy(FieldPath.documentId, descending: true).limit(n). This does not work, because it requires an index, which can't be created because __name__ only indexes are not supported.
My next attempt was to use .limitToLast(n) with the default sort, which is documented here.
By default, Cloud Firestore retrieves all documents that satisfy the query in ascending order by document ID
According to that snippet from the docs, .limitToLast(n) should work. However, because I didn't specify a sort, it says I can't limit the results. To fix this, I tried .orderBy(FieldPath.documentId).limitToLast(n), which should be equivalent. This, for some reason, gives me an error saying I need an index. I can't create it for the same reason I couldn't create the previous one, but I don't think I should need to because they must already have an index like that in order to implement the default ordering.
Should I just give up and copy the document ID into the document as a field, so I can sort that way? I know it should be easy from an algorithms perspective to do what I'm trying to do, but I haven't been able to figure out how to do it using the API. Am I missing something?
Edit: I didn't realize this was important, but I'm using the flutterfire firestore library.
A few points. It is ALWAYS a good practice to use random, well distributed documentId's in firestore for scale and efficiency. Related to that, there is effectively NO WAY to query by documentId - and in the few circumstances you can use it (especially for a range, which is possible but VERY tricky, as it requires inequalities, and you can only do inequalities on one field). IF there's a reason to search on an ID, yes it is PERFECTLY appropriate to store in the document as well - in fact, my wrapper library always does this.
the correct notation, btw, would be FieldPath.documentId() (method, not constant) - alternatively, __name__ - but I believe this only works in Queries. The reason it requested a new index is without the () it assumed you had a field named FieldPath with a subfield named documentid.
Further: FieldPath.documentId() does NOT generate the documentId at the server - it generates the FULL PATH to the document - see Firestore collection group query on documentId for a more complete explanation.
So net:
=> documentId's should be as random as possible within a collection; it's generally best to let Firestore generate them for you.
=> a valid exception is when you have ONE AND ONLY ONE sub-document under another - for example, every "user" document might have one and only one "forms of Id" document as a subcollection. It is valid to use the SAME ID as the parent document in this exceptional case.
=> anything you want to query should be a FIELD in a document,and generally simple fields.
=> WORD TO THE WISE: Firestore "arrays" are ABSOLUTELY NOT ARRAYS. They are ORDERED LISTS, generally in the order they were added to the array. The SDK presents them to the CLIENT as arrays, but Firestore it self does not STORE them as ACTUAL ARRAYS - THE NUMBER YOU SEE IN THE CONSOLE is the order, not an index. matching elements in an array (arrayContains, e.g.) requires matching the WHOLE element - if you store an ordered list of objects, you CANNOT query the "array" on sub-elements.
From what I've found:
FieldPath.documentId does not match on the documentId, but on the refPath (which it gets automatically if passed a document reference).
As such, since the documents are to be sorted by timestamp, it would be more ideal to create a timestamp fieldvalue for createdAt rather than a human-readable string which is prone to string length sorting over the value of the string.
From there, you can simply sort by date and limit to last. You can keep the document ID's as you intend.
Assuming I have a list of data I would like to store with Firebase realtime database, and search it later.
What would be the best way to store the data and query it to get the best performance?
My data is a list of names (containing a lot of names).
["Bob", "Rob", ...]
Note that I have multiple clients searching in a given time.
If the names are supposed to be unique and order doesn't matter, you'll want to store them as a mathematical set. In the Firebase Realtime Database you'll model this as:
"usernames": {
"Bob": true,
"Rob": true
}
A few things of note about this model:
We use the names as keys, which means that each name is by definition unique (since each key can exist only once in its containing node).
The true values have no specific meaning. They are just needed, since Firebase can't store a key without a value.
Certain characters (such as . and /) cannot be used in keys. If a name contains such characters, you will have to filter them out (or encode them) in the key. For example someone named Jes.sie will have to be stored as Jes.sie (lossy) or e.g. Jes%2Esie (with URL encoding).
In such cases you could store the original unfiltered/unencoded name as the value. So: "Jes%2Esie": "Jes.sie".
A few more general notes about (text) searching in the Firebase Realtime Database:
Firebase can only do prefix matches, it has no support for searching strings that contain or end with a certain substrings. This means that in the original data it can search for everything starting with an B (with orderByKey().startAt("R").endAt("R\uF7FF")), but it can't search for everything ending with ob.
Searches are case-sensitive. If you want to be able to search case-insensitive, consider storing the keys as all-lowercase:
"usernames": {
"bob": "Bob",
"rob": "Rob",
"jes%2esie": "Jes.sie"
}
If you need better support for text-search, consider integrating a third-party search engine. Common recommendations are Elastic-search (self-hosted) or Algolia (cloud-based).
For more information on many of these topics, see:
this article on NoSQL data modeling
the video series Firebase for SQL developers
Cloud Firestore Case Insensitive Sorting Using Query (while written for Firestore, the same applies here)
I have a real-time database on firebase which consists of ListFields. Among these fields, one field, participants is a list of strings and two usernames. I want to make a query to firebase database such that it will return the documents in which a particular username is present in the participants list.
The structure of my document is as follows :
I want to make a query such that Firebase returns all the documents in which the participants list consists aniruddh. I am using Flutter with the flutterfire plugins.
Your current data structure makes it easy to find the participants for a conversation. It does however not make it easy to find the conversations for a user.
One alternative data structure that makes this easier is to store the participants in this format:
imgUrls: {},
participants: {
"aniruddh": true,
"trubluvin": true
}
Now you can technically query for the the conversations of a user with something like:
db.child("conversations").orderByChild("participants/aniruddh").equalTo(true)
But this won't scale very well, as you'll need to define an index for each user.
The proper solution is to add a second data structure, known as an inverted index, that allows the look up of conversations for a user. In your case that could look like this:
userConversations: {
"aniruddh": {
"-LxzV5LzP9TH7L6BvV7": true
},
"trubluvin": {
"-LxzV5LzP9TH7L6BvV7": true
}
}
Now you can look up the conversations that a user is part of with a simple read operation. You could expand this data structure to contain more information on each conversation, such as the information you want to display in your list view.
Also see my answer heres:
Firebase query if child of child contains a value (for more explanation on why the queries won't work in your current structure, and why they won't scale in the first structure in my answer).
Best way to manage Chat channels in Firebase (for an alternative way of naming the chat rooms).