Firebase Firestore Easy to remember references - firebase

We are using Firebase Firestore for data storage. When a user creates a new room, we want the reference to be easy to remember so that a user can share the room ID/code with other users.
At present Firestore will create a unique reference such as:
DvfTMYED5cWdo5qIraZg
This is too long and difficult to remember or share. We could set a different reference manually, but they have to be unique. The other point is that users can create multiple rooms so the reference would have to change each time.
Is there a way to use shorter/better references for this use case?

Firebase/Firestore has nothing built in for shorter references, as they wouldn't have enough entropy to statistically guarantee uniqueness. But since creating chat rooms is likely a fairly low-volume operation, you can implement this in your app by:
Generating your own token for each room, for example a counter.
Checking in the database whether this room is available.
If the token is already taken, generate another one and try again.
This is pretty much how auto-increment fields work on most databases. On Firestore you'd create a document where you keep the current counter value:
chat_rooms (collection)
COUNTERS: { last_room_id: 2 } (document)
chatroom_1: { room_id: 1, name: "Chat room for Stuart and Frank" } (document)
chatroom_2: { room_id: 2, name: "Public chat room" } (document)
When you now create a new room, you:
Start a transaction.
Read COUNTERS.
Read the last_room_id, and increment it.
Write the updated document back.
Create a new document for the new chat room.
Commit the transaction
Note that there are many ways to generate the codes. The counter approach above is a simple one, but I recommend checking out more options. Some interesting reading:
How to generate unique coupon codes?
Generating human-readable/usable, short but unique IDs
Unique Identifiers that are User-Friendly and Hard to Guess

Related

Organizing a Cloud Firestore database

I can't manage to determine what is the better way of organizing my database for my app :
My users can create items identified by a unique ID.
The queries I need :
- Query 1: Get all the items created by a user
- Query 2 : From the UID of an item, get its creator
My database is organized as following :
Users database
user1 : {
item1_uid,
item2_uid
},
user2 : {
item3_uid
}
Items database
item1_uid : {
title,
description
},
item2_uid : {
title,
description
},
item3_uid : {
title,
description
}
For the query 2, its quite simple but for the query 2, I need to parse all the users database and list all the items Id to see if there is the one I am looking for. It works right now but I'm afraid that it will slow the request time as the database grows.
Should I add in the items data a row with the user id ? If yes the query will be simpler but I heard that I am not supposed to have twice the same data in the database because it can lead to conflicts when adding or removing items.
Should I add in the items data a row with the user id ?
Yes, this is a very common approach in the NoSQL world and is called denormalization. Denormalization is described, in this "famous" post about NoSQL data modeling, as "copying of the same data into multiple documents in order to simplify/optimize query processing or to fit the user’s data into a particular data model". In other words, the main driver of your data model design is the queries you plan to execute.
More concretely you could have an extra field in your item documents, which contain the ID of the creator. You could even have another one with, e.g., the name of the creator: This way, in one query, you can display the items and their creators.
Now, for maintaining these different documents in sync (for example, if you change the name of one user, you want it to be updated in the corresponding items), you can either use a Batched Write to modify several documents in one atomic operation, or rely on one or more Cloud Functions that would detect the changes of the user documents and reflect them in the item documents.

How to keep documents with 2 partition keys in sync / referential integrity?

I have a cosmos db with high cardinality synthetic partition keys and type properties.
I need a setup where users can share documents between them.
for example, this is a document:
{
“id”:”guid”,
“title”:”Example document to share”,
“ownerUserId”:”user1Guid”,
“type”: “usersDocument”,
“partitionKey”:”user_user1Guid_documents”
}
now, user wants to share this document with another user.
Assumptions:
one document can be shared with many users (thousands)
one user can have thousands of documents shared with him
For these 2 reasons:
i dont want to embed sharings into document documents nor in user documents (since writes would very soon become ineffective/expensive) but i would prefer those m:n be separate documents.
i dont want to put shares for all users/documents as it will create hot spots very soon
I need both queries:
1. ListDocumentsSharedWithMe
In this query, at query time, i know id of the user documents are shared with.
2. ListAllUsersISharedThisDocumentWith
In this query, at query time, i know ‘idof thedocumentthat has been shared with differentusers`.
All this makes me think i should have 2 separate document types with separate partition
For listing all documents shared with me:
{
“id”:”documentGuid”,
“type”:”sharedWithMe”,
“partitionKey”:”sharedWithMe_myUserGuid”
}
(this could also be a single document with collection of shared documents. important here is partitionKey)
Now i can easily do SQL like SELECT * FROM c WHERE c.type = “sharedWithMe” and run query against partition key containing my user guid.
For listing all users i shared some document with, its similar:
{
“id”:”userISharedWithGuid”,
“type”:”documentSharings”,
“partitionKey”:”documentShare_documentGuid”
}
Now i can easily do SQL like SELECT * FROM c WHERE c.type = “documentSharings” and run query against partition key containing my document guid.
Question:
When user shares a document with some user, both documents should be created with different partition keys (thus, no sp/transactions).
How to keep this “atomic-like” or avoid create/update anomalies?
Or is there any better way to model this?
I think your method makes sense I do something similar to partition in multiple ways based on the scope of a query. I assume your main concern is if a failure happens in between saving the first and last set of related documents? The only way unfortunately to manage the chain of documents as they save is within your application code. i.e. we make sure we save in the order that makes it easiest to rollback and then implement a rollback method within the exception handler, this works by keeping a collection saved documents in memory.
As you say as you are across partitions there is no transaction handling out of the box.

Cloud Firestore and data modeling: From RDBMS to No-SQL

I am building an iOS app that is using Cloud Firestore (not Firebase realtime database) as a backend/database.
Google is trying to push new projects towards Cloud Firestore, and to be honest, developers with new projects should opt-in for Firestore (better querying, easier to scale, etc..).
My issue is the same that any relational database developer has when switching to a no-SQL database: data modeling
I have a very simple scenario, that I will first explain how I would configure it using MySQL:
I want to show a list of posts in a table view, and when the user clicks on one post to expand and show more details for that post (let say the user who wrote it). Sounds easy.
In a relational database world, I would create 2 tables: one named "posts" and one named "users". Inside the "posts" table I would have a foreign key indicating the user. Problem solved.
Poor Barry, never had the time to write a post :(
Using this approach, I can easily achieve what I described, and also, if a user updates his/her details, you will only have to change it in one place and you are done.
Lets now switch to Firestore. I like to think of RDBMS's table names as Firestore's collections and the content/structure of the table as the documents.
In my mind i have 2 possible solutions:
Solution 1:
Follow the same logic as the RDBMS: inside the posts collection, each document should have a key named "userId" and the value should be the documentId of that user. Then by fetching the posts you will know the user. Querying the database a second time will fetch all user related details.
Solution 2:
Data duplication: Each post should have a map (nested object) with a key named "user" and containing any user values you want. By doing this the user data will be attached to every post it writes.
Coming from the normalization realm of RDBMS this sounds scary, but a lot of no-SQL documents encourage duplication(?).
Is this a valid approach?
What happens when a user needs to update his/her email address? How easily you make sure that the email is updated in all places?
The only benefit I see in the second solution is that you can fetch both post and user data in one call.
Is there any other solution for this simple yet very common scenario?
ps: go easy on me, first time no-sql dev.
Thanks in advance.
Use solution 1. Guidance on nesting vs not nesting will depend on the N-to-M relationship of those entities (for example, is it 1 to many, many to many?).
If you believe you will never access an entity without accessing its 'parent', nesting may be appropriate. In firestore (or document-based noSQL databases), you should make the decision whether to nest that entity directly in the document vs in a subcollection based on the expect size of that nested entity. For example, messages in a chat should be a subcollection, as they may in total exceed the maximum document size.
Mongo, a leading noSQL db, provides some guides here
Firestore also provided docs
Hope this helps
#christostsang I would suggest a combination of option 1 and option 2. I like to duplicate data for the view layer and reference the user_id as you suggested.
For example, you will usually show a post and the created_by or author_name with the post. Rather than having to pay additional money and cycles for the user query, you could store both the user_id and the user_name in the document.
A model you could use would be an object/map in firestore here is an example model for you to consider
posts = {
id: xxx,
title: xxx,
body: xxx,
likes: 4,
user: {refId: xxx123, name: "John Doe"}
}
users = {
id: xxx,
name: xxx,
email: xxx,
}
Now when you retrieve the posts document(s) you also have the user/author name included. This would make it easy on a postList page where you might show posts from many different users/authors without needed to query each user to retrieve their name. Now when a user clicks on a post, and you want to show additional user/author information like their email you can perform the query for that one user on the postView page. FYI - you will need to consider changes that user(s) make to their name and if you will update all posts to reflect the name change.

Cloud Firestore - ensuring data consistency

My database uses redundant data to speed up fetches and minimise the number of documents that need to be read for certain queries. For example I'd store the names of followed users in a map in a users document so I don't have to read another document to retrieve the names of each of the followed users.
User: (Collection) {
userID: (Document) {
//user state
name: ...
followingUsers: (Map) {
followingUserID: nameOfUser,
followingUserID: nameOfUser
}
}
}
If a user was to change their name, what is the best way to propagate these changes to all places with the redundant data?
Good question!
For starters, I'd recommend doing this kind of administrative task in a server SDK or cloud function, since you don't want a client to necessarily have the ability to start mucking with every single User doc.
The good news is that, once you start using the server SDKs, you can then put a query into a transaction. So let's say user_123 changes their name from "Jenny" to "Jen". Your transaction would look something like this in pseudo-code:
Start Transaction
transaction.get(usersRef.where("followingUsers.user_123", ">=", ""))
Loop through query results. Grab the doc_id from each doc and use that to start building out the writes in your transaction.
transaction.update("/users/<doc_id>/", {"followingUsers.user_123" : "Jen"})
Also make sure you add transactions.update("/users/user_123", {"name": "Jen"})
End transaction
This general approach would also work on the client-side, but you just wouldn't be able to do this in a transaction. (You could still put all of these changes into a batch write, though.)

Firebase query for bi-directional link

I'm designing a chat app much like Facebook Messenger. My two current root nodes are chats and users. A user has an associated list of chats users/user/chats, and the chats are added by autoID in the chats node chats/a151jl1j6. That node stores information such as a list of the messages, time of the last message, if someone is typing, etc.
What I'm struggling with is where to make the definition of which two users are in the chat. Originally, I put a reference to the other user as the value of the chatId key in the users/user/chats node, but I thought that was a bad idea incase I ever wanted group chats.
What seems more logical is to have a chats/chat/members node in which I define userId: true, user2id: true. My issue with this is how to efficiently query it. For example, if the user is going to create a new chat with a user, we want to check if a chat already exists between them. I'm not sure how to do the query of "Find chat where members contains currentUserId and friendUserId" or if this is an efficient denormalized way of doing things.
Any hints?
Although the idea of having ids in the format id1---||---id2 definitely gets the job done, it may not scale if you expect to have large groups and you have to account for id2---||---id1 comparisons which also gets more complicated when you have more people in a conversation. You should go with that if you don't need to worry about large groups.
I'd actually go with using the autoId chats/a151jl1j6 since you get it for free. The recommended way to structure the data is to make the autoId the key in the other nodes with related child objects. So chats/a151jl1j6 would contain the conversation metadata, members/a151jl1j6 would contain the members in that conversation, messages/a151jl1j6 would contain the messages and so on.
"chats":{
"a151jl1j6":{}}
"members":{
"a151jl1j6":{
"user1": true,
"user2": true
}
}
"messages":{
"a151jl1j6":{}}
The part where this gets is little "inefficient" is the querying for conversations that include both user1 and user2. The recommended way is to create an index of conversations for each user and then query the members data.
"user1":{
"chats":{
"a151jl1j6":true
}
}
This is a trade-off when it comes to querying relationships with a flattened data structure. The queries are fast since you are only dealing with a subset of the data, but you end up with a lot of duplicate data that need to be accounted for when you are modifying/deleting i.e. when the user leaves the chat conversation, you have to update multiple structures.
Reference: https://firebase.google.com/docs/database/ios/structure-data#flatten_data_structures
I remember I had similar issue some time ago. The way how I solved it:
user 1 has an unique ID id1
user 2 has an unique ID id2
Instead of adding a new chat by autoId chats/a151jl1j6 the ID of the chat was id1---||---id2 (superoriginal human-readable delimeter)
(which is exactly what you've originally suggested)
Originally, I put a reference to the other user as the value of the chatId key in the users/user/chats node, but I thought that was a bad idea in case I ever wanted group chats.
There is a saying: https://en.wikipedia.org/wiki/You_aren%27t_gonna_need_it
There might a limitation of how many userIDs can live in the path - you can always hash the value...

Resources