Firebase Firestore Data Structuring - firebase

I am having trouble transitioning from Realtime Database to Cloud Firestore. I'm developing something like a story book app. So, I need to store Stories in my database. Each story contains a set of chapters, and each chapter contains a set of pages.
Back in Realtime Database I used to have a data structure in the likes of:
stories:
storyId:
title: "Isaac, the Toothless Dog"
description: "Isaac wanders Paris seeking for true love."
chapters:
title: "Into the Sewers"
published: true
chapterId:
pages:
pageId:
loose: false
occupied: false
body: "This is where the body of the page goes"
With Realtime Database I could just request the node storyId, and receive the whole map below with chapters and pages. With that map, in Dart, I would generate an object Story with all that information. That's what I want to achieve, using Cloud Firestore.
For Cloud Firestore, I have designed the following data structure:
Of course, the purple outter boxes represent Collections, and the blue inner boxes represent Documents. I use three dots (...) to show that there would be plenty more documents like the one next to it.
I like this data structure. It looks neat and clean, and I always know what I'm looking at.
However, Cloud Firestore queries are Shallow (a feature I actually like, serves me well in other points of the app), meaning I can only query from a single Collection. That means I can't generate that Story object (in Dart) without, at least, three queries to Firebase.
My question is: Is there a way to achieve this using only one query? If not, is there any way I can organize my data in order to achieve so?
I know I could put the pages and chapters information inside the Stories Collection (like what I had in Realtime Databse), but Documents have a size limit of 1Mb and 20000 fields. If a story contains multiple chapters, each one with multiple pages, that limit may very well be exceeded. That's why I want to keep them in separated Collections.
Any ideas? Thank you.

Related

Should I create a duplicate collection/document for each use-case? (Firebase/Firestore)

I'm trying to build an ecommerce app with firebase on the backend. I have a collection of 1000+ products, each of which is stored as a separate document, which have product specific info such as price, title etc.
document:{
title: 'Some Title',
price: '$99.99',
genres: ['Horror', 'Action']
}
So in my app I need to display these products in many places, such as product carousels(similar to a bookshelf with arrow buttons at the ends), and also in a search results page.
At any given page, I assume that I will need to display at least 50 products, either as search results, or multiple carousels. I understand that I can use queries to get this data from firebase. But since each document I retrieve counts as (at least)one firestore read, I assume that a typical user session would run into 100+ reads, if not thousands.
It seems a little inefficient to me that I need to read multiple documents to get this data, when I could just all that data in a single array, as its own document. That would mean I get charged for one document read, not 50, per page.
Is this how it is expected to be done? Should I create a new document containing the data I need for each specific use case?
P.S. I'm pretty new to backend dev, let alone firebase.
TL;DR Yes, you should create a new document with the needed data for each specific use case, but it’s not recommended to make it as a document with nested objects like arrays with 1000+ elements.
From a technical point of view, Cloud Firestore is optimized for storing large collections of small documents.
Depending on the use case, you can select the most appropriate Cloud Firestore data structure.
For example, the 10 most buyed books of the month can be a document with nested complex objects like arrays or maps. This structure could be useful for use cases with a small or predefined number of elements, but as stated here, if your data expands over time with larger or growing lists, the document also grows, which can lead to slower document retrieval times.
In plus thousand registers, a better choice can be structure your data as subcollections. It is, you can create collections within documents when you have data that might expand over time, with the main advantage that, as your lists grow, the size of the parent document doesn't change.
Cloud Firestore also has several features to help you manage queries that return a large number of results:
Cursors, which allow you to resume a long-running query.
Page tokens, which help you paginate the query results.
Limits, which specify how many results to retrieve.
Offsets, which allow you
to skip a fixed number of documents.
There are no additional costs for using cursors, page tokens, and limits. In fact, these features can help you save money by reading only the documents that you actually need.
As a best practice, do not use offsets. Instead, use cursors. Using an offset only avoids returning the skipped documents to your application, but these documents are still retrieved internally. The skipped documents affect the latency of the query, and your application is billed for the read operations required to retrieve them.

Determining number of Firebase reads for nested sub-collection

I have a mobile solution (iOS) that is using Firebase to aid in syncing of data between a users devices. What I have works and allows me to keep clients in sync as I wanted to. However from testing, my reads are a bit out of control for larger data sets and I need to do some optimization. To that end, I wanted to make sure that my understanding of how reads are counted was correct (I am still a newbie at Firebase).
My data is structured like this:
Its a bit nested I agree, but for all the uses cases it seems to be the best way to do things to minimize redundancy, e.g. there are relationship between Cats and Dogs and Birds, but I only store one copy of each, not multiple. In addition, each users data is segregated from the other users and I need the ability to version the data. Put that all together and with the requirement to alternate collections and documents, you get what you see.
Based on this structure, I can create queries like this:
Firestore.firestore().collection("userid1").document("data").collection("version0").document("Cats").collection("data").whereField("modifiedDate" isGreaterThanOrEqualTo: someDoubleValue).getDocuments(completionCallback)
This gets me the data I need and seems to only return the number of items I think it should. However, am I correct in saying that if there are 100 Cat type documents (Cat1...Cat100), but only 3 of them have a modifiedDate that is greater than my query parameter, when the data is returned to me, I will only be "charged" for 3 reads? Or have I don't something completely silly here and I am getting charged for all 100 even though I only get 3 documents back in the callback.
The billing doesn't work any different for subcollections than it does for top-level collections. You are only billed for the documents transferred, not the entire set of documents in the collection (unless you do request every document).
Cloud Firestore scales massively, and it's expected that you might have a massive number of documents in a collection. Billing a read for each and every document in a collection for each query against that collection would be insanely expensive.

Firebase Firestore reads and pricing

I currently have a collection of documents in firestore. Each of these documents holds an array of json objects. I believe it would be better to store these arrays as sub collections in each document. My only concern is the pricing aspect of reading the sub collection in.
As its just currently an array on each document I believe this counts only as one read (correct me if im wrong) when i fetch a document.
If i move to using a sub collection and read the entire collection with the code below, does this count as one read or multiple? I fear this could be expensive.
db.collection("cities").get().then(function(querySnapshot) {
querySnapshot.forEach(function(doc) {
// doc.data() is never undefined for query doc snapshots
console.log(doc.id, " => ", doc.data());
});
});
https://firebase.google.com/docs/firestore/query-data/get-data
Thanks for your help :)
Reading one document counts as one read, regardless of how big the document is.
There's actually no such thing as JSON in a document, firebase will flatten your structure in the background, it just looks like JSON to you. Image your document has a key person.
Now the person object could look like that
{
name: "Phil",
age: 25
}
Firestore will save all fields individually, so technically your document now has the fields person.name and person.age instead of just a person field.
What that means for you is that even if you have complex objects inside of a single document, it's still only one document and therefore counts as one read.
Loading subcollections will count as a separate read. But imagine instead of a small object like in my person example you have objects with sizes of multiple kilobytes or even megabytes. Not only will you fetch a huge payload every time you query a document, where you probably only need a few attributes of, your bill will also increase due to network egress, so that one additional read will be worth it.
The question wether to use subcollections or not comes down to how big your document might get. But that's up to you to decide.
Edit:
For the use case you've described in your comment it would probably a good idea to store the comments both in the document itself as well as in a subcollection.
For example, your document could hold the top 5 comments directly, so that your network egress stays low, but you still have access to the most important comments instantly. Then, if you want to load more comments, you could query the subcollection for the full collection of comments. In NoSQL databases, redundant data is allowed and sometimes actually good.
Also I recommend firebase's video on this topic: https://www.youtube.com/watch?v=o7d5Zeic63s

Firestore social network data structure

How to structure a Social Network database structure like for example twitter where we can follow a users and get all their tweets in our timeline, i have already checked this Firestore - how to structure a feed and follow system
but the solutions in the post look flawed.
Firestore is different where it requires redundant data to access data efficiently, but suppose i am following 1000 people and if i need to get the posts of all those users by querying data for each 15 users i am following and using limit(10) method then orderBy(timeStamp) there may be unread posts between Queries, because we are getting the post using the last post timeStamp , how to structure the data for a social media app in Firestore
When modeling a use-case on a NoSQL database, you tend to optimize for the features of your application, and for frequent read-operations.
So in a social media application your main feature may be that the user sees the recent posts of everyone they follow. To optimize this operation for frequent reads, you'll want to store the posts that the each user should see in a document for that user. So when compared to Twitter, you'd pretty much have a document containing the twitter feed for each user. Or if there's too much data for a single document, you might want to put that in a collection. I often explain this as modeling the screens of your app in the database.
This is very different from the typical data model in a relational database, so it's normal that it takes time to get used to. For a good introduction, I recommend:
Reading NoSQL data modeling.
Watching Firebase for SQL developers, even though it's for the Realtime Database, it explains how to map common SQL concepts to Firebase's NoSQL model.
Watching Getting to know Cloud Firestore
To develop a social media app like Twitter. The Firestore queries are not enough.
Twitter generates a personalized timeline for every user.
This is where the cloud functions come into the picture.
You need a cloud function that monitors for new posts and copies them in their following user's timelines.
You don't need to copy the entire tweet data. You can just copy the tweet id and other fields which require ordering, like timestamp.
So when I query my timeline, I will get all the tweet ids.
Then I can just load the original tweet when the user is about to scroll.
Because the likes and dislikes should affect the original tweet.

Structure Data NoSQL

I'm quite new to NoSQL, that's why I come here to get your opinions.
I'm trying to understand if it's better to use nested object or subcollections in a specific case. I will try to explain my case.
I have to store several shops in my Db. Each shop will have an address, a phone number etc... So I have a Collection "Shop" and inside several Documents representing the shops.
Now, my shops have some contacts (2,3 or 4 employees for example). My question is, what should I do :
Store in my "shop" documents 2,3 or 4 objects like :
objectContact: {
name: "Georges",
age: 20....
}
Create a subcollections "Contact" inside my "shop" documents, and then insert 2,3 or 4 documents in this subcollections.
Which is the better ? Does one of this two solutions disable some tools/queries in NoSQL ? Does one of this two solutions is faster when it comes to write/read the data ?
Thanks in advance,
Currently, all the queries in the cloud firestore are shallow which means you can't get the subcollection with its collection.You have read it separately. So I would recommend you to store it in a nested document. But it has few limitations though. Check this link for detailed explanation of modelling data in cloud firestore

Resources