I currently have a collection of documents in firestore. Each of these documents holds an array of json objects. I believe it would be better to store these arrays as sub collections in each document. My only concern is the pricing aspect of reading the sub collection in.
As its just currently an array on each document I believe this counts only as one read (correct me if im wrong) when i fetch a document.
If i move to using a sub collection and read the entire collection with the code below, does this count as one read or multiple? I fear this could be expensive.
db.collection("cities").get().then(function(querySnapshot) {
querySnapshot.forEach(function(doc) {
// doc.data() is never undefined for query doc snapshots
console.log(doc.id, " => ", doc.data());
});
});
https://firebase.google.com/docs/firestore/query-data/get-data
Thanks for your help :)
Reading one document counts as one read, regardless of how big the document is.
There's actually no such thing as JSON in a document, firebase will flatten your structure in the background, it just looks like JSON to you. Image your document has a key person.
Now the person object could look like that
{
name: "Phil",
age: 25
}
Firestore will save all fields individually, so technically your document now has the fields person.name and person.age instead of just a person field.
What that means for you is that even if you have complex objects inside of a single document, it's still only one document and therefore counts as one read.
Loading subcollections will count as a separate read. But imagine instead of a small object like in my person example you have objects with sizes of multiple kilobytes or even megabytes. Not only will you fetch a huge payload every time you query a document, where you probably only need a few attributes of, your bill will also increase due to network egress, so that one additional read will be worth it.
The question wether to use subcollections or not comes down to how big your document might get. But that's up to you to decide.
Edit:
For the use case you've described in your comment it would probably a good idea to store the comments both in the document itself as well as in a subcollection.
For example, your document could hold the top 5 comments directly, so that your network egress stays low, but you still have access to the most important comments instantly. Then, if you want to load more comments, you could query the subcollection for the full collection of comments. In NoSQL databases, redundant data is allowed and sometimes actually good.
Also I recommend firebase's video on this topic: https://www.youtube.com/watch?v=o7d5Zeic63s
Related
I have a user collection in firestore and each user object has an array of references to "tasks" that they have applied to. Tasks is a separate collection as well and each task object has a user ref array as well.
Collection Tasks:
doc: {
name: "Do something",
time: "Time",
users: ["/users/u1", "/users/u2"]
}
Collection Users:
doc: {
name: Username,
tasks: [ "/tasks/docRef", "/tasks/anotherDoc" ]
}
I have a screen in my react-native app that lists all the tasks and when a task is clicked, it goes to a details screen that displays all the users in a list as well.
Is this the best approach to have this kind of data? Or should I have collections instead of arrays with references. I refrained from collections to prevent duplication of data but I'm not sure if they will be more efficient.
(From the comments )I wanted to inquire if this was the right approach to store the data?
1/ Should I just use uids and then query the collection to matching
those ids?
Storing uid or DocumentReferences in the arrays will not make a difference in terms of ease of querying the corresponding documents. It would only make a difference at the level of the size of the document containing this data since DocumentReference`s are longer than uids).
2/ Or create a sub-collection in the user document itself with references to the tasks?
In the NoSQL world you should not hesitate to denormalize your data.
So having the tasks list in a user doc AND having the users list in a task doc and synchronize the docs when a change is done in one of the collections is a valid approach.
HOWEVER, you may encounter a problem if you have "a lot" of tasks for a given user or a lot of users involved in a task since you may hit the maximum size for a document which is 1 MiB (I agree that you need a LOT of tasks or users :-)).
To avoid that I would advise using sub-collections. This is also the preferred approach if you plan a high frequency of changes that could cause database contention or higher latency, see the documentation section about "Designing for scale".
If the user data you want to show in a task is limited (e.g. just their name plus a button to open each user Profile based on the uid) I would keep an array of users in the task document with this limited amount of data (of course after being sure that there is no risk that a doc reaches the limit of 1 MiB). And have a sub-collection of task documents under each user doc (as advised above).
Modeling well in firestore is a bit difficult, you have to think hard about your use case.
Don't worry about data duplication, this is very common in Firestore. Remembering that you mainly pay for the number of reads and writes.
In your user array, you could keep the necessary data to save you from doing more reads on the user collection.
In your example, the user only has the username, you could keep the username and uid, saved in tasks. That way, no reads would be done on the user collection.
What if the user changes username? Use a batched writes and update all docs that contain that user.
I'm trying to build an ecommerce app with firebase on the backend. I have a collection of 1000+ products, each of which is stored as a separate document, which have product specific info such as price, title etc.
document:{
title: 'Some Title',
price: '$99.99',
genres: ['Horror', 'Action']
}
So in my app I need to display these products in many places, such as product carousels(similar to a bookshelf with arrow buttons at the ends), and also in a search results page.
At any given page, I assume that I will need to display at least 50 products, either as search results, or multiple carousels. I understand that I can use queries to get this data from firebase. But since each document I retrieve counts as (at least)one firestore read, I assume that a typical user session would run into 100+ reads, if not thousands.
It seems a little inefficient to me that I need to read multiple documents to get this data, when I could just all that data in a single array, as its own document. That would mean I get charged for one document read, not 50, per page.
Is this how it is expected to be done? Should I create a new document containing the data I need for each specific use case?
P.S. I'm pretty new to backend dev, let alone firebase.
TL;DR Yes, you should create a new document with the needed data for each specific use case, but it’s not recommended to make it as a document with nested objects like arrays with 1000+ elements.
From a technical point of view, Cloud Firestore is optimized for storing large collections of small documents.
Depending on the use case, you can select the most appropriate Cloud Firestore data structure.
For example, the 10 most buyed books of the month can be a document with nested complex objects like arrays or maps. This structure could be useful for use cases with a small or predefined number of elements, but as stated here, if your data expands over time with larger or growing lists, the document also grows, which can lead to slower document retrieval times.
In plus thousand registers, a better choice can be structure your data as subcollections. It is, you can create collections within documents when you have data that might expand over time, with the main advantage that, as your lists grow, the size of the parent document doesn't change.
Cloud Firestore also has several features to help you manage queries that return a large number of results:
Cursors, which allow you to resume a long-running query.
Page tokens, which help you paginate the query results.
Limits, which specify how many results to retrieve.
Offsets, which allow you
to skip a fixed number of documents.
There are no additional costs for using cursors, page tokens, and limits. In fact, these features can help you save money by reading only the documents that you actually need.
As a best practice, do not use offsets. Instead, use cursors. Using an offset only avoids returning the skipped documents to your application, but these documents are still retrieved internally. The skipped documents affect the latency of the query, and your application is billed for the read operations required to retrieve them.
I have some questions regarding firebase, which I think many of the beginners have.
Let's say I have this query:-
var collecRef=FirebaseFirestore.instance.collection('aCollection').where("a"=="b").orderBy(//some more code);
If I execute this, how many reads will it cost? If :-
There are 5 documents which match the condition (a==b)
There are no documents which match the condition.
Now,
if I want to update the data in a document using setData(), with merge=true, would it cost a write? If data is intact? For example in a document I have saved the user name of a user and in my app, my users can change their names.
Now,
If they try to update their name with (setData()), and they haven't entered a DIFFERENT NAME(the name is same), would it cost a write?
One document received from a query costs one read. That is all you need to know. The conditions don't matter, and the size of the collection doesn't matter. Just the number of documents received.
One call to setData costs one write. It doesn't matter what you write, or the current contents of the document.
Based on this other question and on this pricing list I have the next one:
What's the point of using collections when we have a limitation for reads, writes and deletes per document?
I have a collection with 2 different collections inside, would I increase everything x3?
Would it be better for moving everything to the first collection as a single document?
The Firestore pricing for reading ONE document is neither function of the collection (or sub-collection) containing the document nor function of the sub-collection(s) contained by the document.
As you can read in the SO answer/question you refer to, "Firestore queries are always 'shallow'", meaning that when you read a document, you pay for the document read but you don't pay at all for the documents that are in its sub-collection(s).
It's worth noting that the concept of sub-collection can be a bit "misleading".
Let's take an example: Imagine a doc1 document under the col1 collection
col1/doc1/
and another one subDoc1 under the subCol1 (sub-)collection
col1/doc1/subCol1/subDoc1
Actually, from a technical perspective, these two collections (col1 & subCol1) are not at all relating to each other. They just share a part of their path but nothing else. One side effect of this is that if you delete a document, its sub-collection(s) still exist.
So, to answer your questions:
I have a collection with 2 different collections inside, would I
increase everything x3?
It depends on what you exactly read. If you only read documents from the first (parent) collection, you will only pay for these document reads. You will only pay for the documents contained in the two sub-collections if you build two extra queries to read the documents in these 2 sub-collections. Again, you just have to consider these three (sub-)collections as totally independent and therefore you pay for each document you read in each of those collections.
Would it be better moving everything to the first collection as a
single document
It really depends on your data model and on the queries you plan to execute. It is totally possible to "move everything in a single document", but you should take care of some limitations, in particular, the maximum size for a document which is 1 MiB.
Also, if your data model contains some complex hierarchical data it may be much easier to organize this data using sub-collections within documents instead of using nested objects or arrays in one document. For example, querying documents through data contained in Arrays has some limitations.
Again, there isn't a "one single truth": it all depends on your specific case. Note that, in the NoSQL world, your data model should be mainly designed in the light of the queries you plan to execute, without hesitating to denormalize data.
I need to create a firestore doc which also has a collection, ideally in a single write operation.
I'm not seeing anything like this in the documentation, so failing that any tips on getting the created doc id and then adding multiple documents to a collection?
Edit: I'm developing in typescript/js
You can use a transaction or batch write to perform atomic operations among multiple documents. In both cases, you need to know all the IDs for all the documents you want to create ahead of time. You don't need to create a document in order for there to be a subcollection. Documents don't actually "contain" subcollection. Subcollections are currently just a technique for organizing your data.
It wasn't really clear from your question, but if your first document requires a generated ID, you can use the example code in the documentation that generates a DocumentReference object, which you can populate later in your transaction or batch.
Since you didn't say which language or system you're working on, I don't know which code sample to show here from the docs, so you'll have to go to the documentation linked here to see how it works. You'll end up using a method called "doc" or "document" on a CollectionReference to generate the DocumentReference with the generated ID.