What's the difference between Metaplex's set_and_verify_collection and set_and_verify_sized_collection_item? - collections

The Metaplex documentation descriptions of both instructions are the same: https://docs.rs/mpl-token-metadata/latest/mpl_token_metadata/instruction/index.html

They are doing basically the same there is only one small difference:
set_and_verify_sized_collection is used for sized collections, set_and_verify_collection is used for "normal"/unsized ones. You have to use different instructions because the sized collection one does additional checks and updates.
Sized collections have a amount of maximum NFTs in the collection stored in the collection NFTs metadata.

Related

Should I create a duplicate collection/document for each use-case? (Firebase/Firestore)

I'm trying to build an ecommerce app with firebase on the backend. I have a collection of 1000+ products, each of which is stored as a separate document, which have product specific info such as price, title etc.
document:{
title: 'Some Title',
price: '$99.99',
genres: ['Horror', 'Action']
}
So in my app I need to display these products in many places, such as product carousels(similar to a bookshelf with arrow buttons at the ends), and also in a search results page.
At any given page, I assume that I will need to display at least 50 products, either as search results, or multiple carousels. I understand that I can use queries to get this data from firebase. But since each document I retrieve counts as (at least)one firestore read, I assume that a typical user session would run into 100+ reads, if not thousands.
It seems a little inefficient to me that I need to read multiple documents to get this data, when I could just all that data in a single array, as its own document. That would mean I get charged for one document read, not 50, per page.
Is this how it is expected to be done? Should I create a new document containing the data I need for each specific use case?
P.S. I'm pretty new to backend dev, let alone firebase.
TL;DR Yes, you should create a new document with the needed data for each specific use case, but it’s not recommended to make it as a document with nested objects like arrays with 1000+ elements.
From a technical point of view, Cloud Firestore is optimized for storing large collections of small documents.
Depending on the use case, you can select the most appropriate Cloud Firestore data structure.
For example, the 10 most buyed books of the month can be a document with nested complex objects like arrays or maps. This structure could be useful for use cases with a small or predefined number of elements, but as stated here, if your data expands over time with larger or growing lists, the document also grows, which can lead to slower document retrieval times.
In plus thousand registers, a better choice can be structure your data as subcollections. It is, you can create collections within documents when you have data that might expand over time, with the main advantage that, as your lists grow, the size of the parent document doesn't change.
Cloud Firestore also has several features to help you manage queries that return a large number of results:
Cursors, which allow you to resume a long-running query.
Page tokens, which help you paginate the query results.
Limits, which specify how many results to retrieve.
Offsets, which allow you
to skip a fixed number of documents.
There are no additional costs for using cursors, page tokens, and limits. In fact, these features can help you save money by reading only the documents that you actually need.
As a best practice, do not use offsets. Instead, use cursors. Using an offset only avoids returning the skipped documents to your application, but these documents are still retrieved internally. The skipped documents affect the latency of the query, and your application is billed for the read operations required to retrieve them.

Firebase free account limitations using Firestore

Based on this other question and on this pricing list I have the next one:
What's the point of using collections when we have a limitation for reads, writes and deletes per document?
I have a collection with 2 different collections inside, would I increase everything x3?
Would it be better for moving everything to the first collection as a single document?
The Firestore pricing for reading ONE document is neither function of the collection (or sub-collection) containing the document nor function of the sub-collection(s) contained by the document.
As you can read in the SO answer/question you refer to, "Firestore queries are always 'shallow'", meaning that when you read a document, you pay for the document read but you don't pay at all for the documents that are in its sub-collection(s).
It's worth noting that the concept of sub-collection can be a bit "misleading".
Let's take an example: Imagine a doc1 document under the col1 collection
col1/doc1/
and another one subDoc1 under the subCol1 (sub-)collection
col1/doc1/subCol1/subDoc1
Actually, from a technical perspective, these two collections (col1 & subCol1) are not at all relating to each other. They just share a part of their path but nothing else. One side effect of this is that if you delete a document, its sub-collection(s) still exist.
So, to answer your questions:
I have a collection with 2 different collections inside, would I
increase everything x3?
It depends on what you exactly read. If you only read documents from the first (parent) collection, you will only pay for these document reads. You will only pay for the documents contained in the two sub-collections if you build two extra queries to read the documents in these 2 sub-collections. Again, you just have to consider these three (sub-)collections as totally independent and therefore you pay for each document you read in each of those collections.
Would it be better moving everything to the first collection as a
single document
It really depends on your data model and on the queries you plan to execute. It is totally possible to "move everything in a single document", but you should take care of some limitations, in particular, the maximum size for a document which is 1 MiB.
Also, if your data model contains some complex hierarchical data it may be much easier to organize this data using sub-collections within documents instead of using nested objects or arrays in one document. For example, querying documents through data contained in Arrays has some limitations.
Again, there isn't a "one single truth": it all depends on your specific case. Note that, in the NoSQL world, your data model should be mainly designed in the light of the queries you plan to execute, without hesitating to denormalize data.

Firestore subcollection vs array

First of, I know how Firestore works and have spent a lot of time, evaluating different approaches for a good structure. Still I am considering following scenario:
There is a database of known recipes. Users can add recipes, but they have to be confirmed to be real recipes and not just some variations. So every user can choose receipes from the user-generated list of recipes to state, that they know how to cook them (or add new ones).
Now I want users to share their list of receipes with others, but this is where I am not sure how this can be best accomplished using Firestore. The trick is, that I want to show all the recipes at once, and don't want to paginate them.
I am currently evaluating two possibilities:
Subcollections
Whenever a user shares his list, the user looking at said list will have to load the entire list of the recipes which can result in a high amount of document reads (I suppose realistically ~50, in very rare cases maybe 1000).
Pros:
More natural structure
Easier to maintain (e.g. deleting a recipe, checking if a specific one exists)
Easier to add fields (e.g. timeOfCreation, comment, personalRating, ...)
Cons:
Can result in a high amount of reads on the long run
Arrays
I could save every known recipe (the id and an imageURL) inside the user's document (or as a single subdocument "KnownRecipes") within an array. This array could be in form of
recipesKnown: [{rid: 293ndwa, imageURL: image1.com, timeAdded: 8371201332},
{rid: 9012831, imageURL: image1.com, timeAdded: 8371201871},
{rid: jd812da, imageURL: image1.com, timeAdded: 8371201118},
...
]
Pros:
I only need one document read whenever someone wants to see another user's list
Reading a user's list is probably faster
Cons:
It's hard to update a specific recipe (e.g. someone wants to change the imageURL: I need to change the list locally and send the entire document as an update to the server - since I cannot just change a single element in the array)
When a user decides to have around 1000 recipes (this will maybe never happen, but it could), the 1MiB limit of the Firestore limit could be reached. A possible workaround would be to create a seperate document and split those two arrays into these two documents.
For me, the idea with Subcollections seems to be the more "clean" solution to this problem, but maybe I am missing some arguments on why one of those solutions would be superior over the other.
My most common queries are as follows (ordered descending by importance):
Which recipes can a user cook
Add a recipe a user can cook to the user's list
Who can cook a specific recipe (there is a Recipe -> Cooks subcollection)
Update an existing recipe a user can cook
The answer to your question depends on the level of scalability you want to achieve.
If by design the amount of sub-data you want to store is limited and very low, you should use arrays, since you reduce the number of document reads, which means lower costs.
If your sub-data is supposed to increase "unlimitedly" over time, you should use sub-collections.
If you're building a database which is not supposed to scale in any direction (Proof of concept, very small business, etc.) just go with what you feel more comfortable with.
I'm researching the same question...
One of the questions is whether the data held in the document will be ever go pass 1MB that is the limit for a document. Researching a bit on how much it can be held in plain text in 1MB well it's a hell of a lot. Still if it were to be incredible bigger it would crash in the end. Thus if you think in a big-big way sub-collections.
If we had to use the Firebase element logic the answer would be sub-collections.
Still I guess the major point is the data pulled. If you call the user you will directly be pulling out that MB of data. Instead with a sub-collection it won't load, even if you loaded it you can still lazy-load.
I guess for the kind of setup you are doing sub-collections.
key is an additional collection's con/pro
key could help to avoid duplicates; but this requires thinking of what is duplicate's definition (which might change);
array's no-key behavior could be emulated via auto-id.
p.s. #Thomas's list of pros/cons in the question has been quite helpful.

Is it ok to store a user id as the key of a field in a Firestore document?

Firestore charges for the amount of indexes used. If I have a structure where there is a massive list of ratings different users gave, and have the key as the user Id and the value as the rating, will that take up too many auto created indexes? Is there a good structure around this.
For example, in the collection 'ratings', I shard individual ratings that each user gives into different documents using a complex sharding mechanism I made that fills a document up to the max document size of around 20k, then starts filling up another document. say I have 5 documents, each filled with 20k fields. One of those docs would look like this:
uid1: 3.3
uid2: 5
uid3: 1.234
...
Is there another structure I should be using to store loads of individual 'fields' in Firestore? I don't want to use loads of documents for each rating either as that is too expensive. Arrays aren't big enough to store loads of ratings either.
Arrays aren't big enough to store loads of ratings either
The problem isn't about the arrays, the problem is that the documents have limits. So there are some limits when it comes to how much data you can put into a document. According to the official documentation regarding usage and limits:
Maximum size for a document: 1 MiB (1,048,576 bytes)
As you can see, you are limited to 1 MiB total of data in a single document. When we are talking about storing text, you can store pretty much but as your array getts bigger, be careful about this limitation.
According to the offical documentation regarding modelling data in Cloud Firestore:
Cloud Firestore is optimized for storing large collections of small documents.
So trying to shard a collection by filling up documents one by one, is not such a good idea.
If you are trying to add raitings from multipe users in a single document, with other words you trying to store large amount of data in a single document that can be updated by lots of users, there is another limitation that you need to take care of. So you are limited to 1 write per second on every document. So if you have a situation in which a lot of users al all trying to write data to the same documents all at once, you might start to see some of this writes to fail. So, be careful about this limitation too.
My recommendation is to store those raitings in an array, if you think that the size of the document will be within the 1MiB limitation, otherwise use a collection of tags for each object separately.

Advantages of firestore sub-collections

The firestore docs don't have an in depth discussion of the tradeoffs involved in using sub-collections vs top-level collections, but do point out that they are less flexible and less 'scalable'. Given that you sacrifice flexibility in setting up your data in sub-collections, there must be some definite plus sides besides a mentally satisfying structure.
For example how does the time for a firestore query on a single key across a large collection compare with getting all items from a much smaller collection?
Say we want to query a large collection 'People' for all people in a family unit. Alternatively, partition the data by family in the first place into family units.
People -> person: {family: 'Smith'}
versus
Families -> family: {name:'Smith'} -> People -> person
I would expect the latter to be more efficient, but is this correct? Are the any big-O estimates for each?
Any other advantages of sub-collections (eg for transactions)?
I’ ve got some key points about subcollections that you need to be aware of when modeling your database.
1 – Subcollections give you a more structured database.
2 - Queries are indexed by default: Query performance is proportional to the size of your result set, not your data set. So does not matter the size of your collection, the performance depends on the size of your result set.
3 – Each document has a max size of 1MB. For instance, if you have an array of orders in your customer document, it might be a good idea to create a subcollection of orders to each customer because you cannot foresee how many orders a customer will have. By doing this you don’t need to worry about the max size of your document.
4 – Pricing: Firestore charges you for document reads, writes and deletes. Therefore, when you create many subcollections instead of using arrays in the documents, you will need to perform more read, writes and deletes, thus increasing your bill.
To answer the original question about efficiency:
Querying all people with the family 'Smith' from the people top-level collections really is not any slower than asking for all the people in the 'Smith' family sub-collection.
This is explained in the How to Structure Your Data episode of the Get to Know Cloud Firestore video series.
There are some trade-offs between top-level collections and sub-collections to be aware of. Depending on the specific queries you intend to use you may need to create composite indexes to query top-level collections or collection group indexes to query sub-collections. Both these index types count towards the 200 index exemptions limit.
These trade-offs are discussed in detail near the bottom of the Understanding Collection Group Queries blog post and in Maps, Arrays and Subcollections, Oh My! episode of the Get to Know Cloud Firestore video series.
I've linked to the relevant parts of both videos.
I was wondering about the same thing. The documentation mainly talks about arrays vs sub-collections. My conclusion is that there are no clear advantages of using a sub-collection over a top-level collection. Sub collections had some clear technical limitations before, but I think those are removed with the recent introduction of collection group queries.
Here are some advantages of both approaches:
Sub collection:
Your database "feels" more structured as you will have less top-level collections listed.
No need to store a reference/foreign key/id of the parent document, as it is implied by the database structure. You can get to the parent via the sub collection document ref.
Top-level collection:
Documents are easier to delete. Using sub collections you need to make sure to first delete all sub collection documents before you delete the parent document. There is no API for this so you might need to roll your own helper functions.
Having the parent id directly in each (sub) document might make it easier to process query results, depending on the application.
Todd answered this in firebase youtube video
1) There's a limit to how many documents you can create per minute in
a single collection if the documents have an always-increasing value
(like a timestamp)
2) Very large collections don't do as well from a
performance standpoint when you're offline. But they are generally
good options to consider.

Resources