When designing a state shape with related entities, the official Redux docs recommend referencing by ID rather than nesting: http://redux.js.org/docs/basics/Reducers.html#note-on-relationships.
In a one-to-many relationship, Normalizr will put the references in the "one" side of the relationship, e.g.:
"posts": {
"1": {
...
comments: ["1", "2", "3"]
...
Is this better than putting the reference in the "many" side? e.g.
"comments": {
"7": {
...
postId: "1"
...
Does it matter where I put the reference when creating a Redux store?
I'd suggest keeping the ID of the comments in the post.
This way, for any given post, you can access all the comments by direct reference (index or property name, it doesn't matter), which is fast and easy. That's a complexity of O(N).
In the opposite scenario, you'll have to search your whole comments for any given post. That's a complexity of O(N^2). Plus, you'll have to re-order your comments once you have them all.
Related
Quick high level background:
DynamoDB with single table design
OpenSearch for full text search
DynamoDB Stream which indexes into OpenSearch on DynamoDB Create/Update/Delete via Lambda
The single table design approach has been working well for us so far, but we also haven't really had many-to-many relationships to deal with. However a new relationship we recently needed to account for is Tags for Entry objects:
interface Entries {
readonly id: string
readonly title: string
readonly tags: Tag[]
}
interface Tags {
readonly id: string
readonly name: string
}
We want to try and stick to a single query/read to retrieve a list of Entries / single Entry but also want to find a good balance between having to manage updates.
A few ways we've considered storing the data:
Store all tag data in the Entry
{
"id": "asdf1234",
"title": "Entry Title",
"tags": [
{
"id": "1234asdf",
"name": "stack"
},
{
"id": "4321hjkl",
"name": "over"
},
{
"id": "7657gdfg",
"name": "flow"
}
]
}
This approach makes reads easy, but updates become a pain - anytime a tag is updated, we would need to find a way to find all Entries that reference that tag and then update it.
Store only the tag ids in the Entry
{
"id": "asdf1234",
"title": "Entry Title",
"tags": ["1234asdf", "4321hjkl", "7657gdfg"]
}
With this approach, no updates would be required when a Tag is updated, but now we have to do multiple reads to return the full data - we would need to query each Tag by id to retrieve its data before returning the full content back to the client.
Store only the tag ids in the Entry but use OpenSearch to query and get data
This option, similar to the one above, would store only the tag ids on the Entry, but then have the Entry document that is indexed on the search side include all Tag data in our stream lambda. Updates on a Tag would still require updates to all Entries (in search) to also query and update each Entry individually - but the question is if its more cost effective to just do it in DynamoDB.
This scenario presents an interesting uni-directional flow:
writes go straight to DynamoDB
DynamoDB stream -> Lambda - do a transformations on the data => index in OpenSearch
reads are exclusively done via OpenSearch
The overall question is, how do applications using nosql with single table design, handle these many-to-many scenarios? Is using a uni-directional flow stated above a good idea/worth it?
Things to consider:
our application leans more heavily on the read side
our application will also utilize search capability quite heavily
Tag updates will not be often
In my firebase db I have 3 collections:
Users
{user_id}: {name: "John Smith"}
Items
{item_id}: {value: 12345}
Actions
{action_id}: {action: "example", user: {user_id}, items:{item_id}}
Basically, instead of storing the Users and Items under the Actions Collection, I just keep an ID. But now I need a list of all actions and this also needs info from the Users and Items Collection. How can I efficiently query firebase so I can get a result that looks like this:
{
action: "example",
user: {
name: "John Smith"
},
item: {
value: 1234
}
}
Unfortunately, there is no such thing in firebase or a similar database, basically, you are looking for a traditional join, which is no recommended thing to do in a NoSQL database.
If you want to do it in firebase, you will need:
Get the element you are looking for from your main collection Actions in this case.
Then you need to do another call to the Items collections where item_id == action.item_id.
Then assign in the actions["Item"] = item_gotten.
This is not a recommended use as I said, usually, when you are using a NoSQL Database you are expecting a denormalize structure, from your application you need to save the whole Item, in the Action JSON, and also in the Item. Yes, you will have duplicate data but this is fine for this kind of model. also you shouldn't expect too many changes in one specific object within your whole object key If you are managing a big set of changes you could be using the incorrect kind of DB.
For aggregation queries reference, you might check: https://firebase.google.com/docs/firestore/solutions/aggregation
I will be receiving flat documents that will have slightly different schemas.
For example:
{
"FirstName": "Jim",
"LastName: "Bob"
}
And another one, that would simply have:
{
"FullName": "Jim Bob"
}
Is it possible to query the Person collection to retrieve the list of unique properties (not the values)?
[
"FirstName",
"LastName",
"FullName"
]
According to my research, it is not supported in cosmos db query syntax so far. You could refer to this similar feedback and adopt the suggestions from cosmos db team.
Also, I think you could get all the names of properties by below coding workaround.
Create and init a hashmap.
Query the documents and get the results array.
Loop the array and convert every json to map.
Push the elements into initial hashmap to make sure the list of properties is unique.
I am trying to develop a system with 2 different views, being the first "My books" and the second "Other people books"
for the first view i have this working fine
this.mybooks = angFire.database.list('/user-books', {
query: {
orderByChild: 'uid',
equalTo: firebase.auth().currentUser.uid
}
});
but i dont know how to create something that would work as a "notEqualTo" function to exclude all books owned by the current user.
this is my Firebase structure for reference
{
"user-books" : {
"-KfJ9CprqgOWN9Ud_CvG" : {
"author" : "sdasd",
"city" : "asd",
"description" : "asdasd",
"title" : "asda",
"uid" : "bazvEYBL6sgfa6HSmqvtAlX3f0l2"
},
"-KfJARoEU_FDW80hg4ws" : {
"author" : "in",
"city" : "chaaat",
"description" : "the",
"title" : "Pogchamps",
"uid" : "tzROGF1Tk4NcobrTE70ZKQzoKom1"
}
}
}
The firebase data structure you are currently using, which nests books inside user objects, is probably not the best for your exact use. The firebase docs have a section devoted to that right here. One of the subsections is titled "avoid nesting data".
A better structure would normalise books and put them in a separate collection/table. For storing the users books, you can still have the books array, but in it just store the keys (ids) from the books table.
Listing other people's books, then, just becomes listing all the books from the books table and filtering out the current user's books.
Note that, especially in data stores like firebase, the optimal data structure entirely depends on how you use the data. Storing the user-book link inside the user object is a good approach, only if your app focuses on listing a user's books. If, on the contrary, the focus would be to list users for a book, it would be better to store arrays of user ids inside the books collection.
I am developing the browser front end of a social network application. It has lots of relational data, having one-to-many (1:m) and mostly many-to-many (m:m) relationships as in below list.
I want to use Flux data flow architecture in the application. I am using Vuex.js with Vue.js.
As expressed in the Redux.js docs it is better to have flat, normalized, store state shape for various reasons for usage with React, and I think that is the case for usage with Vue.js also.
posts have categories (m:m)
posts have tags (m:m)
post has comments (1:m)
posts have hashtags in them (m:m) // or users creates hashtags
posts have mentions in them (m:m) // or users creates mentions of users
users like posts (m:m)
users follow users, posts, post categories etc. (m:m)
users favorite posts (m:m)
etc.
I will need to show post feeds with all of its related data of other entities like users, comments, categories, tags. For this, like having a 1:many relation, holding the many side of this relation's data in the one side (can be said to be the parent), even it is actually many-to-many, seems ok for usual querying of them to compose their parent, that is posts. However, I will need to query the store state inversely also, for example, getting the posts with a certain category or tag.
In that case, it is not as easy is as doing so for posts. I need a relation entity that holds the id pairs for the two connected data entity, just like a join table or association table in RDBMSs, for ease of accessing and updating, avoiding deep digging into state, and also avoiding unnecessary re-renders (that requirement is React or Vue.js and GUI specific).
How can I achieve this relatively easily and effectively, e.g. as one do for 1:many relations?
Pursuant to your last comment. I'll present the data structure I currently use for this circumstance:
Tag
{
"type": "tag",
"id": "tag1",
"name": "Tag One"
}
Tag To Post
{
"id": "someId",
"type": "postTag",
"tagId": "tag1",
"postId": "post1"
}
Post
{
"id": "post1",
"name": "Post 1"
}
I found that each side of M:M storing the relationship ids potentially produces orphans. The management of these IDs in dual places leads to replicating steps and an increase in cognitive management as all functions managing the M:M happen in two places rather than one. Additionally, the relationship itself may need to contain data, where would this data go?
M:M Without Entity
{
"id": "post1",
"name": "Post 1"
"tagIds": [
{id: "tag1", extraAttribute: false} //this is awkward
]
}
Tag To Post - Additional Attributes
{
"id": "someId",
"extraAttribute": false,
"postId": "post1"
"type": "postTag",
"tagId": "tag1",
}
There are additional options available to speed up extracting tags with minor elbow grease.
Post
{
"id": "post1",
"name": "Post 1"
"tagIds" : ["tag1", "tag4"]
}
Hypothetically, a post would not have more than 20 tags. Making this a generally negligible storage requirement to reduce lookups. I have found no urgent need for this currently with a database of 10000 relationships.
Ease of access and updating
1:M is an object directly pointing at what it wants. M:M are two different entities pointing at their relationships. Model that relationship, and centralise the logic
Rerenders
If your application renders long lists of data (hundreds or thousands
of rows), we recommended using a technique known as “windowing”. This
technique only renders a small subset of your rows at any given time,
and can dramatically reduce the time it takes to re-render the
components as well as the number of DOM nodes created.
https://reactjs.org/docs/optimizing-performance.html#virtualize-long-lists
I feel solutions may be use case specific and subject to broader opinions. I ran into this issue utilising couch/pouch with Vuex and a very large table of 20,000 entries. Again, in production the issues were not extremely noticeable. Results will always vary.
A couple things I try here:
Load partial data sets: in-file (non-reactive) vs in memory (loaded in Vuex)
Sort, paginate, search in-file and load results