This is a general question, I'm making an example just to better illustrate what I mean.
Assume I have a User model, and a Tournament model, and that the Tournament model has a key/value map of user ids and their scores. When exposing this as a GraphQL API, I could expose it more or less directly like so:
Schema {
tournament: {
scores: [{
user: User
score: Number
}]
}
user($id: ID) {
id: ID
name: String
}
}
This gives access to all the data. However, in many cases it might be useful to get a user's scores in a certain tournament, or going from the tournament, get a certain user's scores. In other words, there are many edges that seem handy that I could add:
Schema {
tournament: {
scores: [{
user: User
score: Number
}]
userScore($userID: ID): Number # New edge!
}
user($id: ID) {
id: ID
name: String
tournamentScore($tournamentID: ID): Number # New edge!
}
}
This would probably be more practical to consume for the client, covering more use cases in a handy way. On the other hand the more I expose, the more I have to maintain.
My question is: In general, is it better to be "generous" and expose many edges between nodes where applicable (because it makes it easier for the client), or is it better to code sparingly and only expose as much as needed to get the data (because it will be less to maintain)?
Of course, in this trivial example it won't make much difference either way, but I feel like these might be important questions when designing larger API's.
I could write it as a comment but I can't help emphasizing the following point as an answer:
Always always follow YAGNI principle. The less to maintain, the better. A good API design is not about how large it is, it's about how good it meets the needs, how easy it is to use.
You can always add the new fields (what you call edge in your example) later when you need them. KISS is good.
Or you could do this
Schema {
tournament: {
scores(user_ids: [ID]): [{
user: User
score: Number
}]
}
user($id: ID) {
id: ID
name: String
tournaments(tournament_ids: [ID]): [{
tournament: Tournament
score: Number
}]
}
}
and since user_ids and tournament_ids are not mandatory, a user can make the decision to get all edges, some, or one.
Related
I have a state object with the following branches (trying to adhere to "Normalizing the state shape"):
Users
An array of elements like
{
id: 1,
name: "Werner"
}
originating from some server.
User locations
An array of elements like
{
userId: 1,
latitude: 45,
longitude: 70
}
originating from some server.
The problem
The users might change depending on a number of actions: SET_USERS_ACTION, ADD_USER_ACTION, DELETE_USER_ACTION.
Every time something happens to the users, I want to update the user locations (which is an asynchronous operation, as the data needs to come from the server). The how of the matter is what I'm struggling with.
Obviously, I can't fetch the user locations in the reducer (when updating the users), as the reducer would no longer be pure in that case.
I might do it in the thunk, but that would mean I have to add user location considerations to every action creator involving user-actions, which seems like mixing concerns to me.
Additionally, once an action is added that changes the users array in some way, the developer needs the remember to also update the user locations. My experience is that stuff like this will almost always be forgotten at some point.
Further complications
To further complicate the matter, we don't always need to fetch the locations. Only if a component displaying a map with all users is active, does it make sense to fetch the user locations. Not every action is generated at a place where I know (beforehand) if that component is visible or not. One example is when we receive a notification from the server (with Web Sockets) that a user was added or removed.
What is the best way of solving this problem?
I'll suggest to use https://github.com/kolodny/immutability-helper The benefit of using the update helper is that you are able to do many changes at once without touching the state many times. For example:
import update from 'immutability-helper';
...
case SET_USERS_ACTION:
return update(
state,
{
users: {
[idx]: { status: { $set: 'ready' }}
},
locations: {
$push: [{...}]
}
}
);
break;
I have a Comments collection and a Page collection. Comments belong to pages. Users can upvote the comments, and I want to display the aggregated sum of all the votes of the comments belonging to a page. What would be a good way to do this?
I was thinking of keeping the sum as an AutoValue inside the page collection. Would there be a way to occasionally trigger a recalculation of the AutoValue? I don't need the sum to be updated realtime, once every 5 minutes would suffice.
Or is this a bad idea? Would it be better to use a ReactiveVar in the template to do the calculation, or something else?
Edit: There's not much special about the setup, really. Simply a comment collection with a numeric 'votes' attribute and a pages collection with a numeric autovalue 'score' that should count the votes.
The pages:
Collections.Pages = new Mongo.Collection("pages");
var PageSchema = new SimpleSchema({
name: {
type: String,
min: 1
},
score: {
type: Number,
autoValue: function (doc) {
var maxValue = 1;
Collections.Comments.find({ pageId: doc.pageId }).map(function(mapDoc){
maxValue += mapDoc.votes;
});
return maxValue;
}
},
The comments:
Collections.Comments = new Mongo.Collection("comments");
var CommentSchema = new SimpleSchema({
pageId: {
type: String
},
name: {
type: String,
optional: true
},
votes: {
type: Number,
label: 'Total Votes',
defaultValue: 0
},
Maybe an alternative approach to periodic/timed recalculations might be to simply recalculate the value in one collection in response to a change in the other collection. You said you don't need realtime, but I don't imagine you'd mind if it was realtime.
I had a similar challenge and used the Meteor Collection Hooks package (see https://github.com/matb33/meteor-collection-hooks).
Example:
Collection.comments.after.update(function(userId, doc) {
// make update to aggregated value in Collections.pages
});
i did something similar: i had News items with Comments, and i wanted to track the number of comments per news item w/o having to publish all the Comments.
i chose to give News a commentCount field. i had methods for adding and removing comments, and as part of that processing, i looked up the associated News item and incremented or decremented its count.
what you're finding with your schema solution is that there's no clear way to trigger the autoValue. (it's an interesting use of autoValue, btw, i'll have to keep that in mind for future use).
so i think you're left with these choices:
create upvote/downvote methods for the votes. in the method handler, do the calculations for total votes and store the updated value along with post. this is similar to what i did with News/Comments.
as David suggested, use collection hooks to do something similar to #1. though i do use collection hooks, it's usually when i don't have a clear hook into what i want to do, it's more of a catchall, or processing driven off something i don't totally control.
take care of it in the publish. when you publish the Page, also look up the vote count and simply add dynamically to the publish object. Note that this won't republish the Page when the votes change, so you would lose that reactivity; you did indicate that you were ok with periodic updates.
getting that updated would be a little tricky, because you would have to force the publisher to run again. e.g. through unsubscribing and resubscribing.
of those 3, based on what i understand of your problem, i like them in the order presented. #3 feels the least viable, but i mention it in case it fits in w/ something else you're doing.
I'm working on a game. Originally, the user was in a single dungeon, with properties:
// state
{
health: 95,
creatures: [ {}, {} ],
bigBoss: {},
lightIsOn: true,
goldReward: 54,
// .. you get the idea
}
Now there are many kingdoms, and many dungeons, and we may want to fetch this data asynchronously.
Is it better to represent that deeply-nested structure in the user's state, effectively caching all the other possible dungeons when they are loaded, and every time we want to update a property (e.g. action TURN_ON_LIGHT) we need to find exactly which dungeons we're talking about, or to update the top-level properties every time we move to a new dungeon?
The state below shows nesting. Most of the information is irrelevant to my presentational objects and actions, they only care about the one dungeon the user is currently in.
// state with nesting
{
health: 95,
kingdom: 0,
dungeon: 1,
kingdoms: [
{
dungeons: [
{
creatures: [ {}, {} ],
bigBoss: {},
lightIsOn: true,
goldReward: 54
}
{
creatures: [ {}, {}, {} ],
bigBoss: {},
lightIsOn: false,
goldReward: 79
}
{
//...
}
]
}
{
// ...
}
]
}
One of the things that's holding me back is that all the clean reducers, which previously could just take an action like TURN_ON_LIGHT and update the top-level property lightIsOn, allowing for very straight-forward reducer composition, now have to reach into the state and update the correct property depending on the kingdom and dungeon that we are currently in. Is there a nice way of composing the reducers that would keep this clean?
The recommended approach for dealing with nested or relational data in Redux is to normalize it, similar to how you would structure a database. Use objects with IDs as keys and the items as values to allow direct lookup by IDs, use arrays of IDs to indicate ordering, and any other part of your state that needs to reference an item should just store the ID, not the item itself. This keeps your state flatter and makes it more straightforward to update a given item.
As part of this, you can use multiple levels of connected components in your UI. One typical technique with Redux is to have a connected parent component that retrieves the IDs of multiple items, and renders <SomeConnectedChild itemID={itemID} /> for each ID. That connected child would then look up its own data using that ID, and pass the data to any presentational children below it. Actions dispatched from that subtree would reference the item's ID, and the reducers would be able to update the correct normalized item entry based on that.
The Redux FAQ has further discussion on this topic: http://redux.js.org/docs/FAQ.html#organizing-state-nested-data . Some of the articles on Redux performance at https://github.com/markerikson/react-redux-links/blob/master/react-performance.md#redux-performance describe the "pass an ID" approach, and https://medium.com/#adamrackis/querying-a-redux-store-37db8c7f3b0f is a good reference as well. Finally, I just gave an example of what a normalized state might look like over at https://github.com/reactjs/redux/issues/1824#issuecomment-228609501.
edit:
As a follow-up, I recently added a new section to the Redux docs, on the topic of "Structuring Reducers". In particular, this section includes chapters on "Normalizing State Shape" and "Updating Normalized Data".
I'm interested in a one-way-many association. To explain:
// Dog.js
module.exports = {
attributes: {
name: {
type: 'string'
},
favorateFoods: {
collection: 'food',
dominant: true
}
}
};
and
// Food.js
module.exports = {
attributes: {
name: {
type: 'string'
},
cost: {
type: 'integer'
}
}
};
In other words, I want a Dog to be associated w/ many Food entries, but as for Food, I don't care which Dog is associated.
If I actually implement the above, believe it or not it works. However, the table for the association is named in a very confusing manner - even more confusing than normal ;)
dog_favoritefoods__food_favoritefoods_food, with id, dog_favoritefoods, and food_favoritefoods_food.
REST blueprints function with the Dog model just fine, I don't see anything that "looks bad" except for the funky table name.
So, the question is, is it supposed to work this way, and does anyone see something that might potentially go haywire?
I think you should be ok.
However, there does not really seem any reason to not complete the association for a Many to Many. The reason would be because everything is already being created for that single collection. The join table and its attributes are already there. The only thing missing in this equation is the reference back on food.
I could understand if putting the association on food were to create another table or create another weird join, but that has already been done. There really is no overhead to creating the other association.
So in theory you might as well create it, thus avoiding any potential conflicts unless you have a really compelling reason not to?
Edited: Based on the comments below we should note that one could experience overhead in lift based the blueprints and dynamic finders created.
Suppose I have a typical users & groups data model where a user can be in many groups and a group can have many users. It seems to me that the firebase docs recommend that I model my data by replicating user ids inside groups and group ids inside users like this:
{
"usergroups": {
"bob": {
"groups": {
"one": true,
"two": true
}
},
"fred": {
"groups": {
"one": true
}
}
},
"groupusers": {
"one": {
"users": {
"bob": true,
"fred": true
}
},
"two": {
"users": {
"bob": true
}
}
}
}
In order to maintain this structure, whenever my app updates one side of the relationship (e.g., adds a user to a group), it also needs to update the other side of the relationship (e.g., add the group to the user).
I'm concerned that eventually someone's computer will crash in the middle of an update or something else will go wrong and the two sides of the relationship will get out of sync. Ideally I'd like to put the updates inside a transaction so that either both sides get updated or neither side does, but as far as I can tell I can't do that with the current transaction support in firebase.
Another approach would be to use the upcoming firebase triggers to update the other side of the relationship, but triggers are not available yet and it seems like a pretty heavyweight solution to post a message to an external server just to have that server keep redundant data up to date.
So I'm thinking about another approach where the many-many user-group memberships are stored as a separate endpoint:
{
"memberships": {
"id1": {
"user": "bob",
"group": "one"
},
"id2": {
"user": "bob",
"group": "two"
},
"id3": {
"user": "fred",
"group": "one"
}
}
}
I can add indexes on "user" and "group", and issue firebase queries ".orderByChild("user").equalTo(...)" and ".orderByChild("group").equalTo(...)" to determine the groups for a particular user and the users for a particular group respectively.
What are the downsides to this approach? We no longer have to maintain redundant data, so why is this not the recommended approach? Is it significantly slower than the recommended replicate-the-data approach?
In the design you propose you'd always need to access three locations to show a user and her groups:
the users child to determine the properties of the user
the memberships to determine what groups she's a member of
the groups child to determine the properties of the group
In the denormalized example from the documentation, your code would only need to access #1 and #3, since the membership information is embedded into both users and groups.
If you denormalize one step further, you'd end up storing all relevant group information for each user and all relevant user information for each group. With such a data structure, you'd only need to read a single location to show all information for a group or a user.
Redundancy is not necessarily a bad thing in a NoSQL database, indeed precisely because it speeds things up.
For the moment I would go with a secondary process that periodically scans the data and reconciles any irregular data it finds. Of course that also means that regular client code needs to be robust enough to handle such irregular data (e.g. a group that points to a user, where that user's record doesn't point to the group).
Alternatively you could set up some advanced .validate rules that ensure the two sides are always in sync. I've just always found that takes more time to implement, so never bothered.
You might also want to read this answer: Firebase data structure and url