Event sourcing: denormalizing relationships in projections - projection

I'm working on a CQRS/ES architecture. We run multiple asynchronous projections into the read stores in parallel because some projections might be much slower than others and we want to stay more in sync with the write side for the faster projections.
I'm trying to understand the approaches on how I can generate the read models and how much data-duplication this might entail.
Let's take an order with items as a simplified example. An order can have multiple items, each item has a name. Items and orders are separate aggregates.
I could either try to save the read models in a more normalized fashion, where I create an entity or document for each item and order and then reference them - or I maybe would like to save it in a more denormalized manner where I have an order which contains items.
Normalized
{
Id: Order1,
Items: [Item1, Item2]
}
{
Id: Item1,
Name: "Foosaver 9000"
}
{
Id: Item2,
Name: "Foosaver 7500"
}
Using a more normalized format would allow a single projection to process events that affect/effect item and orders and update the corresponding objects. It would also mean that any changes to the item name affect all orders. A customer might get a delivery note for different items than the corresponding invoice for example (so obviously that model might not be good enough and lead us to the same issues as denormalizing...)
Denormalized
{
Id: Order1,
Items: [
{Id: Item1, Name: "Foosaver 9000"},
{Id: Item2, Name: "Foosaver 7500"},
]
}
Denormalizing however would require some source where I can look up the current related data - such as the item. This means that I either have to transport all the information I might need in the event, or I'll have to keep track of the data that I source for my denormalization. This would also mean that I might have to do this once for each projection - i.e. I might need a denormalized ItemForOrder as well as a denormalized ItemForSomethingElse - both only containing the bare minimum properties that each of the denormalized entities or documents need (whenever they are created or modified).
If I would share the same Item in the read store, I could end up mixing item definitions from different points of time, because the projections for items and orders might not run at the same pace. In the worst case, the projection for items might not have yet created the item I need to source for its properties.
Generally, what approaches do I have when processing relationships from an event stream?
update 2016-06-17
Currently, I'm solving this by running a single projection per denormalised read model and its related data. If I have multiple read models that have to share the same related data, then I might put them into the same projection to avoid duplicating the same related data I need for the lookup.
These related models might even be somewhat normalised, optimised for however I have to access them. My projection is the only thing that reads and writes to them, so I know exactly how they are read.
// related data
public class Item
{
public Guid Id {get; set;}
public string Name {get; set;}
/* and whatever else is needed but not provided by events */
}
// denormalised info for document
public class ItemInfo
{
public Guid Id {get; set;}
public string Name {get; set;}
}
// denormalised data as document
public class ItemStockLevel
{
public ItemInfo Item {get; set;} // when this is a document
public decimal Quantity {get; set;}
}
// or for RDBMS
public class ItemStockLevel
{
public Guid ItemId {get; set;}
public string ItemName {get; set;}
public decimal Quantity {get; set;}
}
However, the more hidden issue here is that of when to update which related data. This is heavily dependent on the business process.
For example, I wouldn't want to change the item descriptions of an order after it has been placed. I must only update the data that changed according to the business process when the projection processes an event.
Therefore, the argument could be made towards putting this information into the event (and using the data as the client sent it?). If we find that we need additional data later, then we might have to fall back to projecting the related data from the event stream and read it from there...
This could be seen as a similar issue for pure CQRS architectures: when do you update the denormalised data in your documents? When do you refresh the data before presenting it to the user? Again, the business process might drive this decision.

First, I think you want to be careful in your aggregates about life cycles. In the usual shopping cart domain, the cart (Order) lifecycle spans that of the items. Udi Dahan wrote Don't Create Aggregate Roots, which I've found to mean that aggregates hold a reference to the aggregate that "created" them, rather than the other way around.
Therefore, I would expect the event history to look like
// Assuming Orders come from Customers
OrderCreated(orderId: Order1, customerId: Customer1)
ItemAdded(itemId: Item1, orderId: Order1, Name:"Foosaver 9000")
ItemAdded(itemId: Item2, orderId: Order1, Name:"Foosaver 7500")
Now, it's still the case that there are no guarantees here about ordering - that's going to depend on how the aggregates are designed in the write model, whether your event store linearizes events across different histories, and so on.
Notice that in your normalized views, you could go from the order to the items, but not the other way around. Processing the events I've described gives you that same limitation: instead of Orders with mysterious items, you have items with mysterious orders. Anybody who looks for an order either doesn't see it yet, sees it empty, or sees it with some number of items; and can follow links from those items to the key store.
Your normalized forms in your key value store don't need to change from your example; the projection responsible for writing the normalized form of orders needs to be smart enough to watch the item streams too, but its all good.
(Also note: we're eliding ItemRemoved here)
That's ok, but it misses on the idea that reads happen more often than writes. For hot queries, you are going to want the denormalized form available: the data in the store is the DTO that you are going to send in response to the query. For example, if the query were supporting a report on the order (no edits allowed), then you wouldn't need to send the item ids either.
{
Title: "Your order #{Order1}",
Items: [
{Name: "Foosaver 9000"},
{Name: "Foosaver 7500"}
]
}
One thing that you might consider is tracking the versions of the aggregates in question, so that when the user navigates from one view to the next -- rather than getting a stale projection, the query pauses waiting for the new projection to catch up.
For instance, if your DTO were hypermedia, then it might looks something like
{
Title: "Your order #{Order1}",
refreshUrl: /orders/Order1?atLeastVersion=20,
Items: [
{Name: "Foosaver 9000", detailsUrl: /items/Item1?atLeastVersion=7},
{Name: "Foosaver 7500", detailsUrl: /items/Item2?atLeastVersion=9}
]
}

I also had this problem and tried diferent things. I read this suggestion and while I did not tried it yet, I think it may be the best way to go. Just enrich events before publishing them.

Related

Organizing a Cloud Firestore database

I can't manage to determine what is the better way of organizing my database for my app :
My users can create items identified by a unique ID.
The queries I need :
- Query 1: Get all the items created by a user
- Query 2 : From the UID of an item, get its creator
My database is organized as following :
Users database
user1 : {
item1_uid,
item2_uid
},
user2 : {
item3_uid
}
Items database
item1_uid : {
title,
description
},
item2_uid : {
title,
description
},
item3_uid : {
title,
description
}
For the query 2, its quite simple but for the query 2, I need to parse all the users database and list all the items Id to see if there is the one I am looking for. It works right now but I'm afraid that it will slow the request time as the database grows.
Should I add in the items data a row with the user id ? If yes the query will be simpler but I heard that I am not supposed to have twice the same data in the database because it can lead to conflicts when adding or removing items.
Should I add in the items data a row with the user id ?
Yes, this is a very common approach in the NoSQL world and is called denormalization. Denormalization is described, in this "famous" post about NoSQL data modeling, as "copying of the same data into multiple documents in order to simplify/optimize query processing or to fit the user’s data into a particular data model". In other words, the main driver of your data model design is the queries you plan to execute.
More concretely you could have an extra field in your item documents, which contain the ID of the creator. You could even have another one with, e.g., the name of the creator: This way, in one query, you can display the items and their creators.
Now, for maintaining these different documents in sync (for example, if you change the name of one user, you want it to be updated in the corresponding items), you can either use a Batched Write to modify several documents in one atomic operation, or rely on one or more Cloud Functions that would detect the changes of the user documents and reflect them in the item documents.

Problem with one controller handling two forms (models)

I have to implement this form, which at first seemed easy to me. For the models part, I created a one-to-many relationship between GeneralInformation and CourseList (very obvious). In the GeneralInformation I included the bottom section as well with the 'Comments', 'Remarks'etc.
The problem is that before submitting the General Information section, you fill the Course List which will give an error for the FK constraint (again, obvious). For the Course List I'm using DevExtreme datagrid.
The only solution I came up with is to create another table, which keeps the ID of GeneralInformation and each Course ID. A similar solution for many-to-many relationships. If this seems like a viable solution, then how do you store the IDs of Courses in the controller, and then the ID of GeneralInformation, to put them in the database. Now I think I'm handling two models with the same controller, which might not be an optimal solution or even against guidelines of .NET Core.
If someone has a better solution, it would be greatly appreciated.
Your Model can be like this
public class Model_Name //enter your model name
{
public string FirstName {get;set;}
public string LastName {get;set;}
//other properties from General Information form
public List<Course> Courses {get;set;} //Course is a separate model
public string Remarks {get;set;}
}
In HTML do not map course to General info, Post the form and pass all the fields as json like below
{'FirstName':'Test','LastName':'Test','Courses':[{'Id':1}], 'Remarks':'test'}
So you're basically asking how to bind to a collection in a model.
Property binding would look like this:
<input asp-for="#Model.Collection[i].Property" />
What i did was store the course table (or grid) row as a partial view that takes an index as an argument with the course model and at the same time i maintain an index of the courses count using data-id attribute on each row and then every time a user wants to add a course i use ajax with the index of the last course as a parameter to fetch a new row rendered with an incremented index
and i append it to the table.
That way when your submit your form it will bind to a collection what ever the length is.

Writing DTO Class to build object getting informations from different Data Sources

I'm following Domain Driven Design for this project.
I have got an object containing an image. Let's call it Product:
class Product {
UniqueID id;
ProductName name;
ImageBytes imageBytes;
}
UniqueID, ProductName and ImageBytes are just validated objects that represent respectively a String, a String and a List<int>.
I would like to store the actual image on Firebase Storage and saving the imageId on Firestore.
So, in my idea, I have an image on Firebase Storage with id xYF87Ejid0093RTcxaWpof and a Doc in the Firestore containing this id instead of the actual image.
The problem I'm stuck into is writing the Data Transfer Object of a Product. How should I convert an imageId into the actual image?
Please consider that I'm using DDD, so my DTO and my Entity classes are Unions (using Freezed).
I think I should have an intermediate class called FirestoreProduct at the Infrastructure level that looks like this:
class FirestoreProduct {
UniqueID id;
ProductName name;
UniqueID imageId;
}
So that I can write a DTO that uses this class instead and I can create the Product object from the repository class after I downloaded the image.
Is there any better way to solve this problem in the DDD way?
Thanks in advance.
Do you really need the ImageBytes to perform the business logic of your product entity? I even guess that your Product is an aggregate root and thus will have data and corresponding behaviour (business logic) in it.
So from my point-of-view the model of your FirestoreProduct is closer to a domain model than your Product class.
I consider your image a separate aggregate which can reside in the same service but a different storage or which could even live in a separate service.
Either way the Product aggregate should only need a reference to the image. I would model it somehow like this
class Product {
ProductId id;
ProductName name;
ImageId imageId;
}
whereas ProductId and ImageId would be value objects for strongly-typed ids.
I expect the storing/upload of a new image to be performed in a separate transaction than creating/updating a product itself. That means when you create a new product or perform some business logic on it to change it your image has already been uploaded to the Firestore and you only work with the image id in your product aggregate.
Your Product DTO (you could also call it view model) on the other hand which you use for providing data for the UI (i.e. for reading data) can look different then the Product aggregate. This is okay and also makes total sense.
So the DTO would look something like this instead:
class ProductDto {
UniqueID id;
ProductName name;
ImageBytes imageBytes;
}
Note: I don't know if ImageBytes is the right type for the DTO as my flutter knowledge is limited but I hope you get the idea.
With that you can bypass the Product aggregate domain repository completely and have another service class which will give you all the data you need for reading/viewing the Product data. As you do not change anything by reading data you do not go through your domain model and optimize for reads.
The code which than builds the DTO will go to your persistence for querying some Product data but also to Firebase for querying the actual image. You could even reload the actual Firebase image afterwards by a separate call from the UI if performance is an issue, for instance if you retrieve a whole list of product data for reading at once.

Displaying a list of a customer's contacts in my details view

My overall goal is to be able to display all Customer Information in my CustomerController->Details View and have no idea how to pass all of this data:
Customer info - (fields directly in Customer table)
Customer contacts
Customer locations
Customer documents
I am able to display the customer info just fine because I am doing something like this and passing it to my view:
public ActionResult Details(int? id)
{
Model.Customer customer = _custService.GetCustByID(id);
return View(customer);
}
I have absolutely no idea how to go from just this which gives me access to direct Customer properties in my view to displaying all of these other lists of related customer items that come from separate tables.
Here are a few details on my project setup:
I am using EF6DbFirst and I also create a Dto for each one of my entities and use AutoMapper to map everything. My Customer.cs model has all of the customer's direct properties and also a couple that look like this for the things that are one-to-many relationships:
public List<CustomerContact> Contacts { get; set; }
Now in my AutoMapper config I did the following since my table is named CustomerContacts in EF:
CreateMap<Customer, Model.Customer>()
.ForMember(dest => dest.Contacts, opt => opt.MapFrom(src => src.CustomerContact));
CreateMap<Customer_GetAll_Result, Model.Customer>();
I am not sure if this is the proper way to do this with AutoMapper or if there is anything else I have to tell it when accessing other tables. I am thinking the only reason I have to map this property is because I want to change the name to just c.Contacts rather than c.CustomerContacts.
Side question:
As you can see I am also trying to map values that come from my GetAll stored procedure which my custService.GetAll() uses in my controller to bind a grid of customers. I don't think this matters in this case but I am assuming if I ever need to loop through that list that comes from the proc and get the contacts for each customer they will not be available since my stored proc only returns direct customer properties which is why I can't do the mapping for that one. Is there any workaround for this?

Keep priority after update with AngularFire

I have a database with exercises, and each exercise has a field userId which references to the user that the exercise belongs to. I would want to retrieve every exercise that belongs to the current user, so I have assigned a priority to each exercise which corresponds to the userId, I can then retrieve every exercise for a specific user like this:
collection: function(user_id) {
return angularFireCollection(new Firebase(FBURL+'/exercises').startAt(user_id).endAt(user_id));
},
Which is called from a controller like this:
$scope.findExercises = function() {
$scope.exercises= Exercises.collection($scope.auth.id);
}
The problem is, when I update an exercise, the priority seems to be removed since it no longer turns up in the collection. The exercise is updated via the realtime binding angularFire(). How can I make updates and maintain the priority, or reset the priority?

Resources