What are some popular frameworks available for implementing CQRS, Event Sourcing and Saga for data consistency and distributed transactions? - asynchronous

I would like to know some popular frameworks that are available for implementing CQRS, ES, Saga in the application.
As a part of my research, I have to compare these frameworks and evaluate them based on various -ilities.

I have to compare these [event-sourcing] frameworks and evaluate them based on various -ilities.
The premise of the question is that you need a framework to implement event sourcing but, in fact, you do not.
Greg Young, one of the most influential proponents of event sourcing, frequently expresses his misgivings about frameworks. See, for instance, his QCon London 2013 keynote, esp. mark 9'.
Event sourcing is conceptually simple and doesn't need the kind of magic that frameworks typically bring with them. For instance, rebuilding the state from a stream of events simply consists in a left fold over the stream in question. Moreover, you don't necessarily need a specialised database; I know people who have successfully implemented event sourcing by simply appending events to a file.
If your research aims at comparing event-sourcing frameworks, I would argue that you should consider the case where no framework is used at all.

Axon is a popular framework/server for building CQRS/ES applications.
EventStoreDB is a popular EventStore database for the EventSourcing part.
A simple starting point if you want to write your own framework/library is to check out some of the code I co-authored at https://www.cqrs.nu/

If you are looking for a managed solution, you can also check out what we at Serialized provide.

In addition to Axon, on the JVM there's also the Akka ecosystem (the cluster sharding, persistence, sharded daemon process, and projection modules are the most relevant to CQRS/ES/DDD). One benefit of Akka Persistence is the ability to choose from a variety of datastores to use as an event store (JDBC SQL databases and Cassandra are the most common, but there are many more supported). My experience with it has been that it is capable of exceptionally high availability and since it allows a stateful event-sourced application to be deployed as if it's stateless (e.g. in Kubernetes without needing an operator) there's a lot of deployment flexibility. Note that because it's built on the actor model, a lot of JVM observability tooling doesn't work particularly well with it (often assuming a stronger mapping of threads to tasks), so certain commercially-licensed observability tooling is recommended.
Additionally, Kalix also provides a polyglot (all you need is to express domain logic in a language which supports grpc) event-sourcing implementation.
Disclaimer: since answering this question (almost a year after answering this question), I became employed by Lightbend, the maintainers of Akka and provider of Kalix.

Related

Axon Framework: Should microservices share events?

We are migrating a monolithic to a more distributed and we decided to use AxonFramework.
In Axon, as messages are first-class citizens, you get to model them as POJOs.
Now I wonder, since one event can be dispatched by one service and listen on any others, how should we handle event distribution.
My first impulse is to package them in a separate project as a JAR file, but this goes against a rule for microservices, that they should not share implementations.
Any suggestion is welcome.
Having some form of 'common' module is definitely not uncommon, although I'd personally use that 'common' module for that specific application alone.
I'd generally say you should regard your commands/events/queries as the API of your application. As such, it might be beneficial to share the event structure with other projects, but just not the actual POJO itself. You could for example think about using ProtoBuf for this use case, were in ProtoBuf describes a schema for your events.
Another thing to think about is to not expose your whole 'event-API'. Typically you'll have quite some fine grained events, things which other (micro) services in your environment are not interested in. There are however always a couple of 'very important events', differently put 'milestone events', which others definitely are interested in.
These milestone events in some scenarios aren't a direct POJO following from your domain, but rather an accumulations of several events.
It is thus not to uncommon to have a service which accumulates these and publishes another event to notify other services. The accumulating of these fine grained, internal events, and publishing a milestone event as a response to these is typically better suited as the event-API within your micro service architecture.
So that's a couple of ideas there for you, hope they give you some insights.
I'd like to give a clear cut solution to your question, but such an answer always hides behind 'it depends'.
You are right, the "official" rule is not to share models. So if you have distributed dev-teams, I would stick to it.
However, I tend to not follow strictly when I have components that are decoupled but developed by the same team or teams with high interaction ...

Infrastructure mobility via Onion architecture - practical implications

One of the key benefits provided by Onion architecture is the ability to swap out "infrastructure" elements, such as "Data Access, I/O, and Web Services" (http://jeffreypalermo.com/blog/the-onion-architecture-part-3/).
Jeff says in his post from 2008 that "the industry has modified data access techniques at least every three years".
Does anyone have an example of a reasonably large project where Onion architecture was used and swapping out of key infrastructure elements was subsequently undertaken?
I'm interested to understand:
How common is this scenario, in general?
My instinct tells me that while "data access techniques" may be modified every three years, changes to actual infrastructure for running solutions, which would allow this benefit to be realised, may be a lot less frequent?
What were the conditions were that the solution was operating under originally?
What caused the change in the underlying infrastructure?
Are there lessons to be learned about the practical implications of changing infrastructure in this way, which may allow us to refine original implementations of the Onion architecture?
I'm interested to hear whether there were unexpected changes required beyond just replacing the infrastructure component and implementing the same interface. For example, did the new infrastructure require new arguments to be passed to previously defined methods e.g. SaveOrder(int ID) -> SaveOrder(int ID, bool AllowSiblings, bool SiblingCreated) when moving from a Relational to NoSQL DB model.
Did the implementation of this architecture + rework to migrate to new infrastructure significantly decrease the total effort required, if compared to a traditional, coupled approach?
Do developers find coupled, hard-referenced code easier to write and debug than loosely coupled, indirectly referenced code, but the eventual payoff for infrastructure changes makes this worth it?
Well, IMHO, the primary intent of such architecture style (Hexagoanl, Ports&Adapters, Onion …) is that it allows you to focus on your domain, how you will deliver value instead of focusing first on UI, frameworks or storage issues. It allows you to defer such decisions.
As Jeffrey says, the ability to swap out "infrastructure" elements is a nice side effect of such architecture style. Even if you will not switch from one RDBMS to another every 6 months, it’s quite reassuring knowing that it would be possible doing it without pain, though.
Rather than thinking about changing your storage mechanism on a regular basis or as you said “swapping out of key infrastructure elements”, just think about third parties services that you’d plug to your system. Those are eager to change on a regular basis; you would also switch from one provider to another. This is quite a more common scenario we are used to face with on a more regular basis. In this particular case, the domain behavior won’t change, the interfaces will stay the same, you won’t have to change a single line of code into your core domain layer. Only the implementation made somewhere in your infrastructure layer might have to change. That’s a another noteworthy benefit from that kind of architecture!
Please read this nice Uncle Bob article about Clean Architecture where he explains why the ability to defer critical infrastructure decision is really cool!
--- EDIT ---
Could you provide an example of where you have swapped out a third party service?
We have tons of examples where we switched from one provider to another (from payment providers to live feeds providers or whatever provider). The business stays the same, the domain behaviors are still the same. Changing a provider should not have any kind of impact on your business. You don’t have to change the way your business work, where the value really is, just because you change from one provider to another, it makes no sense. Isolating your domain behaviors in an independent core layer, with no dependencies on any third parties libraries, frameworks or provider services, definitely help you to deal with changes.
I have the feeling that you’re trying to convince yourself whether to go with Onion. You might be on the wrong track only thinking about migrating to new infrastructure related stuff (db, third parties stuff...). Focus on your domain instead. Ask yourself if your domain is complex enough to require such an architecture style. Don’t use a bazooka to kill a fly. As Simon Brown says: "Principles are good, but make sure they’re realistic and don’t have a negative impact"!
If your application is quite small, with no complex business domain, go for classic n-tiers architecture, that’s ok; don’t change things just for the sake of it or just because of any buzzword. But also keep in mind that an isolated core business layer without dependencies, as in Onion architecture, might be very easy to unit test!
Now for your additional questions:
Did the implementation of this architecture + rework to migrate to new infrastructure significantly decrease the total effort required, if compared to a traditional, coupled approach?
It depends! :-) In tightly coupled applications as soon as there’s a new infrastructure element to be migrated, there is little doubt that you’ll surely have to modify code in every layers (including the business layer). But if this application is small, quite straightforward, well organized with a descent test code coverage, this shouldn’t be a big deal. Now, if it’s quite big, with a more complex business domain, it might be a good idea to isolate that layer in a totally separate layer with no dependencies at all, ensuring that infrastructure changes won’t cause any business regression.
Do developers find coupled, hard-referenced code easier to write and debug than loosely coupled, indirectly referenced code, but the eventual payoff for infrastructure changes makes this worth it?
Well, ask your teammates! Are they used to work with IOC? Remember that architecture design and choices must be a team decision. This must be something shared by the whole team.

Workflow Foundation - Can I make it fit?

I have been researching workflow foundation for a week or so now, but have been aware of it and the concepts and use cases for it for many years, just never had the chance to dedicate any time to going deeper.
We now have some projects where we would benifit from a centralized business logic exposed as services as these projects require many different interfaces on different platforms I can see the "Business Logic Silos" occuring.
I have had a play around with some proof of concepts to discover what is possible and how it can be achieved and I must say, its a bit of a fundamental phase shift for a regular C# developer.
There are 3 things that I want to achieve:
Runtime instanciated state machines
Customizable by the user (perform different tasks in different orders and have unique functions called between states).
WCF exposed
So I have gone down the route of testing state machine workflows, xamlx wcf services, appfabric hosted services with persistance and monitoring, loading xamlx services from the databse at runtime, etc, but all of these examples seem not to play nicely together. For example, a hosted state machine service, when in appfabric, has issues with the sequence of service method calls such as:
"Operation 'MethodName' on service instance with identifier 'efa6654f-9132-40d8-b8d1-5e611dd645b1' cannot be performed at this time. Please ensure that the operations are performed in the correct order and that the binding in use provides ordered delivery guarantees".
Also, if you call instancial workflow services at runtime from an sql store, they cannot be tracked in appfabric.
I would like to Thank Ron Jacobs for all of his very helpful Hands On Labs and blog posts.
Are there any examples out there that anyone knows of that will tie together all of these concepts?
Am I trying to do something that is not possible or am I attempting this in the right way?
Thanks for all your help and any comments that you can make to assist.
Nick
Regarding the error, it seems like you have modified the WF once deployed (is that #2 in your list?), hence the error you mention.
Versioning (or for this case, modifying a WF after it's been deployed) is something that will be improved in the coming version, but I don't think it will achieve what you need in #2 (if it is what I understood), as the same WF is used for every instance.

Transaction management in Web services

Our client follows SOA principles and have design web services that are very fine grained like createCustomer, deleteCustomer, etc.
I am not sure if fine grained services are desirable as they create transactional related issues. for e.g. if a business requirement is every Customer must have a Address when it's created. So in this case, the presentation component will invoke createCustomer first and then createAddress. The services internally use simple JDBC to update the respective tables in db. As a service is invoked by external component, it has not way of fulfilling transactional requirement here i.e. if createAddress fails, createCustomer operation must be rolledback.
I guess, one of the approach to deal with this is to either design course grained services (that creates a Customer and associated Address in one single JDBC transaction) or
perhaps simple create a reversing service (deleteCustomer) that simply reverses the action of createCustomer.
any suggestions. thanks
The short answer: services should be designed for the convenience of the service client. If the client is told "call this, then cdon't forget to call that" you're making their lives too difficult. There should be a coarse-grained service.
A long answer: Can a Customer reasonably be entered with no Address? So we call
createCustomer( stuff but no address)
and the result is a valid (if maybe not ideal) state for a customer. Later we call
changeCustomerAddress ( customerId, Address)
and now the persisted customer is more useful.
In this scenario the API is just fine. The key point is that the system's integrity does not depend upon the client code "remembering" to do something, in this case to add the address. However, more likely we don't want a customer in the system without an address in which case I see it as the service's responsibility to ensure that this happens, and to give the caller the fewest possibilities of getting it wrong.
I would see a coarse-grained createCompleteCustomer() method as by far the best way to go - this allows the service provider to solve the problem once rather then require every client programmer to implement the logic.
Alternatives:
a). There are web Services specs for Atomic Transactions and major vendors do support these specs. In principle you could actually implement using fine-grained methods and true transactions. Practically, I think you enter a world of complexity when you go down this route.
b). A stateful interface (work, work, commit) as mentioned by #mtreit. Generally speaking statefulness either adds complexity or obstructs scalability. Where does the service hold the intermediate state? If in memeory, then we require affinity to a particular service instance and hence introduce scaling and reliability problems. If in some State or Work-in-progress database then we have significant additional implementation complexity.
Ok, lets start:
Our client follows SOA principles and
have design web services that are very
fine grained like createCustomer,
deleteCustomer, etc.
No, the client has forgotten to reach the SOA principles and put up what most people do - a morass of badly defined interfaces. For SOA principles, the clinent would have gone to a coarser interface (such asfor example the OData meachsnism to update data) or followed the advice of any book on multi tiered architecture written in like the last 25 years. SOA is just another word for what was invented with CORBA and all the mistakes SOA dudes do today where basically well known design stupidities 10 years ago with CORBA. Not that any of the people doing SOA today has ever heard of CORBA.
I am not sure if fine grained services
are desirable as they create
transactional related issues.
Only for users and platforms not supporting web services. Seriously. Naturally you get transactional issues if you - ignore transactional issues in your programming. The trick here is that people further up the food chain did not, just your client decided to ignore common knowledge (again, see my first remark on Corba).
The people designing web services were well aware of transactional issues, which is why web service specification (WS*) contains actually mechanisms for handling transactional integrity by moving commit operations up to the client calling the web service. The particular spec your client and you should read is WS-Atomic.
If you use the current technology to expose your web service (a.k.a. WCF on the MS platform, similar technologies exist in the java world) then you can expose transaction flow information to the client and let the client handle transaction demarcation. This has its own share iof problems - like clients keeping transactions open maliciously - but is still pretty much the only way to handle transactions that do get defined in the client.
As you give no platform and just mention java, I am pointing you to some MS example how that can look:
http://msdn.microsoft.com/en-us/library/ms752261.aspx
Web services, in general, are a lot more powerfull and a lot more thought out than what most people doing SOA ever think about. Most of the problems they see have been solved a long time ago. But then, SOA is just a buzz word for multi tiered architecture, but most people thinking it is the greatest thing since sliced bread just dont even know what was around 10 years ago.
As your customer I would be a lot more carefull about the performance side. Fine grained non-semantic web services like he defines are a performance hog for non-casual use because the amount of times you cross the network to ask / update small small small small stuff makes the network latency kill you. Creating an order for like 10 goods can easily take 30-40 network calls in this scenario which will really possibly take a lot of time. SOA preaches, ever since the beginning (if you ignore the ramblings of those who dont know history) to NOT use fine grained calls but to go for a coarse grained exchange of documents and / or a semantical approach, much like the OData system.
If transactionality is required, a coarser-grained single operation that can implement transaction-semantics on the server is definitely going to be much simpler to implement.
That said, certainly it is possible to construct some scheme where the target of the operations is not committed until all of the necessary fine-grained operations have succeeded. For instance, have a Commit operation that checks some flag associated with the object on the server; the flag is not set until all of the necessary steps in the transaction have completed, and Commit fails if the flag is not set.
Of course, if having light-weight, fine grained operations is an important design requirement, perhaps the need to have transactionality should be re-thought.

Asynchronously Decoupled Three-Tier Architecture

Maybe I just expected "three-tier architecture" to deliver a little more than just a clean separation of responsibilities in the source code (see here)...
My expectations to such a beast that can safely call its self "three-tier architecture" are a lot higher... so, here they are:
If you were to build something like a "three tier architecture" system but this time with these, additional requirements and constraints:
Up and running at all times from a Users point of viewExpect when the UI gets replacedWhen other parts of the system are down, the UI has to handle that
Never get into a undefined state or one from which the system cannot recover automatically
The system has to be "pausable"
The middle-tier has to contain all the business logic
Obviously using an underlying Database, itself in the data-tier (if you like)
The business logic can use a big array of core services (here in the data-tier, not directly accessible by the UI, only through business logic tier facade)
Can be unavailable at times
Can be available as many parallel running, identical processes
The UI's may not contain any state other than the session in case of web UI's and possibly transient view baking models
Presentation-tier, logic-tier and data/core-services-tier have to be scalable independently
The only thing you can take for granted is the network
Note: The mentioned "core services" are heavy-weight components that access various external systems within the enterprise. An example would be the connection to an Active Directory or to a "stock market ticker"...
1. How would you do it?
If you don't have an answer right now, maybe read on and let me know what you think about this:
Sync considered harmful. Ties your system together in a bad way (Think: "weakest link"). Thread blocked while waiting for timeout. Not easy to recover from.
Use asynchronous messaging for all inter-process communication (between all tiers). Allows to suspend the system anytime you like. When part of the system is down, no timeout happens.
Have central routing component where all requests get routed through and core services can register themselves.
Add heartbeat component that can e.g. inform the UI that a component is not currently available.
State is a necessary evil: Allow no state other than in the business logic tier. This way the beast becomes manageable. While the core services might well need to access data themselves, all that data should be fed in by the calling middle tier. This way the core services can be implemented in a fire and forget fashion.
2. What do you think about this "solution"?
I think that, in the real world, high-availability systems are implemented using fail-over: for example, it isn't that the UI can continue to work without the business layer, instead it's that if the business layer becomes unavailable then the UI fails over to using a backup instance of the business layer.
Apart from that, they might operate using store-and-forward: e.g. a mail system might store a piece of mail, and retransmit it periodically, if it can't deliver it immediately.
Yep its the way most large websites do it. Look at nosql databases, Google's bigtable architecture etc.
1. This is the general approach I'd take.
I'd use a mixture of memcached , a nosql-cloud (couch-db or mongo-db) and enterprise grade RDBMS systems (core data storage) for the data layer. I'd then write the service layer ontop of the data layer. nosql database API's are massively parallel (look at couchdb with its ngingx service layer parallizer). I'd then provide "oldschool each request is a web-page" generating web-servers and also direct access to the service layer for new style AJAX application; both these would depend on the service layer.
p.s. the RDBMS is an important component here, it holds the authoritative copy of the all the data in the memchached/nosql cloud. I would use an enterprise grade RDBMS to do data-centre to data-centre replication. I don't know how the big boys do their cloud based site replication, it would scare me if they did data-cloud to data-cloud replication :P
Some points:
yYu do not need heartbeat, with nosql
the approach taken is that if content
becomes unavailable, you regenerate it
onto another server using the
authoratitve copy of the data.
The burden of state-less web-design
is carried to the nosql and memcached
layer which is infinitely scalable.
So you do not need to worry about
this. Just have a good network
infrastructure.
In terms of sync, when you are
talking to the RDBMS you can expect
acceptable synchronous response
times. Your cloud you should treat as
an asynchronous resource, you will
get help from the API's that
interface with your cloud so you
don't even have to think about this.
Advice I can give about networking
and redundancy is this: do not go for
fancy Ethernet bonding, as its not worth
it -- things always go wrong. Just
set up redundant switches, ethernet cards
and have multiple routes to all your
machines. You can use OpenBSD and
CARP for your routers, as they work
great - routers are your worst point of failure -- openbsd solves this problem.
2. You've described the general components of a web 2.0 farm, so no comment:D

Resources