Initializing the publish-subscribe pattern - initialization

I am implementing a message passing framework using the publish-subscribe pattern. One optimization I would like to use is maintaining only a single list of messages per topic queue, where each subscriber has a position in the list signifying their position in the message queue. Once an event has been delivered to all subscribers it gets removed from the list. I am using the publish-subscribe framework to maintain an event sourcing pattern; the messages contain the changed attributes of the underlying state.
However, I am running in to a problem when a new subscriber needs to get it's initial state, or otherwise needs a snapshot of the underlying state. I would like to insert the snapshot into the message stream to guarantee that it gets processed in the correct order in relation to the state change messages, however if done naively all subscribers will get the snapshot every time any of the subscribers request one.
At its core, the problem is that the messages are identical for all subscribers, except for the small number of snapshots. I am wondering if there is a known pattern to solve the situation, or if should solve this ad hoc?

Related

Two conflicting long lived process managers

Let assume we got two long lived process managers. Both sagas operates over 10 milion items for example. First saga adds something to each item. Second saga removes it from each item. Given both process managers need few minutes to complete its job if I run them simultaneously I get into troubles.
Part of those items would hold the value while rest of them not. The result is close to random actually and depends on command order that affect particular item. I wondered if redispatching "Remove" command in case of failure would solve the problem. I mean if you try remove non existing value you should wait for the first saga to add the value. But while process managers are working someone else may dispatch "Remove" or "Add" command. In such case my approach would fail.
How may I solve such problem? :)
It seems that you would want the second saga to not run if the first saga is running (and presumably not run until some process which depends on whatever the first saga added being there). So the apparent solution would be to have a component (could be a microservice, could also be a record in a strongly consistent datastore like zookeeper/etcd/consul) that gives permission for the sagas to start executing. An example protocol might look like:
Saga sends a message to the component identifying the saga and conveying the intention to start
Component validates that no sagas might be running which would prevent this saga from running
Component responds with permission to start running
Subsequent saga attempts result in rejection until the running saga tells the component it's OK to run the other saga
Assuming that this component is reliably durable, the failure mode to worry about is that permission is granted but this component never processes the message that the saga finished (causes of this could include the permission message not getting delivered/processed or the saga crashing). No amount of acknowledgements or extra messages can solve this (it's basically the Two Generals' Problem).
A mitigation is to have this component (or something watching this component) alert if it seems that too much time has passed without saga completion. Whatever/whoever is responsible for ensuring liveness would then investigate to see if the saga is still running and if none is running, inform the component that it's OK to run the other saga. Note that this is not foolproof: it's quite possible for the decider in question to make what turns out to be the wrong decision.
I feel like I need more context. Whilst you don't say it explicitly, is the problem that the second saga tries to remove values that haven't been added by the first?
If that was true, a simple solution would be to just use a third state.
What I mean by that is to just more explicitly define and declare item state. You currently seem to have two states with value, and without value, but nothing to indicate if an item is ready to be processed by the second saga because the first saga has already done it's work on the item in question.
So all that needs to happen is that the second saga keeps looking for items where:
(with_value == true & ready_for_saga2 == true)
Ready_for_saga2 or "Saga 1 processing complete", whatever seems more appropriate in your context.
I'd say that the solution would vary based on which actual problem, we're trying to solve.
Say it's an inventory and add are items added to the inventory and remove are items requested for delivery. Then the order of commands does not matter that much because you could just process the request for delivery, when new items are added to the inventory.
This would lead to an aggregate root with two collections: Items and PendingOrders.
One process manager adds new inventory to Items - if any orders are pending, it will complete these orders in the same transaction and remove both the item and the order from the collections.
If the other process manager adds an order (tries to remove an item), it will either do it right away, if there's any items left - or it will add the order to the pending orders to be processed when new items arrive (and maybe notify someone about the delay, while we're at it).
This way we end up with the same state regardless of the order of commands, but the actual real-world-problem has great influence on the model chosen.
If we have other real world problems, we can make a model those too.
Let's say you have two users that each starts a process that bulk updates titles on inventory items. In this case you - and the users - have to decide how best to resolve this conflict - what will lead to the best real world outcome.
If you want consistency across all the items - all or no items should be updated by a single bulk update - I would embed this knowledge in a new model. Let's call it UpdateTitlesProcesses. We have only one instance of this model in the system. The state is shared between processes. This model is effectually a command queue, and when a user initiates the bulk operation, it adds all the commands to the queue and starts processing each item one at a time.
When the second user initiates another title update, the business logic in our models will reject this, as there's already another update started. Or if the experts say that the last write should win, then we ditch the remaining commands from the first process and add the new ones (and similarly we should decide what should happen if a user issues a single title update, not bulk - should it be rejected, prioritized or put on hold?).
So in short I'd say:
Make it clear which real world problem we are solving - and thus which conflict resolution outcome is best (probably a trade off, often also something that requires user interaction or notification).
Model this explicitly (where processes, actions and conflict handling are also part of the model).

Return entity updated by axon command

What is the best way to get the updated representation of an entity after mutating it with a command.
For example, lets say I have a project like digital-restaurant and I want to be able to update a field on the restaurant and return it's current state to the client making the update (to retrieve any modifications by different processes).
When a restaurant is created, it is easy to retrieve the current state (ie: the projection representation) after dispatching the create command by subscribing to a FindRestaurantQuery and waiting until a record is returned (see Restaurant CommandController)
However, it isn't so simple to detect when the result of an UpdateCommand has been applied to the projection. For example,
if we use the same trick and subscribe to the FindRestaurantQuery, we will be notified if the restaurant has been modified,
but it may not be our command that triggered the modification (in the case where multiple processes are concurrently issuing
update commands).
There seems to be two obvious ways to detect when a given update command has been applied to the projection:
Have a unique ID associated with every update command.
Subscribe to a query that is updated when the command ID has been applied to the projection.
Propagate the unique ID to the event that is applied by the aggregate
When the projection receives the event, it can notify the query listener with the current state
Before dispatching an update command, query the existing state of the projection
Calculate the destination state given the contents of the update command
In the case of (1): is there any situation (eg: batching / snapshotting) where the event carrying the unique ID may be
skipped over somehow, preventing the query listener from being notified?
Is there a more reliable / more idiomatic way to accomplish this use case?
Axon 4 with Spring boot.
Although fully asynchronous designs may be preferable for a number of reasons, it is a common scenario that back-end teams are forced to provide synchronous REST API on top of asynchronous CQRS+ES back-ends.
The part of the demo application that is trying to solve this problem is located here https://github.com/idugalic/digital-restaurant/tree/master/drestaurant-apps/drestaurant-monolith-rest
The case you are mentioning is totally valid.
I would go with the option 1.
My only concern is that you have to introduce new unique ID associated with every update command attribute to the domain (events). This ID attribute does not have any Domain/Business value by my opinion. There is an Audit(who, when) attribute associated to every event already, and maybe you can use that to correlate commands and subscriptions. I believe that there is more value in this solution (identity is part of domain), if this is not to relaxing for your case.
Please note that Queries have to be extended with Audit in this case (you will know who requested the Query)

Whats the best way to generate ledger change Events that include the Transaction Command?

The goal is to generate events on every participating node when a state is changed that includes the business action that caused the change. In our case, Business Action maps to the Transaction command and provides the business intent or what the user is doing in business terms. So in our case, where we are modelling the lifecycle of a loan, an action might be to "Close" the loan.
We model Event at a state level as follows: Each Event encapsulates a Transaction Command and is uniquely identified by a (TxnHash, OutputIndex) and a created/consumed status.
We would prefer a polling mechanism to generate events on demand, but an asynch approach to generate events on ledger changes would be acceptable. Either way our challenge is in getting the Command from the Transaction.
We considered querying the States using the Vault Query API vaultQueryBy() for the polling solution (or vaultTrackBy() for the asynch Obvservalble Stream solution). We were able to create a flow that gets the txn for a state. This had to be done in a flow, as Corda deprecated the function that would have allowed us to do this in our Springboot client. In the client we use vaultQueryBy() to get a list of States. Then we call a flow that iterates over the states, gets txHash from each StateRef and then calls serviceHub.validatedTransactions.getTransaction(txHash) to get signedTransaction from which we can ultimately retrieve the Command. Is this the best or recommended approach?
Alternatively, we have also thought of generating events of the Transaction by querying for transactions and then building the Event for each input and output state in the transaction. If we go this route what's the best way to query transactions from the vault? Is there an Observable Stream-based option?
I assume this mapping of states to command is a common requirement for observers of the ledger because it is standard to drive contract logic off the transaction command and quite natural to have the command map to the user intent.
What is the best way to generate events that encapsulate the transaction command for each state created or consumed on the ledger?
If I understand correctly you're attempting to get a notified when certain types of ledger updates occur (open, approved, closed, etc).
First: Asynchronous notifications are best practice in Corda, polling should be avoided due to the added weight it puts on the node for constant querying and delays. Corda provides several mechanisms for Observables which you can use: https://docs.corda.net/api/kotlin/corda/net.corda.core.messaging/-corda-r-p-c-ops/vault-track-by.html
Second: Avoid querying transactions from the database as these are intended to be internal to the node. See this answer for background on why to avoid transaction querying. In general only tables that begin with "VAULT_*" are intended to be queried.
One way to solve your use case would be a "status" field which reflects the command that was used to produce the current state. For example: if a "Close" command was used to produce the state it's status field could be "closed". This way you could use the above vaultTrackBy to look at each state's status field and infer the action that occured.
Just to finish up on my comment: While the approach met the requirements, The problem with this solution is that we have to add and maintain our own code across all relevant states to capture transaction-level information that is already tracked by the platform. I would think a better solution would be for the platform to provide consumers access to transaction-level information (selectively perhaps) just as it does for states. After all, the transaction is, in part, a business/functional construct that is meaningful at the client application level. For example, If I am "transferring" a loan, that may be a complex business transaction that involves many input and output states and may be an important construct/notion for the client application to manage.

Cosmos DB ChangeFeed Exception Handling

With Cosmos DB ChangeFeed, can anyone please provide some help with exception handling?
Let's say if I have 10 documents in the change feed, I have a loop to iterate through the documents one by one. Let's assume if there was an exception happened after the 5th document that is processed.
What is going to happen with the changefeed?
So far, it looks to me that the entire changefeed is swallowed, i.e. the rest documents after the exception are gone.
I am just wondering what is the backout strategy on this? Is there a way I can completely backout the entire batch so I do not loose any changes.
It is an old question, but hopefully other may find it useful.
To handle the error, the recommended pattern is to wrap your code with try-catch. Catch the error and put that document on a queue (dead-letter). Have a separate program to deal with those document which produced the error. This way if you have 100-document batch, and just one document failed, you do not have to throw away the whole batch.
Second reason is if you can keep getting those documents from Change Feed then you may lose the last snapshot on the document. Change Feed keeps only one last version of the document, in between other processes can come and change the document.
As you keep fixing your code, you will soon find no documents on dead-letter queue.
Azure Function is automatically called by Change Feed system. If you want to roll back the Change Feed and control every aspect of it, you should consider using Change processor Feed SDK.
Recommendation from MS, to add try-catch in your CosmosDB trigger function. If any document throw exception you have to store in place.
Once you will start storing failed messages in some location, you have to build metrics, alerts and re-process strategy.
Below is my strategy to handle this scenario. My One function listing to DB changefeed and pushing data into "Topic" (without any process). I created multiple subscriptions so each subscription maintain own dead-letter queue.

WF4 receive activity to be able to CreateInstance AND handle subsequent correlation

I want to create a workflow that will be persistent and which will consist of a Pick activity containing the following:
A Receive pick activity (ReceiveItem) which can Create a WF Instance using an email address parameter for correlation AND can also be called again later with the same email address and be picked up in correlation to start up the correct persisted WF. Each item is added to a queue for later processing
A MaxItems pick activity which will force the processing of the queue when it reaches a defined size and
A Timer pick activity which will simply process all queued items at the end of the day
Please Note: I want to receive the second and subsequent items via RecieveItem with the same email address parameter.
My question is:
Will this work as I suggest or am I going to get correlation collisions because the Receive activity can CreateInstance? Or will WF simply create a WF Instance at the beginning and then always correlate after that?
If this will not work how could I implement this with one single Receive activity and still get the benefit of single workflow handling the both the receive and batch operations?
That will work just fine. Check this blog post for an example of how to do that. The complete XAML is listed at the bottom if you want to inspect all Receive settings.

Resources