What's the best way to create/use an ID throughout the processing of a message in Biztalk? - biztalk

Our program so far: We have a process that involves multiple schemata, orchestrations and messages sent/received.
Our desire: To have an ID that links the whole process together when we log our progress into a SQL server table.
So far, we have a table that logs our progress but when there are multiple messages it is very difficult to read since Biztalk will, sometimes, process certain messages out of order.
E.g., we could have:
1 Beginning process for client1
2 Second item for client1
3 Third item for client1
4 Final item for client1
Easily followed if there's only one client being updated at a time. On the other hand, this will be much more likely:
1 Beginning process for client1
2 Beginning process for client2
3 Second item for client2
4 Third item for client2
5 Second item for client1
6 Third item for client1
7 Final item for client1
8 Final item for client2
It would be nice to have an ID throughout the whole thing so that the last listing could ordered by this ID field.
What is the best and/or quickest way to do this? We had thought to add an ID, we would create, from the initial moment of the first orchestration's triggering and keep passing that value to all the schemata and later orchestrations. This seems like a lot of work and would require we modify all the schemata - which just seems wrong.
Should we even be wanting to have such an ID? Any other solutions that come to mind?

This may not exactly be the easiest way, but have you looked at this:
http://blogs.msdn.com/b/appfabriccat/archive/2010/08/30/biztalk-application-tracing-made-easy-with-biztalk-cat-instrumentation-framework-controller.aspx
Basically it's an instrumentation framework which allows you to event out from pipelines, maps, orchs, etc.
When you write out to the event-trace you can use a "business key" which will tie mutltiple events together in a chain, similar to what you are saying.
Available here
http://btscatifcontroller.codeplex.com/

I'm not sure I fully understand all the details of your specific setup, but here goes:
If you can correlate the messages from the same client into a "long running" orchestration (which waits for subsequent messages from the same client), then the orchestration will have an automatically assigned ServiceId Guid, which will be kept throughout the orchestration.
As you say, for correlation purposes, you would usually try and use natural keys within the existing incoming message schemas to correlate subsequent messages back to the running orchestration - this way you don't need to change the schemas. In your example, ClientId might be a good correlation, provided that the same client cannot send multiple message 'sets' simultaneously. (and worst case, if you do add a new correlation key to the schemas, all systems involved in the orchestration will need to be changed to 'remember' this key and return it to you.) Again, assuming ClientId as a correlation key, in your example, 2 orchestrations would be running simultaneously - one for Client 1 and one for Client 2
However, for scalability and version control reasons, (very) long running orchestrations are generally to be avoided unless they are absolutely necessary (e.g. unless you can only trigger a process once all 4 client messages are received). If you decide to keep each message as a separate orchestration or just mapped and filtered on a port, another way to 'track' the sets of is by using BAM - you can use a continuation to tie all the client messages back together, e.g. for the purpose of a report or such.

Take a look at BAM. It's designed to do exactly what you describe: Using Business Activity Monitoring
This book has got a very good chapter about BAM and this tool, by one of the authors of the book, can help you developing your BAM solution. And finally, a nice BAM Poster.
Don't be put off by the initial complexity. When you get your head around it, BAM it's one of the coolest features of BizTalk.
Hope this helps. Good luck.

Biztalk assigns various values in the message context that usually persist for the life of the processing of that message. Such as the initial MessageId. Will that work for you?
In our application we have to use an externally provided ID (from the customer). We have a multi-part message with this id in part of it. You might consider that as well

You could create a UniqueId and StepId and pass them around in the message context. When a new process for a client starts set UniqueId to a Guid and StepId to 1. As it gets passed to the next process increment the StepId.
This would allow you to query events, grouped by client id and in the order (stepId) the event happened.

Related

Two conflicting long lived process managers

Let assume we got two long lived process managers. Both sagas operates over 10 milion items for example. First saga adds something to each item. Second saga removes it from each item. Given both process managers need few minutes to complete its job if I run them simultaneously I get into troubles.
Part of those items would hold the value while rest of them not. The result is close to random actually and depends on command order that affect particular item. I wondered if redispatching "Remove" command in case of failure would solve the problem. I mean if you try remove non existing value you should wait for the first saga to add the value. But while process managers are working someone else may dispatch "Remove" or "Add" command. In such case my approach would fail.
How may I solve such problem? :)
It seems that you would want the second saga to not run if the first saga is running (and presumably not run until some process which depends on whatever the first saga added being there). So the apparent solution would be to have a component (could be a microservice, could also be a record in a strongly consistent datastore like zookeeper/etcd/consul) that gives permission for the sagas to start executing. An example protocol might look like:
Saga sends a message to the component identifying the saga and conveying the intention to start
Component validates that no sagas might be running which would prevent this saga from running
Component responds with permission to start running
Subsequent saga attempts result in rejection until the running saga tells the component it's OK to run the other saga
Assuming that this component is reliably durable, the failure mode to worry about is that permission is granted but this component never processes the message that the saga finished (causes of this could include the permission message not getting delivered/processed or the saga crashing). No amount of acknowledgements or extra messages can solve this (it's basically the Two Generals' Problem).
A mitigation is to have this component (or something watching this component) alert if it seems that too much time has passed without saga completion. Whatever/whoever is responsible for ensuring liveness would then investigate to see if the saga is still running and if none is running, inform the component that it's OK to run the other saga. Note that this is not foolproof: it's quite possible for the decider in question to make what turns out to be the wrong decision.
I feel like I need more context. Whilst you don't say it explicitly, is the problem that the second saga tries to remove values that haven't been added by the first?
If that was true, a simple solution would be to just use a third state.
What I mean by that is to just more explicitly define and declare item state. You currently seem to have two states with value, and without value, but nothing to indicate if an item is ready to be processed by the second saga because the first saga has already done it's work on the item in question.
So all that needs to happen is that the second saga keeps looking for items where:
(with_value == true & ready_for_saga2 == true)
Ready_for_saga2 or "Saga 1 processing complete", whatever seems more appropriate in your context.
I'd say that the solution would vary based on which actual problem, we're trying to solve.
Say it's an inventory and add are items added to the inventory and remove are items requested for delivery. Then the order of commands does not matter that much because you could just process the request for delivery, when new items are added to the inventory.
This would lead to an aggregate root with two collections: Items and PendingOrders.
One process manager adds new inventory to Items - if any orders are pending, it will complete these orders in the same transaction and remove both the item and the order from the collections.
If the other process manager adds an order (tries to remove an item), it will either do it right away, if there's any items left - or it will add the order to the pending orders to be processed when new items arrive (and maybe notify someone about the delay, while we're at it).
This way we end up with the same state regardless of the order of commands, but the actual real-world-problem has great influence on the model chosen.
If we have other real world problems, we can make a model those too.
Let's say you have two users that each starts a process that bulk updates titles on inventory items. In this case you - and the users - have to decide how best to resolve this conflict - what will lead to the best real world outcome.
If you want consistency across all the items - all or no items should be updated by a single bulk update - I would embed this knowledge in a new model. Let's call it UpdateTitlesProcesses. We have only one instance of this model in the system. The state is shared between processes. This model is effectually a command queue, and when a user initiates the bulk operation, it adds all the commands to the queue and starts processing each item one at a time.
When the second user initiates another title update, the business logic in our models will reject this, as there's already another update started. Or if the experts say that the last write should win, then we ditch the remaining commands from the first process and add the new ones (and similarly we should decide what should happen if a user issues a single title update, not bulk - should it be rejected, prioritized or put on hold?).
So in short I'd say:
Make it clear which real world problem we are solving - and thus which conflict resolution outcome is best (probably a trade off, often also something that requires user interaction or notification).
Model this explicitly (where processes, actions and conflict handling are also part of the model).

Return entity updated by axon command

What is the best way to get the updated representation of an entity after mutating it with a command.
For example, lets say I have a project like digital-restaurant and I want to be able to update a field on the restaurant and return it's current state to the client making the update (to retrieve any modifications by different processes).
When a restaurant is created, it is easy to retrieve the current state (ie: the projection representation) after dispatching the create command by subscribing to a FindRestaurantQuery and waiting until a record is returned (see Restaurant CommandController)
However, it isn't so simple to detect when the result of an UpdateCommand has been applied to the projection. For example,
if we use the same trick and subscribe to the FindRestaurantQuery, we will be notified if the restaurant has been modified,
but it may not be our command that triggered the modification (in the case where multiple processes are concurrently issuing
update commands).
There seems to be two obvious ways to detect when a given update command has been applied to the projection:
Have a unique ID associated with every update command.
Subscribe to a query that is updated when the command ID has been applied to the projection.
Propagate the unique ID to the event that is applied by the aggregate
When the projection receives the event, it can notify the query listener with the current state
Before dispatching an update command, query the existing state of the projection
Calculate the destination state given the contents of the update command
In the case of (1): is there any situation (eg: batching / snapshotting) where the event carrying the unique ID may be
skipped over somehow, preventing the query listener from being notified?
Is there a more reliable / more idiomatic way to accomplish this use case?
Axon 4 with Spring boot.
Although fully asynchronous designs may be preferable for a number of reasons, it is a common scenario that back-end teams are forced to provide synchronous REST API on top of asynchronous CQRS+ES back-ends.
The part of the demo application that is trying to solve this problem is located here https://github.com/idugalic/digital-restaurant/tree/master/drestaurant-apps/drestaurant-monolith-rest
The case you are mentioning is totally valid.
I would go with the option 1.
My only concern is that you have to introduce new unique ID associated with every update command attribute to the domain (events). This ID attribute does not have any Domain/Business value by my opinion. There is an Audit(who, when) attribute associated to every event already, and maybe you can use that to correlate commands and subscriptions. I believe that there is more value in this solution (identity is part of domain), if this is not to relaxing for your case.
Please note that Queries have to be extended with Audit in this case (you will know who requested the Query)

Can I create a flow to migrate state to new version instead of using contract upgrade?

When I upgrade contract there are 2 step Authorise and Initiate. And since both step need to be done one by one per state (as my understanding) it take very long time when I have a large amount of data.
I ended up with looping call API to query some amount of data and then looping call ContractUpgradeFlow one by one.
The result is it took more than 11 hours and not finish upgrading.
So the question is if I create a flow A to query list of StateV1 as an input and create an out output to be list of StateV2.
Would it reduce the process for contract upgrade?
Should it be faster?
Is this considering same result like upgrade contract?
Would it be any effect to the next contract upgrade for StateV2 if I want to use Corda contract upgrade instead of flow A?
Yes correct with an explicit upgrade if there is a lot of data, it is going to take time as lot of things are happening behind the scenes.
Each and every unconsumed state is taken, new transaction is created, old state with old contract and new states with new contract are
added to this transaction, the transaction is sent to each signer for signing, setting of appropriate constraints is done, and finally the entire signed transaction is sent to notary.
“So the question is if I create a flow A to query list of StateV1 as an input and create an out output to be list of StateV2”
Yes you can very well create a flow to query list of StateV1 as an input and create an out output to be list of StateV2, but keep in mind you will also have to take care of all the steps which I have mentioned above which are as of now handled by the ContractUpgradeFlow.
“Would it reduce the process for contract upgrade?”
No I don’t think so as you will have to handle all the steps as mentioned above which are as of now handled by the ContractUpgradeFlow.
“Should it be faster?”
No it will take same time as taken by ContractUpgradeFlow

fetch 1 message at given offset when debugging app using spring-kafka

I understand, that random access in inefficient. But app failed, record content is (let's assume it) big for logging or otherwise inappropriate (that's true without assumption), so I only have info, that record at this offset failed and why. Good, now I want to see the data, let's say to be able to reproduce it. How to do that?
OK, I can use ConsumerSeekAware consumer, but that will rewind the position and process all records from that position on. I don't want that, I want just 1 specific message. I can use specific consumer in specific consumer group for this use case not to influence others and set ConsumerConfig.MAX_POLL_RECORDS_CONFIG to 1 so that each pull returns just 1 record, but this will not stop all records from reaching the listener. Since there is no way how to call poll manually, programmatically. Right? Or is there such a way? Or other how to achieve this? Even if I try to reach spring-kafka internals, the org.apache.kafka.clients.consumer.The consumer seems to be made inaccessible on purpose, or at least I do not see the way.
Yes, you can just create your own consumer manually and poll it.
Get a reference to the consumer factory and call `createConsumer("tempGroup", "tempClient").
You would need to create a second consumer factory with max.poll.records=1.
You can copy the other properties from the main factory by calling getConfigurationProperties() - and creating a new map from it and create a new DefaultKafkaConsumerFactory.
Close the consumer when you are done.

WF4 receive activity to be able to CreateInstance AND handle subsequent correlation

I want to create a workflow that will be persistent and which will consist of a Pick activity containing the following:
A Receive pick activity (ReceiveItem) which can Create a WF Instance using an email address parameter for correlation AND can also be called again later with the same email address and be picked up in correlation to start up the correct persisted WF. Each item is added to a queue for later processing
A MaxItems pick activity which will force the processing of the queue when it reaches a defined size and
A Timer pick activity which will simply process all queued items at the end of the day
Please Note: I want to receive the second and subsequent items via RecieveItem with the same email address parameter.
My question is:
Will this work as I suggest or am I going to get correlation collisions because the Receive activity can CreateInstance? Or will WF simply create a WF Instance at the beginning and then always correlate after that?
If this will not work how could I implement this with one single Receive activity and still get the benefit of single workflow handling the both the receive and batch operations?
That will work just fine. Check this blog post for an example of how to do that. The complete XAML is listed at the bottom if you want to inspect all Receive settings.

Resources