Can I create a flow to migrate state to new version instead of using contract upgrade? - corda

When I upgrade contract there are 2 step Authorise and Initiate. And since both step need to be done one by one per state (as my understanding) it take very long time when I have a large amount of data.
I ended up with looping call API to query some amount of data and then looping call ContractUpgradeFlow one by one.
The result is it took more than 11 hours and not finish upgrading.
So the question is if I create a flow A to query list of StateV1 as an input and create an out output to be list of StateV2.
Would it reduce the process for contract upgrade?
Should it be faster?
Is this considering same result like upgrade contract?
Would it be any effect to the next contract upgrade for StateV2 if I want to use Corda contract upgrade instead of flow A?

Yes correct with an explicit upgrade if there is a lot of data, it is going to take time as lot of things are happening behind the scenes.
Each and every unconsumed state is taken, new transaction is created, old state with old contract and new states with new contract are
added to this transaction, the transaction is sent to each signer for signing, setting of appropriate constraints is done, and finally the entire signed transaction is sent to notary.
“So the question is if I create a flow A to query list of StateV1 as an input and create an out output to be list of StateV2”
Yes you can very well create a flow to query list of StateV1 as an input and create an out output to be list of StateV2, but keep in mind you will also have to take care of all the steps which I have mentioned above which are as of now handled by the ContractUpgradeFlow.
“Would it reduce the process for contract upgrade?”
No I don’t think so as you will have to handle all the steps as mentioned above which are as of now handled by the ContractUpgradeFlow.
“Should it be faster?”
No it will take same time as taken by ContractUpgradeFlow

Related

Exactly-once semantics in Dataflow stateful processing

We are trying to cover the following scenario in a streaming setting:
calculate an aggregate (let’s say a count) of user events since the start of the job
The number of user events is unbounded (hence only using local state is not an option)
I'll discuss three options we are considering, where the two first options are prone to dataloss and the final one is unclear. We'd like to get more insight into this final one. Alternative approaches are of course welcome too.
Thanks!
Approach 1: Session windows, datastore and Idempotency
Sliding windows of x seconds
Group by userid
update datastore
Update datastore would mean:
Start trx
datastore read for this user
Merging in new info
datastore write
End trx
The datastore entry contains an idempotency id that equals the sliding window timestamp
Problem:
Windows can be fired concurrently, and then can hence be processed out of order leading to dataloss (confirmed by Google)
Approach: Session windows, datastore and state
Sliding windows of x seconds
Group by userid
update datastore
Update datastore would mean:
Pre-check: check if state for this key-window is true, if so we skip the following steps
Start trx
datastore read for this user
Merging in new info
datastore write
End trx
Store in state for this key-window that we processed it (true)
Re-execution will hence skip duplicate updates
Problem:
Failure between 5 and 7 will not write to local state, causing re-execution and potentially counting elements twice.
We can circumvent this by using multiple states, but then we could still drop data.
Approach 3: Global window, timers and state
Based on the article Timely (and Stateful) Processing with Apache Beam, we would create:
A global window
Group by userid
Buffer/count all incoming events in a stateful DoFn
Flush x time after the first event.
A flush would mean the same as Approach 1
Problem:
The guarantees for exactly-once processing and state are unclear.
What would happen if an element was written in the state and a bundle would be re-executed? Is state restored to before that bundle?
Any links to documentation in this regard would be very much appreciated. E.g. how does fault-tolerance work with timers?
From your Approach 1 and 2 it is unclear whether out-of-order merging is a concern or loss of data. I can think of the following.
Approach 1: Don't immediately merge the session window aggregates because of out of order problem. Instead, store them separately and after sufficient amount of time, you can merge the intermediate results in timestamp order.
Approach 2: Move the state into the transaction. This way, any temporary failure will not let the transaction complete and merge the data. Subsequent successful processing of the session window aggregates will not result in double counting.

Return entity updated by axon command

What is the best way to get the updated representation of an entity after mutating it with a command.
For example, lets say I have a project like digital-restaurant and I want to be able to update a field on the restaurant and return it's current state to the client making the update (to retrieve any modifications by different processes).
When a restaurant is created, it is easy to retrieve the current state (ie: the projection representation) after dispatching the create command by subscribing to a FindRestaurantQuery and waiting until a record is returned (see Restaurant CommandController)
However, it isn't so simple to detect when the result of an UpdateCommand has been applied to the projection. For example,
if we use the same trick and subscribe to the FindRestaurantQuery, we will be notified if the restaurant has been modified,
but it may not be our command that triggered the modification (in the case where multiple processes are concurrently issuing
update commands).
There seems to be two obvious ways to detect when a given update command has been applied to the projection:
Have a unique ID associated with every update command.
Subscribe to a query that is updated when the command ID has been applied to the projection.
Propagate the unique ID to the event that is applied by the aggregate
When the projection receives the event, it can notify the query listener with the current state
Before dispatching an update command, query the existing state of the projection
Calculate the destination state given the contents of the update command
In the case of (1): is there any situation (eg: batching / snapshotting) where the event carrying the unique ID may be
skipped over somehow, preventing the query listener from being notified?
Is there a more reliable / more idiomatic way to accomplish this use case?
Axon 4 with Spring boot.
Although fully asynchronous designs may be preferable for a number of reasons, it is a common scenario that back-end teams are forced to provide synchronous REST API on top of asynchronous CQRS+ES back-ends.
The part of the demo application that is trying to solve this problem is located here https://github.com/idugalic/digital-restaurant/tree/master/drestaurant-apps/drestaurant-monolith-rest
The case you are mentioning is totally valid.
I would go with the option 1.
My only concern is that you have to introduce new unique ID associated with every update command attribute to the domain (events). This ID attribute does not have any Domain/Business value by my opinion. There is an Audit(who, when) attribute associated to every event already, and maybe you can use that to correlate commands and subscriptions. I believe that there is more value in this solution (identity is part of domain), if this is not to relaxing for your case.
Please note that Queries have to be extended with Audit in this case (you will know who requested the Query)

Whats the best way to generate ledger change Events that include the Transaction Command?

The goal is to generate events on every participating node when a state is changed that includes the business action that caused the change. In our case, Business Action maps to the Transaction command and provides the business intent or what the user is doing in business terms. So in our case, where we are modelling the lifecycle of a loan, an action might be to "Close" the loan.
We model Event at a state level as follows: Each Event encapsulates a Transaction Command and is uniquely identified by a (TxnHash, OutputIndex) and a created/consumed status.
We would prefer a polling mechanism to generate events on demand, but an asynch approach to generate events on ledger changes would be acceptable. Either way our challenge is in getting the Command from the Transaction.
We considered querying the States using the Vault Query API vaultQueryBy() for the polling solution (or vaultTrackBy() for the asynch Obvservalble Stream solution). We were able to create a flow that gets the txn for a state. This had to be done in a flow, as Corda deprecated the function that would have allowed us to do this in our Springboot client. In the client we use vaultQueryBy() to get a list of States. Then we call a flow that iterates over the states, gets txHash from each StateRef and then calls serviceHub.validatedTransactions.getTransaction(txHash) to get signedTransaction from which we can ultimately retrieve the Command. Is this the best or recommended approach?
Alternatively, we have also thought of generating events of the Transaction by querying for transactions and then building the Event for each input and output state in the transaction. If we go this route what's the best way to query transactions from the vault? Is there an Observable Stream-based option?
I assume this mapping of states to command is a common requirement for observers of the ledger because it is standard to drive contract logic off the transaction command and quite natural to have the command map to the user intent.
What is the best way to generate events that encapsulate the transaction command for each state created or consumed on the ledger?
If I understand correctly you're attempting to get a notified when certain types of ledger updates occur (open, approved, closed, etc).
First: Asynchronous notifications are best practice in Corda, polling should be avoided due to the added weight it puts on the node for constant querying and delays. Corda provides several mechanisms for Observables which you can use: https://docs.corda.net/api/kotlin/corda/net.corda.core.messaging/-corda-r-p-c-ops/vault-track-by.html
Second: Avoid querying transactions from the database as these are intended to be internal to the node. See this answer for background on why to avoid transaction querying. In general only tables that begin with "VAULT_*" are intended to be queried.
One way to solve your use case would be a "status" field which reflects the command that was used to produce the current state. For example: if a "Close" command was used to produce the state it's status field could be "closed". This way you could use the above vaultTrackBy to look at each state's status field and infer the action that occured.
Just to finish up on my comment: While the approach met the requirements, The problem with this solution is that we have to add and maintain our own code across all relevant states to capture transaction-level information that is already tracked by the platform. I would think a better solution would be for the platform to provide consumers access to transaction-level information (selectively perhaps) just as it does for states. After all, the transaction is, in part, a business/functional construct that is meaningful at the client application level. For example, If I am "transferring" a loan, that may be a complex business transaction that involves many input and output states and may be an important construct/notion for the client application to manage.

A Queue is a collection for holding elements prior to processing

A Queue is a collection for holding elements prior to processing. All the collection needs some data before processing. Then why only in the queue interface it is mentioned like this..?. ArrayList, Linked List all required needs to insert data before processing the collection. Can any one help me on this
I think its dependent upon the requirement how the data should be processed.
Let's say you have a queue in cinema for buying tickets, now the tickets should be allocated to a person in the order they have come, so in this case QUEUE is preferred data structure as it maintains the FIFO(First In First Out) order.
But, in some other scenario, you might want data to be processed in order of "priority", in that case QUEUE might not come handy, you would want some sorting mechanism on priority on the data structure before processing it.<>
So, there are different data structures which keeps different way of handling data based on requirement.
You can search on different data structure and their processing, data storage for finding which best suits the need
A queue is a waiting line. It was added to the JCF with Java 5.0. And Reflecting the print queue prototyping, the Java doc states this - a Queue is a collection for holding elements prior to processing.
Think that there are two states for an element of the queue, "on hold(OH)" and "ready for processing(RFP)". Now, the head of the queue have the RFP state and all other elements have the OH state. Concluding, queue holds it's OH-elements prior they become RFP-element (head). After an element become RFP it can be popped out and processed.

What's the best way to create/use an ID throughout the processing of a message in Biztalk?

Our program so far: We have a process that involves multiple schemata, orchestrations and messages sent/received.
Our desire: To have an ID that links the whole process together when we log our progress into a SQL server table.
So far, we have a table that logs our progress but when there are multiple messages it is very difficult to read since Biztalk will, sometimes, process certain messages out of order.
E.g., we could have:
1 Beginning process for client1
2 Second item for client1
3 Third item for client1
4 Final item for client1
Easily followed if there's only one client being updated at a time. On the other hand, this will be much more likely:
1 Beginning process for client1
2 Beginning process for client2
3 Second item for client2
4 Third item for client2
5 Second item for client1
6 Third item for client1
7 Final item for client1
8 Final item for client2
It would be nice to have an ID throughout the whole thing so that the last listing could ordered by this ID field.
What is the best and/or quickest way to do this? We had thought to add an ID, we would create, from the initial moment of the first orchestration's triggering and keep passing that value to all the schemata and later orchestrations. This seems like a lot of work and would require we modify all the schemata - which just seems wrong.
Should we even be wanting to have such an ID? Any other solutions that come to mind?
This may not exactly be the easiest way, but have you looked at this:
http://blogs.msdn.com/b/appfabriccat/archive/2010/08/30/biztalk-application-tracing-made-easy-with-biztalk-cat-instrumentation-framework-controller.aspx
Basically it's an instrumentation framework which allows you to event out from pipelines, maps, orchs, etc.
When you write out to the event-trace you can use a "business key" which will tie mutltiple events together in a chain, similar to what you are saying.
Available here
http://btscatifcontroller.codeplex.com/
I'm not sure I fully understand all the details of your specific setup, but here goes:
If you can correlate the messages from the same client into a "long running" orchestration (which waits for subsequent messages from the same client), then the orchestration will have an automatically assigned ServiceId Guid, which will be kept throughout the orchestration.
As you say, for correlation purposes, you would usually try and use natural keys within the existing incoming message schemas to correlate subsequent messages back to the running orchestration - this way you don't need to change the schemas. In your example, ClientId might be a good correlation, provided that the same client cannot send multiple message 'sets' simultaneously. (and worst case, if you do add a new correlation key to the schemas, all systems involved in the orchestration will need to be changed to 'remember' this key and return it to you.) Again, assuming ClientId as a correlation key, in your example, 2 orchestrations would be running simultaneously - one for Client 1 and one for Client 2
However, for scalability and version control reasons, (very) long running orchestrations are generally to be avoided unless they are absolutely necessary (e.g. unless you can only trigger a process once all 4 client messages are received). If you decide to keep each message as a separate orchestration or just mapped and filtered on a port, another way to 'track' the sets of is by using BAM - you can use a continuation to tie all the client messages back together, e.g. for the purpose of a report or such.
Take a look at BAM. It's designed to do exactly what you describe: Using Business Activity Monitoring
This book has got a very good chapter about BAM and this tool, by one of the authors of the book, can help you developing your BAM solution. And finally, a nice BAM Poster.
Don't be put off by the initial complexity. When you get your head around it, BAM it's one of the coolest features of BizTalk.
Hope this helps. Good luck.
Biztalk assigns various values in the message context that usually persist for the life of the processing of that message. Such as the initial MessageId. Will that work for you?
In our application we have to use an externally provided ID (from the customer). We have a multi-part message with this id in part of it. You might consider that as well
You could create a UniqueId and StepId and pass them around in the message context. When a new process for a client starts set UniqueId to a Guid and StepId to 1. As it gets passed to the next process increment the StepId.
This would allow you to query events, grouped by client id and in the order (stepId) the event happened.

Resources