Back chain validation under node failures in R3 Corda

Back chain validation under node failures in R3 Corda - corda

I am new to Corda. My question is not about any particular implementation, but more of an architectural question.
What happens during back chain validation if one of the nodes involved permanently dies and fails to respond? How is that transaction validated?
I have seen this issue that only talks about how of transaction volume could slow down validation. Does validation come to a grinding halt if one of the nodes fails permanently?
As per Corda webinar on Consensus, in the example that is 5 minutes into the video, the back chain is Charlie -> Dan -> Alice -> Bob. In this, if either Charlie or Dan are unavailable, the proposed transaction cannot be validated. The same webinar further says that this is not a problem in other blockchains such as Ethereum.
Applications that can foresee the need for a highly-available record keeper, can surely accommodate such a node during the design phase, as suggested by Adel Rustum.
However, a privacy-conscious application reluctant to leak information that is deployed globally, could suffer from many transaction-validation failures due to the vagaries of a wide-area network. Thoughts?

The short answer is, transaction verification will fail (if that node was the only node that had that transaction); and that's the point of using a DLT (or a blockchain). If you can't go back in the history of a certain block of data until genesis, then you can't verify how that block and its ancestors were created.
As for the issue that you referenced in your question; Corda Enterprise 4.4 introduced a new feature called bulk back-chain fetching, which allows modifying the way the transactions that are needed to verify a certain transaction are fetched. Previously it was depth first, now you can change that to breadth first and specify how many transactions you want to fetch in one call. More details in this video.

The back chain validation doesn't depend on the nodes who were part of the transaction in the past. The validation of the back chain is only done by those nodes who are part of the current ongoing transaction. The other nodes who were part of a past transaction that involves the evolution of a state in question don't need to be contacted (or stay online) while the back chain is validated.
Back chain validation only involves checking that all transaction which happened in the past on a particular state used as input in the current transaction is valid. The validity is checked by running the contracts again for those previous transactions. There is no real need to reach to the parties of a previous transaction.
However, you need to make sure that the parties involved in the current transaction are online and responding since you would need signatures from them to successfully complete the transaction.

Related

Axon FW 4.6 comes with Dead Letter Queue support, but is it possible to still have rollback support in case of exception?

we're looking into using the new feature of Axon, dead letter queue(DLQ).
In our previous application (axon 4.5x) we have a lot of eventhandlers updating projections. Our default is to rethrow exceptions when they occur, which will trigger a rollback for the database updates. Perhaps not the best practice to rely on this behaviour (because it can not rollback everything, eg sending an email from event can not be reverted of course)
This behaviour seems to get lost when introducing DLQ in our applications, which has big impact on our current application flow (projections are updated when they previously weren't). This makes upgrading not that easy.
Is it possible to still get the old behaviour(transaction rolled back in case of exceptions) together with DLQ processing?
What we tried was building a test application to test the new DLQ features. While playing around all looks fine in case of exceptions (they were moved to dlq) but the projections still got updated (not rolled back as before)
We throw an exception after the .save() of the projection simulating a database failure to see if events involved (we have multiple eventhandlers for an event updating projections) got rolled back.

You need to choose here, #davince. Storing a dead-letter in the dead-letter queue similarly requires a database transaction.
To ensure the token still progresses and the dead letter is entered, the framework uses the existing transaction.
Furthermore, in practical terms event handling was successful.
Hence, a rollback for some of the parts wouldn't be feasible.
The best way to deal with this, as is mentioned in the Reference Guide, is to make your event handlers idempotent:
Before configuring a SequencedDeadLetterQueue it is vital to validate whether your event handling functions are idempotent.
As a processing group consists of several Event Handling Components
(as explained in the intro of this chapter), some handlers may succeed
in event handling while others will not. As a configured dead-letter
queue does not stall event handling, a failure in one Event Handling
Component does not cause a rollback for other event handlers.
Furthermore, as the dead-letter support is on the processing group
level, dead-letter processing will invoke all event handlers for that
event within the processing group.
Thus, if your event handlers are not idempotent, processing letters may result in undesired side effects.
Hence, we strongly recommend making your event handlers idempotent when using the dead-letter queue.
The principle of exactly-once delivery is no longer guaranteed; at-least-once delivery is the reality to cope with.
I understand this requires some rework on your end, #davince. Although the feature is powerful, it comes with a certain level of responsibility before you use it.
I hope the above clarifies this for you.
Added, I'd like to point out that the version upgrade in itself does not require you to use the dead-letter queue. Hence, this change shouldn't impose any strains for updating to the latest release.
Update 1
Sometimes you need to think about an issue a bit longer.. I was just wondering the following things about your setup. Perhaps I can help out on that front:
What storage mechanism do you use to store projections in?
Where are you storing your tokens?
Where are you planning to store your dead-letters?
How are you invoking the storage layer from your event handlers?
Update 2
Thanks for sharing that you're using PostgreSQL for your projections, tokens, and dead letters. And that you're using JPA entities for storage.
It gives more certainty about your setup, as it may impact how your system would react in case of a rollback.
However, as it's rather vanilla/regular, the shared comment from the Reference Guide still applies.
This, sadly enough, means some work on your end, #davince. I hope the road forward to start using the SequencedDeadLetterQueue from Axon Framework is clear.
By the way, if you ever have recommendations on how the framework or the documentation may be improved, be sure to file issues in GitHub here and here, respectively.

How Do I Customise The Corda Hospital?

As you can see from this other question a flow is being sent to the hospital when a unique db constraint is being violated
org.h2.jdbc.JdbcSQLIntegrityConstraintViolationException: Unique index or primary key violation:
This is clearly never going to be able to be resolved so I want it to fail instead and not go to the hospital.
It is currently going to the hospital due to Cordas built-in rules.
Is it possible to modify these rules to prevent this exception from being sent to the hospital?

Unfortunately, according to the official documentation; this type of errors will go to the flow hospital:
Database constraint violation (ConstraintViolationException): This scenario may occur due to natural contention between racing flows as Corda delegates handling using the database’s optimistic concurrency control. If this exception occurs, the flow will retry. After retrying a number of times, the errored flow is kept in for observation.
So you have 2 things that you can do:
Go to the database and modify the existing record that's colliding with the record the flow is trying to add.
Go to your node's terminal and kill the flow.

Does Corda really require a notary to achieve uniqueness consensus?

The Corda introduction to consensus says "uniqueness consensus is provided by notaries."
Are we saying that without a notary that it would be possible for A to convince B to commit a transaction to its ledger involving a state X as an input and at the same time, or later, convince C to commit a different transaction involving X to its ledger?
In this situation the ledger of A would be inconsistent with that of C (or B or both depending on what transaction, if any, it chooses to commit) and A would have created a situation that is inconsistent now and can never become consistent between A, B and C.
Presumably, the Corda framework tries to prevent this kind of thing as far as possible, so is this all about honesty? I.e. we're talking about the situation where A completely subverts its own infrastructure, i.e. doesn't use Corda as intended, and lies in all the messages it sends other parties?
Update: this question was initially asked due to my mistaken belief that notaries were an optional element of a Corda system. They are not, but their involvement may be optional for particular transactions, e.g. ones that involve no input states (and therefore by their nature have no double-spend issue).
The important thing that #joel makes clear in his answer is that the double-spend issue can also be a problem even if all parties trust each other, i.e. no malicious behavior is expected.
Once a party in Corda determines that validity consensus has been reached for a transaction it can immediately commit the transaction to its own ledger, i.e. it does not first try to reach some kind of additional BFT style consensus with the other parties that they can and will definitely commit the transaction to their respective ledgers as well.
So in the above scenario A could honestly/mistakenly propose two different transactions to B and C. B and C would both reach validity consensus on their respective transactions and commit them to their own ledgers with A only being confronted with the double-spend issue when it afterwards tried to commit the second of the two transactions to its own ledger.
The notary avoids such situations (whether the result of malicious intent or not).

There are two reasons you need a notary:
Malicious nodes: A node purposefully extracts a consumed state from its vault, consumes it in another transaction, and sends the transaction to a counterparty who didn't see the original transaction
Race conditions: Two nodes simultaneously propose transactions consuming the same state

How to do a transaction rollback in corda

How can I do a transaction rollback in corda. Let's say I've a complex flow which includes 2 flows. I want to rollback the previous transaction if the last one failed how can I do that in corda? Or I need to re-design my complex flow or invalidate the previous state created myself? eg: I've a main flow.in that I created subflow which creates a new state (or updates some state). now suppose for some reason the main flow fails how do I rollback transaction created by my previous subflow?

Once a transaction has been notarised, it is final and cannot be rolled back. However, depending on how the transaction's contracts are written, it may be possible to consume the newly created state to create the old state again.
Regarding your comment, the broadcast cannot "fail" in Corda unless one of the nodes permanently leaves the network. ACKs are used to ensure messages between nodes are always received.

118 SQL Statements were executed for one transaction

recently we are trouble shooting one performance problem for our application, the code initially was build based on M13 version of Corda tutorial code and we followed Corda’s releases and now it is updated as V2.0. The business is simple, Party A upload a contract document with some meta data in a form, then send this transaction to Party B, we defined some simple conditions in verify function, so normally the transaction will be completed without any manual action. But this process if we did that in our local environment, it took around 3 secondes(with one 2.9M attachment), but when we deploy it to our dev environment which H2 is hosted in a seperate server from the CorDapp, it always take 15-20 seconds to complete, with one notary node.
We tried to enable the H2 track log feature, and from the log, we found that 165 SQL statements were executed, includes 114 selects, 31 inserts, 16 deletes, 2 updates and 2 alters. Our flow is mostly similar as the tutorial’s code, except the acceptor flow we have the similar verify function as the initiator flow and we have attachment but the tutorial doesn’t.
Use the same approach, I executed one create IOU transaction on the Corda Example code which is based on V1.0(as there is no V2.0 example code, so I only can do it on V1.0), for that transaction, 118 SQL Statements were executed, includes 74 selects, 28 insert, 14 deletes and 2 updates.
There are also lots of “SET LOCK_MODE” and COMMITs, and checkpoints were delete and inserted frequently. So we would like to get your comments for below questions, kindly help on this. Thanks.
Whether these so much SQL execution are reasonable for a transaction, and these must needed to happen to complete one transaction?
As we may not be able to understand what is the purpose for each SQL execution, so do you have any suggestion about what we should do for next step to get the root cause for it? Is 15-20 seconds for a transaction is normal as we host H2 database and notary separately in different servers? Our CorDapp(Party A, Party B and notary), H2 database are hosted on Azure VM separately.

Work on Corda has focussed thus far more on functionality than on performance. There are many performance improvements - in the database and elsewhere - that will be implemented in future releases.
However, there is no reason for a transaction to take 15-20 seconds.
How many times are you running the transaction? I ask because every time a node sees a new attachment as part of a transaction, it caches it and stores it for later reference. This means that the large attachment only needs to be sent across the wire for the first transaction. If you send 10 transactions referencing the same attachment, it will only be downloaded the first time, and the other 9 times will be much faster. Do you observe these improvements?

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex