How raft node know itself has voted after an crash recover? - raft

If an raft node has voted for some candidate ,then crash before it could persistent the vote info, will the server has re-vote ability after restart?

The way this should work is to persist the vote first before sending the vote.
In the worst case when the candidate does not receive enough votes (due to many crashes right after persisting or the vote is lost when being sent over the network), just start the election again.
Please note the highlighted text from raft paper: https://raft.github.io/raft.pdf
This could be confirmed by https://raft.github.io/ visualization:

Related

Back chain validation under node failures in R3 Corda

I am new to Corda. My question is not about any particular implementation, but more of an architectural question.
What happens during back chain validation if one of the nodes involved permanently dies and fails to respond? How is that transaction validated?
I have seen this issue that only talks about how of transaction volume could slow down validation. Does validation come to a grinding halt if one of the nodes fails permanently?
As per Corda webinar on Consensus, in the example that is 5 minutes into the video, the back chain is Charlie -> Dan -> Alice -> Bob. In this, if either Charlie or Dan are unavailable, the proposed transaction cannot be validated. The same webinar further says that this is not a problem in other blockchains such as Ethereum.
Applications that can foresee the need for a highly-available record keeper, can surely accommodate such a node during the design phase, as suggested by Adel Rustum.
However, a privacy-conscious application reluctant to leak information that is deployed globally, could suffer from many transaction-validation failures due to the vagaries of a wide-area network. Thoughts?
The short answer is, transaction verification will fail (if that node was the only node that had that transaction); and that's the point of using a DLT (or a blockchain). If you can't go back in the history of a certain block of data until genesis, then you can't verify how that block and its ancestors were created.
As for the issue that you referenced in your question; Corda Enterprise 4.4 introduced a new feature called bulk back-chain fetching, which allows modifying the way the transactions that are needed to verify a certain transaction are fetched. Previously it was depth first, now you can change that to breadth first and specify how many transactions you want to fetch in one call. More details in this video.
The back chain validation doesn't depend on the nodes who were part of the transaction in the past. The validation of the back chain is only done by those nodes who are part of the current ongoing transaction. The other nodes who were part of a past transaction that involves the evolution of a state in question don't need to be contacted (or stay online) while the back chain is validated.
Back chain validation only involves checking that all transaction which happened in the past on a particular state used as input in the current transaction is valid. The validity is checked by running the contracts again for those previous transactions. There is no real need to reach to the parties of a previous transaction.
However, you need to make sure that the parties involved in the current transaction are online and responding since you would need signatures from them to successfully complete the transaction.

Do we really need a notary that validates?

At the risk of sounding naive, I ask myself "Is a validating notary necessary?", given all the issues with it - transaction and dependencies leak, exposure of state model, to name a few.
The answer I hear has something to do with the potential attack where a dishonest node tries to steal someone else's asset. For example, in a legitimate transaction, partyA sold partyB some asset S subject to a Move contract. Immediately afterwards, partyA creates a self-signed transaction that transfers S back to himself subject to a dummy contract in a bogus flow that does not even run the ledger transaction verify(). But when he calls FinaltyFlow to satisfy the simple notary to commit the transaction on the ledger, it will fail the verifyContracts() because S appoints to the Move contract which should say owner partyB must sign the bogus transaction.
So that does not convince me of the need for a validating notary.
Apparently, I must have missed something. Can someone enlighten me?
Thanks.
\Sean
As you say, the advantage of the validating notary is that it prevents a malicious party from "wedging" a state - that is, creating an invalid transaction that consumes an unconsumed state that they are aware of. Although the transaction is invalid, the non-validating notary would not be aware of this, and would mark the state as spent.
You're correct that this invalid transaction would fail the FinalityFlow's call to verifyContracts(). However, we cannot force a malicious node to use FinalityFlow to send something to the notary. If the malicious node was sufficiently motivated, they could build the invalid transaction hash by hand and send that to the notary directly, for example.
However, note that in the non-validating notary case, there are still multiple layers of protection against wedging:
The malicious party has to be aware of the state(s) they want to wedge. Since information in Corda is only distributed on a need-to-know basis, the node would only be aware of a small subset of unconsumed states
If the state is wedged by accident, the node can show the invalid transaction to the notary. Upon seeing that the transaction is invalid, the notary will mark the re-mark the state as unconsumed
As Corda is a permissioned network, the notary can keep a record of the legal identity of everyone who submits a transaction. If a malicious node is submitting invalid transactions to consume states, their identity will be known, and the dispute can be resolved outside of the platform, based on the agreements governing participation in the network
On the other hand, SGX addresses the data leak issue associated with validating notaries. If the validation of the transaction occurs within an SGX enclave, the validating notary gains no information about the contents of the transaction they are validating.
Ultimately, the choice between validating and non-validating notaries comes down to the threat model associated with a given deployment. If you are using Corda in a setting where you are sure the participants won't deliberately change their node's code or act maliciously, then a validating notary is not needed.
But if you assume somebody WILL try to cheat and would be willing to write their own code to do so, then a validating notary provides an extra layer of protection.
So Corda provides choices:
Choose to reveal more to a notary cluster if you trust the participants relatively less...
Choose to reveal less to a notary cluster if you consider the risk of revealing too much to the notaries to be the bigger problem
(and use SGX if you're paranoid about everybody!)

Race condition between two network clients

I have the following problem: I have two network clients, where one is a device that is to be "claimed" by its owner, and another is the program which claims it. When the claimee hits the server, it announces it's available to be claimed, and then the claimer can claim it (after authenticating and supplying information only it could know of course). However, if claimer hits the server first, then I have a classic "lost signal" problem. The claimer can retry and that's fine, but I can end up with the following race condition, the main point in question:
Claimee hits the server and announces, then its connection fails
Claimer comes in and find the announced record, and claims it
Claimee reconnects with a status of unclaimed, and overwrites the claim
I've thought of a few solutions:
1) Expire old claimee announces after 60 seconds, and have the claimer retry. This is still susceptible to the above problem, but shrinks the window to about 60 seconds. In addition, the claimee takes about 30-40 seconds to bootstrap, so it should pragmatically make the problem very hard to encounter, or reproduce.
2) Have claims issued by claimer be valid for any claimee announce up to 30 seconds after the claim came in. This works, but starts to muddle the definition of a claimee announce: it means that the claimee announce isn't always interpreted to mean to "reset the claimee status," because for up to 30 seconds after the last claim it means "join to last claim."
Those are the high points, but may not be enough of a description of the problem, so let me know if I can add any comments to elucidate further. In terms of the solution, these are workable solutions, but I'm looking for an analogy to a known problem perhaps, and to see if there're ideas I haven't thought of.
Maybe I didn’t understand the problem description correctly, but you have also another problem - what if both are connected just fine and than the claimee fails? The claimer will need to deal with this issue as well, unless you’re assuming that this scenario never can happen.
In general there are several ways to implement a solution for both problems, but the probably most reliable one would be inspired by the implementation used by Java’s RMI.
When you send a message to the claimee add there a unique ID. When you don’t get an answer you can retry sending the message several times with the same ID (messages can get lost) and after some longer timeout you can accept that the claimee is unavailable. Now you can again look for connection information at the server and restart the process.
For this you’d need to cache all messages which haven’t been yet processed on claimer’s side. Additionally on claimee’s side you’ll need to cache the last X message IDs and their results (if available) This is necessary in order not to perform operations in one message multiple times and also be able to reply with the correct result again (since also result messages can get lost)

Many small writes to SQLite

I have an application, which runs all the time and receives some messages (rate of them varies from several per second to none per hour). Every message should be put into a SQLite database. What's the best way to do this?
Opening and closing the database on each message doesn't sound good: if there are tens of them per second, it will be extremely slow.
On the other hand, opening the database once and just writing to it can lead to loss of data if the process unexpectedly terminates.
It sounds like whatever you do, you'll have to make a trade-off.
If safety is your top-most concern, then update the database on each message and take the speed hit.
If you want a compromise, then update the database write every so many messages. For instance, maintain a buffer and every 100th message, issue an update, wrapped in a transaction.
The transaction wrapping is important for two reasons. First, it maximizes speed. Second, it can help you recover from errors if you employ logging.
If you do the batch update above, you can add an additional level of safety by logging each message as it comes to a file. You will reset this log every time a database update is successfully issues. That way, if an update fails, you know it failed on the entire block (since you are using transactions) and your log will have the information that did not update. This will allow you to re-issue the update, or even see if there was a problem with the data that caused the failure. This of course assumes that keeping a log is cheaper than updating the database, which can be the case depending on how you are connecting.
If your top rate is "several per second" then I dont see a real problem with opening and closing the db. This is especially true if its critical that the data be recorded right away in case of server failure.
We use SQLite in a reporting product and the best performance we have been able to eek out is recording rows in blocks of several thousands at a time. Our default is around setting is 50k. That means our app waits around until 50k rows of data is collected then commits it as one transaction.
There is an easy algorithm to adjust your application's behaviour to the message rate:
When you have just written a message, check if there is any new message.
If yes, write that message too, and repeat.
Only when you have run out of immediately available messages, commit the transaction and close the database.
In that manner, every message will be saved immediately, unless the message rate becomes too high for that.
Note: closing the database will not increase data durability (that's what transaction commit is for), it will just free up a little bit of memory.

Projecting simultaneous database queries

I’m after some thoughts on how people go about calculating database load for the purposes of capacity planning. I haven’t put this on Server Fault because the question is related to measuring just the application rather than defining the infrastructure. In this case, it’s someone else’s job to worry about that bit!
I’m aware there are a huge number of variables here but I’m interested in how others go about getting a sense of rough order of magnitude. This is simply a costing exercise early in a project lifecycle before any specific design has been created so not a lot of info to go on at this stage.
The question I’ve had put forward from the infrastructure folks is “how many simultaneous users”. Let’s not debate the rationale of seeking only this one figure; it’s just what’s been asked for in this case!
This is a web front end, SQL Server backend with a fairly fixed, easily quantifiable audience. To nail this down to actual simultaneous requests in a very rough fashion, the way I see it, it comes down to increasingly granular units of measurement:
Total audience
Simultaneous sessions
Simultaneous requests
Simultaneous DB queries
This doesn’t account for factors such as web app caching, partial page requests, record volume etc and there’s some creative license needed to define frequency of requests per user and number of DB hits and execution time but it seems like a reasonable starting point. I’m also conscious of the need to scale for peak load but that’s something else that can be plugged into the simultaneous sessions if required.
This is admittedly very basic and I’m sure there’s more comprehensive guidance out there. If anyone can share their approach to this exercise or point me towards other resources that might make the process a little less ad hoc, that would be great!
I will try, but obviously without knowing the details it is quite difficult to give a precise advice.
First of all, the infrastructure guys might have asked this question from the licensing perspective (SQL server can be licensed per user or per CPU)
Now back to your question. "Total audience" is important if you can predict/work out this number. This can give you the worst case scenario when all users hit the database at once (e.g. 9am when everyone logs in).
If you store session information you would probably have at least 2 connections per user (1 session + 1 main DB). But this number can be (sometimes noticeably) reduced by connection pooling (depends on how you connect to the database).
Use a worst case scenario - 50 system connection + 2 * number of users.
Simultaneous requests/queries depend on the nature of the application. Need more details.
More simultaneous requests (to your front end) will not necessarily translate to more requests on the back end.
Having said all of that - for the costing purposes you need to focus on a bigger picture.
SQL server license (If my memory serves me right) will cost ~128K AUD (dual Xeon). Hot/warm standby? Double the cost.
Disk storage - how much storage will you need? Disks are relatively cheap but if you are going to use SAN the cost might become noticeable. Also - the more disks the better from the performance perspective.

Resources