In Corda, how are attachments hashed and verified? - corda

I have the following questions about the SecureHash generated by Corda when uploading an attachment JAR file to the node:
Is the hash always unique based on the content of the file?
If the contents of the JAR file change, is the hash guaranteed to change (just like with a normal hash function)?
Can the counterparty verify the integrity of the hash (and check the attachment has not been tampered with) at their end?
Can the counterparty verify the authenticity of the jar (that it has been signed by the uploader)?

In theory, SecureHash is a wrapper for hashes generated using various hashing algorithms. However, only SecureHash.SHA256, which uses the SHA-2 hash function family, has been implemented as of Corda 3.1. A hash collision with this algorithm is extremely, extremely unlikely.
Again, it is extremely, extremely unlikely that if the JAR contents change, the hash will stay the same.
Upon receiving an attachment for the first time, the node checks that it is a valid JAR file before importing it into its attachment storage. Instead of being sent the attachment hash by the counterparty, it calculates the attachment's hash itself and uses that hash as the attachment's ID.
JAR signing is not implemented as of Corda 3.1. The node simply checks that the JAR file is valid.

Related

Clarifying Vault key decryption process

I'm trying to understand Vault workflow w.r.t. keys, e.g.: https://www.vaultproject.io/docs/concepts/seal
As I understand,
unseal (shared) keys are provided on init
they're used to acquire the combined key
combined key is then used to decrypt a root (master) key (which is apparently stored in the sealed vault)
root key is then used to decrypt the data encryption key (or a keyring which contains it?..)
the data encryption key is then used to en/decrypt the data in Vault
I get the unseal keys on init, how can I inspect the other keys? Is it impossible / are those keys just stored somewhere internally in Vault?
Unsealing is the process of obtaining the plaintext root key necessary to read the decryption key to decrypt the data, allowing access to the Vault.
Is the data encryption key / keyring also decrypted during the unseal, or is it... maybe decrypted on each Vault operation (so only the root key is stored somewhere in plaintext after the unseal)?
Is it ok that the root key is stored in plaintext after the unseal? Or is it still protected by passwords/tokens?.. Or if it's just transiently used to decrypt the data encryption key / keyring, then how are those protected? I guess it has smth to do with the lock icon on the image :)
I'm somewhat confused about how it all works.
Vault, like any other software, has the concept of a super user. That concept is called the "root token" (not be confused with a root key, we'll come back to that).
A full lecture on Vault internal architecture is way beyond the scope of a StackOverflow answer. Here is my attempt to clear up the confusion, leaving the details for you to explore.
Two takeaway to begin with:
Vault stores a strong hash of the root token (if it is still valid)
Vault does not store shards at all, anywhere, ever.
Super user access (aka root token)
When Vault is initialized, it will give you two set of data:
The almighty powerful root token
A number of Shamir shards (5 is the default, 1 in DEV mode)
The inital root token is always given out in plain text, never encrypted. It is meant to be used right away to perform the initial configuration.
Best practice is to use the root token only to:
Create a very powerful policy (let's call it almost-root)
Setup at least one authentication method
Bind one account of the authentication method to the almost-root policy
Revoke the root token (with vault token revoke -self)
At this point you can be almost-root, which should be behind some two factor auth, strong audit, and have a limited validity period (30 minutes), etc. Your CI/CD is often able to be almost-root.
Now let's say that for security reason, the almost-root policy does not allow to change the audit devices, adding or removing them.
To change the audit device configuration, you need to get a new root token. To get one, you must generate one from the shards you got at initialization time. It is a security measure. No one can become root on his own. You must collude with other which decreases the payout of a wrongdoing and increases the risk to get caught.
But let's move on and talk about generating a root token.
Shards and getting the root token
Vault will give you the shards in plain text. Shards are points on a Shamir curve.
When you initialize Vault, you should send public keys so that the shards come out encrypted. You can provide public keys on the command line (as file names) or with keybase.io aliases.
You can then safely distribute one shard per "shard holder". Have them decrypt and test their shard right away by generating a root token and revoking it right away. You don't want to find out that a user misplaced his private key or whatever.
You must test the shards on a regular basis. People come and go, computer crash. When you get close to your threshold, you must rekey to get new shards to new "shard holders". If your enterprise has a physical safe, consider generating enough extra shards to generate a root token, decrypt them, store them on a CD and print them out, and put all of that in a sealed enveloppe in the safe.
Generating a root token should be rare, but we use actually use it at the end of every sprint to generate an almost-root token that we give our CI/CD tool. That means that in day to day use, nobody has super powers in Vault. Define "super powers" to fit your operational reality.
Internal encryption
So the root token does not participate in the encryption at all, or else the system would stop when the root token is revoked. And you can't have 3 or more shard holders around for every decryption Vault does (and it can do a lot).
So Vault has an internal encryption key that is encrypted with key material outside of Vault. Suffice to say that it is either:
Encrypted with a key made from the shards
Encrypted with a KMS of HSM
When Vault starts, it cannot read its own storage, whatever storage backend you use. It is behind a cryptographic "barrier". You can give Vault credentials to your KMS so that it can decrypt the Key Encryption Key (KEK) that will give it its internal encryption key. That process is called auto-unseal.
Where do you store the Vault credentials to auto-unseal with a KMS or HSM you might ask? Good question, you have to start somewhere. Maybe you set your cloud policy to allow a given pod running Vault to access your KMS without a password.
If you can't setup auto-unseal, Vault will start in sealed state. You must enter a given number of shards (3 is the default) to allow Vault to generate that KEK and unseal itself. It will run like that until restarted or manually sealed.

Self-validating encrypted string - is method feasible?

I have a keystring which allows customer to have additional features.
Obviously I would like the software to check that this string is valid, and not modified.
Is the following idea feasible:
get the key string as encrypted value, and encode it in Base64
(my encrypted string is around 100 characters, for my purpose)
calculate the checksum (MD5) of course using a private salt.
weave the checksum into the encrypted data
In principle :
xxxxCxxxxxxCxxxxxxxxCxxxxxxxxxxCxxxxxxxxxxxxxCxxx
the places to weave into the encrypted data could be determined by first cher of the encrypted, creating up to 16 different patterns.
On checking the code validity I simply "unweave" the checksum, test if it's correct, and thereby know if the data has been modified.
Is my line of thoughts correct ?
The cryptographic feature you're thinking of is called "authentication," and there are many well-established approaches. You should strongly avoid inventing your own, particularly using a long-outdated hash like MD5. When an encryption system is authenticated, it can detect changes to the ciphertext.
Your best approach is to use an authenticated cipher mode, such as AES-GCM. Used correctly, that combines encryption an authentication in a single operation. While decrypting an authenticated scheme, the decryption will fail if the cipher text has been modified.
If you don't have access to AES-GCM, the next option is AES-CBC+HMAC, which uses the more ubiquitous AES-CBC with a random IV, and appends a type of encrypted hash (called an HMAC) to the end of the message to authenticate it. In order to authenticate, you need to remove the HMAC, use it to validate that the cipher text is unmodified, and then proceed to decrypt normally. This scheme is generally called "encrypt then MAC."
The implementation details will depend on your language and frameworks.

Authentication keys in smart cards

I use JCManager tool load applets on my java-cards. This software has three fields for authentication keys in the top named S_ENC, S_MAC and DEK.
As I know, ENC stands for Encryption, MAC stands for Message Authentication Code and DEK stands for Data Encryption Key.
I want to know when they used (which step in communication? INITIAL UPDATE , EXTERNAL AUTHENTICATION? ... )?
Does all of these three keys, used in every communications or some of them are optional to use?
and where? (Card or Terminal or both?)
And also, I want to know what is KEK? Is there any KEK in smart cards?
Read the Global Platform Card specifications (registration required) on how the keys are used. The way they are used during authentication differs in the Global Platform specification, so it's better to go straight to the source. For instance E.4.2. of GPC 2.2 specifies:
Generating or verifying an
authentication cryptogram uses the S-ENC session key and the signing method described in appendix
B.1.2.1 - Full Triple DES.
The DEK - or a key derived from the given DEK - is uses for additional encryption of confidential data, such as keys. It would for instance allow for wrapping of keys within a Hardware Security Module, before sending it over the secure message channel (which may not encrypt at all, mind you). For older schemes it was required to also derive a DEK session key, which - paired with the awkward proprietary key derivation - made it near impossible to do so without programming the HSM specifically for Global Platform.
DEK is a more generic term than KEK (Key Encryption Key). I can be used for any data that needs to be kept confidential separate from the transport channel.

How to safely de-duplicate files encrypted at the client's side?

Bitcasa's claim its to provide infinite storage for a fixed fee.
According to a TechCrunch interview, Bitcasa uses client-side convergent encryption. Thus no unencrypted data ever reaches the server. Using convergent encryption, the encryption-key gets derived from the be encrypted source-data.
Basically, Bitcasa uses a hash function to identify identical files uploaded by different users to store them only once on their servers.
I wonder, how the provider is able to ensure, that no two different files get mapped to the same encrypted file or the same encrypted data stream, since hash functions aren't bijective.
Technical question: What do I have to implement, so that such a collision may never happen.
Most deduplication schemes make the assumption that hash collisions are so unlikely to happen that they can be ignored. This allows clients to skip reuploading already-present data. It does break down when you have two files with the same hash, but that's unlikely to happen by chance (and you did pick a secure hash function to prevent people from doing it intentionally, right?)
If you insist on being absolutely sure, all clients must reupload their data (even if it's already on the server), and once this data is reuploaded, you must check that it's identical to the currently-present data. If it's not, you need to pick a new ID rather than using the hash (and sound the alarm that a collision has been found in SHA1!)

How to implement password protection for individual files?

I'm writing a little desktop app that should be able to encrypt a data file and protect it with a password (i.e. one must enter the correct password to decrypt). I want the encrypted data file to be self-contained and portable, so the authentication has to be embedded in the file (or so I assume).
I have a strategy that appears workable and seems logical based on what I know (which is probably just enough to be dangerous), but I have no idea if it's actually a good design or not. So tell me: is this crazy? Is there a better/best way to do it?
Step 1: User enters plain-text password, e.g. "MyDifficultPassword"
Step 2: App hashes the user-password and uses that value as the symmetric key to encrypt/decrypt the data file. e.g. "MyDifficultPassword" --> "HashedUserPwdAndKey".
Step 3: App hashes the hashed value from step 2 and saves the new value in the data file header (i.e. the unencrypted part of the data file) and uses that value to validate the user's password. e.g. "HashedUserPwdAndKey" --> "HashedValueForAuthentication"
Basically I'm extrapolating from the common way to implement web-site passwords (when you're not using OpenID, that is), which is to store the (salted) hash of the user's password in your DB and never save the actual password. But since I use the hashed user password for the symmetric encryption key, I can't use the same value for authentication. So I hash it again, basically treating it just like another password, and save the doubly-hashed value in the data file. That way, I can take the file to another PC and decrypt it by simply entering my password.
So is this design reasonably secure, or hopelessly naive, or somewhere in between? Thanks!
EDIT: clarification and follow-up question re: Salt.
I thought the salt had to be kept secret to be useful, but your answers and links imply this is not the case. For example, this spec linked by erickson (below) says:
Thus, password-based key derivation as defined here is a function of a password, a salt, and an iteration count, where the latter two quantities need not be kept secret.
Does this mean that I could store the salt value in the same place/file as the hashed key and still be more secure than if I used no salt at all when hashing? How does that work?
A little more context: the encrypted file isn't meant to be shared with or decrypted by others, it's really single-user data. But I'd like to deploy it in a shared environment on computers I don't fully control (e.g. at work) and be able to migrate/move the data by simply copying the file (so I can use it at home, on different workstations, etc.).
Key Generation
I would recommend using a recognized algorithm such as PBKDF2 defined in PKCS #5 version 2.0 to generate a key from your password. It's similar to the algorithm you outline, but is capable of generating longer symmetric keys for use with AES. You should be able to find an open-source library that implements PBE key generators for different algorithms.
File Format
You might also consider using the Cryptographic Message Syntax as a format for your file. This will require some study on your part, but again there are existing libraries to use, and it opens up the possibility of inter-operating more smoothly with other software, like S/MIME-enabled mail clients.
Password Validation
Regarding your desire to store a hash of the password, if you use PBKDF2 to generate the key, you could use a standard password hashing algorithm (big salt, a thousand rounds of hashing) for that, and get different values.
Alternatively, you could compute a MAC on the content. A hash collision on a password is more likely to be useful to an attacker; a hash collision on the content is likely to be worthless. But it would serve to let a legitimate recipient know that the wrong password was used for decryption.
Cryptographic Salt
Salt helps to thwart pre-computed dictionary attacks.
Suppose an attacker has a list of likely passwords. He can hash each and compare it to the hash of his victim's password, and see if it matches. If the list is large, this could take a long time. He doesn't want spend that much time on his next target, so he records the result in a "dictionary" where a hash points to its corresponding input. If the list of passwords is very, very long, he can use techniques like a Rainbow Table to save some space.
However, suppose his next target salted their password. Even if the attacker knows what the salt is, his precomputed table is worthless—the salt changes the hash resulting from each password. He has to re-hash all of the passwords in his list, affixing the target's salt to the input. Every different salt requires a different dictionary, and if enough salts are used, the attacker won't have room to store dictionaries for them all. Trading space to save time is no longer an option; the attacker must fall back to hashing each password in his list for each target he wants to attack.
So, it's not necessary to keep the salt secret. Ensuring that the attacker doesn't have a pre-computed dictionary corresponding to that particular salt is sufficient.
As Niyaz said, the approach sounds reasonable if you use a quality implementation of strong algorithms, like SHA-265 and AES for hashing and encryption. Additionally I would recommend using a Salt to reduce the possibility to create a dictionary of all password hashes.
Of course, reading Bruce Schneier's Applied Cryptography is never wrong either.
If you are using a strong hash algorithm (SHA-2) and a strong Encryption algorithm (AES), you will do fine with this approach.
Why not use a compression library that supports password-protected files? I've used a password-protected zip file containing XML content in the past :}
Is there really need to save the hashed password into the file. Can't you just use the password (or hashed password) with some salt and then encrypt the file with it. When decrypting just try to decrypt the file with the password + salt. If user gives wrong password the decrypted file isn't correct.
Only drawbacks I can think is if the user accidentally enters wrong password and the decryption is slow, he has to wait to try again. And of course if password is forgotten there's no way to decrypt the file.

Resources