What are the benefits of HMAC over symmetric cryptography? - encryption

Somehow I don't get HMACs.
I once asked Why do I need HMACs when we do have public key signatures?, and I think I got this one. Easier to compute, and so on ...
But, what I do not get is why we need HMACs at all, respectively what kind of problem they are solving.
From my understanding, HMACs ...
provide a way to make sure the message has not been tampered,
are "secured" by a secret, but symmetric key.
Hence for calculating the HMAC (either initially or for verification) I do need to know the secret key.
Now, if I can exchange this key in a secret way without it being tampared, I could also exchange the message in the very same secret way without it being tampered, don't I?
Okay, now you could argue that you only need to exchange the key once, but you may have multiple messages. That's fine.
But if we now have a secret key that must be kept secret by all parties, we could also directly use symmetric encryption using the very same secret key to encrypt the message, couldn't we?
Of course, an HMAC shall provide a solution against tampering, but if I only have an encrypted message without the secret key and a reasonable encryption algorithm, I can not change that encrypted message in a way that a) decryption still works, and b) a meaningful decrypted message appears.
So what do I need an HMAC actually for?
Or - where is the point that I am missing?

You're assuming that it is impossible to tamper with an encrypted message without knowing the key used for encryption. This is not the case and a dangerous assumption to make. There are several things possible even if you only have access to the ciphertext:
Corruption of a suffix of the message: this can leak information about the content through error messages, timing and possibly other ways.
Corruption of ranges of the message for some modes (ECB, CFB and possibly others): same as above but the attacker has more ways to trigger the wanted behaviour.
Flipping of arbitrary bits in a single block (not knowing their initial value though) and corruption of the following block (CFB): If some bits are known to the attacker he can set them to the value he wants.
Flipping of arbitrary bits in the whole message for stream ciphers or stream cipher equivalent modes for block ciphers: This can avoid corruption altogether.
Thus it is very important to verify that no attacker tampered with the message before processing even a single byte of the decrypted content. Since there are again some pitfalls in doing this using ad-hoc verification or simple hashing there is a need for MACs of which HMAC is one example.

Related

How to encrypt and use a symmetric key with an asymmetric key pair

I have messages I need to be able to encrypt when being sent. They should only be able to be decrypted by the receiver.
Initially, I had a structure where the message is encrypted using the receiver's public key, and the receiver then uses their private key to decrypt their messages. However, since I was using RSA, the size of the message was quite limited.
I'm imagining two potential solutions, but am not quite sure how to implement the better one (option 2).
(Easy) just split up each message into many smaller parts, encrypt and store those. This would only change the query structure of my app but not the encryption structure.
I could encrypt the messages with symmetric keys, which is faster and works on any size. However, I would then need to encrypt that symmetric key with an asymmetric one. The problem then becomes that I can only decrypt the symmetric key when the asymmetric private one is provided, ie when the receiver wants to read their messages. So in that case, how would I actually encrypt the messages? Since I don't want the sender to be able to access a key used for decryption as well.
The problem then becomes that I can only decrypt the symmetric key when the asymmetric private one is provided, ie when the receiver wants to read their messages. So in that case, how would I actually encrypt the messages?
That's simple, you use an ephemeral, message specific, fully random symmetric key for data encryption before you encrypt it with the public key. Preferably you should explicitly destroy the symmetric key after that. You can prefix the wrapped (encrypted) symmetric key before the ciphertext of the message, as it will always have the same size in bytes as the modulus (i.e. the RSA key size in bytes).
The system you are thinking about, which is much better than splitting up messages for RSA, is called a hybrid cryptosystem. There are various other ways to accomplish the same thing such as RSA-KEM and - for elliptic curves - ECIES. Both are not often present in crypto-libraries though.
If you decide to use RSA/AES for sending cryptograms then I would advice you to use OAEP and e.g. AES-CTR rather than AES-CBC as RSA PKCS#1 v1.5 padding and CBC padding are both vulnerable to padding oracle attacks.
It is highly recommended to sign the messages, otherwise an adversary can encrypt fake messages. Encryption is only used to achieve message confidentiality, not message integrity & authenticity. An adversary may even try plaintext oracle attacks if any message can be send. If you are not allowing a set of private keys that you control then you should sign-then-encrypt, not encrypt-then-sign.
And as always, prefer TLS or other explicit secure transport protocols if that's an option for transport security.

Asymmetrical encryption for more that two recipients?

I want to create an application, where multiple people should be able to communicate with each other securely (think of a decentral group chat) - sounds easy, but here is my problem:
As far as I understood, with asymmetrical encryption you have a public key and a private key. Everyone who wants to send a message to someone has to encrypt it with the public key and the recipient can decrypt it with the private key.
But if there are more than two people that should be able to read all messages, I don´t know how this should work...
Either everyone has the public and the private key - which I think is a bad idea - or everyone has to have everybodys public key and has to send a seperate message to each recipient.
Also, I want to make a 100% sure, that the one who sends a message really is who he pretends to be. (so nobody is able to "fake" messages)
Is there an encryption algorithm that solves my problem?
Controlling the extend of the recipient group
In a comment to Richard Schwartz' good answer, you ask
Is it possible with this algorithm to ensure that only one is able to invite others? As far as I understood, everybody could distribute the decrypted session key.
When applying the protocol in a group chat scenario, don't let the term "session key" mislead you. Rather, think of the key for symmetric encryption as a "message key": Each time someone sends a message to the group, they should generate a new random symmetric key, encrypt it with every legit receiver's public key separately and prepend all these cryptograms to the symmetrically encrypted message. This way, each sender decides independently whom they consider a part of the legit recipients group of their own sent messages.
This will give the protocol some more transmission overhead, but this probably won't matter in practice. What could matter is the 'cost' of getting larger amounts of 'good' randomness (entropy) to generate sufficiently unpredictable message keys. So an acceptable optimization might be that, if the group of legit recipients has remained the same, a sender might re-use the session key of their own previously sent message. Never though should they re-purpose a sessions key received from another group member for sending messages of their own.
Off course, even if each sender decides independently whom they consider a legit recipient of their message, you can't keep any legit recipient from compromising messages they received: They can simply forward the messages unencrypted (or encrypted for someone not in the original recipient group) to whomever they want.
Ensuring authenticity
In an edit to your original question, you added
Also, I want to make a 100% sure, that the one who sends a message really is who he pretends to be. (so nobody is able to "fake" messages)
Encryption can't do that, but cryptography has another way to make sure that
the message actually comes from whom claimed to have sent it
the message hasn't been tampered with since
And the way of ensuring these things is signatures, which also are something that public-private-key cryptography enables. Let senders sign their messages with their private key. (Which usually means 'encrypting' a cryptographically secure hash of the message with the private key.) And let receivers verify the signatures (by 'decrypting' the signature with the sender's public key and comparing the result with a hash of the message they computed themselves.)
Don't roll your own anything (except when you should)
Richard's answers advices you to not roll your own (pseudo) random number generator. For anything you plan to use in production, I'd extend this to anything encryption:
Don't invent your own protocols
Don't invent your own cyphers, signatures or hash functions
Don't invent your own way of gathering entropy
Don't roll your own implementations of any of the above, even if invented by others
Instead, use well-established cryptography libraries. These are written and reviewed by experts in both cryptographic theory and in the practices of writing secure software. And while even these libraries are often enough found to have (sometimes embarrassing) security issues, nothing you'll come up with yourself will be nearly as secure as them.
Though, for learning, implementing any or all of the listed stuff (including pseudo random number generators) is great exercise and helps you understand at least some aspects of the underlying cryptography. And this understanding is important, as it's often difficult enough to correctly and securely use the well-established libraries, even when you do have some knowledge of the concepts they reveal through their interfaces.
And of course for innovating within cryptography, inventing new stuff (and getting it scrutinizingly reviewed by the community of experts in the field) is necessary, too. That new stuff just shouldn't be used for anything serious before it has passed that review successfully.
I assume you mean asymmetric encryption, not asynchronous encryption.
In most cases, we don't actually use an asymmetric cipher to encrypt the content of messages. That's because messages can be large, and asymmetric ciphers are slow in comparison to symmetric ciphers. It's also because of the issue you are contending with here: in a multi-party commmunicaiton, you'd like to be able to just send the message once and have everybody be able to read it. So the trick is that we combine asymmetric and symmetric techniques into a protocol that solves the problem.
First, we generate a random symmetric key which we can call the "session key". We're going to distribute this session key to all recipients, but we need to do this securely. Here's where we're actually going to use asymmetric encryption. We encrypt the session key once for each recipient using each of their public keys and an asymmetric cipher (such as RSA), and we send the encrypted session key to each recipeint. We can send it to each recipient separately, or we can just build a structure that looks like this:
"recip1|recip1EncryptedSessionKey|recip2|recip2EncryptesSessionKey..."
and send the whole thing out to all recipients, each of whom will be able to parse it and decrypt their own encrypted copy of the session key. (This is generally how it's done in encrypted email: the list of all encrypted versions of the session key for all recipients is enclosed with the message, and everyone gets the exact same email.)
Once we've securely distributed the session key to all recipients, we can use the session key to encrypt each message just once with a symmetric cipher (such as AES), and send the same encrypted message to all recipients. Since they all have received a copy of the session key, they can all read it and act on it.
Note that as in all things having to do with encryption, it is crucial that the session key is really random. Don't just rely on a plain vanilla random number generator for it, and for heavens sake don't roll your own. Make sure that you use a cryptographically secure pseudorandom number generator.
A real chat system would likely be quite a bit more complicated, probably with a mechanism for re-establishing a new session key periodically, and the details of a secure protocol can be quite intricate. I.e., consider how you would protect against a bad-guy stepping in and fooling everyone into using a session key of his choosing! But the basics are as above.

AES/Rijndael: search on encrypted data - static salt and IV

I want to do searching on encrypted data. Which means that there is the need to have the same ciphertext every time I encrypt the same plaintext. I.e. think of a list of encrypted names and I want to find all "Kevin"'s in it. I would now encrypt "Kevin" and search the database for the encrypted text. All hits will be "Kevin"'s — but still only the one who has the password knows.
Now my question: What about security if I use the same salt and IV (to get the effect described above) all the time? Is the encryption still secure? Or is there any other method to do searching on encrypted data?
If you want to do a deterministic encryption then you should use an encryption mode
that has been designed for deterministic encryption (and not modify an encryption mode designed for something else).
One possibility is the SIV encryption mode described in
RFC 5297.
(Of course, deterministic encryption has its drawbacks, but discussing this is not part of this question.)

AES encryption and the need for Integrity

I did some research on the topic but could not find anything similar to my question. So I hope some of you great guys may help me out.
I want to use AES128 encryption (CFB-Mode) for the networking in my application between two individual clients. The data being exchanged consists only of textual strings of a specific structure, for example, the first bytes allways tell the recipient the kind of message they are receiving, so they can process them. With AES I want to ensure the confidentiality of the message, but now the question of "integrity" arises.
Normaly you would consider using a MAC. But isn't it guaranteed that nobody has altered the message, if the recipient is able to decrypt it correctly, i.e. that the message can be used correctly in his application because of the string's format? Wouldn't altering (even 1 bit) the encrypted message by a third party result in garbage during decryption?
Furthermore let's assume that the application is a multi-party peer-to-peer-game, where two of the players are communicating with each other on a private but AES-encrypted channel. Now the originator of the message is not playing fair and intentionally sending a fraudulent encrypted message to convey an impression that the message has been altered by a random third party (to force a player to quit). Now the recipient would have no chance to determine if the message has been altered or if the sender acts fraudulent, am I right? So Integrity would not be of much use in such a situation and could be neglected?
This may sound like an odd and out of world example. But it's something I recently encountered in a similar application and I am asking myself if there is a solution to the problem or if I got the basic Idea of AES encryption.
As you said, you may detect changes in the format of the plain text message after encryption. But at what level would it go wrong? Do you have something that is large and redundant enough to be tested? What are you going to do if the altered plain text results in some obscure exception somewhere down the line? With CFB (like most modes) an attacker can make sure that only the last part of the message is altered, for instance, and leave the first blocks intact.
And you are worried about cheats as well.
In my opinion, you are much better off using a MAC or HMAC algorithm, or a cipher mode that provides integrity/authentication on top of confidentiality (EAX or GCM for instance). If you are sure nobody else has the symmetric key, an authentication check (such as a MAC) will prove that the data has been signed by the correct key. So no, the user cannot claim that the data has been changed in transport if the authenticity checks succeed.
The next question becomes: can you trust that the symmetric key is only in possession of the other player? For this you might want to use some sort of PKI scheme (using assymetric keys) together with a key exchange mechanism such as DH. But that is for a later, if you decide to go that way.
This is a bit out of my depth, but...
Yes, modifying the encrypted bytes of an AES encrypted message should cause the decryption to fail (this has been my experience with the c# implementation). The client who decrypts will know the message is invalid. EDIT: apparently this is not the case. Looks like you'd need a CRC or hash to verify the message was successfully decrypted. The more serious problem is if the secret AES key is leaked (and in a peer-to-peer environment, the key has to be sent so the receiver can decrypt the message at all). Then a 3rd party can send messages as if they were a legitimate client, and they will be accepted as OK.
Integrity is much harder. I'm not entirely sure how robust you want things to be, but I suspect you want to use public key encryption. This allows you to include a hash of the message (like a signature or MAC) based on the private key to assert the message validity. The receiver uses the public key to verify the hash and thus the original message is OK. The main advantage of public key encryption over symmetric encryption like AES is you don't have to send the private key, only the public key. This makes it much harder to impersonate a client. SSL/TLS uses public key encryption.
In any case, once you have identified a client sending invalid messages, you're in the world of deciding to trust that client or not. That is, is the corruption due to malicious behaviour (what you're worried about)? Or a faulty client implementation (incompetence)? Or a faulty communications link?. And this is where encryption (or at least my knowledge of it) won't help you any more!
Additional regarding integrity:
If you assume no one else has access to your secret key, a CRC, hash, or HMAC would all suffice to ensure you detected changes. Simply take the body of your message, calculate the CRC, hash, whatever and append as a footer. If the hash doesn't match when you decrypt, the message has been altered.
The assumption that the secret key remains secret is quite reasonable. Especially if after some number of messages you generate new ones. SSH and WiFi's WPA both generate new keys periodically.
If you can't assume the secret key is secret, then you need to go to PKI to sign the message. With the AES key in a malicious 3rd party, they'll just generate whatever messages they want with the key.
There may be some mileage in including a sequence number in your message based on a RNG. If you use the same RNG and same seed for both parties, they should be able to predict what sequence number comes next. A 3rd party would need to intercept the original seed, and know how many messages have been sent to send valid but forged messages. (This assumes no messages can ever be lost or dropped.)

Is it insecure to pass initialization vector and salt along with ciphertext?

I'm new to implementing encryption and am still learning basics, it seems.
I have need for symmetric encryption capabilities in my open source codebase. There are three components to this system:
A server that stores some user data, and information about whether or not it is encrypted, and how
A C# client that lets a user encrypt their data with a simple password when sending to the server, and decrypt with the same password when receiving
A JavaScript client that does the same and therefore must be compatible with the C# client's encryption method
Looking at various JavaScript libraries, I came across SJCL, which has a lovely demo page here: http://bitwiseshiftleft.github.com/sjcl/demo/
From this, it seems that what a client needs to know (besides the password used) in order to decrypt the ciphertext is:
The initialization vector
Any salt used on the password
The key size
Authentication strength (I'm not totally sure what this is)
Is it relatively safe to keep all of this data with the ciphertext? Keep in mind that this is an open source codebase, and there is no way I can reasonably hide these variables unless I ask the user to remember them (yeah, right).
Any advice appreciated.
Initialization vectors and salts are called such, and not keys, precisely because they need not be kept secret. It is safe, and customary, to encode such data along with the encrypted/hashed element.
What an IV or salt needs is to be used only once with a given key or password. For some algorithms (e.g. CBC encryption) there may be some additional requirements, fulfilled by chosing the IV randomly, with uniform probability and a cryptographically strong random number generator. However, confidentiality is not a needed property for an IV or salt.
Symmetric encryption is rarely enough to provide security; by itself, encryption protects against passive attacks, where the attacker observes but does not interfere. To protect against active attacks, you also need some kind of authentication. SJCL uses CCM or OCB2 encryption modes which combine encryption and authentication, so that's fine. The "authentication strength" is the length (in bits) of a field dedicated to authentication within the encrypted text; a strength of "64 bits" means that an attacker trying to alter a message has a maximum probability of 2-64 to succeed in doing so without being detected by the authentication mechanism (and he cannot know whether he has succeeded without trying, i.e. having the altered message sent to someone who knows the key/password). That's enough for most purposes. A larger authentication strength implies a larger ciphertext, by (roughly) the same amount.
I have not looked at the implementation, but from the documentation it seems that the SJCL authors know their trade, and did things properly. I recommend using it.
Remember the usual caveats of passwords and Javascript:
Javascript is code which runs on the client side but is downloaded from the server. This requires that the download be integrity-protected in some way; otherwise, an attacker could inject some of his own code, for instance a simple patch which also logs a copy of the password entered by the user somewhere. In practice, this means that the SJCL code should be served across a SSL/TLS session (i.e. HTTPS).
Users are human beings and human beings are bad at choosing passwords. It is a limitation of the human brain. Moreover, computers keep getting more and more powerful while human brains keep getting more or less unchanged. This makes passwords increasingly weak towards dictionary attacks, i.e. exhaustive searches on passwords (the attacker tries to guess the user's password by trying "probable" passwords). A ciphertext produced by SJCL can be used in an offline dictionary attack: the attacker can "try" passwords on his own computers, without having to check them against your server, and he is limited only by his own computing abilities. SJCL includes some features to make offline dictionary attacks more difficult:
SJCL uses a salt, which prevents cost sharing (usually known as "precomputed tables", in particular "rainbow tables" which are a special kind of precomputed tables). At least the attacker will have to pay the full price of dictionary search for each attacked password.
SJCL uses the salt repeatedly, by hashing it with the password over and over in order to produce the key. This is what SJCL calls the "password strengthening factor". This makes the password-to-key transformation more expensive for the client, but also for the attacker, which is the point. Making the key transformation 1000 times longer means that the user will have to wait, maybe, half a second; but it also multiplies by 1000 the cost for the attacker.

Resources