This seems like a really simple problem. I just can't seem to figure it out.
A message was encrypted using a Block Cipher that seems to follow an Electronic Codebook method. I know that they took it in blocks of 3 characters at a time. I know what the message says and I know what the cipher text says; but I want to know the keys. The problem says that it was encrypted using the same method twice but with different keys. Is it possible to find the keys without brute forcing it?
If not, then how would I minimize the time needed to brute force the key?
BTW: The key is in hex and it can only be 6 characters long maximum. So the biggest key possible in decimal would be 16777215
Related
My understanding is that for most encryption algorithms there is always an output, regardless of the key. A wrong key will of course produce a wrong output. So when using brute force to decrypt encrypted data, how do hackers know when the key was correct? Is there a way other than analyzing the output data?
If this is the only way, I have this thoughts. When encrypting texts, wouldn't it safer to encrypt on word level using a directory rather than on bit level as done today? Then the output would always consist of words. Hackers would need to use complicated and slow algorithms to check grammar in the output words to determine whether this could be the real written text.
To answer the first part, I simply state my old answer to the super user question "How does Truecrypt know it has the correct password?"
It knows the correct password because within that encrypted container
there is a known header.
When Trucrypt decrypts a blob of data and the header matches what it
was expecting it reports back that the decryption was successful. If
you use a incorrect password it will still "decrypt" the text, but it
will decrypt the header in to gibberish and fail the decryption check.
Here is a link to the specification, you can see there are many
things that must be true for it to be a valid header (bytes 64-67
after decryption should always be the ASCII value TRUE, bytes
132-251 must all be 0's, ect.). If you you decrypt a blob of data
and it does not match that header format, you know the decryption
failed.
So they already do what you where suggesting about "checking the grammar", they attempt to decrypt the message and if the message "has proper grammar" (the data follows the spec of the encrypted file format) the message was successfully decrypted.
For your 2nd part of "using a dictionary" there are a few important issues.
First, this would only work on plain unformatted text, no binary data or text metadata allowed. However, more importantly, second how do you "create" this dictionary? If you create the dictionary on the fly using the words in the document tell me what would be the dictionary for the following message:
We attack tomorrow!
You could pad the dictionary with extra words but how do you choose the padding? If you used an existing fixed dictionary, what if a word is not in the dictionary, what do you then? What about misspellings?
I have not even begun to touch on how this method is very likely to leek information. Like you said, English has a set of rules for grammar and some words more often come near the end of sentences and some words come more often near the start of sentences, looking at the numbers used as the indexes you could potentially do a statistical analysis on it and rule out a portion of the dictionary as "unlikely" to be used words.
I am sure there are many many other problems with this, but I am only a beginner in crypto and I can not think of any others off of the top of my head.
There is an adage in cryptography "It is easy to for you to create a cypher that you yourself can not break, it is quite hard for you to make a cypher that other people can not break"
Assuming a have some plain text and the corresponding encrypted data, is it possible to find the key in faster than brute force time? If so, how do I do this?
To clarify: I have plaintext p and encrypted data d. They can be strings or a byte array, or whatever you prefer. I just want to know if it is possible to obtain the key from this data.
See Attacks on the RC4 stream cipher. RC4 provides effective security when used carefully and properly, but it's not that hard to make a mistake either.
I am looking for a way of obtaining the key from this set of information, I know for a fact that we are using 16 byte blocks with CBC and I have the first 16 byte plaintext and cyphered, along with the used IV.
At the moment I can test if a key is correct by comparing the output, but I cannot bruteforce 16 character keys for obvious reasons, reading other posts it was my understanding that having the data I have it might be possible to get the key.
Any hint?
What you are trying to do is called a "known plaintext atack", you have both the cyphertext and the plaintext, all that you lack is the key used. Unfortunately, all modern cyphers are designed to resist such attacks. Unless you have extremely sophisticated mathematical skills, you will not be able to find the key this way. AES is resistant to a known plaintext attack.
You will have to try some other method of determining the key. Has the key owner left it written on a piece of paper somewhere?
Note that if AES has been applied as it should be then you cannot find the key. However, judging on the amount of incorrect implementations on stackoverflow, the key may as well be a password, or a simple SHA-256 of a string. If you can obtain information about how the key was generated/applied or stored you may be able to get around even AES-256.
Otherwise your only attack vector is breaking AES or brute forcing the key. In that case I wish you good luck, because brute forcing a 256 bit key is completely out of the question, even with a quantum computer. Unless vulnerabilities are found, of course, AES is not provably secure after all. There may be a vulnerability.
I understand that the cyphertext from a properly used one time pad cypher reveals absolutely no data about the encrypted message.
Does this mean that there is no way to distinguish a message encrypted with a one time pad from completely random noise? Or is there some theoretical way to determine that there is a message, even though you can't learn anything about it?
There is no way to determine if a string has been encrypted with a OTP. You can produce any string of the same size by choosing an appropriate key.
For example (from the Wikipedia One Time Pad article), the plaintext "HELLO" can be encrypted with the key "XMCKL", giving ciphertext "EQNVZ". But it is possible to find keys which produce any 5 character plaintext, such as "LATER". There is no way to determine the original plaintext without the original key.
A OTP can be 'broken' if it is reused (and therefore is no longer a one time pad). The Venona Project is an example of what can happen when OTPs are reused.
A major drawback to OTPs is that you must securely distribute a key equal in size to the plaintext to be encoded.
If your one-time pad is completely random, then anything XOR'd with it also is (assuming your message has no/low correlation with the contents of the one-time pad).
Is it recommended that I use an initialization vector to encrypt/decrypt my data? Will it make things more secure? Is it one of those things that need to be evaluated on a case by case basis?
To put this into actual context, the Win32 Cryptography function, CryptSetKeyParam allows for the setting of an initialization vector on a key prior to encrypting/decrypting. Other API's also allow for this.
What is generally recommended and why?
An IV is essential when the same key might ever be used to encrypt more than one message.
The reason is because, under most encryption modes, two messages encrypted with the same key can be analyzed together. In a simple stream cipher, for instance, XORing two ciphertexts encrypted with the same key results in the XOR of the two messages, from which the plaintext can be easily extracted using traditional cryptanalysis techniques.
A weak IV is part of what made WEP breakable.
An IV basically mixes some unique, non-secret data into the key to prevent the same key ever being used twice.
In most cases you should use IV. Since IV is generated randomly each time, if you encrypt same data twice, encrypted messages are going to be different and it will be impossible for the observer to say if this two messages are the same.
Take a good look at a picture (see below) of CBC mode. You'll quickly realize that an attacker knowing the IV is like the attacker knowing a previous block of ciphertext (and yes they already know plenty of that).
Here's what I say: most of the "problems" with IV=0 are general problems with block encryption modes when you don't ensure data integrity. You really must ensure integrity.
Here's what I do: use a strong checksum (cryptographic hash or HMAC) and prepend it to your plaintext before encrypting. There's your known first block of ciphertext: it's the IV of the same thing without the checksum, and you need the checksum for a million other reasons.
Finally: any analogy between CBC and stream ciphers is not terribly insightful IMHO.
Just look at the picture of CBC mode, I think you'll be pleasantly surprised.
Here's a picture:
http://en.wikipedia.org/wiki/Block_cipher_modes_of_operation
link text
If the same key is used multiple times for multiple different secrets patterns could emerge in the encrypted results. The IV, that should be pseudo random and used only once with each key, is there to obfuscate the result. You should never use the same IV with the same key twice, that would defeat the purpose of it.
To not have to bother keeping track of the IV the simplest thing is to prepend, or append it, to the resulting encrypted secret. That way you don't have to think much about it. You will then always know that the first or last N bits is the IV.
When decrypting the secret you just split out the IV, and then use it together with the key to decrypt the secret.
I found the writeup of HTTP Digest Auth (RFC 2617) very helpful in understanding the use and need for IVs / nonces.
Is it one of those things that need to be evaluated on a case by case
basis?
Yes, it is. Always read up on the cipher you are using and how it expects its inputs to look. Some ciphers don't use IVs but do require salts to be secure. IVs can be of different lengths. The mode of the cipher can change what the IV is used for (if it is used at all) and, as a result, what properties it needs to be secure (random, unique, incremental?).
It is generally recommended because most people are used to using AES-256 or similar block ciphers in a mode called 'Cipher Block Chaining'. That's a good, sensible default go-to for a lot of engineering uses and it needs you to have an appropriate (non-repeating) IV. In that instance, it's not optional.
The IV allows for plaintext to be encrypted such that the encrypted text is harder to decrypt for an attacker. Each bit of IV you use will double the possibilities of encrypted text from a given plain text.
For example, let's encrypt 'hello world' using an IV one character long. The IV is randomly selected to be 'x'. The text that is then encrypted is then 'xhello world', which yeilds, say, 'asdfghjkl'. If we encrypt it again, first generate a new IV--say we get 'b' this time--and encrypt like normal (thus encrypting 'bhello world'). This time we get 'qwertyuio'.
The point is that the attacker doesn't know what the IV is and therefore must compute every possible IV for a given plain text to find the matching cipher text. In this way, the IV acts like a password salt. Most commonly, an IV is used with a chaining cipher (either a stream or block cipher). In a chaining block cipher, the result of each block of plain text is fed to the cipher algorithm to find the cipher text for the next block. In this way, each block is chained together.
So, if you have a random IV used to encrypt the plain text, how do you decrypt it? Simple. Pass the IV (in plain text) along with your encrypted text. Using our fist example above, the final cipher text would be 'xasdfghjkl' (IV + cipher text).
Yes you should use an IV, but be sure to choose it properly. Use a good random number source to make it. Don't ever use the same IV twice. And never use a constant IV.
The Wikipedia article on initialization vectors provides a general overview.