How to decrypt a HEX buffer with RC4? - encryption

I just can't seem to decrypt a hex buffer, which I'm pretty sure is encrypted with RC4 and which I'm pretty sure I know the key. Being a beginner in cryptography, I just want to make sure I'm actually doing everything right before starting to think that my assumptions are wrong.
const crypto = require('crypto');
const buffer = Buffer.from('471b...', 'hex');
const decipher = crypto.createDecipheriv('rc4', 'MyKey', '');
let decrypted = '';
decrypted += decipher.update(buffer, 'hex', 'utf8');
decrypted += decipher.final('utf8');
console.log(decrypted) // outputs stuff like "�Y6�k�"
Is my hex buffer really encrypted in RC4 and/or is my key right?

We cannot tell. The algorithm or key is probably not correct unless the input message was binary rather than text, because it doesn't look like any of the known character encodings.
Ciphertext is indistinguishable from random which makes detecting a modern cipher very hard (doubly so because you left out most of the ciphertext). RC4 can be distinguished however, but you need an intricate attack to distinguish it from random noise; that would presumably also identify the cipher, even without knowing the key.
Furthermore, RC4 can be initialized with almost any kind of key size. Smaller key sizes may be relatively easy to brute force - bigger key sizes might take forever though (quite literally, surviving beyond the heath death of the universe).
So answers in short:
RC4 - dunno;
key correct - probably not.
By the way: print out the plaintext in hexadecimals in case you want to check if it makes sense or not. ASCII is easy to distinguish that way, but other schemes are also likely to show a pattern in binary, which you cannot see if you just get a diamond with a question mark in it (or any other replacement character, though some fonts / terminals actually display the hex value within the font, which is nice).

Related

Is there a way to detect if a hex / base64 string is actually encrypted, or just encoded?

My question is: Is there a reliable way to detect if a hex / base64 string is actually encrypted, or just encoded?
(I did a quick search but I only seem to find whats the difference between encryption and encoding none seems to say how to detect encryption in general...)
I don't need to know what kind of encryption it is, just detect whether it is encrypted or not and send error if not encrypted, thus enforce encryption.
String size may vary from couple of bytes to kilobytes...
Is there a C/C++ library available for that?
If you think you're working with encoded/encrypted plaintext, the most obvious thing to do would be to try and decode with various standard encodings, and see if what you get back looks like plain English, or at least what you're looking for.
Beyond that, there's a few things you could try:
If you had a perfectly encrypted string, it would be indistinguishable from random noise, so if you can see significant correlations in your string, you probably have imperfectly encrypted data, or straight up encoded plaintext.
To find this, you can find the "Index of Coincidence" for the string, or look for repeated blocks of code. If you find repeats, it's either unencrypted, or, if the repeats are multiples of 16 bytes (or another suitable block length) long, then it might be ECB encoded (i.e. with the same 16 bytes key repeated through the data).
I would say your best bet would be to see how random your string is, if it's really hard to find correlations, then it's probably well encrypted. If the same bits of encrypted/encoded text keep popping up, it's probably just encoded.

Obtain AES CBC key when I have IV, plain text and cyphered text

I am looking for a way of obtaining the key from this set of information, I know for a fact that we are using 16 byte blocks with CBC and I have the first 16 byte plaintext and cyphered, along with the used IV.
At the moment I can test if a key is correct by comparing the output, but I cannot bruteforce 16 character keys for obvious reasons, reading other posts it was my understanding that having the data I have it might be possible to get the key.
Any hint?
What you are trying to do is called a "known plaintext atack", you have both the cyphertext and the plaintext, all that you lack is the key used. Unfortunately, all modern cyphers are designed to resist such attacks. Unless you have extremely sophisticated mathematical skills, you will not be able to find the key this way. AES is resistant to a known plaintext attack.
You will have to try some other method of determining the key. Has the key owner left it written on a piece of paper somewhere?
Note that if AES has been applied as it should be then you cannot find the key. However, judging on the amount of incorrect implementations on stackoverflow, the key may as well be a password, or a simple SHA-256 of a string. If you can obtain information about how the key was generated/applied or stored you may be able to get around even AES-256.
Otherwise your only attack vector is breaking AES or brute forcing the key. In that case I wish you good luck, because brute forcing a 256 bit key is completely out of the question, even with a quantum computer. Unless vulnerabilities are found, of course, AES is not provably secure after all. There may be a vulnerability.

What encryption algorithm is best for small strings?

I have a string of 10-15 characters and I want to encrypt that string. The problem is I want to get a shortest encrypted string as possible. I will also want to decrypt that string back to its original string.
Which encryption algorithm fits best to this situation?
AES uses a 16-byte block size; it is admirably suited to your needs if your limit of 10-15 characters is firm. The PKCS#11 (IIRC) padding scheme would add 6-1 bytes to the data and generate an output of exactly 16 bytes. You don't really need to use an encryption mode (such as CBC) since you're only encrypting one block. There is an issue of how you'd be handling the keys - there is always an issue of how you handle encryption keys.
If you must go with shorter data lengths for shorter strings, then you probably need to consider AES in CTR mode. This uses the key and a counter to generate a byte stream which is XOR'd with the bytes of the string. It would leave your encrypted string at the same length as the input plaintext string.
You'll be hard pressed to find a general purpose compression algorithm that reliably reduces the length of such short strings, so compressing before encrypting is barely an option.
If it's just one short string, you could use a one-time pad which is mathematically perfect secrecy.
http://en.wikipedia.org/wiki/One-time_pad
Just be sure you don't use the key more than one time.
If the main goal is shortening, I would look for a compression library that allows a fixed dictionary built on a corpus of common strings.
Personally I do not have experience with that, but I bet LZMA can do that.

encryption of a single character

What is the minimum number of bits needed to represent a single character of encrypted text.
eg, if I wanted to encrypt the letter 'a', how many bits would I require. (assume there are many singly encrypted characters using the same key.)
Am I right in thinking that it would be the size of the key. eg 256 bits?
Though the question is somewhat fuzzy, first of all it would depend on whether you use a stream cipher or a block cipher.
For the stream cipher, you would get the same number of bits out that you put in - so the binary logarithm of your input alphabet size would make sense. The block cipher requires input blocks of a fixed size, so you might pad your 'a' with zeroes and encrypt that, effectively having the block size as a minimum, like you already proposed.
I'm afraid all the answers you've had so far are quite wrong! It seems I can't reply to them, but do ask if you need more information on why they are wrong. Here is the correct answer:
About 80 bits.
You need a few bits for the "nonce" (sometimes called the IV). When you encrypt, you combine key, plaintext and nonce to produce the ciphertext, and you must never use the same nonce twice. So how big the nonce needs to be depends on how often you plan on using the same key; if you won't be using the key more than 256 times, you can use an 8 bit nonce. Note that it's only the encrypting side that needs to ensure it doesn't use a nonce twice; the decrypting side only needs to care if it cares about preventing replay attacks.
You need 8 bits for the payload, since that's how many bits of plaintext you have.
Finally, you need about 64 bits for the authentication tag. At this length, an attacker has to try on average 2^63 bogus messages minimum before they get one accepted by the remote end. Do not think that you can do without the authentication tag; this is essential for the security of the whole mode.
Put these together using AES in a chaining mode such as EAX or GCM, and you get 80 bits of ciphertext.
The key size isn't a consideration.
You can have the same number of bits as the plaintext if you use a one-time pad.
This is hard to answer. You should definitely first read up on some fundamentals. You can 'encrypt' an 'a' with a single bit (Huffman encoding-style), and of course you could use more bits too. A number like 256 bits without any context is meaningless.
Here's something to get you started:
Information Theory -- esp. check out Shannon's seminal paper
One Time Pad -- infamous secure, but impractical, encryption scheme
Huffman encoding -- not encryption, but demonstrates the above point

Should I use an initialization vector (IV) along with my encryption?

Is it recommended that I use an initialization vector to encrypt/decrypt my data? Will it make things more secure? Is it one of those things that need to be evaluated on a case by case basis?
To put this into actual context, the Win32 Cryptography function, CryptSetKeyParam allows for the setting of an initialization vector on a key prior to encrypting/decrypting. Other API's also allow for this.
What is generally recommended and why?
An IV is essential when the same key might ever be used to encrypt more than one message.
The reason is because, under most encryption modes, two messages encrypted with the same key can be analyzed together. In a simple stream cipher, for instance, XORing two ciphertexts encrypted with the same key results in the XOR of the two messages, from which the plaintext can be easily extracted using traditional cryptanalysis techniques.
A weak IV is part of what made WEP breakable.
An IV basically mixes some unique, non-secret data into the key to prevent the same key ever being used twice.
In most cases you should use IV. Since IV is generated randomly each time, if you encrypt same data twice, encrypted messages are going to be different and it will be impossible for the observer to say if this two messages are the same.
Take a good look at a picture (see below) of CBC mode. You'll quickly realize that an attacker knowing the IV is like the attacker knowing a previous block of ciphertext (and yes they already know plenty of that).
Here's what I say: most of the "problems" with IV=0 are general problems with block encryption modes when you don't ensure data integrity. You really must ensure integrity.
Here's what I do: use a strong checksum (cryptographic hash or HMAC) and prepend it to your plaintext before encrypting. There's your known first block of ciphertext: it's the IV of the same thing without the checksum, and you need the checksum for a million other reasons.
Finally: any analogy between CBC and stream ciphers is not terribly insightful IMHO.
Just look at the picture of CBC mode, I think you'll be pleasantly surprised.
Here's a picture:
http://en.wikipedia.org/wiki/Block_cipher_modes_of_operation
link text
If the same key is used multiple times for multiple different secrets patterns could emerge in the encrypted results. The IV, that should be pseudo random and used only once with each key, is there to obfuscate the result. You should never use the same IV with the same key twice, that would defeat the purpose of it.
To not have to bother keeping track of the IV the simplest thing is to prepend, or append it, to the resulting encrypted secret. That way you don't have to think much about it. You will then always know that the first or last N bits is the IV.
When decrypting the secret you just split out the IV, and then use it together with the key to decrypt the secret.
I found the writeup of HTTP Digest Auth (RFC 2617) very helpful in understanding the use and need for IVs / nonces.
Is it one of those things that need to be evaluated on a case by case
basis?
Yes, it is. Always read up on the cipher you are using and how it expects its inputs to look. Some ciphers don't use IVs but do require salts to be secure. IVs can be of different lengths. The mode of the cipher can change what the IV is used for (if it is used at all) and, as a result, what properties it needs to be secure (random, unique, incremental?).
It is generally recommended because most people are used to using AES-256 or similar block ciphers in a mode called 'Cipher Block Chaining'. That's a good, sensible default go-to for a lot of engineering uses and it needs you to have an appropriate (non-repeating) IV. In that instance, it's not optional.
The IV allows for plaintext to be encrypted such that the encrypted text is harder to decrypt for an attacker. Each bit of IV you use will double the possibilities of encrypted text from a given plain text.
For example, let's encrypt 'hello world' using an IV one character long. The IV is randomly selected to be 'x'. The text that is then encrypted is then 'xhello world', which yeilds, say, 'asdfghjkl'. If we encrypt it again, first generate a new IV--say we get 'b' this time--and encrypt like normal (thus encrypting 'bhello world'). This time we get 'qwertyuio'.
The point is that the attacker doesn't know what the IV is and therefore must compute every possible IV for a given plain text to find the matching cipher text. In this way, the IV acts like a password salt. Most commonly, an IV is used with a chaining cipher (either a stream or block cipher). In a chaining block cipher, the result of each block of plain text is fed to the cipher algorithm to find the cipher text for the next block. In this way, each block is chained together.
So, if you have a random IV used to encrypt the plain text, how do you decrypt it? Simple. Pass the IV (in plain text) along with your encrypted text. Using our fist example above, the final cipher text would be 'xasdfghjkl' (IV + cipher text).
Yes you should use an IV, but be sure to choose it properly. Use a good random number source to make it. Don't ever use the same IV twice. And never use a constant IV.
The Wikipedia article on initialization vectors provides a general overview.

Resources