How to encrypt data with aes-cbc for random access use? - encryption

How can I decrypt a random block of encrypted data using aes-cbc?

First information:
AES has a block size of 128-bits. So when you say "AES-128" the assumption is a key size of 128-bits and "AES-256" the assumption is a key size of 256-bits.
CBC mode requires an iv. The iv is used for the first block, each other block use the value of the previous block in a similar way. See Cipher Block Chaining (CBC).
Decryption must be done on a block boundary. The first block will use the iv, subsequent blocks will use the value of the previous encrypted block essentially for it's iv. Thus decryption can start from other than the beginning of the encrypted data.
An assumption on my part is that gpg and openssl place the iv preceding the encrypted data, that is usual procedure but this is a guess by me but may be more complicated (I am to lazy to look that up). This would explain why decryption from the first block would work and not from other starting locations.
For more information study the available documentation.
There is a good online AES calculator provided by Cryptomathic.

With info in zaph's answer I was able to do it in python like this:
from os import urandom
from Crypto.Cipher import AES
import hashlib
IV = urandom(16)
aes = AES.new(hashlib.sha256(b'123').digest(), AES.MODE_CBC, IV)
T='1234567890'*160
C=aes.encrypt(T)
# Now if we make a new aes instance with IV we will be able to decrypt first block:
aes = AES.new(hashlib.sha256(b'123').digest(), AES.MODE_CBC, IV)
aes.decrypt(q[:16]) # It returns b'1234567890123456'
# But if we need to decrypt block 4 we need to instanciate aes with contents of block 3 as iv parameter:
aes = AES.new(hashlib.sha256(b'123').digest(), AES.MODE_CBC, q[48:64])
aes.decrypt(q[64:80]) # It returns b'5678901234567890'
So to sum it up if you want to decrypt some encrypted text using aes-cbc from block n to block m for example (which is bytes n×16 to m×16), you need data from block n-1 (bytes (n-1)×16 to (n×16)-1) as IV to start decryption on block n. This way you can decrypt any chunk of data even though you don't have access to whole data except for its first block (first 16 bytes).

Related

RSA/ECB/PKCS1 Padding & AES/CBC/PKCS5Padding Encryption / Decryption

I have an API to call where I have to encrypt my data using RSA/ECB/PKCS1 Padding & AES/CBC/PKCS5PADDING.
Sample Data: {"KEY":"VALUE"}
Step.1:
I have to generate a random number of 16 digit. eg: '1234567890123456'
Step.2:
Do RSA/ECB/PKCS1Padding to random number and base64Encode the result. we get "encrypted_key"
Step.3:
Concatenate random number & data:
DATA = 1234567890123456{"KEY":"VALUE"}
Step.4:
Do AES/CBC/PKCS5Padding on DATA (from Step 3) using random number(1234567890123456) as KEY & Base64Encoded random number as IV. we get "ENCRYPTED_DATA"
So, for Step 1 I am using JSEncrypt javascript library.
for Step 4 I am using CrytoJS.AES.encrypt() function. I am pretty sure that my JSEncrypt function is running fine as the client is able to decrypt it but client is not able to decrypt my data. I feel that I am making a mistake while using CryptoJS.
Can someone guide me properly on how to use the library.
What I am doing is:
KEY = '1234567890123456'
IV = MTIzNDU2Nzg5MDEyMzQ1Ng== (result of btoa('1234567890123456') )
DATA = "1234567890123456{"KEY":"VAL"}"
cryptedData = Crypto.AES.encrypt(DATA, KEY, {iv: IV, mode: CryptoJS.mode.CBC,padding:CryptoJS.pad.Pkcs7})
I am told to use PKCS5Padding in AES/CBC Encryption ( Step 4 ) but it seems that AES does not support PKCS5Padding but PKCS7Padding.
I think I am making a mistake in the way I am passing KEY & IV to CryptoJS.
Any help will be greatly appreciated.
For the start lets see why are you doing the exercise. RSA is intended to encode only limited amout of data. So we use "hybrid encryption", where the data are encrypted using a symmetric cipher with a random key and the key itself is encrypted using RSA
Encryption works on binary data, to safely transmit binary data, the data are encoded to printable form (hex or base64)
Step.1: I have to generate a random number of 16 digit
What we see is 16 digits 0-9. That's not really safe. Generating 16 digits you will get a key of 10^16, which is equals of approx 2^53 (if I did the math wrong, please comment).
You need to generate 16 random bytes (digits 0-256 resulting in 2^128 key). That is your DEK (data encryption key).
You may encode the DEK to be in printable form, in hexadecimal encoding it will have 32 characters.
Step.2:
ok, you now get encrypted encoded_encryption_key
Step 3, Step 4
And here you should understand what are you doing.
encrypt DATA using DEK ( not encoded random number in binary form), you will get encrypted_data. You can encode the result to encoded_encrypted_data
concatenate the encrypted key and encrypted data. It. is up to you to choose if you encode it before or after encoding. I suggest you make concatenation of encoded_encryption_key and encoded_encrypted_data with some separator, because if RSA key length changes, the length of encoded_encryption_key changes too
Make sure to discuss with the client what format is expected exactly.
Notes:
IV needs to be 16 bytes long for AES and for CryptoJS I believe it needs to be Hex encoded, so using btoa may not be the best idea. I believe the CryptoJS just trims the value to 16 bytes, but formally it is not correct.
CBC cipher needs some sort of integrity check, I suggest to add some HMAC or signature to the result (otherwise someone could change the ciphertext without you being able to detect the tamper)
but it seems that AES does not support PKCS5Padding but PKCS7Padding.
Indeed AES supports Pkcs7. Pkcs5 is functionally the same, but defined on 64 blocks. The designation is still used in Java as heritage from DES encryption.

Does RSA encrypt each character in a message separately?

I'm working on a homework assignment on paper where I must design an RSA cryptosystem and show the steps of encrypting/decrypting a message by hand.
I have selected and calculated my p, q, n, and phi(n). I am now encrypting the message "HELLO". I have started by breaking each letter into its ASCII equivalent such that H = 72, E = 69, and so forth.
My question is should I encrypt/decrypt each letter separately to/from ciphertext or is there a better way to do it while leaving it as one string?
It seems daunting to have to do the Extended Euclidean Algorithm by hand for all the letters in order to find the decryption key.
I ask this question because I assume that RSA handles this in a better way as to not have to run encryption on each character but encrypt it as a whole.
TL;DR Should I encrypt each letter separately or can I do it all at once?
RSA encrypts the whole message by converting the whole text into a very large integer usually by putting an integer representation of each character in the text side by side and then applying the public key. If each character is encrypted separately, the encrypted message will be vulnerable to frequency analysis.

AES- ECB using DataPower

I have a requirement where in I get a HEX string which is 32 character long. I need to encrypt it with AES-128-ECB and get an Hex string with is again 32 character long.
I have been asked to convert the 32 char hex string to binary stream(to get 16 bytes of data) and then encrypt it using AES-ECB(to get 16 bytes of encrypted data) and then convert this 16 bytes of encrypted data to 32 char hex string.
I came across this article to achieve AES-ECB encryption.
https://www.ibm.com/developerworks/community/blogs/HermannSW/entry/gatewayscript_modules_aes?lang=en
​
Kindly let me know how to achieve this.
Other than the actual code you have the concept, for more detailed help you will need to make a best-effort attempt and add that code to the question along with error information and input/output test data (in hex).
Note that you need to ensure that padding is not added, some AES implementations add padding by default and will add a block of (PKCS#7) padding to data that is an exact multiple of the block size (16-bytes for AES).
Note: ECB mode, it is insecure when the key is used more than once and there is a similarity in the data. See ECB mode, scroll down to the Penguin.

XOR encryption/decryption when the key is more than one byte long?

Suppose that the character 'b' is used as a key for XOR encryption. In that case, encrypting a plain text is done by XOR-ing each byte (character) of the text by the ascii code of 'b'. Conversely, the plain text can be obtained from the ciphered text by XOR-ing by 'b's ascii code again. This is understood.
However, how exactly does one encrypt when the key (password) is a string of characters? Suppose that the encrypting password is 'adg'. In that case, is the plain text ciphered via XOR-ing each of its bytes with the value of a XOR d XOR g? If not, then how?
A way is to repeat the key to cover plain text.
e.g. key = RTTI, plaintext = "how exactly does one"
Text: how exactly does one
Key: RTTIRTTIRTTIRTTIRTTI
Each character in the plain text will be XOR'd with the corresponding key character below it.
There are many ways to implement "XOR encryption", so if you're trying to decode some existing data, you'll first need to figure out which kind it's encrypted with.
The most common scheme I've seen works basically like the classic Vigenère cipher; e.g. for the three-byte key abc, the first byte of plaintext is XORed with a, the second with b, the third with c; the fourth byte is then again XORed with a, the fifth with b, and so on, like this:
Plaintext: THIS IS SOME SECRET TEXT...
Key: abcabcabcabcabcabcabcabcabc
--------------------------------------
XOR: 5**2B*2B0./&A1&"0&5B7$:7OLM
One way to recognize this kind of repeating-key cipher (and also find out the key length) is to compute the index of coincidence between pairs of bytes N positions apart in the ciphertext. If the key length is L, then plotting the index of coincidence as a function of N should reveal a regular array of peaks at the values of N that are divisible by L. (Of course, this only works if the plaintext is something like normal text or code that has a biased byte frequency distribution; if it's completely random data, then this won't help.)
Or you could just use hellman's xortool, which will automate all this for you. For example, running it on the ciphertext 5**2B*2B0./&A1&"0&5B7$:7OLM above, it says:
The most probable key lengths:
1: 17.3%
3: 40.7%
6: 21.5%
8: 6.5%
12: 5.4%
15: 4.6%
18: 4.0%
Key-length can be 3*n
If you have enough ciphertext, and can guess the most common byte in the plaintext, it will even spit out the key for you.

How to decide if the chosen password is correct?

If an encrypted file exists and someone wants to decrypt it, there are several methods do try.
For example, if you would chose a brute force attack, that's easy: just try all possible keys and you will find the correct one. For this question, it doesn't matter that this might take too long.
But trying keys means the following steps:
Chose key
Decrypt data with key
Check if decryption was successful
Besides the problem that you would need to know the algorithm that was used for the encryption, I cannot imagine how one would do #3.
Here is why: After decrypting the data, I get some "other" data. In case of an encrypted plain text file in a language that I can understand, I can now check if the result is a text in that langauge.
If it would be a known file type, I could check for specific file headers.
But since one tries to decrypt something secret, it is most likely unknown what kind of information there will be if correctly decrypted.
How would one check if a decryption result is correct if it is unknown what to look for?
Like you suggest, one would expect the plaintext to be of some know format, e.g., a JPEG image, a PDF file, etc. The idea would be that it is very unlikely that a given ciphertext can be decrypted into both a valid JPEG image and a valid PDF file (but see below).
But it is actually not that important. When one talks about a cryptosystem being secure, one (roughly) talks about the odds of you being able to guess the plaintext corresponding to a given ciphertext. So I pick a random message m and encrypts it c = E(m). I give you c and if you cannot guess m, then we say the cryptosystem is secure, otherwise it's broken.
This is just a simple security definition. There are other definitions that require the system to be able to hide known plaintexts (semantic security): you give me two messages, I encrypt one of them, and you will not be able to tell which message I chose.
The point is, that in these definitions, we are not concerned with the format of the plaintexts, all we require is that you cannot guess the plaintext that was encrypted. So there is no step 3 :-)
By not considering your step 3, we make the question of security as clear as possible: instead of arguing about how hard it is to guess which format you used (zip, gzip, bzip2, ...) we only talk about the odds of breaking the system compared to the odds of guessing the key. It is an old principle that you should concentrate all your security in the key -- it simplifies things dramatically when your only assumption is the secrecy of the key.
Finally, note that some encryption schemes makes it impossible for you to verify if you have the correct key since all keys are legal. The one-time pad is an extreme example such a scheme: you take your plaintext m, choose a perfectly random key k and compute the ciphertext as c = m XOR k. This gives you a completely random ciphertext, it is perfectly secure (the only perfectly secure cryptosystem, btw).
When searching for an encryption key, you cannot know when you've found the right one. This is because c could be an encryption of any file with the same length as m: if you encrypt the message m' with the key *k' = c XOR m' you'll see that you get the same ciphertext again, thus you cannot know if m or m' was the original message.
Instead of thinking of exclusive-or, you can think of the one-time pad like this: I give you the number 42 and tell you that is is the sum of two integers (negative, positive, you don't know). One integer is the message, the other is the key and 42 is the ciphertext. Like above, it makes no sense for you to guess the key -- if you want the message to be 100, you claim the key is -58, if you want the message to be 0, you claim the key is 42, etc. One time pad works exactly like this, but on bit values instead.
About reusing the key in one-time pad: let's say my key is 7 and you see the ciphertexts 10 and 20, corresponding to plaintexts 3 and 13. From the ciphertexts alone, you now know that the difference in plaintexts is 10. If you somehow gain knowledge of one of the plaintext, you can now derive the other! If the numbers correspond to individual letters, you can begin looking at several such differences and try to solve the resulting crossword puzzle (or let a program do it based on frequency analysis of the language in question).
You could use heuristics like the unix
file
command does to check for a known file type. If you have decrypted data that has no recognizable type, decrypting it won't help you anyway, since you can't interpret it, so it's still as good as encrypted.
I wrote a tool a little while ago that checked if a file was possibly encrypted by simply checking the distribution of byte values, since encrypted files should be indistinguishable from random noise. The assumption here then is that an improperly decrypted file retains the random nature, while a properly decrypted file will exhibit structure.
#!/usr/bin/env python
import math
import sys
import os
MAGIC_COEFF=3
def get_random_bytes(filename):
BLOCK_SIZE=1024*1024
BLOCKS=10
f=open(filename)
bytes=list(f.read(BLOCK_SIZE))
if len(bytes) < BLOCK_SIZE:
return bytes
f.seek(0, 2)
file_len = f.tell()
index = BLOCK_SIZE
cnt=0
while index < file_len and cnt < BLOCKS:
f.seek(index)
more_bytes = f.read(BLOCK_SIZE)
bytes.extend(more_bytes)
index+=ord(os.urandom(1))*BLOCK_SIZE
cnt+=1
return bytes
def failed_n_gram(n,bytes):
print "\t%d-gram analysis"%(n)
N = len(bytes)/n
states = 2**(8*n)
print "\tN: %d states: %d"%(N, states)
if N < states:
print "\tinsufficient data"
return False
histo = [0]*states
P = 1.0/states
expected = N/states * 1.0
# I forgot how this was derived, or what it is suppose to be
magic = math.sqrt(N*P*(1-P))*MAGIC_COEFF
print "\texpected: %f magic: %f" %(expected, magic)
idx=0
while idx<len(bytes)-n:
val=0
for x in xrange(n):
val = val << 8
val = val | ord(bytes[idx+x])
histo[val]+=1
idx+=1
count=histo[val]
if count - expected > magic:
print "\tfailed: %s occured %d times" %( hex(val), count)
return True
# need this check because the absence of certain bytes is also
# a sign something is up
for i in xrange(len(histo)):
count = histo[i]
if expected-count > magic:
print "\tfailed: %s occured %d times" %( hex(i), count)
return True
print ""
return False
def main():
for f in sys.argv[1:]:
print f
rand_bytes = get_random_bytes(f)
if failed_n_gram(3,rand_bytes):
continue
if failed_n_gram(2,rand_bytes):
continue
if failed_n_gram(1,rand_bytes):
continue
if __name__ == "__main__":
main()
I find this works reasonable well:
$ entropy.py ~/bin/entropy.py entropy.py.enc entropy.py.zip
/Users/steve/bin/entropy.py
1-gram analysis
N: 1680 states: 256
expected: 6.000000 magic: 10.226918
failed: 0xa occured 17 times
entropy.py.enc
1-gram analysis
N: 1744 states: 256
expected: 6.000000 magic: 10.419895
entropy.py.zip
1-gram analysis
N: 821 states: 256
expected: 3.000000 magic: 7.149270
failed: 0x0 occured 11 times
Here .enc is the source ran through:
openssl enc -aes-256-cbc -in entropy.py -out entropy.py.enc
And .zip is self-explanatory.
A few caveats:
It doesn't check the entire file, just the first KB, then random blocks from the file. So if a file was random data appended with say a jpeg, it will fool the program. The only way to be sure if to check the entire file.
In my experience, the code reliably detects when a file is unencrypted (since nearly all useful data has structure), but due to its statistical nature may sometimes misdiagnose an encrypted/random file.
As it has been pointed out, this kind of analysis will fail for OTP, since you can make it say anything you want.
Use at your own risk, and most certainly not as the only means of checking your results.
One of the ways is compressing the source data with some standard algorithm like zip. If after decryption you can unzip the result - it's decrypted right. Compression is almost usually done by encryption programs prior to encryption - because it's another step the bruteforcer will need to repeat for each trial and lose time on it and because encrypted data is almost surely uncompressible (size doesn't decrease after compression with a chained algorithm).
Without a more clearly defined scenario, I can only point to cryptanalysis methods. I would say it's generally accepted that validating the result is an easy part of cryptanalysis. In comparison to decrypting even a known cypher, a thorough validation check costs little cpu.
are you seriously asking questions like this?
well if it was known whats inside then you would not need to decrypt it anywayz right?
somehow this has nothing to do with programming question its more mathematical. I took some encryption math classes at my university.
And you can not confirm without a lot of data points.
Sure if your result makes sense and its clear it is meaningful in plain english (or whatever language is used) but to answer your question.
If you were able to decrypt you should be able to encrypt as well.
So encrypt the result using reverse process of decryption and if you get same results you might be golden...if not something is possibly wrong.

Resources