When using the openssl_encrypt() function in PHP to encrypt a string with AES-256-CBC as the encryption method:
$encrypted = openssl_encrypt($data, "AES-256-CBC", $key, 0, $iv);
I tried different string lengths for $data, and the resulting length of $encrypted will increase when $data reaches a multiple of 16 bytes. But it seems the growth is not steady.
Is there a general formula that relates the length of $data and $encrypted?
Let me quote from https://stackoverflow.com/a/3717552/2393787
With CBC mode, the input data must have a length multiple of the block length, so it is customary to add PKCS#5 padding: if the block length is n, then at least 1 byte is added, at most n, such that the total size is a multiple of n, and the last added bytes (possibly all of them) have numerical value k where k is the number of added bytes. Upon decryption, it suffices to look at the last decrypted byte to recover k and thus know how many padding bytes must be ultimately removed.
Hence, with CBC mode and AES, assuming PKCS#5 padding, if the input data has length d then the encrypted length is (d + 16) & ~15. I am using C-like notation here; in plain words, the length is between d+1 and d+16, and multiple of 16.
This states, that the length of your encrypted data can't be predicted with CBC. You should consired moving to another mode.
Related
The question is why signatureValue is so big if it is based on hashes?
Suppose Signature Algorithm is sha256RSA.
Shouldn't it be smaller according following steps:
Calculate SHA256 hash from tbsCertificate. Output => 256 bits.
Sign 256 bits hash with RSA private key. Output => 256 bits?
But if you see the size of the signatureValue, it might have 2048, 4096, [bigger?] bits.
Signature size doesn't depend on hashing algorithm used to hash signed data. It depends on key size only.
The RSA signature is based on modular exponentiation, i.e. sig = m ^ d mod N, where:
m is message to be signed
d private exponent (its size is N / 2, i.e. 1024 bits for RSA 2048).
N modulus
sig signature value
For such a calculation the final result is the remainder of the m ^ d result divided by the modulus (roughly, RSA key size). The m and d are quite large values and when you power one by another, the value will be huge and won't fit even modulus length, this is why last mod operation is used. And you can see, there is no term such as "hash". Sometimes (when very small values are used), resulting value size can be less than modulus size. In such cases, signature value is padded to match the modulus length.
From raw signature you can infer RSA key size, but can't infer the hash algorithm embedded in signature, this is why signature contains asymmetric algorithm and hash name, such as sha256RSA, otherwise, you will have to store hash algorithm somewhere in message. Since the combination (asymmetric algorithm and hash algorithm) is finite and quite small, it was good enough to assign unique OIDs to such combinations.
I checked the storage size but I'm confused when it comes to storing numbers.
In case of Bytes, what does "Byte length" mean? If I store -128 what's the length? And in case of 12?
In case of Floating-point number and Integer, it doesn't matter if I store 325 or 9.9999999999999 it will always be 8 bytes?
In case of Array? Let' say we have ["ab", "bcd"], what's the size, (2+3=5) or (2+1)+(3+1)=7
If you store an array of bytes, the size will simply be the length of that array. An array with a single byte value of -128 is still just one byte.
Yes, all numbers occupy the same 8-byte size, even if you don't see a fractional part.
The documentation says it's the sum of the array element sizes, so I would expect 7, the sum of the two individual string sizes, each encoded in UTF-8 + 1
I have a data that needs to be stored in a database as encrypted, the maximum length of the data before encryption is 50 chars (English or Arabic), I need to encrypt the data using AES-128 bit, and store the output in the database (base64string).
How to know the length of the data after encryption?
Try it with your specified algorithm, block size, IV size, and see what size output you get :-)
First it depends on the encoding of the input text. Is it UTF8? UTF16?
Lets assume UTF8 so 1 Byte per character means 50 Bytes of input data to your encryption algorithm. (100 Bytes if UTF16)
Then you will pad to the Block Size for the algorithm. AES, regardless of key size is a block of 16 Bytes. So we will be padded out to 64 Bytes (Or 112 for UTF 16)
Then we need to store the IV and header information. So that is (usually, with default settings/IV sizes) another 16Bytes so we are at 80 Bytes (Or 128 for UTF16)
Finally we are encoding to Base64. I assume you want string length, since otherwise it is wasteful to make it into a string. So Base 64 bloats the string using the following formula: Ceil(bytes/3) * 4. So for us that is Ceil(80/3) = 27 * 4 = 108 characters (Or 172 for UTF 16)
Again this is all highly dependent on your choices of how you encrypt, what the text is encoded as, etc.
I would try it with your scenario before relying on these numbers for anything useful.
Suppose that the character 'b' is used as a key for XOR encryption. In that case, encrypting a plain text is done by XOR-ing each byte (character) of the text by the ascii code of 'b'. Conversely, the plain text can be obtained from the ciphered text by XOR-ing by 'b's ascii code again. This is understood.
However, how exactly does one encrypt when the key (password) is a string of characters? Suppose that the encrypting password is 'adg'. In that case, is the plain text ciphered via XOR-ing each of its bytes with the value of a XOR d XOR g? If not, then how?
A way is to repeat the key to cover plain text.
e.g. key = RTTI, plaintext = "how exactly does one"
Text: how exactly does one
Key: RTTIRTTIRTTIRTTIRTTI
Each character in the plain text will be XOR'd with the corresponding key character below it.
There are many ways to implement "XOR encryption", so if you're trying to decode some existing data, you'll first need to figure out which kind it's encrypted with.
The most common scheme I've seen works basically like the classic Vigenère cipher; e.g. for the three-byte key abc, the first byte of plaintext is XORed with a, the second with b, the third with c; the fourth byte is then again XORed with a, the fifth with b, and so on, like this:
Plaintext: THIS IS SOME SECRET TEXT...
Key: abcabcabcabcabcabcabcabcabc
--------------------------------------
XOR: 5**2B*2B0./&A1&"0&5B7$:7OLM
One way to recognize this kind of repeating-key cipher (and also find out the key length) is to compute the index of coincidence between pairs of bytes N positions apart in the ciphertext. If the key length is L, then plotting the index of coincidence as a function of N should reveal a regular array of peaks at the values of N that are divisible by L. (Of course, this only works if the plaintext is something like normal text or code that has a biased byte frequency distribution; if it's completely random data, then this won't help.)
Or you could just use hellman's xortool, which will automate all this for you. For example, running it on the ciphertext 5**2B*2B0./&A1&"0&5B7$:7OLM above, it says:
The most probable key lengths:
1: 17.3%
3: 40.7%
6: 21.5%
8: 6.5%
12: 5.4%
15: 4.6%
18: 4.0%
Key-length can be 3*n
If you have enough ciphertext, and can guess the most common byte in the plaintext, it will even spit out the key for you.
I've some clear text which I want to encrypt using RSA_PKCS_V21 (using PolarSSL library). The problem is that I need to know size of cipher text before executing the algorithm (for dynamic memory allocation purpose).
I know RSA key size & clear text length.
I also want to know the limitation on input clear text length.
Any idea?
Just check the RSA PKCS#1 v2.1 standard, chapter 7.2:
RSAES-PKCS1-V1_5-ENCRYPT ((n, e), M)
Input:
(n, e) recipient's RSA public key (k denotes the length in octets
of the modulus n)
M message to be encrypted, an octet string of length mLen,
where mLen <= k - 11
So the input depends on the key size. k is that key size but in octets. So for a 1024 bit key you have 1024 / 8 - 11 = 117 bytes as maximum plain text.
Note that above is the maximum size for RSA with PKCS#1 v1.5 padding. For the newer OAEP padding the following can be found in chapter 7.1:
RSAES-OAEP-ENCRYPT ((n, e), M, L)
...
Input:
(n, e) recipient's RSA public key (k denotes the length in octets
of the RSA modulus n)
M message to be encrypted, an octet string of length mLen,
where mLen <= k - 2hLen - 2
L optional label to be associated with the message; the
default value for L, if L is not provided, is the empty
string
Where hLen is the output size of the hash function used for the mask generation function. If the default SHA-1 hash function is used then the maximum size of the message is k - 42 (as the output size of SHA-1 is 20 bytes, and 2 * 20 + 2 = 42).
Normally a randomly generated secret key is encrypted instead of the message. Then the message is encrypted with that secret key. This allows almost infinitely long messages, and symmetric crypto - such as AES in CBC mode - is much faster than asymmetric crypto. This combination is called hybrid encryption.
The output size for RSA encryption or signature generation with any padding is identical to the size of the modulus in bytes (rounded upwards, of course), so for a 1024 bit key you would expect 1024 / 8 = 128 octets / bytes.
Note that the output array of the calculated size may contain leading bytes set to zero; this should be considered normal.