Encryption of a string - encryption

There's a program that encrypts any values. The problem is, I can't understand what the algorithm is.
So, e.g.
input is 1, output is cwjtCNNxuYsB+fns/5h66g==
input is 2, output is UR/EJ8GNC/eG5zFXBwbXDw== and so on.
When the input becomes bigger, the output becomes bigger as well:
input London is the capital of Great Britain, output mnmxU29GVF+e+zn6Y8k246TdbF3wafzl7/ohdgA9KEvZNoLG02JW5HdcwZJNiZmA.
The strange things here are these "+", "/" and "=".
I can't understand how to classify such cipher.

That output (with the "+", "/" and "=") is base64 encoding.
Base64 is an encoding standard that uses a number of ASCII characters to represent binary data, by translating it into a radix-64 representation.
It's widely used to make messages that are encrypted easy to transport over email, WhatsApp, iMessage, etc.
Looking at your examples, they also seem to be encrypted, however, all ciphers will produce larger and larger ciphertext as plaintext input increases.
I expect this is some type of symmetric block cipher.
AES - the Advanced Encryption Standard, has a block size of 128-bits. Ergo, no matter how small the input (even 1 bit) it will be padded to at least 128-bits. Given that your inputs of '1' and '-2' are in fact encrypted to a 128-bit output, I expect this is likely AES.
Indeed "London is the capital of Great Britain" is also a multiple of 128-bits, at 384-bit's of ciphertext.
mnmxU29GVF+e+zn6Y8k246TdbF3wafzl7/ohdgA9KEvZNoLG02JW5HdcwZJNiZmA
->
100110100111100110110001010100110110111101000110010101000101111110011110111110110011100111111010011000111100100100110110111000111010010011011101011011000101110111110000011010011111110011100101111011111111101000100001011101100000000000111101001010000100101111011001001101101000001011000110110100110110001001010110111001000111011101011100110000011001001001001101100010011001100110000000 == 384-bits/128-bit block size = 3 Blocks of data.

Related

How to create fixed length decryption

I would like to ask if there is a way to encrypt text (no matter how long it is) and ALWAYS get a fixed length decryption? I am not referring to hashing but to encryption/decryption.
Example:
Suppose that we want to encrypt (not hash) a text which is 60 characters long. The result will be a string which is 32 characters long. We can then decrypt the string to get the original text!
We now want to encrypt (not hash) a text which is 200 characters long. The result will be a string which is again 32 characters long. We can then decrypt the string to get the original text!
Is that somehow possible?
Thank you
As the comments indicate, this is impossible. For the underlying reason that this is impossible, see the Pigeonhole Principle. In your example, there are 256^200 inputs and 256^32 outputs. Therefore there must be at least 1 output that has more than 1 input, and therefore is impossible to reverse. Since the number of inputs is massively larger than the number of outputs (and in the general case, is unbounded), almost all cipher texts are necessarily impossible to decrypt.

Is it possible to write an Enigma encryption algorithm that can use all alphanumeric as input but does not output ambiguous characters?

This is about Enigma encryption, I'm guessing the number of rotors doesn't matter but I'm using 3.
I am working with what's basically a coded version of the old mechanical enigma style encryption machines. The concept is rather old but before I get too far into learning it, I was wondering if it would be possible to be able to encrypt using all characters 0-9 a-z and A-Z but the encrypted text itself will only be a subset of these characters? I'm trying to replace a subset of characters (around 10 total) from the encrypted output, while still being able to get back to those characters if they were part of the input?
You can disambiguate by adding 1 to 2-character mapping for ambiguous symbols: O -> A1; 0 -> A2; other ambiguous symbols; A->AA. This is basically just like escaping in strings: we usually can’t put new line inside the string, so we represent it as \n. \ is represented as \\
If you’re working with encrypted data (so the probabilities of all characters are uniformly distributed and characters cannot be predicted) then you can’t compress the ciphertext. If you can compress it, then you’ve noticed some kind of pattern in the text and partially broken the encryption.
If you want to reduce the ciphertext’s alphabet, then you must increase the length of the ciphertext, otherwise you’ve successfully compressed it.

RSA/ECB/PKCS1 Padding & AES/CBC/PKCS5Padding Encryption / Decryption

I have an API to call where I have to encrypt my data using RSA/ECB/PKCS1 Padding & AES/CBC/PKCS5PADDING.
Sample Data: {"KEY":"VALUE"}
Step.1:
I have to generate a random number of 16 digit. eg: '1234567890123456'
Step.2:
Do RSA/ECB/PKCS1Padding to random number and base64Encode the result. we get "encrypted_key"
Step.3:
Concatenate random number & data:
DATA = 1234567890123456{"KEY":"VALUE"}
Step.4:
Do AES/CBC/PKCS5Padding on DATA (from Step 3) using random number(1234567890123456) as KEY & Base64Encoded random number as IV. we get "ENCRYPTED_DATA"
So, for Step 1 I am using JSEncrypt javascript library.
for Step 4 I am using CrytoJS.AES.encrypt() function. I am pretty sure that my JSEncrypt function is running fine as the client is able to decrypt it but client is not able to decrypt my data. I feel that I am making a mistake while using CryptoJS.
Can someone guide me properly on how to use the library.
What I am doing is:
KEY = '1234567890123456'
IV = MTIzNDU2Nzg5MDEyMzQ1Ng== (result of btoa('1234567890123456') )
DATA = "1234567890123456{"KEY":"VAL"}"
cryptedData = Crypto.AES.encrypt(DATA, KEY, {iv: IV, mode: CryptoJS.mode.CBC,padding:CryptoJS.pad.Pkcs7})
I am told to use PKCS5Padding in AES/CBC Encryption ( Step 4 ) but it seems that AES does not support PKCS5Padding but PKCS7Padding.
I think I am making a mistake in the way I am passing KEY & IV to CryptoJS.
Any help will be greatly appreciated.
For the start lets see why are you doing the exercise. RSA is intended to encode only limited amout of data. So we use "hybrid encryption", where the data are encrypted using a symmetric cipher with a random key and the key itself is encrypted using RSA
Encryption works on binary data, to safely transmit binary data, the data are encoded to printable form (hex or base64)
Step.1: I have to generate a random number of 16 digit
What we see is 16 digits 0-9. That's not really safe. Generating 16 digits you will get a key of 10^16, which is equals of approx 2^53 (if I did the math wrong, please comment).
You need to generate 16 random bytes (digits 0-256 resulting in 2^128 key). That is your DEK (data encryption key).
You may encode the DEK to be in printable form, in hexadecimal encoding it will have 32 characters.
Step.2:
ok, you now get encrypted encoded_encryption_key
Step 3, Step 4
And here you should understand what are you doing.
encrypt DATA using DEK ( not encoded random number in binary form), you will get encrypted_data. You can encode the result to encoded_encrypted_data
concatenate the encrypted key and encrypted data. It. is up to you to choose if you encode it before or after encoding. I suggest you make concatenation of encoded_encryption_key and encoded_encrypted_data with some separator, because if RSA key length changes, the length of encoded_encryption_key changes too
Make sure to discuss with the client what format is expected exactly.
Notes:
IV needs to be 16 bytes long for AES and for CryptoJS I believe it needs to be Hex encoded, so using btoa may not be the best idea. I believe the CryptoJS just trims the value to 16 bytes, but formally it is not correct.
CBC cipher needs some sort of integrity check, I suggest to add some HMAC or signature to the result (otherwise someone could change the ciphertext without you being able to detect the tamper)
but it seems that AES does not support PKCS5Padding but PKCS7Padding.
Indeed AES supports Pkcs7. Pkcs5 is functionally the same, but defined on 64 blocks. The designation is still used in Java as heritage from DES encryption.

RSA on ASCII message problems with '\0'

I want to encrypt and decrypt ASCII messages using an RSA algorithm written in assembly.
I read that for security and efficiency reasons the encryption is normally not called character-wise but a number of characters is grouped and encrypted together (e.g. wikipedia says that 3 chars are grouped).
Let us assume that we want to encrypt the message "aaa" grouping 2 characters.
"aaa" is stored as 61616100.
If we group two characters and encrypt the resulting halfwords the result for the 6161 block can in fact be something like 0053. This will result in an artificial second '\0' character which corrupts the resulting message.
Is there any way to work around this problem?
Using padding or anything similar is unfortunately not an option since I am required to use the same function for encrypting and decrypting.
The output of RSA is a number. Usually this number is encoded as an octet string (or byte array). You should not treat the result as a character string. You need to treat it as a byte array with the same length as the modulus (or at least the length of the modulus in bytes).
Besides the result containing a zero (null-terminator) the characters may have any value, including non-printable characters such as control characters and 7F. If you want to treat the result as a printable string, convert to hex or base64.

repetition in encrypted data -- red flag?

I have some base-64 encoded encrypted data and noticed a fair amount of repetition. In a (approx) 200-character-long string, a certain base-64 character is repeated up to 7 times in several separate repeated runs.
Is this a red flag that there is a problem in the encryption? According to my understanding, encrypted data should never show significant repetition, even if the plaintext is entirely uniform (i.e. even if I encrypt 2 GB of nothing but the letter A, there should be no significant repetition in the encrypted version).
According to the binomial distribution, there is about a 2.5% chance that you'd see one character from a set of 64 appear seven times in a series of 200 random characters. That's a small chance, but not negligible. With more information, you might raise your confidence from 97.5% to something very close to 100% … or find that the cipher text really is uniformly distributed.
You say that the "character is repeated up to 7 times" in several separate repeated runs. That's not enough information to say whether the cipher text has a bias. Instead, tell us the total number of times the character appeared, and the total number of cipher text characters. For example, "it appeared a total of 3125 times in 1000 runs of 200 characters each."
Also, you need to be sure that you are talking about the raw output of a cipher. Cipher text is often encapsulated in an "envelope" like that defined by the Cryptographic Message Syntax. Of course, this enclosing structure will have predictable patterns.
Well I guess it depends. Repetition in general is bad thing if it represents the same data.
Considering you are encoding it have you looked at data to see if you have something that repeats in those counts?
In order to understand better you gotta know what kind of encryption does it use.
It could be just coincidence that they are repeating.
But if repetition comes from same data, then it can be a red flag because then frequency counts can be used to decode it.
What kind of encryption are you using? Home made or some industry standard?
It depends on how are you encrypting your data.
Base64 encoding a string may count as light obfuscation, but it is NOT encryption. The purpose of Base64 encoding is to allow any sort of binary data to be encoded as a safe ASCII string.

Resources