RSA on ASCII message problems with '\0' - encryption

I want to encrypt and decrypt ASCII messages using an RSA algorithm written in assembly.
I read that for security and efficiency reasons the encryption is normally not called character-wise but a number of characters is grouped and encrypted together (e.g. wikipedia says that 3 chars are grouped).
Let us assume that we want to encrypt the message "aaa" grouping 2 characters.
"aaa" is stored as 61616100.
If we group two characters and encrypt the resulting halfwords the result for the 6161 block can in fact be something like 0053. This will result in an artificial second '\0' character which corrupts the resulting message.
Is there any way to work around this problem?
Using padding or anything similar is unfortunately not an option since I am required to use the same function for encrypting and decrypting.

The output of RSA is a number. Usually this number is encoded as an octet string (or byte array). You should not treat the result as a character string. You need to treat it as a byte array with the same length as the modulus (or at least the length of the modulus in bytes).
Besides the result containing a zero (null-terminator) the characters may have any value, including non-printable characters such as control characters and 7F. If you want to treat the result as a printable string, convert to hex or base64.

Related

How to create fixed length decryption

I would like to ask if there is a way to encrypt text (no matter how long it is) and ALWAYS get a fixed length decryption? I am not referring to hashing but to encryption/decryption.
Example:
Suppose that we want to encrypt (not hash) a text which is 60 characters long. The result will be a string which is 32 characters long. We can then decrypt the string to get the original text!
We now want to encrypt (not hash) a text which is 200 characters long. The result will be a string which is again 32 characters long. We can then decrypt the string to get the original text!
Is that somehow possible?
Thank you
As the comments indicate, this is impossible. For the underlying reason that this is impossible, see the Pigeonhole Principle. In your example, there are 256^200 inputs and 256^32 outputs. Therefore there must be at least 1 output that has more than 1 input, and therefore is impossible to reverse. Since the number of inputs is massively larger than the number of outputs (and in the general case, is unbounded), almost all cipher texts are necessarily impossible to decrypt.

Is it possible to write an Enigma encryption algorithm that can use all alphanumeric as input but does not output ambiguous characters?

This is about Enigma encryption, I'm guessing the number of rotors doesn't matter but I'm using 3.
I am working with what's basically a coded version of the old mechanical enigma style encryption machines. The concept is rather old but before I get too far into learning it, I was wondering if it would be possible to be able to encrypt using all characters 0-9 a-z and A-Z but the encrypted text itself will only be a subset of these characters? I'm trying to replace a subset of characters (around 10 total) from the encrypted output, while still being able to get back to those characters if they were part of the input?
You can disambiguate by adding 1 to 2-character mapping for ambiguous symbols: O -> A1; 0 -> A2; other ambiguous symbols; A->AA. This is basically just like escaping in strings: we usually can’t put new line inside the string, so we represent it as \n. \ is represented as \\
If you’re working with encrypted data (so the probabilities of all characters are uniformly distributed and characters cannot be predicted) then you can’t compress the ciphertext. If you can compress it, then you’ve noticed some kind of pattern in the text and partially broken the encryption.
If you want to reduce the ciphertext’s alphabet, then you must increase the length of the ciphertext, otherwise you’ve successfully compressed it.

XOR encryption/decryption when the key is more than one byte long?

Suppose that the character 'b' is used as a key for XOR encryption. In that case, encrypting a plain text is done by XOR-ing each byte (character) of the text by the ascii code of 'b'. Conversely, the plain text can be obtained from the ciphered text by XOR-ing by 'b's ascii code again. This is understood.
However, how exactly does one encrypt when the key (password) is a string of characters? Suppose that the encrypting password is 'adg'. In that case, is the plain text ciphered via XOR-ing each of its bytes with the value of a XOR d XOR g? If not, then how?
A way is to repeat the key to cover plain text.
e.g. key = RTTI, plaintext = "how exactly does one"
Text: how exactly does one
Key: RTTIRTTIRTTIRTTIRTTI
Each character in the plain text will be XOR'd with the corresponding key character below it.
There are many ways to implement "XOR encryption", so if you're trying to decode some existing data, you'll first need to figure out which kind it's encrypted with.
The most common scheme I've seen works basically like the classic Vigenère cipher; e.g. for the three-byte key abc, the first byte of plaintext is XORed with a, the second with b, the third with c; the fourth byte is then again XORed with a, the fifth with b, and so on, like this:
Plaintext: THIS IS SOME SECRET TEXT...
Key: abcabcabcabcabcabcabcabcabc
--------------------------------------
XOR: 5**2B*2B0./&A1&"0&5B7$:7OLM
One way to recognize this kind of repeating-key cipher (and also find out the key length) is to compute the index of coincidence between pairs of bytes N positions apart in the ciphertext. If the key length is L, then plotting the index of coincidence as a function of N should reveal a regular array of peaks at the values of N that are divisible by L. (Of course, this only works if the plaintext is something like normal text or code that has a biased byte frequency distribution; if it's completely random data, then this won't help.)
Or you could just use hellman's xortool, which will automate all this for you. For example, running it on the ciphertext 5**2B*2B0./&A1&"0&5B7$:7OLM above, it says:
The most probable key lengths:
1: 17.3%
3: 40.7%
6: 21.5%
8: 6.5%
12: 5.4%
15: 4.6%
18: 4.0%
Key-length can be 3*n
If you have enough ciphertext, and can guess the most common byte in the plaintext, it will even spit out the key for you.

ASP.net query encryption method that doesn't produce slash character

I searched a lot to find an encryption algorithm which its encrypted results do not include slash character. Anything I've tested so far (like this, this and this) generate strings which include slash character and therefore they make asp.net (web forms) routing misunderstand the way it should interpret the route.
Can you please help by introducing a symmetric encryption algorithm which generate encrypted strings that can safely be used for encrypting query strings without misguiding asp.net routing?
Encryption algorithms generally produce random (looking) bytes. These bytes can have any value. You can encode this value, for instance using hexadecimals or base 64. With hexadecimals you have already code that only contains 0..9 and a..f (in upper or lower case). However, hexadecimal encoding is not very efficient, doubling the ciphertext.
Base 64 uses 64 characters: A..Z, a..z, 0..9, + and /, and sometimes a padding character =. It is however very easy to replace the URL unsafe + and / characters with other ones, e.g. - and _ according to RFC 4648. You can also remove any = characters at the end, although you may have to put them back (until you get a multiple of 4 base 64 characters) depending on the base 64 decoding routine. Base 64 uses 4 characters for 3 bytes, so it expands the ciphertext by 33%.

Special characters contained in AES encryption output

I tried to find the list of possible characters that are contained in the encrypted output after AES 256 bit encryption. But, it seems like they are not on the internet? Mind to help? thanks.
The output of an AES cipher is not character data, it is simply bytes. The output should be indistinguishable from random data.
You can represent the output as a string by encoding it as Base64 or Hex if you like.

Resources