AES Rijndael and little/big endian? - networking

I am using the public domain reference implementation of AES Rijndael, commonly distributed under the name "rijndael-fst-3.0.zip". I plan to use this to encrypt network data, and I am wondering whether the results of encryption will be different on big/little endian architectures? In other words, can I encrypt a block of 16 bytes on a little endian machine and then decrypt that same block on big endian? And of course, the other way around as well.
If not, how should I go about swapping bytes?
Thanks in advance for your help.
Kind regards.

Byte order issues are relevant only in context of mapping multi-byte constructs to a sequence of bytes e.g. mapping a 4-byte sequence to a signed integer value is sensitive to byte order.
The AES algorithm is byte centric and insensitive to endian issues.

Rijndael is oblivious to byte order; it just sees the string of bytes you feed it. You should do the byte swapping outside of it as you always would (with ntohs or whatever interface your platform has for that purpose).

Related

How stream cipher works

I am new here, and I am trying to understand encryption. I have done a lot of reading here and I can not find an explanation that could help me understand.
When we are talking about stream ciphers, from what I understood, the encryption is done bit by bit.
Does that mean that the input text (let's say "Google") is encrypted character by character(because that would be byte by byte) ? Or is it converted to binary first, then the sequence of 0 and 1 is encrypted bit bi bit?
Thank you.
When we are talking about stream ciphers, from what I understood, the encryption is done bit by bit.
I assume you are talking about the simple XOR-ing of plaintext with the cipherstream.
Stream ciphers are often defined (theoretically, as a formal definition) as PRG (pseudo random generator) producing bit by bit with non-guessable probability. I've seen such a definition in multiple courses. You could (in theory) apply the XOR operation bit byt bit. As you've already find out that would not be very practical in current computer architecture.
Or is it converted to binary first, then the sequence of 0 and 1 is encrypted bit bi bit?
Practically the cipher streams are having some internal state and produces the output as stream of bytes or a byte array. As a result the string is converted as a byte array and and XOR is applied to the whole array (byte by byte or whole chunks of bytes)

How to convert a message to an integer to encrypt it with RSA?

In encryption methods like RSA, we operate on an integer which represents our message. I've toyed around with converting the string to an array of bytes and working one character at a time, but that seems overly slow and the RSA algorithm is designed to work with the entire message.
How do we convert a string to a representation (integer, big integer etc) in which we can apply our cryptographic algorithm too?
In typical usage, you don't actually encrypt the entire message using RSA. Instead, you encrypt the encryption key for a symmetric block cipher (like AES) using RSA, then encrypt your stream of data using that block cipher.
Do not attempt to do this on your own! You have to be very careful with how you do the conversion, including setting up a secure padding scheme and using the block cipher correctly and in a secure mode. You might want to look using language-provided crypto libraries or a standard library like OpenSSL.
Hope this helps!
Think about how integers and strings are represented in memory. A 32-bit integer takes up four 8-bit bytes, and a 64-bit integer takes up eight bytes. A string is stored as bytes too, and in case of ASCII, each characters is represented by one byte. (UTF-8 and UTF-16 are variable length encodings, but it's still bytes.)
There is nothing to convert, because all datatypes are represented by bytes internally.
There's no reason this can't be extended to, say, 2048-bit integers for use with RSA.

Best practices for passing encrypted data between different programming languages

I have read that if you want to encrypt a string using one programming language and decrypt that string using another programming language, then to ensure compatibility it is best to do some conversions prior to doing the encryption. I have read that it's a best practice to encrypt the byte array of a string rather than the string itself. Also, I have read that certain encryption algorithms expect each encrypted packet to be a fixed length in size. If the last packet to be encrypted isn't the required size, then encryption would fail. Therefore it seems like a good idea to encrypt data that has first been converted into a fixed length, such as hex.
I am trying to identify best practices that are generally useful regardless of the encryption algorithm being used. To maximize compatibility when encrypting and decrypting data across different languages and platforms, I would like a critique on the following steps as a process:
Encryption:
start with a plain text string
convert plain text string to byte array
convert byte array to hex
encrypt hex to encrypted string
end with an encrypted string
Decryption:
start with an encrypted string
decrypt encrypted string to hex
convert hex to byte array
convert byte array to plain text string
end with a plain text string
Really the best practice for encryption is to use a high level encryption framework, there's a lot of things you can do wrong working with the primitives. And mfanto does a good a good job of mentioning important things you need to know if you don't use a high level encryption framework. And i'm guessing that if you are trying to maximize compatibility across programming languages, it's because you need other developers to inter-operate with the encryption, and then they need to learn the low level details of working with encryption too.
So my suggestion for high level framework is to use the Google Keyczar framework, as it handles the details of, algorithm, key management, padding, iv, authentication tag, wire format all for you. And it exists for many different programming Java, Python, C++, C# and Go. Check it out.
I wrote the C# version, so I can tell you the primitives it uses behind the scenes are widely available in most other programming languages too, and it uses standards like json for key management and storage.
Your premise is correct, but in some ways it's a little easier than that. Modern crypto algorithms are meant to be language agnostic, and provided you have identical inputs with identical keys, you should get identical results.
It's true that for most ciphers and some modes, data needs to be a fixed length. Converting to hex won't do it, because the data needs to end on fixed boundaries. With AES for example, if you want to encrypt 4 bytes, you'll need to pad it out to 16 bytes, which a hex representation wouldn't do. Fortunately that'll most likely happen within the crypto API you end up using, with one of the standard padding schemes. Since you didn't tag a language, here's a list of padding modes that the AesManaged class in .NET supports.
On the flip side, encrypting data properly requires a lot more than just byte encoding. You need to choose the correct mode of operation (CBC or CTR is preferred), and then provide some type of message integrity. Encryption alone doesn't protect against tampering with data. If you want to simplify things a bit, then look at a mode like GCM, which handles both confidentiality, and integrity.
Your scheme should then look something like:
Convert plain text to string to byte array. See #rossum's comment for an important note about character encoding.
Generate a random symmetric key or use PBKDF2 to convert a passphrase to a key
Generate a random IV/nonce for use with GCM
Encrypt the byte array and store it, along with the Authentication Tag
You might optionally want to store the byte array as a Base64 string.
For decryption:
If you stored the byte array as a Base64 string, convert back to the byte array.
Decrypt encrypted byte array to plaintext
Verify the resulting Authentication Tag matches the stored Authentication Tag
Convert byte array to plain text string.
I use HashIds for this purpose. It's simple and supports wide range of programming language. We use it to pass encrypted data between our PHP, Node.js, and Golang microservices whenever we need to decrypt data in the destination.
I have read that it's a best practice to encrypt the byte array of a string rather than the string itself.
Crytographic algorithms generally work on byte arrays or byte stream, so yes. You don't encrypt objects (strings) directly, you encrypt their byte representations.
Also, I have read that certain encryption algorithms expect each encrypted packet to be a fixed length in size. If the last packet to be encrypted isn't the required size, then encryption would fail.
This is an implementation detail of the particular encryption algorithm you choose. It really depends on what the API interface is to the algorithm.
Generally speaking, yes, crytographic algorithms will break input into fixed-size blocks. If the last block isn't full then they may pad the end with arbitrary bytes to get a full chunk. To distinguish between padded data and data which just happens to have what-look-like-padding bytes at the end, they'll prepend or append the length of the plain text to the byte stream.
This is the kind of detail that should not be left up to the user, and a good encryption library will take care of these details for you. Ideally you just want to feed in your plain text bytes and get encrypted bytes out on the other side.
Therefore it seems like a good idea to encrypt data that has first been converted into a fixed length, such as hex.
Converting bytes to hex doesn't make it fixed length. It doubles the size, but that's not fixed. It makes it ASCII-safe so it can be embedded into text files and e-mails easily, but that's not relevant here. (And Base64 is a better binary→ASCII encoding than hex anyways.)
In the interest of identifying best practices for ensuring compatibility with encrypting and decrypting data across different languages and platforms, I would like a critique on the following steps as a process:
Encryption:
plain text string
convert plain text string to byte array
convert byte array to hex
encrypt hex to encrypted string
encrypted string
plain text byte array to encrypted byte array
Decryption:
encrypted string
decrypt encrypted string to hex
convert hex to byte array
encrypted byte array
decrypt encrypted byte array to plain text byte array
convert byte array to plain text string
plain text string
To encrypt, convert the plain text string into its byte representation and then encrypt these bytes. The result will be an encrypted byte array.
Transfer the byte array to the other program in the manner of your choosing.
To decrypt, decrypt the encrypted byte array into a plain text byte array. Construct your string from this byte array. Done.

What encryption algorithm is best for small strings?

I have a string of 10-15 characters and I want to encrypt that string. The problem is I want to get a shortest encrypted string as possible. I will also want to decrypt that string back to its original string.
Which encryption algorithm fits best to this situation?
AES uses a 16-byte block size; it is admirably suited to your needs if your limit of 10-15 characters is firm. The PKCS#11 (IIRC) padding scheme would add 6-1 bytes to the data and generate an output of exactly 16 bytes. You don't really need to use an encryption mode (such as CBC) since you're only encrypting one block. There is an issue of how you'd be handling the keys - there is always an issue of how you handle encryption keys.
If you must go with shorter data lengths for shorter strings, then you probably need to consider AES in CTR mode. This uses the key and a counter to generate a byte stream which is XOR'd with the bytes of the string. It would leave your encrypted string at the same length as the input plaintext string.
You'll be hard pressed to find a general purpose compression algorithm that reliably reduces the length of such short strings, so compressing before encrypting is barely an option.
If it's just one short string, you could use a one-time pad which is mathematically perfect secrecy.
http://en.wikipedia.org/wiki/One-time_pad
Just be sure you don't use the key more than one time.
If the main goal is shortening, I would look for a compression library that allows a fixed dictionary built on a corpus of common strings.
Personally I do not have experience with that, but I bet LZMA can do that.

encryption of a single character

What is the minimum number of bits needed to represent a single character of encrypted text.
eg, if I wanted to encrypt the letter 'a', how many bits would I require. (assume there are many singly encrypted characters using the same key.)
Am I right in thinking that it would be the size of the key. eg 256 bits?
Though the question is somewhat fuzzy, first of all it would depend on whether you use a stream cipher or a block cipher.
For the stream cipher, you would get the same number of bits out that you put in - so the binary logarithm of your input alphabet size would make sense. The block cipher requires input blocks of a fixed size, so you might pad your 'a' with zeroes and encrypt that, effectively having the block size as a minimum, like you already proposed.
I'm afraid all the answers you've had so far are quite wrong! It seems I can't reply to them, but do ask if you need more information on why they are wrong. Here is the correct answer:
About 80 bits.
You need a few bits for the "nonce" (sometimes called the IV). When you encrypt, you combine key, plaintext and nonce to produce the ciphertext, and you must never use the same nonce twice. So how big the nonce needs to be depends on how often you plan on using the same key; if you won't be using the key more than 256 times, you can use an 8 bit nonce. Note that it's only the encrypting side that needs to ensure it doesn't use a nonce twice; the decrypting side only needs to care if it cares about preventing replay attacks.
You need 8 bits for the payload, since that's how many bits of plaintext you have.
Finally, you need about 64 bits for the authentication tag. At this length, an attacker has to try on average 2^63 bogus messages minimum before they get one accepted by the remote end. Do not think that you can do without the authentication tag; this is essential for the security of the whole mode.
Put these together using AES in a chaining mode such as EAX or GCM, and you get 80 bits of ciphertext.
The key size isn't a consideration.
You can have the same number of bits as the plaintext if you use a one-time pad.
This is hard to answer. You should definitely first read up on some fundamentals. You can 'encrypt' an 'a' with a single bit (Huffman encoding-style), and of course you could use more bits too. A number like 256 bits without any context is meaningless.
Here's something to get you started:
Information Theory -- esp. check out Shannon's seminal paper
One Time Pad -- infamous secure, but impractical, encryption scheme
Huffman encoding -- not encryption, but demonstrates the above point

Resources