How stream cipher works - encryption

I am new here, and I am trying to understand encryption. I have done a lot of reading here and I can not find an explanation that could help me understand.
When we are talking about stream ciphers, from what I understood, the encryption is done bit by bit.
Does that mean that the input text (let's say "Google") is encrypted character by character(because that would be byte by byte) ? Or is it converted to binary first, then the sequence of 0 and 1 is encrypted bit bi bit?
Thank you.

When we are talking about stream ciphers, from what I understood, the encryption is done bit by bit.
I assume you are talking about the simple XOR-ing of plaintext with the cipherstream.
Stream ciphers are often defined (theoretically, as a formal definition) as PRG (pseudo random generator) producing bit by bit with non-guessable probability. I've seen such a definition in multiple courses. You could (in theory) apply the XOR operation bit byt bit. As you've already find out that would not be very practical in current computer architecture.
Or is it converted to binary first, then the sequence of 0 and 1 is encrypted bit bi bit?
Practically the cipher streams are having some internal state and produces the output as stream of bytes or a byte array. As a result the string is converted as a byte array and and XOR is applied to the whole array (byte by byte or whole chunks of bytes)

Related

What to expect and how to use OpenSSL 1-bit stream cipher modes

I am trying to use the 1-bit stream cipher mode in OpenSSL, specifically EVP_aes_128_cfb1.
I initialize the context with that parameter using, EVP_EncryptInit_ex().
The rest is unclear to me. I am expecting this to work as a bit-by-bit stream cipher, meaning that if I "feed" it with a stream of bits it will gradually "build" the ciphertext bit by bit, and that ciphertext should eventually be identical to a one-shot encryption of the entire thing when using a bigger bit width like 128 bits.
Whatever I try, this is not what I am getting, and to be honest I am not sure if the problem is in the expectation or in my usage of OpenSSL.
I am using EVP_EncryptUpdate(&ctx, p_target, &outlen, &current_byte, 1) but that means calling this 8 times with the same input byte, shifted properly. Then I expect the target byte, which also stays the same for 8 iteration at a time, to gradually be built one bit index at a time from MSB to LSB, with the context object orchestrating the state.
This does no yield the expected results, and I could not find working examples.
Help will be appreciated.
Thanks.

How to encrypt text using CRC32

As far as i know CRC32 is an error-detecting code.
I did not use it before but i recieve some task to use CRC32 to encrypt some text.
Is it possible to use CRC32 to encrypt ?
As noted, a CRC is linear and easily reversible, which is precisely what you don't want for secure encryption.
However it is possible to encode data with a CRC, if there were some reason to do so. You can, for example, take four bytes of data, compute the CRC-32 of that data, and transmit the CRC-32 instead of the data. Repeat for the next four bytes. That CRC-32 can easily be inverted to get the original four bytes.
I have no idea why that might be useful, but it can be done.
i recieve some task to use CRC32 to encrypt some text.
As already commented, crc32 is checksum, not even cryptographic hash. That means it lacks important features to be used in cryptography
Lets assume (wrongly) you could use crc32 as a hash function.
In theory (long shot) you could build a stream cipher e. g. chain-hashing a key. You could encrypt data with it. Just I am pretty sure the solution would be not secure enough. (even now I am an amateur and I am quite confident I see several attack vectors on such a cipher, crc32 is simply too linear)

Best practices for passing encrypted data between different programming languages

I have read that if you want to encrypt a string using one programming language and decrypt that string using another programming language, then to ensure compatibility it is best to do some conversions prior to doing the encryption. I have read that it's a best practice to encrypt the byte array of a string rather than the string itself. Also, I have read that certain encryption algorithms expect each encrypted packet to be a fixed length in size. If the last packet to be encrypted isn't the required size, then encryption would fail. Therefore it seems like a good idea to encrypt data that has first been converted into a fixed length, such as hex.
I am trying to identify best practices that are generally useful regardless of the encryption algorithm being used. To maximize compatibility when encrypting and decrypting data across different languages and platforms, I would like a critique on the following steps as a process:
Encryption:
start with a plain text string
convert plain text string to byte array
convert byte array to hex
encrypt hex to encrypted string
end with an encrypted string
Decryption:
start with an encrypted string
decrypt encrypted string to hex
convert hex to byte array
convert byte array to plain text string
end with a plain text string
Really the best practice for encryption is to use a high level encryption framework, there's a lot of things you can do wrong working with the primitives. And mfanto does a good a good job of mentioning important things you need to know if you don't use a high level encryption framework. And i'm guessing that if you are trying to maximize compatibility across programming languages, it's because you need other developers to inter-operate with the encryption, and then they need to learn the low level details of working with encryption too.
So my suggestion for high level framework is to use the Google Keyczar framework, as it handles the details of, algorithm, key management, padding, iv, authentication tag, wire format all for you. And it exists for many different programming Java, Python, C++, C# and Go. Check it out.
I wrote the C# version, so I can tell you the primitives it uses behind the scenes are widely available in most other programming languages too, and it uses standards like json for key management and storage.
Your premise is correct, but in some ways it's a little easier than that. Modern crypto algorithms are meant to be language agnostic, and provided you have identical inputs with identical keys, you should get identical results.
It's true that for most ciphers and some modes, data needs to be a fixed length. Converting to hex won't do it, because the data needs to end on fixed boundaries. With AES for example, if you want to encrypt 4 bytes, you'll need to pad it out to 16 bytes, which a hex representation wouldn't do. Fortunately that'll most likely happen within the crypto API you end up using, with one of the standard padding schemes. Since you didn't tag a language, here's a list of padding modes that the AesManaged class in .NET supports.
On the flip side, encrypting data properly requires a lot more than just byte encoding. You need to choose the correct mode of operation (CBC or CTR is preferred), and then provide some type of message integrity. Encryption alone doesn't protect against tampering with data. If you want to simplify things a bit, then look at a mode like GCM, which handles both confidentiality, and integrity.
Your scheme should then look something like:
Convert plain text to string to byte array. See #rossum's comment for an important note about character encoding.
Generate a random symmetric key or use PBKDF2 to convert a passphrase to a key
Generate a random IV/nonce for use with GCM
Encrypt the byte array and store it, along with the Authentication Tag
You might optionally want to store the byte array as a Base64 string.
For decryption:
If you stored the byte array as a Base64 string, convert back to the byte array.
Decrypt encrypted byte array to plaintext
Verify the resulting Authentication Tag matches the stored Authentication Tag
Convert byte array to plain text string.
I use HashIds for this purpose. It's simple and supports wide range of programming language. We use it to pass encrypted data between our PHP, Node.js, and Golang microservices whenever we need to decrypt data in the destination.
I have read that it's a best practice to encrypt the byte array of a string rather than the string itself.
Crytographic algorithms generally work on byte arrays or byte stream, so yes. You don't encrypt objects (strings) directly, you encrypt their byte representations.
Also, I have read that certain encryption algorithms expect each encrypted packet to be a fixed length in size. If the last packet to be encrypted isn't the required size, then encryption would fail.
This is an implementation detail of the particular encryption algorithm you choose. It really depends on what the API interface is to the algorithm.
Generally speaking, yes, crytographic algorithms will break input into fixed-size blocks. If the last block isn't full then they may pad the end with arbitrary bytes to get a full chunk. To distinguish between padded data and data which just happens to have what-look-like-padding bytes at the end, they'll prepend or append the length of the plain text to the byte stream.
This is the kind of detail that should not be left up to the user, and a good encryption library will take care of these details for you. Ideally you just want to feed in your plain text bytes and get encrypted bytes out on the other side.
Therefore it seems like a good idea to encrypt data that has first been converted into a fixed length, such as hex.
Converting bytes to hex doesn't make it fixed length. It doubles the size, but that's not fixed. It makes it ASCII-safe so it can be embedded into text files and e-mails easily, but that's not relevant here. (And Base64 is a better binary→ASCII encoding than hex anyways.)
In the interest of identifying best practices for ensuring compatibility with encrypting and decrypting data across different languages and platforms, I would like a critique on the following steps as a process:
Encryption:
plain text string
convert plain text string to byte array
convert byte array to hex
encrypt hex to encrypted string
encrypted string
plain text byte array to encrypted byte array
Decryption:
encrypted string
decrypt encrypted string to hex
convert hex to byte array
encrypted byte array
decrypt encrypted byte array to plain text byte array
convert byte array to plain text string
plain text string
To encrypt, convert the plain text string into its byte representation and then encrypt these bytes. The result will be an encrypted byte array.
Transfer the byte array to the other program in the manner of your choosing.
To decrypt, decrypt the encrypted byte array into a plain text byte array. Construct your string from this byte array. Done.

AES Rijndael and little/big endian?

I am using the public domain reference implementation of AES Rijndael, commonly distributed under the name "rijndael-fst-3.0.zip". I plan to use this to encrypt network data, and I am wondering whether the results of encryption will be different on big/little endian architectures? In other words, can I encrypt a block of 16 bytes on a little endian machine and then decrypt that same block on big endian? And of course, the other way around as well.
If not, how should I go about swapping bytes?
Thanks in advance for your help.
Kind regards.
Byte order issues are relevant only in context of mapping multi-byte constructs to a sequence of bytes e.g. mapping a 4-byte sequence to a signed integer value is sensitive to byte order.
The AES algorithm is byte centric and insensitive to endian issues.
Rijndael is oblivious to byte order; it just sees the string of bytes you feed it. You should do the byte swapping outside of it as you always would (with ntohs or whatever interface your platform has for that purpose).

encryption of a single character

What is the minimum number of bits needed to represent a single character of encrypted text.
eg, if I wanted to encrypt the letter 'a', how many bits would I require. (assume there are many singly encrypted characters using the same key.)
Am I right in thinking that it would be the size of the key. eg 256 bits?
Though the question is somewhat fuzzy, first of all it would depend on whether you use a stream cipher or a block cipher.
For the stream cipher, you would get the same number of bits out that you put in - so the binary logarithm of your input alphabet size would make sense. The block cipher requires input blocks of a fixed size, so you might pad your 'a' with zeroes and encrypt that, effectively having the block size as a minimum, like you already proposed.
I'm afraid all the answers you've had so far are quite wrong! It seems I can't reply to them, but do ask if you need more information on why they are wrong. Here is the correct answer:
About 80 bits.
You need a few bits for the "nonce" (sometimes called the IV). When you encrypt, you combine key, plaintext and nonce to produce the ciphertext, and you must never use the same nonce twice. So how big the nonce needs to be depends on how often you plan on using the same key; if you won't be using the key more than 256 times, you can use an 8 bit nonce. Note that it's only the encrypting side that needs to ensure it doesn't use a nonce twice; the decrypting side only needs to care if it cares about preventing replay attacks.
You need 8 bits for the payload, since that's how many bits of plaintext you have.
Finally, you need about 64 bits for the authentication tag. At this length, an attacker has to try on average 2^63 bogus messages minimum before they get one accepted by the remote end. Do not think that you can do without the authentication tag; this is essential for the security of the whole mode.
Put these together using AES in a chaining mode such as EAX or GCM, and you get 80 bits of ciphertext.
The key size isn't a consideration.
You can have the same number of bits as the plaintext if you use a one-time pad.
This is hard to answer. You should definitely first read up on some fundamentals. You can 'encrypt' an 'a' with a single bit (Huffman encoding-style), and of course you could use more bits too. A number like 256 bits without any context is meaningless.
Here's something to get you started:
Information Theory -- esp. check out Shannon's seminal paper
One Time Pad -- infamous secure, but impractical, encryption scheme
Huffman encoding -- not encryption, but demonstrates the above point

Resources