Making GCM/CBC ciphers streamable in golang - encryption

The GCM and CBC AES ciphers in Go can't be used along side with StreamWriter or StreamReader, which forces me to allocate the entire file into memory. Obviously, this is not ideal with large files.
I was thinking of making them streamable, by allocation some fixed size of blocks into memory, and feeding them to GCM or CBC, but I'm assuming that is probably a bad idea, since there must be a reason they've been designed this way.
Can someone explain why these operation modes can't be used without allocating the entire files into memory?

Simple answer - that's how they designed the API.
CBC and GCM are very different modes. GCM is AEAD mode (Authenticated Encryption with Associated Data). Do you actually need authentication? If not, for large files CBC is a good fit. You could also use CTR or OFB but they're streaming modes and are very strict about choosing an IV.
Before implementing anything I suggest you read about these modes. You have to at least understand, which parameters they need, for what purpose and how they should be generated.
CBC
BlockMode interface is a good fit for encrypting large files. You just need to encrypt block by block. CBC requires padding, but Go doesn't have the implementation. At least, I don't see one. You will have to deal with that somehow.
GCM
GCM uses AEAD interface, which only allows you to encrypt and decrypt a whole message. There is absolutely no reason why it should be implemented like that. GCM is a streaming mode, it's actually a good fit for streaming encryption. The only problem is authentication. GCM produces a tag at the end, which acts like a MAC. To make use of that tag you can't just encrypt an endless stream of data. You have to split it into chunks and authenticate them separately. Or do something else but at some point you have to read that tag and verify it, otherwise there's no point in using GCM.
What some libraries do, including Go, is they append that tag at the end implicitly on encryption and read and verify it on decryption. Personally, I think that's a very bad design. Tag should be available as a separate entity, you can't just assume that it will always be at the end. And that's not the only one of the problems in Go implementation.
Sorry for that rant. Recently I've dealt with a particulary bad implementation.
As a solution I suggest you split your stream into chunks and encrypt them separately with a unique nonce (that's very important). Each chunk will have a tag at the end, which you should verify. That way you can make use of GCM authentication. Yes, it's kind of ugly but Go doesn't give you access to inner methods, so that you could make your own encryption API.
As an alternative you could find a different implementation. Maybe even a C library. I can suggest mbedtls. For me, it's the best implementation I came across in terms of API.

Heres a stream implementation for reading from a BlockMode. Your mileage may vary.
type BlockReader struct {
buf []byte
block cipher.BlockMode
in io.Reader
}
func NewBlockReader(blockMode cipher.BlockMode,reader io.Reader) *BlockReader {
return &BlockReader{
block: blockMode,
in: reader,
}
}
func (b *BlockReader) Read(p []byte) (n int, err error) {
toRead := len(p)
mul := toRead/b.block.BlockSize()
size := mul*b.block.BlockSize()
if cap(b.buf) != size{
b.buf = make([]byte,toRead,toRead)
}
read, err := b.in.Read(b.buf)
if err != nil {
return 0,err
}
if read < b.block.BlockSize(){
return 0,io.ErrUnexpectedEOF
}
b.block.CryptBlocks(b.buf,b.buf)
return copy(p,b.buf),nil
}

Related

Golang generate same encryption across different machines

I have a go script that uses crypto/aes to encrypt and decrypt a plaintext.
https://play.golang.org/p/le_-uuzWN4
I want this script to be used across different machines and produce the same encrypted text. I thought that by having a custom IV it would result in a consistent encryption no matter where.
Right now it yields in different results on the go playground versus on https://repl.it/languages/go
Is it possible to produce consistent encryption or will it always be different due to internal implementation (like encryption salt, etc..)
Also, what exactly is IV, I'm still confused about that. The doc doesnt really explain what it is
I figured out why it was generating a different cipher text each time. The IV is randomized with this statement
if _, err := io.ReadFull(rand.Reader, iv); err != nil {
panic(err)
}
Removing it will keep the IV constant and it will generate the same encryption given the same Key an IV on any machine

Cryptography: Mixing CBC and CTR?

I have some offline files that have to be password-protected. My strategy is as follows:
Cipher Algorithm: AES, 128-bit block, 256-bit key (PBKDF2-SHA-256
10000 iterations with a random salt stored plainly elsewhere)
Whole file is divided into pages with page size 1024 bytes
For a complete page, CBC is used
For an incomplete page,
Use CBC with cipher text stealing if it has at least one block
Use CTR if it has less one block
With this setup, we can keep the same file size
IV or nonce will be based on the salt and deterministic. Since this is not for network communication, I reckon we don't need to concern about replay attacks?
Question: Will this kind of mixing lower the security? Would we better off just use CTR throughout the whole file?
You're better off just using CTR for the entire file. Otherwise, you're adding a lot of extra work, in supporting multiple modes (CBC, CTR, and CTS) and determining which mode to use. It's not clear there's any value in doing so, since CTR is perfectly fine for encrypting a large amount of data.
Are you planning on reusing the same IV for each page? You should expand a bit on what you mean by a page, but I'd recommend unique IV's for each page. Are these pages addressable somehow? You might want to look into some of the new disk encryption modes for an idea on generating unique IV's
You also really need to MAC your data. In CTR for example, if someone flips a bit of the ciphertext, it'll flip the bit when you decrypt, and you'll never know it was tampered with. You can use HMAC or if you want to simplify your entire scheme, use AES GCM mode, which combines CTR for encryption and GMAC for integrity
There are a few things you need to know about CTR mode. After you know them all you could happily apply a stream cipher in your situation:
never ever reuse a data key with the same nonce;
above, not even in time;
be aware that CTR mode really shows the size of the encrypted data; always encrypting full blocks can hide this somewhat (in general a 1024 byte block takes as much as a single bit block if the file system boundaries are honored);
CTR mode in itself does not provide authentication (for completion, as this was already discussed);
If you don't keep to the first two rules, an attacker will immediately see the place of the edit and the attacker will be able to retrieve data directly related to the plain text.
On a possitive node:
you can happily use the offset (in, e.g., blocks) in the file to be part of the nonce;
it is very easy to seek in files, buffer ciphertext and create multi-threaded code around CTR.
And in general:
it pays off to use a data specific key specific sets of files, in such a way that if a key is compromised or changed that you don't have to re-encrypt everything;
think very well about how your keys are used, stored, backed up etc. Key management is the hardest part;

Random access encrypted file

I'm implementing a web based file storage service (C#). The files will be encrypted when stored on the server, but the challenge is how to implement the decryption functionality.
The files can be of any size, from a few KB to several GB. The data transfer is done in chunks, so that the users download the data from say offset 50000, 75000 etc. This works fine for unencrypted files, but if encryption is used the entire file has to be decrypted before each chunk is read from the offset.
So, I am looking at how to solve this. My research so far shows that ECB and CBC can be used. ECB is the most basic (and most insecure), with each block encrypted separately. The way ECB works is pretty much what I am looking for, but security is a concern. CBC is similar, but you need the previous block decrypted before decrypting the current block. This is ok as long as the file is read from beginning to end, and you keep the data as you decrypt it on the server, but at the end of the day its not much better than decrypting the entire file server side before transferring.
Does anyone know of any other alternatives I should consider? I have no code at this point, because I am still only doing theoretical research.
Do not use ECB (Electronic Code Book). Any patterns in your plaintext will appear as patterns in the ciphertext. CBC (Cipher Block Chaining) has random read access (the calling code knows the key and the IV is the previous block's result) but writing a block requires rewriting all following blocks.
A better mode is Counter (CTR). In effect, each block uses the same key and the IV for each block is the sum of offset of that block from a defined start and an initial IV. For example, the IV for block n is IV + n. CTR mode is described in detail on page 15 in NIST SP800-38a. Guidance on key and IV derivation can be found in NIST SP800-108.
There are a few other similar modes such as (Galois Counter Mode) GCM.

Repeatedly encrypt(random padding + const int32) - compromise secret?

Will any encryption scheme safely allow me to encrypt the same integer repeatedly, with different random material prepended each time? It seems like the kind of operation that might get me in hot water.
I want to prevent spidering of items at my web application, but still have persistent item IDs/URLs so content links don't expire over time. My security requirements aren't really high for this, but I'd rather not do something totally moronic that obviously compromises the secret.
// performed on each ID before transmitting item search results to the client
public int64 encryptWithRandomPadding(int32 id) {
int32 randomPadding = getNextRandomInt32();
return encrypt(((int64)randomPadding << 32) + id), SECRET);
}
// performed on an encrypted/padded ID for which the client requests details
public int32 decryptAndRemoveRandomPadding(int64 idToDecrypt) {
int64 idWithPadding = decrypt(idToDecrypt, SECRET);
return (int32)idWithPadding;
}
static readonly string SECRET = "thesecret";
Generated IDs/URLs are permanent, the encrypted IDs are sparsely populated (less than 1 in uint32.Max are unique, and I could add another constant padding to reduce the likelyhood of a guess existing), and the client may run the same search and get the same results with different representative IDs each time. I think it meets my requirements, unless there's a blatant cryptographic issue.
Example:
encrypt(rndA + item1) -> tokenA
encrypt(rndB + item1) -> tokenB
encrypt(rndC + item2) -> tokenC
encrypt(rndD + item175) -> tokenD
Here, there is no way to identify that tokenA and tokenB both point to identical items; this prevents a spider from removing duplicate search results without retrieving them (while retrieving increments the usage meter). Additionally, item2 may not exist.
Knowing that re-running a search will return the same int32 padded multiple ways with the same secret, can I do this safely with any popular crypto algorithms? Thanks, crypto experts!
note: this is a follow-up to a question that didn't work out as I'd hoped: Encrypt integer with a secret and shared salt
If your encryption is secure, then random padding makes cracking neither easier nor harder. For a message this short, a single block long, either everything is compromised or nothing is. Even with a stream cipher, you'd still need the key to get any further; the point of good encryption is that you don't need extra randomness. Zero padding or other known messages at least a block long at the beginning are obviously to be avoided if possible, but that's not the issue here. It's pure noise, and once someone discovered that, they'd just skip ahead and start cracking from there.
Now, in a stream cipher, you can add all the randomness in the beginning and the later bytes will still be the same with the same key, don't forget that. This only actually does anything at all for a block cipher, otherwise you'd have to xor the random bits into the real value to get any use out of it.
However, you might be better off using a MAC as padding: with proper encryption, the encrypted mac won't give any information away, but it looks semi-randomish and you can use it to verify that there were no errors or malicious attacks during decryption. Any hash function you like can create the MAC, even a simple CRC-32, without giving anything away after encryption.
(A cryptographer might find a way to shave a bit or two off due to the relatedness, will tons of plaintexts if they knew beforehand how they were related, but that's still far beyond practicality.)
As you asked before, you can safely throw in an unecrypted salt in front of every message; a salt can only compromise an encrypted value if the implementation is broken or the key compromised, as long as the salt is properly mixed into the key, particularly if you can mix it into the expanded key schedule before decryption. Modern hash algorithms with lots of bits are really good at that, but even mixing into a regular input key will always have the same security as the key alone.

Been advised to use same IV in AES implementation

We've had to extend our website to communicate user credentials to a suppliers website (in the query string) using AES with a 256-bit key, however they are using a static IV when decrypting the information.
I've advised that the IV should not be static and that it is not in our standards to do that, but if they change it their end we would incur the [big] costs so we have agreed to accept this as a security risk and use the same IV (much to my extreme frustration).
What I wanted to know is, how much of a security threat is this? I need to be able to communicate this effectively to management so that they know exactly what they are agreeing to.
*UPDATE:*We are also using the same KEY throughout as well.
Thanks
Using a static IV is always a bad idea, but the exact consequences depend on the Mode of Operation in use. In all of them, the same plaintext will produce the same ciphertext, but there may be additional vulnerabilities: For example, in CFB mode, given a static key, the attacker can extract the cipherstream from a known plaintext, and use it to decrypt all subsequent strings!
Using a static IV is always a bad idea. Using a static key is always a bad idea. I bet that your supplier had compiled the static key into their binaries.
Sadly, I've seen this before. Your supplier has a requirement that they implement encryption and they are attempting to implement the encryption in a manner that's as transparent as possible---or as "checkbox" as possible. That is, they aren't really using encryption to provide security, they are using it to satisfy a checkbox requirement.
My suggestion is that you see if the supplier would be willing to forsake this home-brewed encryption approach and instead run their system over SSL. Then you get the advantage of using a quality standard security protocol with known properties. It's clear from your question that neither your supplier nor you should be attempting to design a security protocol. You should, instead, use one that is free and available on every platform.
As far as I know (and I hope others will correct me if I'm wrong / the user will verify this), you lose a significant amount of security by keeping a static key and IV. The most significant effect you should notice is that when you encrypt a specific plaintext (say usernameA+passwordB), you get the same ciphertext every time.
This is great for pattern analysis by attackers, and seems like a password-equivalent that would give attackers the keys to the kingdom:
Pattern analysis: The attacker can see that the encrypted user+password combination "gobbbledygook" is used every night just before the CEO leaves work. The attacker can then leverage that information into the future to remotely detect when the CEO leaves.
Password equivalent: You are passing this username+password in the URL. Why can't someone else pass exactly the same value and get the same results you do? If they can, the encrypted data is a plaintext equivalent for the purposes of gaining access, defeating the purpose of encrypting the data.
What I wanted to know is, how much of a security threat is this? I need to be able to communicate this effectively to management so that they know exactly what they are agreeing to.
A good example of re-using the same nonce is Sony vs. Geohot (on a different algorithm though). You can see the results for sony :) To the point. Using the same IV might have mild or catastrophic issues depending on the encryption mode of AES you use. If you use CTR mode then everything you encrypted is as good as plaintext. In CBC mode your first block of plaintext will be the same for the same encrypted data.

Resources