Similarily to RC4 (RC4_PRNG+XOR ), would it be secure to use another CSPRNG(Cryptographically secure pseudorandom number generator)[Isaac, BlumBlumShub, etc) instead of RC4's and XOR the data with the resulting keystream?
Essentially this is just using Blum Blum Shub (or whatever PRNG) as a stream cipher. This isn't how they're designed to be used, and they might be weak to attacks that make sense in a stream cipher context but not in a CSPRNG context (eg. related-key attacks).
If this is what you want, you're better off just using a modern stream cipher. For example, DJB's Salsa20 is well-regarded.
Well, it depends.
Most encryption algorithms do significantly more than XOR. But that's because the key is shorter than the plaintext. If the key is as large as the plaintext, and truly random, then it is impossible to crack it (it's called a One Time Pad).
So, you need to explain more.
But I'm going to guess that you're key length is not the same as your input length, and that even if it was, almost certainly the random number service you are using is not truly secure, so I'd advise against your approach (furthermore, it goes without saying (maybe) that the problem with OTP is key-exchange).
Swapping out the CSPRNG in this scheme would probably be just as secure, and have the exact same set of assumptions, weaknesses and practical issues.
Related
My question is that, suppose you have some AES-ECB encrypted hash and you want to decode it. You are also given a bunch of example plaintexts and hashes. For example:
I want: unknown_plaintext for the hash given_hash
and i have a bunch of known_plaintexts and hashes that have been encrypted with the same secret key. None of them (obviously) are the exact same to the given hash.
Please let me know if you can help. This is not for malicious intents, just to learn how Cryptography and AES systems work.
This is not computationally feasible. I.e., you can't do this.
Modern encryption algorithms like AES are resistant to known-plaintext attacks, which is what you are describing.
There has been some past success in a category called adaptive chosen plaintext attacks. Often these exploit an "oracle." In this scenario, an attacker can decrypt a single message by repeatedly asking the victim whether it can successfully decrypt a guess generated by the attacker. By being smart about choosing successive guesses, the attacker could decrypt the message with a million tries or so, which is a relatively small number. But even in this scenario, the attacker can't recover the key.
As an aside, ciphers don't generate hashes. They output cipher text. Hash functions (aka message digests) generate hashes.
For any respectable block cipher (and AES is a respectable block cipher), the only way to decrypt a ciphertext block (not "hash") is to know the key, and the only way to find the key from a bunch of plaintext-ciphertext pairs is by guessing a key and seeing if it maps a known plaintext onto the corresponding ciphertext. If you have some knowledge of how the key was chosen (e.g., SHA-256 of a pet's name), this might work; but if the key was randomly selected from the set of all possible AES keys, the number of guesses required to produce a significant probability of success is such a large number that you wander off into age-of-the-universe handwaving.
If you know that all the encrypted hashes are encrypted with the same key you can first try to find that key using your pairs of plaintexts and encrypted hashes. The most obvious way to do that would be to just take one of your plaintexts, first hash it and then try out all the possible keys to encrypt it until it matches the encrypted hash that you know. If the key you're looking for is just one of the many many possible AES keys this is set to fail, because it would take way too long to try all the keys.
Assuming you were able to recover the AES key somehow, you can decrypt that one hash you don't have a plaintext for and start looking for the plaintext.
The more you know about the plaintext, the easier this guesswork would be. You could just throw the decrypted hash into google and see what it spits out, query databases of known hashes or make guesses in the most eduated way possible. This step will again fail, if the hash is strong enough and the plaintext is random enough.
As other people have indicated, modern encryption algorithms are specifically designed to resist this kind of attack. Even a rather weak encryption algorithm like the Tiny Encryption Algorithm would require well over 8 million chosen plaintexts to do anything like this. Better algorithms like AES, Blowfish, etc. require vastly more than that.
As of right now, there are no practical attacks on AES.
If you're interested in learning about cryptography, the older Data Encryption Standard (DES) may actually be a more interesting place to start than AES; there's a lot of literature available about it and it was already broken (the code to do so is still freely available online - studying it is actually really useful).
If I use different encryption methods but provide no indication in the ciphertext output of which method I use (for example, attaching an unencrypted header to the ciphertext) does that make the ciphertext harder to decrypt than just the difficulty implied by, for example, the keylength? The lack of information as to what encryption protocol and parameters to use should add difficulty by requiring a potential decrypter to try some or all the various encryption methods and parameters.
Well, in general you should not rely on information in the algorithm / protocol itself. Such information is generic for any key you use, so you should consider it public knowledge. OK, so that's that out of the way.
Now say you use 16 methods and you somehow have created a protocol that keeps the used encryption method confidential (let's say by encrypting a single block half filled with random and a magic, decrypting blocks at the receiver until you find the correct one). Now if you would want to brute force the key used you would need 16 more tries. In other words, you just have increased the key length with 4 bits, as 2 ^ 4 = 16. So say you would have AES-256 equivalent ciphers. You would now have equivalent encryption of 256 + 4 = 260 bits. That hardly registers, especially since AES-256 is already considered safe against attacks using a quantum computer.
Now those 4 bits comes at a very high price. A highly complex protocol using multiple ciphers. Each of these ciphers have their weaknesses. None of them will have received as much scrutiny as AES, and if one breaks you are in trouble (at least for 1 out of 16 encrypted messages). Speeds will differ, parameters and block sizes will differ, platforms may not support them all...
All in all, just use AES-256 if you are not willing to accept AES-128. If you must, encrypt things twice using AES and SERPENT. Adding an authentication tag over IV & ciphertext probably makes much more of a difference though. See this answer by Thomas over at the security site.
Try GCM or EAX mode of operation. Much more useful.
I'm trying to make a symmetric encryption algorithm. My key is 256 bits, and the block size and resulting ciphertext are also 256 bit. Is there a disadvantage because the key, plaintext and ciphertext have the same size?
Do not use home-brewed cryptographic algorithms for anything worth protecting. This is a complex area, where techniques that you'd never dream of are routine for crackers. There are plenty of time-tested and expert-vetted algorithms around, use one of those (after looking it up for kwnown weaknesses, and possible recommendations).
Most (if not all) block ciphers assume that the message size is a multiple of the block size, much as yours. There's no intrinsic disadvantage to that, AFAIK, and it makes it much easier to process the data. If you don't want to process the data in blocks, you need a stream cipher.
As #vonbrand has mentioned, you should never use such a custom cipher to encrypt any kind of sensitive data, as it will be trivially broken. If all you want is to have a working block cipher, you're looking forAES, which is uncrackable, as far the top universities know.
My php script and my c# application will pass a hash string to each other that is 32 chars long, what is the best mode for this? I thought ECB but am unsure as it says if using more then 1 block dont use. How do I know how big the block is?
They will occasionally pass a large text file, which would be the best mode for encrypting this...CBC?
Any good useful reads welcome...
Thanks
One problem of ECB (among many other problems) is that it encrypts deterministically. That is each time you encrypt the same ID, you get the same ciphertext. Thus this mode does not prevent traffic analysis. An attacker may not be able to learn the IDs that are encrypted. However, he can still determine when and how frequently the same ID is sent.
CBC and OFB when used properly use a new random IV for each encryption thus encrypting the same ID differently each time. Since you also make sure that all IDs have the same length the result should ciphertexts where the attacker cannot distinguish repeating IDs from non-repeating ones.
ECB is the simplest mode and really not recommended (it's not quite as secure as other modes).
Personally I'd use CBC.
In Addition to Accipitridae's Answer. You would need to supply the IV to the decryption procedure. That will also be a overhead in case of CBC or OFB.
What is the difference between Obfuscation, Hashing, and Encryption?
Here is my understanding:
Hashing is a one-way algorithm; cannot be reversed
Obfuscation is similar to encryption but doesn't require any "secret" to understand (ROT13 is one example)
Encryption is reversible but a "secret" is required to do so
Hashing is a technique of creating semi-unique keys based on larger pieces of data. In a given hash you will eventually have "collisions" (e.g. two different pieces of data calculating to the same hash value) and when you do, you typically create a larger hash key size.
obfuscation generally involves trying to remove helpful clues (i.e. meaningful variable/function names), removing whitespace to make things hard to read, and generally doing things in convoluted ways to make following what's going on difficult. It provides no serious level of security like "true" encryption would.
Encryption can follow several models, one of which is the "secret" method, called private key encryption where both parties have a secret key. Public key encryption uses a shared one-way key to encrypt and a private recipient key to decrypt. With public key, only the recipient needs to have the secret.
That's a high level explanation. I'll try to refine them:
Hashing - in a perfect world, it's a random oracle. For the same input X, you always recieve the same output Y, that is in NO WAY related to X. This is mathematically impossible (or at least unproven to be possible). The closest we get is trapdoor functions. H(X) = Y for with H-1(Y) = X is so difficult to do you're better off trying to brute force a Z such that H(Z) = Y
Obfuscation (my opinion) - Any function f, such that f(a) = b where you rely on f being secret. F may be a hash function, but the "obfuscation" part implies security through obscurity. If you never saw ROT13 before, it'd be obfuscation
Encryption - Ek(X) = Y, Dl(Y) = X where E is known to everyone. k and l are keys, they may be the same (in symmetric, they are the same). Y is the ciphertext, X is the plaintext.
A hash is a one way algorithm used to compare an input with a reference without compromising the reference.
It is commonly used in logins to compare passwords and you can also find it on your reciepe if you shop using credit-card. There you will find your credit-card-number with some numbers hidden, this way you can prove with high propability that your card was used to buy the stuff while someone searching through your garbage won't be able to find the number of your card.
A very naive and simple hash is "The first 3 letters of a string".
That means the hash of "abcdefg" will be "abc". This function can obviously not be reversed which is the entire purpose of a hash. However, note that "abcxyz" will have exactly the same hash, this is called a collision. So again: a hash only proves with a certain propability that the two compared values are the same.
Another very naive and simple hash is the 5-modulus of a number, here you will see that 6,11,16 etc.. will all have the same hash: 1.
Modern hash-algorithms are designed to keep the number of collisions as low as possible but they can never be completly avoided. A rule of thumb is: the longer your hash is, the less collisions it has.
Obfuscation in cryptography is encoding the input data before it is hashed or encrypted.
This makes brute force attacks less feasible, as it gets harder to determine the correct cleartext.
That's not a bad high-level description. Here are some additional considerations:
Hashing typically reduces a large amount of data to a much smaller size. This is useful for verifying the contents of a file without having to have two copies to compare, for example.
Encryption involves storing some secret data, and the security of the secret data depends on keeping a separate "key" safe from the bad guys.
Obfuscation is hiding some information without a separate key (or with a fixed key). In this case, keeping the method a secret is how you keep the data safe.
From this, you can see how a hash algorithm might be useful for digital signatures and content validation, how encryption is used to secure your files and network connections, and why obfuscation is used for Digital Rights Management.
This is how I've always looked at it.
Hashing is deriving a value from
another, using a set algorithm. Depending on the algo used, this may be one way, may not be.
Obfuscating is making something
harder to read by symbol
replacement.
Encryption is like hashing, except the value is dependent on another value you provide the algorithm.
A brief answer:
Hashing - creating a check field on some data (to detect when data is modified). This is a one way function and the original data cannot be derived from the hash. Typical standards for this are SHA-1, SHA256 etc.
Obfuscation - modify your data/code to confuse anyone else (no real protection). This may or may not loose some of the original data. There are no real standards for this.
Encryption - using a key to transform data so that only those with the correct key can understand it. The encrypted data can be decrypted to obtain the original data. Typical standards are DES, TDES, AES, RSA etc.
All fine, except obfuscation is not really similar to encryption - sometimes it doesn't even involve ciphers as simple as ROT13.
Hashing is one-way task of creating one value from another. The algorithm should try to create a value that is as short and as unique as possible.
obfuscation is making something unreadable without changing semantics. It involves value transformation, removing whitespace, etc. Some forms of obfuscation can also be one-way,so it's impossible to get the starting value
encryption is two-way, and there's always some decryption working the other way around.
So, yes, you are mostly correct.
Obfuscation is hiding or making something harder to understand.
Hashing takes an input, runs it through a function, and generates an output that can be a reference to the input. It is not necessarily unique, a function can generate the same output for different inputs.
Encryption transforms the input into an output in a unique manner. There is a one-to-one correlation so there is no potential loss of data or confusion - the output can always be transformed back to the input with no ambiguity.
Obfuscation is merely making something harder to understand by intruducing techniques to confuse someone. Code obfuscators usually do this by renaming things to remove anything meaningful from variable or method names. It's not similar to encryption in that nothing has to be decrypted to be used.
Typically, the difference between hashing and encryption is that hashing generally just employs a formula to translate the data into another form where encryption uses a formula requiring key(s) to encrypt/decrypt. Examples would be base 64 encoding being a hash algorithm where md5 being an encryption algorithm. Anyone can unhash base64 encoded data, but you can't unencrypt md5 encrypted data without a key.