Randomness of IV: Why is it needed yet secrecy of the IV is not needed? - cbc-mode

The IV used in schemes such as CBC has to be random and unpredictable. But at the same time it does not have to be kept secret.
If the IV does not have to be secret, why does it have to be random then? I fail to make sense out of these seemingly contradicting requirements.
I have seen descriptions of attacks which exploit the non-randomness. So, I would understand why randomness is needed. However, things get confusing when the requirements specify that the IV does not have to be secret !This seems to defeat the whole purpose of randomness.
Somebody help clarify this please.

I think you are reversing the roles.
When a cryptographic protocol is designed, it is designed with certain assumptions in mind. The more assumptions you use, the less useful the protocol is, as you are less likely to find scenarios in which the assumptions hold.
In the case of CBC, the IV was designed to not need to be secret. You can keep it a secret, if you like. The algorithm is definitely not less secure this way. It is not, however, a requirement.
Having a non-random IV, on the other hand, causes the entire protocol to be unsuitable for certain applications. When choosing between adding a requirement to the protocol and adding a requirement to its data, the right choice is obvious.
In other words, the IV does not need to be secret, merely because it can be non-secret.

Related

Cryptography - Good to swip some bits in byte array?

I currently informed myself about encryption a lot.
And I wonder, whether it would be good to toggle some bits (XOR and bitmasks) at a known position in the encrypted byte array and to toggle them again before decrypting them.
Because even if you know the algorithm and the key it wouldn't be possible to decrypt them propably without knowing where to toggle the bits wouldn't it?
The bit-toggling becomes part of the algorithm, so if "they know the algorithm" comes to include which bits were toggled.
It does become marginally harder to find out that "the algorithm", but this gain is small. If they can get their hands on the key, I think your problem is somewhere else...
There are some disadvantages to this as well. First, you may introduce security flaws in the system. I don't think this will happen, but I don't know I won't, and in security you should assume you might cause security flaws unless you know you won't.
The second problem is that if you make a mistake here somewhere it is possible to corrupt data. Of course, rigid testing will make sure that a mistake won't make it to production, but it just isn't as safe as using the functionality of a security library.
Lastly, there is the problem that your code and data will be harder to work with. If you need to work with it in the future, or someone has to work with it, it'll probably take more effort than it otherwise would have.
Those aren't big things, but I'd say more than the gain. At the end of the day, this is little more than "security through obscurity", so no, I wouldn't say it is a good idea.
You need to look at Kerckhoff's Principle. Any crypto-system needs to be secure, even if the attacker knows the entire algorithm. Only the key must be kept secret. Your bit-twiddling is part of the algorithm, and so will be known to any enemy.
Consider AES. There are public papers describing in great detail exactly how AES works: NIST AES Description. That description of the algorithm is public, yet AES is still secure because if follows Kerckhoff. You need to make sure that your algorithm is secure. Simple bit-twiddling as you describe will not make an insecure algorithm secure. Any attacker will know what bits to twiddle and can untwiddle them and break the underlying insecure cypher.
As an alternative, you could add the bitmask you use to the key. This increases the key size and the processing time for very little security gain. There are usually better ways to increase the security of a cypher, such as adding more rounds.

How do people go about attacking an encrypted file? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I've been reading up on file encryption lately, and In many places I've seen warnings that encrypted files are susceptible to decryption by people so inclined regardless of encryption algorithm strength.
However, I can't get my head around how someone would go about attempting to decrypt an encrypted file.
For example, lets say you've got an encrypted file and you'd like to know it's contents. You have no idea what the key used to encrypt the file is, nor the encryption algorithm used. What do you do? (Assume for this example that the encryption algorithm is a symmetric-key algorithm such as AES-256, I.E. a file encrypted with key which requires said key to decrypt it).
Additionally, how would your approach change if you knew the encryption algorithm used? (Assume in this case that the encryption algorithm used is AES-256, with a random key + salt).
There's two ways to answer this question, in the literal sense of how a perfect crypto system is attacked, and how real world systems are attacked. One of the biggest problems you'll find as you begin to learn more about cryptography is that selecting algorithms is the easy part. It's how you manage those keys that becomes impossibly difficult.
The way in which you attack the basic primitives depends on the type of algorithm. In the case of data encrypted by symmetric ciphers like AES you use Brute force attacks. That is, you effectively try every key possible, until you find the right one. Unfortunately, barring changes in the laws of physics trying every possible 256-bit key can't be done. From Wikipedia: "A device that could check a billion billion (10^18) AES keys per second would in theory require about 3×10^51 years to exhaust the 256-bit key space"
The problem with your question about coming across a seemingly encrypted file, with no knowledge of the methods used, is that it's a bit of a hard problem known as a Distinguishing Attack. One of the requirements of all modern algorithms is that their output should be indistinguishable from random data. If I encrypt something under both AES and Twofish, and then give you some random data, absent any other information like headers, there's no way for you to tell them apart. That being said....
You asked how knowledge of the algorithm changes the approach. One assumption cryptographers usually make is that knowledge of the algorithm shouldn't affect security at all, it should all depend on the secret key. Usually whatever protocol you're working with will tell you the algorithm specifications. If this wasn't public, interoprobility would be a nightmare. Cipher Suites, for example, are sets of algorithms that protocols like SSL support. NIST FIPS and the NSA Suite B are algorithms that have been standardized by the Federal Government, that most everyone follows.
In practice though, most crypto-systems have much larger problems.
Bad random number generation: Cryptography requires very good, unpredictable random number generators. Bad random number generators can completely collapse security, as in the case of Netscape's SSL implementation. You also have examples like the Debian RNG bug, where a developer changed code to satisfy a memory leak warning, which ultimately led to Debian generating the same certificate keys for every system.
Timing Attacks: Certain operations take longer to execute on a computer than others. Sometimes, attackers can observe this latency and deduce secret values. This has been demonstrated by remotely recovering a server's private key over a local network.
Attacks against the host: One way to attack a cryptosystem is to attack the host. By cooling memory, its contents can be preserved and inspected in a machine you control.
Rubber hose cryptanalysis: Maybe one of the easiest attacks, you threaten the party with physical harm or incarceration unless they reveal the key. There has been a lot of interesting case law on whether or not courts can force you to reveal crypto keys.
AES256 is effectively unbreakable.
From http://www.wilderssecurity.com/showthread.php?t=212324:
I don't think there's any credible speculation that any agency can
break a properly implemented AES. There are no known cryptanalytic
attacks, and actually bruteforcing AES-256 is probably beyond human
capabilities within any of our lifetimes. Let's assume that 56 bit DES
can be bruteforced in 1 sec, which is a ridiculous assumption to begin
with. Then AES-256 would take 2^200 seconds, which is 5 x 10^52 years.
So, you can see that without any known weakness in AES, it would be a
total impossibility within any of our lifetimes, even with quantum
computing. Our sun will explode, approximately 5 billion years from
now, before we obtain enough computing power to bruteforce AES-256
without a known weakness. IF a weakness in AES is never found, there
is absolutely no reason to ever look for another cipher besides AES.
It will suffice for as long as humans occupy the planet.
With basic Brute force attack for example. You ask a software to try every single combination between 1 character to 15 character with a-z A-Z 0-9 and wait.
The software will start with 0 to 10... then 0a, 0b, 0c until it finds the password. Wikipedia will give you more detail.
I partially agree with Andrew and partially with Jeremy.
In the case, if encryption key is generated correctly (random generated or based on complex password, good key derivation function and random salt) then AES256 is effectively unbreakable (as Andrew said)
On other hand, if a key isn't correctly generated. As example, just straight hash of 4 digit's PIN password, brute force could be very efficient.
Regarding "You have no idea what the key used to encrypt the file is, nor the encryption algorithm used. "
In most case, encrypted files has a header or a footer which specify something (an application used to encrypt a file, encryption algorithm or something else).
You can try to figure out algorithm by padding (as example 3DES has padding and AES has different padding)

Using RSA for hash

I am pondering creating a hash function (like md5 or sha1) using the RSA crypto algorithm. I am wondering if there are any obvious reasons that this algorithm wouldn't work:
Generate RSA public/private keys.
Discard private key, never store it at all.
Begin with a hash with a length of the block size for the RSA encryption.
Encrypt message using public key, one block at a time.
For each encrypted block of the message, accumulate it to the hash using a specified algorithm (probably a combination of +, xor, etc.)
To verify a message has the same hash as a stored hash, use the saved public key and repeat the process.
Is this possible, secure, and practical?
Thanks for any comments.
RSA encryption is not deterministic: if you follow the RSA standard, you will see that some random bytes are injected. Therefore, if you encrypt with RSA the same message twice, chances are that you will not get twice the same output.
Also, your "unspecified step 5" is likely to be weak. For instance, if you define a way to hash a block, and then just XOR the blocks together, then A||B and B||A (for block-sized values A and B) will hash to the same value; that's collision bonanza.
Academically, building hash functions out of number-theoretic structures (i.e. not a raw RSA, but reusing the same kind of mathematical element) has been tried; see this presentation from Lars Knudsen for some details. Similarly, the ECOH hash function was submitted for the SHA-3 competition, using elliptic curves at its core (but it was "broken"). The underlying hope is that hash function security could somehow be linked to the underlying number-theoretic hard problem, thus providing provable security. However, in practice, such hash functions are either slow, weak, or both.
There are already hashes that do essentially this, except perhaps not with the RSA algorithm in particular. They're called cryptographic hashes, and their salient point is that they're cryptographically secure - meaning that the same strength and security-oriented thought that goes into public key cryptographic functions has gone into them as well.
The only difference is, they've been designed from the ground-up as hashes, so they also meet the individual requirements of hash functions, which can be considered as additional strong points that cryptographic functions need not have.
Moreover, there are factors which are completely at odds between the two, for instance, you want hash functions to be as fast as possible without compromising security whereas being slow is oftentimes seen as a feature of cryptographic functions as it limits brute force attacks considerably.
SHA-512 is a great cryptographic hash and probably worthy of your attention. Whirlpool, Tiger, and RipeMD are also excellent choices. You can't go wrong with any of these.
One more thing: if you actually want it to be slow, then you definitely DON'T want a hash function and are going about this completely wrong. If, as I'm assuming, what you want is a very, very secure hash function, then like I said, there are numerous options out there better suited than your example, while being just as or even more cryptographically secure.
BTW, I'm not absolutely convinced that there is no weakness with your mixing algorithm. While the output of each RSA block is intended to already be uniform with high avalanching, etc, etc, etc, I remain concerned that this could pose a problem for chosen plaintext or comparative analysis of similar messages.
Typically, it is best to use an algorithm that is publicly available and has gone through a review process. Even though there might be known weaknesses with such algorithms, that is probably better than the unknown weaknesses in a home-grown algorithm. Note that I'm not saying the proposed algorithm has flaws; it's just that even if a large number of answers are given here saying that it seems good, it doesn't guarantee that it doesn't. Of course, the same thing can be said about algorithms such as MD5, SHA, etc. But at least with those, a large number of people have put them through a rigorous analysis.
Aside from the previous "boilerplate" warnings against designing one's own cryptographic functions, it seems the proposed solution might be somewhat expensive in terms of processing time. RSA encryption on a large document could be prohibitive.
Without thinking too much about it, it seems like that would be cryptographically secure.
However, you'd have to be careful of chosen plaintext attacks, and if your input is large you may run into speed issues (as asymmetric crypto is significantly slower than cryptographic hashes).
So, in short: yes, this seems like it could be possible and secure… But unless there is a really compelling reason, I would use a standard HMAC if you want a keyed hash.
As mentioned above step 4.) is to be done deterministic, i.e. with modulus and public key exponent, only.
If the hash in step 3.) is private, the concept appears secure to me.
Concerning Step 5.): In known CBC mode of kernel algorithms the mix with previous result is done before encryption, Step 4.), might be better for avoiding collusions, e.g. with a lazy hash; XOR is fine.
Will apply this, as available implementations of known hash functions might have backdoors :)
Deterministic Java RSA is here.
EDIT
Also one should mention, that RSA is scalable without any limits. Such hash function can immediately serve as Mask Generation Function.

Improving cipher's properties sanity check

I am reading about cryptography I was thinking about these properties of AES (that I use):
same message = same ouput
no message length secrecy
possible insecurity if you know the messages (does this actually apply to AES?)
I hear that AES is secure, but what if I want to theoritcaly improve these properties?
I was thinking I could do this:
apply encryption algorithm A
XOR with random data D (making sure the output looks random in case of any cipher)
generate random data that are longer than the original message
use hashing function F to allocate slots in random data (this scrambles the order bytes)
Inputs: Encryption algorith A, Data to XOR with D and a hashing function F
My questions are
does the proposed solution theoreticaly help with my concerns?
is this approach used somewhere?
Possible enhancements to this approach
I could also say that the next position chosen by hashing function will be altered using a checksum of the last decoded byte after the XOR step (that way the message has to be decoded from beginning to end)
If I was to use this to have conversation with someone, the data to XOR with could be the last message from the other person, but thats probably a vulnerability.
I am looking forward to your thoughts!
(This is only theoretical, I am not in need of more secure encryption, just trying to learn from you guys.)
Yeah.
Look. If you want to learn about cryptography, I suggest you read Applied Cryptography. Really, just do it. You will get some nice definitive learnings, and get an understanding of what is appropriate and what is not. It specifically talks about implementation, which is what you are after.
Some rules of thumb:
Don't make up your own scheme. This is almost universally true. There may be exceptions, but it's fair to say that you should only invent your own scheme if you've thoroughly reviewed all existing schemes and have specific quantifiable reasons for them not being good enough.
Model your attacker. Find out what scenarios you are intending to protect against, and structure your system so that it works to mitigate the potential attacks.
Complexity is your enemy. Don't make your system more complex then it needs to be.
Stay up to date. You can find a few mailing lists related to cryptography and (and hashing) join them. From there you will learn interesting implementation details, and be aware of the latest attacks.
As for specifically addressing your question, well, it's confusing. I don't understand your goal, nor do I understand steps 3 and 4. You might like to take a quick look here to gain an understanding of the different ways you can use a given encryption algorithm.
Hope this helps.
You assumptions are incorrect.
same message != same output
The output will not be the same if you encrypt the same message twice.
This is because you are suppsed to use different IVs'.
Message length can be hidden by adding random data to the plaintext.
Attacks have been demonstrated against AES with lesser number of rounds.
Full-round AES has not been compromised in any way.
Other than that I suggest you follow Noon Silks recommendation and read Applied Cryptography.
What's the point of the random data XOR? If it's truly random, how will you ever decrypt it? If you're saying the random data is part of the key, you might as well drop AES and use only the truly random key - as long as it's the same length (or longer than) the data and is never used more than once to encrypt. It's called a one-time pad, the only theoretically unbreakable encryption algorithm I know about.
If the random bits are pseudo-randomly generated, it's highly unlikely that your efforts will yield added security. Consider how many talented mathematicians were involved in designing AES...
EDIT: And I too highly recommend Applied Cryptography, it's an actually very readable and interesting book, not as dry as it may sound.

Which is the best encryption mechanism, Triple DES or RC4?

Triple DES or RC4?
I have the choice to employ either one.
As a high level view the following comments on both should be useful.
It is extremely easy to create a protocol based on RC4 (such as WEP) that is of extremely low strength (breakable with commodity hardware in minutes counts as extremely weak).
Triple DES is not great in that its strength comes though excessive cpu effort but it is of considerably greater strength (both theoretically in real world attacks) than RC4 so should be the default choice.
Going somewhat deeper:
In the field of encryption with no well defined target application then the definition of 'best' is inherently hard since the 'fitness' of an algorithm is multi variant.
Ease of implementation
Can you run it on commodity hardware?
Are implementations subject to accidental flaws that significantly reduce security while still allowing 'correctness' of behaviour.
Cost of implementation
Power/silicon/time to encode/decode.
Effort to break
Brute Force resilience. Pretty quantifiable
Resistance to cryptanalysis, less quantifiable, you might think so but perhaps the right person hasn't had a try yet:)
Flexibility
Can you trade off one of the above for another
What's the maximum key size (thus upper limits of the Brute Force)
What sort of input size is required to get decent encryption, does it require salting.
Actually working out the effort to break itself requires a lot of time and effort, which is why you (as a non cryptographer) go with something already done rather than roll your own. It is also subject to change over time, hopefully solely as a result of improvements in the hardware available rather than fundamental flaws in the algorithm being discovered.
The core overriding concern is of course just that, is it secure? It should be noted that many older algorithms previously considered secure are no longer in that category. Some are so effectively broken that their use is simply pointless, you have no security whatsoever simply obscurity (useful but in no way comparable to real security).
Currently neither of the core algorithms of RC4 and TDES is in that category but the naive implementation of RC4 is considered extremely flawed in protocols where the message data can be forced to repeat. RC4 has several more significant theoretical flaws than TDES.
That said TDES is NOT better than RC4 in all the areas listed above. It is significantly more expensive to compute (and that expensiveness is not justified, other less costly crypto systems exist with comparable security to TDES)
Once you have a real world application you normally get one or both of the following:
Constrains on your hardware to do the task
Constraints imposed be the data you are encrypting (this is used to transmit data which needs to be kept secret only for X days... for example)
Then you can state, with tolerances and assumptions, what can achieve this (or if you simply can't) and go with that.
In the absence of any such constraints we can only give you the following:
Ease of implementation
Both have publicly available secure free implementations for almost any architecture and platform available.
RC4 implementations may not be as secure as you think if the message can be forced to repeat (see the WEP issues). Judicious use of salting may reduce this risk but this will NOT have been subject to the rigorous analysis that the raw implementations have been and as such should be viewed with suspision.
Cost of implementation
I have no useful benchmarks for RC4 (it is OLD) http://www.cryptopp.com/benchmarks.html has some useful guides to put TDES in context with RC5 which is slower than RC4 (TDES is at least an order of magnitude slower than RC4) RC4 can encrypt a stream at approximately 7 cycles per byte in a fast implementation on modern x86 processors for comparison.
Effort to break
Brute Force resilience of TDES is currently believed to be high, even in the presence of many encryption outputs.
RC4 brute force resilience is orders of magnitude lower than TDES and further is extremely low in certain modes of operation (failure to discard initial bits of stream)
Resistance to cryptanalysis, There are publicly known flaws for Triple DES but they do not reduce the effectiveness of it to realistic attack in the next decade or two, the same is not true for RC4 where several flaws are known and combined they have produced reliable attacks on several protocols based on it.
Flexibility
TDES has very little flexibility (and your library may not expose them anyway)
RC4 has a lot more flexibility (the key used to initialize it can be arbitrarily long in theory, though the library may limit this.
Based on this and your statement that you must use one or the other you should consider the RC4 implementation only if the CPU cost of TripleDES makes it unrealistic to implement in your environment or the low level of security provided by RC4 is still considerably higher than your requirements specify.
I should also point out that systems exist which are empirically better in all areas than RC4 and TDES.
The eSTREAM project is evaluating various stream cyphers in the order of 5 or less cycles per byte though the cryptanalysis work on them is not really complete.
Many faster, stronger block cyphers exist to compete with TDES. AES is probably the best known, and would be a candidate since it is of comparable (if not better) security but is much faster.
Sorry - triple DES is no longer considered best practices. AES is simply a better algorithm so if you can use it then you should. For an easy implementation, go here.
I strongly suggest that you learn more by reading up on TDES on Wikipedia. The money quote is:
"TDES is slowly disappearing from use,
largely replaced by the Advanced
Encryption Standard (AES)."
RC4 is, honestly, just not an acceptable option for any application where security is important.
Agreed -- DES is largely outdated, so unless there is a good reason to use it, go with AES. If that's not an option, TDES would be the better choice, unless you're dealing with streaming data (ie, data which cannot be broken into blocks), then RC4 is the way to go (out of the given options).
Of course, I feel like I should mention... Cryptography is really, really hard to get right, and even the strongest algorithm can be broken easily if you get something even a little wrong (see, eg, older Kerberos or WEP).
This might not be the most informative answer, but during my 4 year employment term with a very large telco, Triple DES was the encryption standard for all sensitive applications, others were simply not allowed. It was Triple DES or the application does not go live. Hope that helps.
Both are secure, well... enough. RC4 is faster so if that's important to you...
After reading other peoples answers (which are all correct), it's clear that it really depends on your context. There are so many other questions that could influence your decision. If it just needs to be fool proof, if it's not really something sensitive and you have a lot of data and speed is the factor, go for RC4.
Otherwise, if you need something a bit more secure and easier to implement or as you say "tougher to screw up" :) then go for 3DES, which is, as far as I remember, secure enough (!) till 2020-2030, or something like that.
Are those your only two options? If you can use AES (also known as Rijndael) then use it instead. DES is slow, and now considered obsolete (AES is the replacement for it).
RC4 sucks, don't use it. It's a stream cipher but you can use a block cipher instead, just pad the final block of data (Google PKCS#5 padding scheme).
Lately I've only seen DES being used in embedded devices (firmware), because the implementation is simple and it uses very little memory. Even in JavaME you can use AES.
One factor in deciding between 3DES and RC4 is language support. Java doesn't natively support RC4 and you would need to grab an open source library such as BouncyCastle to implement. MS doesn't have this same challenge.

Resources