Theoretically hashing something 2^128 times with the MD5 algrithm [closed] - math

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 8 years ago.
Improve this question
This is a purely hypothetical question, but if you were to start with 128 bits, and then hash them 2^128 times, say with the MD5 algorithm, would you eventually come back to your original bits? Would all possible combinations have been used? And if not, are there certain nubers that "hash back to themselves" faster than others?
I assume this is practically impossible to achieve (after looking at my calculators answer to 2^128...), and I'm pretty sure the answer would be different for different algorithms, but that doesn't stop one from theoretizing, does it?
So yeah, that's it, hope someone out there will have some more knowledge on this topic. Looking forward to seeing the answer(s), thanks in advance!
Edit:
To clarify: What interests me the most in this question is if it will go through all possible bit combinations or if there rather are several smaller cycles, tho any additional, relevant and interesting information is appreciated.

A good cryptographic hash should have some, but not too many, cycles in it, that makes it much harder to create rainbow tables for it. This occurs in the MD5 - actually a problem with MD5 is that it's a bit to easy too find hash collisions for a given hash for the algorithm. This weakness makes it computationally feasible to inject malicious data in a file that is hashed with MD5 for verification.
I think you think there's some Fermat's little theorem property of the MD5, but this is not the case. The hash function will probably start to walk in circles quite soon, and it should.
There's also a very memory efficient way to find MD5 cycles. Also have a look at MD5CRK.
If you really want a unique "hashing" of an 128-bit id, you should use an ordinary encryption algorithm, for instance with AES, of a particular number and a secret key. This gives you a "random", unique row of numbers form an increasing id, since you can always decrypt the information in a unique way, given the same key that was used to encrypt the data.

Related

Decoding MD5 Hash into unicode [duplicate]

This question already has answers here:
Is it possible to decrypt MD5 hashes?
(24 answers)
Closed 5 years ago.
ร encodes into 0f93821e0fbc6d3736da7df2c73024aa
I was wondering if it's possible to decode the hash back into the unicode form. If so, how can I approach this or how can I perform this.
Any help is appreciated, thanks.
m5d is a hashing algorithm, that is by nature monodirectional.
You just can't "decode" it.
The only option you have is bruteforcing.
The whole point of a hash is to present a fixed-length output for arbitrary input with the property that the same input results in the same output. Cryptographic hash functions like MD5, or SHA-1 are even designed so they cannot be reversed easily. Thus, no, you cannot do that.
Also, just as a thought exercise that shows that in the general case it just cannot work: MD5 is 128 bits long, so how could you possibly recover input that's larger than that? There are an infinite number of strings turning into the same digest, so while you could find a string that has the same hash, you're not guaranteed to find the one you started with.
Whites11 has mentioned brute-forcing, however take into account this is not 'Decoding' the hash. This is simply hashing common inputs and comparing the 2 hashes to see if they match, unless you have a set of common inputs that may actually match the hash its very unlikely you will get anywhere with it.
Hashes are intentionally mono-directional, I can't think of why you would need to either you may need to rethink the logic of whatever project you're doing.
To summarize, you can't decode a hash, this is intentional and that's why hashing algorithms exist. And brute-forcing is hashing common inputs to see if they match your hash. It's commonly used for password cracking etc. Done with common password data sets. So may not be useful in your case.
http://www.md5online.org is a good example of bruteforcing, it is a database of previously bruteforce/tested hashes and their unicode inputs. You can try hashing a basic word like "password" and throwing it in there, it should show the original unicode input if it's a known hash!
Here are 2 excellent informative videos that cover hashing algorithms and brute-forcing hashes:
https://www.youtube.com/watch?v=b4b8ktEV4Bg
https://www.youtube.com/watch?v=7U-RbOKanYs

serpent encryption - better than rijndael? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 8 years ago.
Improve this question
Is Serpent-256 better than Rijndael-256 in terms of security? (speed doesn't matter)
Would Serpent encryption combined with SHA-512 be enough to safeguard sensitive data?
And to what extent? (SECRET, TOP SECRET, CLASSIFIED etc.)
Moreover, Rijndael has a max of 16 rounds. Serpent has 32 rounds, so it must be more secure.
As I've read that the Rijndael cipher is cryptographically broken, why isn't Serpent
adopted more widely? Would it be that slow if implemented on hardware?
Any other technical specifications about Serpent that you can link me to, I would be very grateful.
Thank you.
The number of rounds, by itself, doesn't determine the security of a cipher. You need to take the round function into account before the number of rounds means anything.
Nonetheless, I'd agree that there's a pretty decent chance that Serpent is more secure than AES. There are attacks currently known against AES that reduce the complexity by a factor of approximately 4 compared to a pure brute-force attack.
Cryptographers count that as a successful attack--but from a practical viewpoint, it's of precisely zero consequence. Even if you restrict yourself to AES-128, it's basically reducing complexity from 16 times the estimated life of the universe to only 4 times the estimated life of the universe (I'm sort of making up numbers here, but you get the general idea). With AES-256, the number is so much larger the factor of four shrinks to a new level of utterly meaningless insignificance.
Until/unless a dramatically better attack is found, real security is completely unaffected. In essentially every case, the problems you need to deal with and worry about are in how the cipher is used, how keys are generated, stored, and exchanged, etc. Changing from AES to Serpent (or Mars, Twofish, etc.) is extremely unlikely to improve your security (or anybody else's).
I should probably add: I'm probably as strong an advocate as anybody of having more cipher algorithms available and standardized. If you do a little looking, you can find where I'm cited in the papers submitted to NIST during AES standardization on that subject, giving use cases where including more than one algorithm in the standard would have been useful. Nonetheless, I have to admit that no (publicly known) current attack even comes close to giving a real reason to choose a different cipher algorithm.

How do people go about attacking an encrypted file? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I've been reading up on file encryption lately, and In many places I've seen warnings that encrypted files are susceptible to decryption by people so inclined regardless of encryption algorithm strength.
However, I can't get my head around how someone would go about attempting to decrypt an encrypted file.
For example, lets say you've got an encrypted file and you'd like to know it's contents. You have no idea what the key used to encrypt the file is, nor the encryption algorithm used. What do you do? (Assume for this example that the encryption algorithm is a symmetric-key algorithm such as AES-256, I.E. a file encrypted with key which requires said key to decrypt it).
Additionally, how would your approach change if you knew the encryption algorithm used? (Assume in this case that the encryption algorithm used is AES-256, with a random key + salt).
There's two ways to answer this question, in the literal sense of how a perfect crypto system is attacked, and how real world systems are attacked. One of the biggest problems you'll find as you begin to learn more about cryptography is that selecting algorithms is the easy part. It's how you manage those keys that becomes impossibly difficult.
The way in which you attack the basic primitives depends on the type of algorithm. In the case of data encrypted by symmetric ciphers like AES you use Brute force attacks. That is, you effectively try every key possible, until you find the right one. Unfortunately, barring changes in the laws of physics trying every possible 256-bit key can't be done. From Wikipedia: "A device that could check a billion billion (10^18) AES keys per second would in theory require about 3×10^51 years to exhaust the 256-bit key space"
The problem with your question about coming across a seemingly encrypted file, with no knowledge of the methods used, is that it's a bit of a hard problem known as a Distinguishing Attack. One of the requirements of all modern algorithms is that their output should be indistinguishable from random data. If I encrypt something under both AES and Twofish, and then give you some random data, absent any other information like headers, there's no way for you to tell them apart. That being said....
You asked how knowledge of the algorithm changes the approach. One assumption cryptographers usually make is that knowledge of the algorithm shouldn't affect security at all, it should all depend on the secret key. Usually whatever protocol you're working with will tell you the algorithm specifications. If this wasn't public, interoprobility would be a nightmare. Cipher Suites, for example, are sets of algorithms that protocols like SSL support. NIST FIPS and the NSA Suite B are algorithms that have been standardized by the Federal Government, that most everyone follows.
In practice though, most crypto-systems have much larger problems.
Bad random number generation: Cryptography requires very good, unpredictable random number generators. Bad random number generators can completely collapse security, as in the case of Netscape's SSL implementation. You also have examples like the Debian RNG bug, where a developer changed code to satisfy a memory leak warning, which ultimately led to Debian generating the same certificate keys for every system.
Timing Attacks: Certain operations take longer to execute on a computer than others. Sometimes, attackers can observe this latency and deduce secret values. This has been demonstrated by remotely recovering a server's private key over a local network.
Attacks against the host: One way to attack a cryptosystem is to attack the host. By cooling memory, its contents can be preserved and inspected in a machine you control.
Rubber hose cryptanalysis: Maybe one of the easiest attacks, you threaten the party with physical harm or incarceration unless they reveal the key. There has been a lot of interesting case law on whether or not courts can force you to reveal crypto keys.
AES256 is effectively unbreakable.
From http://www.wilderssecurity.com/showthread.php?t=212324:
I don't think there's any credible speculation that any agency can
break a properly implemented AES. There are no known cryptanalytic
attacks, and actually bruteforcing AES-256 is probably beyond human
capabilities within any of our lifetimes. Let's assume that 56 bit DES
can be bruteforced in 1 sec, which is a ridiculous assumption to begin
with. Then AES-256 would take 2^200 seconds, which is 5 x 10^52 years.
So, you can see that without any known weakness in AES, it would be a
total impossibility within any of our lifetimes, even with quantum
computing. Our sun will explode, approximately 5 billion years from
now, before we obtain enough computing power to bruteforce AES-256
without a known weakness. IF a weakness in AES is never found, there
is absolutely no reason to ever look for another cipher besides AES.
It will suffice for as long as humans occupy the planet.
With basic Brute force attack for example. You ask a software to try every single combination between 1 character to 15 character with a-z A-Z 0-9 and wait.
The software will start with 0 to 10... then 0a, 0b, 0c until it finds the password. Wikipedia will give you more detail.
I partially agree with Andrew and partially with Jeremy.
In the case, if encryption key is generated correctly (random generated or based on complex password, good key derivation function and random salt) then AES256 is effectively unbreakable (as Andrew said)
On other hand, if a key isn't correctly generated. As example, just straight hash of 4 digit's PIN password, brute force could be very efficient.
Regarding "You have no idea what the key used to encrypt the file is, nor the encryption algorithm used. "
In most case, encrypted files has a header or a footer which specify something (an application used to encrypt a file, encryption algorithm or something else).
You can try to figure out algorithm by padding (as example 3DES has padding and AES has different padding)

Instead of using common ciphers such as AES or blowfish twofish, how creating my own cipher?

I don't know much about the heavy math behind cryptosystems, I get stuck when it gets bad with the Z/nZ algebra, and sometimes with all these exponent of exponents. It's not I don't like it, it's just that the information you find on the web are not easy to follow blindly.
I was wondering: how reliable can a algorithm be when it encodes a message into plain binary. If my algorithm is arbitrary and known only to me, how can a cryptanalist study an encrypted file and decrypt it, with or without having the decoded file ?
I'm thinking about not using ASCII text to code my message, and I have some ideas to make this algorithm/program.
Attacking a AES or blowfish crypted file is more trivial for a cryptanalyst, than if the algorithm the file is encrypted with is unknown to him, but how does he do then ?
I don't know if I understanded well, but a CS teacher once told me that codes are harder to crack that crypted ciphers.
What do you think ?
Attacking a AES or blowfish crypted file is more trivial for a cryptanalyst, than if the algorithm the file is encrypted with is unknown to him...
What about:
Attacking an untested self written algorithm with no real research is more trivial for a cryptanalyst, than if the algorithm the file is encrypted with, is a well known and proofed one, that has been correctly used....
In short, DO NOT roll your own cryptography unless you're an expert, no unless you're part of an expert group in that field.
Nintendo failed when they implemented RSA on their own in the Wii, Sony failed too when using it in the PS3 (they pretty much used XKCD's random number function for M...)
And you really think you can win by using security by obscurity?
PS: That doesn't mean that you should take the Wikipedia entry on RSA and roll you own implementation from that one (that's exactly were Sony and Big-N failed), no use a tested, open source implementation.
You seem to be using two words interchangeably but remember that Encoding is Not Encryption
When the attacker has no idea which algorithm you used and it is safe, cryptoanalyst has a hard job. So it is unimportant if you use AES or your own cipher as long as it is as strong and safe as AES. Here is the but. Cryptography is a bit demanding and therefore you have many ways to shoot yourself in a foot without knowing it. I would suggest using standard algorithms, maybe with some safe variations.
Common wisdom is that you should not build your own algorithms, and especially not rely on these algorithms remaining secret.
The conceptual reason is that good encryption is about quantified confidentiality. We do not want our secrets to get cracked, but in a more precise way we want to be able to tell how much it would cost to crack our secrets (and hopefully show that the cost is way too high to be envisioned by any entity on Earth). This is the real advance which occurred a few years after World War II: to understand the distinction between key and algorithm. The key concentrates the secret. The algorithm becomes the implementation.
Since the implementation is, well, implemented, it exists as some code or a device, which is tangible and stored even when it is not used. Keeping an implementation secret requires keeping track of the hard disk on which the code resides at all times. If the attacker sees the binary code, he may be able to reverse-engineer it, something which depends on his wits and patience. The point here is that it is very difficult to be able to say: "it costs X dollars to recover a description of the algorithm".
On the other hand, the key is short. It can be stored safely much more easily; e.g. you could memorize it, and avoid committing it to any permanent storage device. You then have to worry about your key only at times when you use it (and not when you do not, e.g. in the middle of the night, when you sleep). The number of possible keys is a simple mathematical problem. You can easily and accurately estimate the average cost of enumerating the possible keys until your key is found. The key is a sturdy foundation for quantified security.
So you should not roll your own algorithms because then you do not know how much security you get.
Also, most people who rolled their own algorithms found out, usually the hard way, that they did not get much security at all. Designing a good encryption algorithm is hard, because it cannot be automatically tested. Your code may run, and properly decrypt data that it encrypted, but it tells you nothing about how secure the algorithm is. The design of the AES was the result of a process which took several years and involved hundreds of skilled cryptographers (most of whom had a PhD and years of experience in academic research on symmetric encryption). That a lone developer could do as well, let alone better, in the secrecy of his own workshop, looks kind of... implausible.
The biggest part of your strategy is called "security through obscurity." You're making the gamble that, since nobody knows the precise details of your little variation on an idea, they won't be able to figure it out.
I'm not a security expert, but I can tell you that you probably won't come up with something incredibly new. Cryptography has been studied by people for millenia and your idea is highly unlikely to be original. Even if you're a relatively good programmer and code something really tricky, the question will come down to who you're up against. If you're just trying to protect your data from your kid sister, then it will probably be fine. On the other hand, if you're using it to send credit card numbers across the internet, then you're doomed to fail. It will be analysed in ways you didn't think of or don't know, and ultimately cracked.
Another way to think of it: algorithms like AES have been extensively studied by professionals in the field and its level of security is pretty well understood. Anything you come up with by yourself will not have the benefit of having been attacked by the best and brightest minds out there. You will have almost no idea of how good it actually is until people start reporting identity theft.

Improving cipher's properties sanity check

I am reading about cryptography I was thinking about these properties of AES (that I use):
same message = same ouput
no message length secrecy
possible insecurity if you know the messages (does this actually apply to AES?)
I hear that AES is secure, but what if I want to theoritcaly improve these properties?
I was thinking I could do this:
apply encryption algorithm A
XOR with random data D (making sure the output looks random in case of any cipher)
generate random data that are longer than the original message
use hashing function F to allocate slots in random data (this scrambles the order bytes)
Inputs: Encryption algorith A, Data to XOR with D and a hashing function F
My questions are
does the proposed solution theoreticaly help with my concerns?
is this approach used somewhere?
Possible enhancements to this approach
I could also say that the next position chosen by hashing function will be altered using a checksum of the last decoded byte after the XOR step (that way the message has to be decoded from beginning to end)
If I was to use this to have conversation with someone, the data to XOR with could be the last message from the other person, but thats probably a vulnerability.
I am looking forward to your thoughts!
(This is only theoretical, I am not in need of more secure encryption, just trying to learn from you guys.)
Yeah.
Look. If you want to learn about cryptography, I suggest you read Applied Cryptography. Really, just do it. You will get some nice definitive learnings, and get an understanding of what is appropriate and what is not. It specifically talks about implementation, which is what you are after.
Some rules of thumb:
Don't make up your own scheme. This is almost universally true. There may be exceptions, but it's fair to say that you should only invent your own scheme if you've thoroughly reviewed all existing schemes and have specific quantifiable reasons for them not being good enough.
Model your attacker. Find out what scenarios you are intending to protect against, and structure your system so that it works to mitigate the potential attacks.
Complexity is your enemy. Don't make your system more complex then it needs to be.
Stay up to date. You can find a few mailing lists related to cryptography and (and hashing) join them. From there you will learn interesting implementation details, and be aware of the latest attacks.
As for specifically addressing your question, well, it's confusing. I don't understand your goal, nor do I understand steps 3 and 4. You might like to take a quick look here to gain an understanding of the different ways you can use a given encryption algorithm.
Hope this helps.
You assumptions are incorrect.
same message != same output
The output will not be the same if you encrypt the same message twice.
This is because you are suppsed to use different IVs'.
Message length can be hidden by adding random data to the plaintext.
Attacks have been demonstrated against AES with lesser number of rounds.
Full-round AES has not been compromised in any way.
Other than that I suggest you follow Noon Silks recommendation and read Applied Cryptography.
What's the point of the random data XOR? If it's truly random, how will you ever decrypt it? If you're saying the random data is part of the key, you might as well drop AES and use only the truly random key - as long as it's the same length (or longer than) the data and is never used more than once to encrypt. It's called a one-time pad, the only theoretically unbreakable encryption algorithm I know about.
If the random bits are pseudo-randomly generated, it's highly unlikely that your efforts will yield added security. Consider how many talented mathematicians were involved in designing AES...
EDIT: And I too highly recommend Applied Cryptography, it's an actually very readable and interesting book, not as dry as it may sound.

Resources