Identifying and Reverse engineering CRC from Hex - hex

I'm working on identifying and reversing the CRC for a Hex code that I have.
The hex code contains certain information that I have identified already, but I can't seem to be able to find the CRC and therefore reverse it so I can make changes to the file which is important for me.
I have a few files of the same type with minor to major changes between them and have been trying to use that to find where the CRC might be located.
I got it down to a few possible locations, however when I put the remainder in a calculator to get the CRC to verify it doesn't work.
I have a hunch that CRC-CCITT was used, but no calculator and no variation of the code seems to identify that.
Is there any other way to identify and reverse the CRC?
I thought about writing code for this, but I'm not sure that will help since I'm not absolutely sure which CRC is being used.
I created an XOR file that shows me the differences. It helped me verify that I'm looking for a 16-bit CRC, but not exactly what the Polynomial might be, also it might be in 2 different places.
Now I'm having trouble identifying that start-to-end addresses from which the CRC is calculated. The Hex code I have is 18H lines*16 bytes, 25D lines*16 bytes, each line so it's very time consuming to try every variation in a calculator hoping I find the CRC. Is there any way to identify the start-to-end addresses?
Here is the XOR file for reference:
00000000000001710000000000000005
00100d0032b100000000000000000000
0000000000000000000000007fc90000
00000000000000000000000000000000
0171000000000000000500100d000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000a00
0c400000003900000000000000090000
00000000000000000000000000000000
00000000000001a00000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
000000000000
The locations that are possible for the CRC could be:
02C and 02D = 7FC9
014 and 015 = 32B1

Related

Decrypting MD5 hashed text when salt is known

Let's say I have the following MD5 hashed password:
bec0932119f0b0dd192c3bb5e5984eec
If I know that the original password was salted and hashed and know that instead of typical salt it was just wrapped in 'flag{}' before MD5 summing it.
How may I decrypt MD5 in this case?
The other answer is not correct in the definition of what you are trying. Let's begin with the formal definitions of Cryptographical hash functions' required resistances. The below from Cryptographic Hash-Function Basics: Definitions, Implications, and Separations for Preimage Resistance, Second-Preimage Resistance, and Collision Resistance by P. Rogaway and T. Shrimpton;
preimage-resistance — for essentially all pre-specified outputs, it is computationally infeasible to find any input which hashes to that output, i.e., to find any preimage x' such that h(x') = y when given any y for which a corresponding input is not known.
2nd-preimage resistance, weak-collision — it is computationally infeasible to find any second input which has the same output as any specified input, i.e., given x, to find a 2nd-preimage x' != x such that h(x) = h(x').
collision resistance, strong-collision — it is computationally infeasible to find any two distinct inputs x, x' which hash to the same output, i.e., such that h(x) = h(x').
Collisions and password cracking is not related. Actually, you are trying to find a pre-image that works with the given hash value and the salt. The cost of generic pre-image attacks is O(2^n) in the case of MD5 n=128 that is O(2^128). There is a pre-image attack on the MD5 that is better than the generic with a cost of 2^123.4
Finding Preimages in Full MD5 Faster Than Exhaustive Search
This attack still beyond the search of everybody (except the QC and that is another story). Even for the supercomputers or the collaborative power of the bitcoin miners.
As pointed above, MD5 is no longer cryptographically secure since its collision resistance is broken, even SHA-1 is no longer secure.
hashing is not encryption/decryption. That is really a long story here a short answer, Encryption is reversible but hashes are not ( consider the pigeonhole principle, and see one-way functions) [ minor note block cipher mode of operation like the CTR mode doesn't requires a PRP it can work with PRF and it is designed in this way]...
What can you do?
First, use the John the Ripper password cracker.
If not found, then
Build a fast pre-image attack on the MD5 up to some limit according to your budget. hashcat is a very powerful tool that you can benefit from it to build it. Here a hashcat performance;
hashcat with Nvidia RTX 3090 one can search for 65322.5 MH/s (Mega Hashes/ Seconds). That is 2^16 MH/s. The calculations - time, device cost, electricity costs - can be done according to target search space if known.
MD5 is a hash function, you cannot really decrypt the result (plz search difference between hash and decryption).
However - you may try to find a collision - an input giving the same hash. With some probability it will match the original input. Cryptographic hash functions are designed to be very difficult (unfeasible) to find a collision, however for the MD5 it is not valid anymore (that's why MD5 is considered as not safe to use)
You may check the resources Vlastimil Klima: Tunnels in Hash Functions: MD5 Collisions Within a Minute, there are some more references and tools linked related to the latest Tunnel attack.

A few questions on HTTPS encryption

I have a couple of questions about HTTPS encryption:
What is the bitsize of the keys used? Is it standardised? (I could not find this information searching the web.)
Can the keys generated for the key length start with a zero, or would this key be counted as a n - 1 bit key?
During the initial handshake client and server agree among other things to an algorithm for symetric encryption. Keysize depends on the algorithm, e.g. AES128 is 128 bit, AES256 256 bit, DES 56 bit etc.
The key itself is random, which also means that it can start with 0. If you would restrict the initial bit to 1, you would effectively leave one bit less for the random bits.

Is there two key symetric commutative encryption function?

I'm wondering if there is some strong (like AES or so.) encryption function that works like this:
symetric
2 keys: plaintext -> 2keys ->ciphered text, however it must not matter order of keys, i.e
Key1 (Key2 (plaintext)) == Key2 (Key1(plaintext))
e.g. "commutative"
(also required for decryption - you need two keys, doesn't matter order)
thanks
This can be easily done by putting any block encryption algorithm into CTR mode. CTR mode with a single key looks like:
ciphertext = plaintext XOR cipher(key, counter)
Where counter is initialized to your IV and incremented for each block. Decryption is exactly the same operation. As such, if you CTR-encrypt twice with two keys, you get:
ciphertext = plaintext XOR cipher(key0, counter) XOR cipher(key1, counter)
And since XOR is commutative, you can reverse it in either order.
This has the nice property that you don't need to have all keys in the same location. Consider: Alice, Bob, and Charlie are participating in a protocol in which Charlie will double encrypt data for both Alice and Bob (this protocol will assume all point-to-point communication is secured through usual SSL-like channels):
Alice and Bob perform an authenticated Diffie-Hellman exchange to produce the IV. This IV is then sent to Charlie.
Alice computes digest(key0, IV + ctr) for ctr = 0...number-of-ciphertext-blocks, and sends the result KS_A to Charlie
Bob computes digest(key1, IV + ctr) for ctr = 0...number-of-ciphertext-blocks, and sends the result KS_B to Charlie
Charlie computes KS_A XOR KS_B XOR plaintext, and sends the resulting ciphertext to both Alice and Bob.
Alice and Bob each sign a tuple (IV, hash(ciphertext), description-of-encrypted-data). This is attached to the ciphertext.
Later, to decrypt:
Charlie (performing the decryption) sends the signed (IV, hash(ciphertext)) tuples to each of Alice and Bob, as well as the ciphertext.
Alice verifies his signed tuple, computes KS_A, and sends ciphertext XOR KS_A = D_A to Charlie
Bob verifies his signed tuple, computes KS_B, and sends ciphertext XOR KS_B = D_B to Charlie
Charlie computes KS = D_A XOR D_B = KS_A XOR KS_B
Charlie computes plaintext = ciphertext XOR KS
The purpose of the signed tuple here and DH exchange is to ensure Alice and Bob can't be tricked into decryption the wrong stream by sending them a different IV. This may not be relevant in your usage scenario. Also, the role of Charlie may be played by Alice or Bob in a real implementation.
If you're worried about the potential security risks of CTR mode, one other option would be to use CTR-mode encryption on a session key, which in turn is used to encrypt in a more normal mode, such as CBC. That is:
sessionkey = RANDOM
IV_0 = RANDOM
IV_1 = RANDOM
enc_sessionkey = sessionkey XOR cipher(key0, IV_0) XOR cipher(key1, IV_0)
ciphertext = enc_sessionkey + IV_0 + IV_1 + cipherCBC(IV_1, sessionkey, plaintext)
Although some other posters have commented on secret sharing, this is overkill if you don't need the property that only a subset of keys are needed for decryption - ie, with secret sharing you might encrypt with three keys, but require only any two to decrypt. If you want to require all keys, secret sharing schemes aren't really necessary.
It's not a commutative encryption, but there are well-proven algorithms for secret sharing (note, this is not the same thing as "key agreement.")
Two of the best known methods are Shamir's and Blakley's. In general, these algorithms take a secret and produce many "shares". When enough shares are available to reach a threshold, the secret can be recovered. In the simplest case, two shares are required, but the threshold can be higher.
To explain Shamir's method in simple terms, think about a line on a graph. If you know any two points on the line, you know everything about the line. Any string of bytes, like the encryption key of a symmetric cipher, is just a large number, in base-256. Shamir's algorithm treats this secret as the line's "y-intercept" (the y-coordinate of the line when x=0). Then the line's slope chosen randomly. The y-coordinates of the line at x=1, x=2, x=3, … are computed, and each point is given to a different share-holder.
If any two of these share-holders get together, they can draw a line through their two points, back to the y-axis. The y-coordinate at where it crosses the axis is the original secret. However, each share-holder has only one point; by themselves, they can't guess anything about the original secret.
The threshold can be increased by increasing the degree of the polynomial. For example, if a parabola is used instead of a line, three shares are needed instead of two.
There's more to a real implementation, like the use of modular arithmetic, but this is the concept behind it. Blakley's approach is similar, but it uses the intersection of planes to encode the secret.
You can play around with an implementation of Shamir's method online.
You can make a commutative encryption algorithm, but the encryption methods must then be limited to commutative operations. This will limit the strength of the encryption function because it greatly reduces the possible encryption methods that can be used. Thus, if a hacker wanted to break your algorithm and new it was commutative, it would greatly improve his chances of breaking it because of the reduction in decryption methods he would need to try. However, it might be okay for your purposes, depending on how much hacking you expect.
Also, I'm not sure if "secret splitting" is what you are going for, as mentioned by atk. I've looked at it briefly, but from what I've seen (at least for the basic case) you can't perform the operations separately, as both keys need to be provided together to perform the encrypt/decrypt actions. In other words you can't call encrypt with one person's key to get a result that you can call encrypt on with a second key. However, if you have both keys available at once, this might be a good method to try.
You're talking about secret splitting. Yes, there's been a lot of research on it. Wikipedia would be a good starting point.

What the difference between CRC and checksum?

What the difference between CRC and checksum?
CRC (Cyclic Redundancy Check) is a type of checksum, specifically a position dependent checksum algorithm (among others, such as Fletcher's checksum, Adler-32). As their name suggest, these detect positional changes as well, which makes them more robust - thus more widely used - than other checksum methods.
CRC refers to a specific checksum algorithm. Other types of checksums are XOR, modulus, and all the various cryptographic hashes.
Check out HowStuffWorks for a good description of both and how they differ.
From the page:
Cyclic Redundancy Check (CRC)
CRCs are similar in concept to checksums, but they use polynomial division to determine the value of the CRC
More info is given at the link above including an example of how a checksum is calculated.
Jeff Atwood (founder of Stack Overflow) wrote in his Checksums and Hashes blog post:
I learned to appreciate the value of the Cyclic Redundancy Check (CRC) algorithm in my 8-bit, 300 baud file transferring days. If the CRC of the local file matched the CRC stored in the file (or on the server), I had a valid download. I also learned a little bit about the pigeonhole principle when I downloaded a file with a matching CRC that was corrupt!
A checksum is an error-detection scheme that typically refers to a cryptographic hash function, though it also includes CRC. Here are three different types of checksum:
Cyclic Redundancy Checks like CRC32 are fast but collision-prone. They are not robust to collision attacks, meaning that somebody can take a given CRC and easily a second input that matches it.
Cryptographic hash functions like MD5 (weaker), SHA1 (weak), and SHA256 (strong) are specifically designed to be resistant to collision attacks. They are preferable to CRCs in every situation except speed; use the strongest algorithm you can computationally afford.
Key derivation functions like PBKDF2 and bcrypt are designed for passwords. They are checksums that are expensive to compute so that they're robust to brute-force attacks.
See also this Crypto.SE question on CRC vs SHA1. Wikipedia has a hash function security summary page that discusses collision-proneness of various cryptographic hashes.

Generating short license keys with OpenSSL

I'm working on a new licensing scheme for my software, based on OpenSSL public / private key encryption. My past approach, based on this article, was to use a large private key size and encrypt an SHA1 hashed string, which I sent to the customer as a license file (the base64 encoded hash is about a paragraph in length). I know someone could still easily crack my application, but it prevented someone from making a key generator, which I think would hurt more in the long run.
For various reasons I want to move away from license files and simply email a 16 character base32 string the customer can type into the application. Even using small private keys (which I understand are trivial to crack), it's hard to get the encrypted hash this small. Would there be any benefit to using the same strategy to generated an encrypted hash, but simply using the first 16 characters as a license key? If not, is there a better alternative that will create keys in the format I want?
DSA signatures are signficantly shorter than RSA ones. DSA signatures are the twice the size of the Q parameter; if you use the OpenSSL defaults, Q is 160 bits, so your signatures fit into 320 bits.
If you can switch to a base-64 representation (which only requires upper-and-lower case alphanumerics, the digits and two other symbols) then you will need 53 symbols, which you could do with 11 groups of 5. Not quite the 16 that you wanted, but still within the bounds of being user-enterable.
Actually, it occurs to me that you could halve the number of bits required in the license key. DSA signatures are made up of two numbers, R and S, each the size of Q. However, the R values can all be pre-computed by the signer (you) - the only requirement is that you never re-use them. So this means that you could precalculate a whole table of R values - say 1 million of them (taking up 20MB) - and distribute these as part of the application. Now when you create a license key, you pick the next un-used R value, and generate the S value. The license key itself only contains the index of the R value (needing only 20 bits) and the complete S value (160 bits).
And if you're getting close to selling a million copies of the app - a nice problem to have - just create a new version with a new R table.
Did you consider using some existing protection + key generation scheme? I know that EXECryptor (I am not advertising it at all, this is just some info I remember) offers strong protection whcih together with complimentatary product of the same guys, StrongKey (if memory serves) offers short keys and protection against cracking. Armadillo is another product name that comes to my mind, though I don't know what level of protection they offer now. But they also had short keys earlier.
In general, cryptographically strong short keys are based on some aspects of ECC (elliptic curve cryptography). Large part of ECC is patented, and in overall ECC is hard to implement right and so industry solution is a preferred way to go.
Of course, if you don't need strong keys, you can go with just a hash of "secret word" (salt) + user name, and verify them in the application, but this is crackable in minutes.
Why use public key crypto? It gives you the advantage that nobody can reverse-engineer the executable to create a key generator, but key generators are a somewhat secondary risk compared to patching the executable to skip the check, which is generally much easier for an attacker, even with well-obfuscated executables.
Eugene's suggestion of using ECC is a good one - ECC keys are much shorter than RSA or DSA for a given security level.
However, 16 characters in base 32 is still only 5*16=80 bits, which is low enough that brute-forcing for valid keys might be practical, regardless of what algorithm you use.

Resources