Generating short license keys with OpenSSL - encryption

I'm working on a new licensing scheme for my software, based on OpenSSL public / private key encryption. My past approach, based on this article, was to use a large private key size and encrypt an SHA1 hashed string, which I sent to the customer as a license file (the base64 encoded hash is about a paragraph in length). I know someone could still easily crack my application, but it prevented someone from making a key generator, which I think would hurt more in the long run.
For various reasons I want to move away from license files and simply email a 16 character base32 string the customer can type into the application. Even using small private keys (which I understand are trivial to crack), it's hard to get the encrypted hash this small. Would there be any benefit to using the same strategy to generated an encrypted hash, but simply using the first 16 characters as a license key? If not, is there a better alternative that will create keys in the format I want?

DSA signatures are signficantly shorter than RSA ones. DSA signatures are the twice the size of the Q parameter; if you use the OpenSSL defaults, Q is 160 bits, so your signatures fit into 320 bits.
If you can switch to a base-64 representation (which only requires upper-and-lower case alphanumerics, the digits and two other symbols) then you will need 53 symbols, which you could do with 11 groups of 5. Not quite the 16 that you wanted, but still within the bounds of being user-enterable.
Actually, it occurs to me that you could halve the number of bits required in the license key. DSA signatures are made up of two numbers, R and S, each the size of Q. However, the R values can all be pre-computed by the signer (you) - the only requirement is that you never re-use them. So this means that you could precalculate a whole table of R values - say 1 million of them (taking up 20MB) - and distribute these as part of the application. Now when you create a license key, you pick the next un-used R value, and generate the S value. The license key itself only contains the index of the R value (needing only 20 bits) and the complete S value (160 bits).
And if you're getting close to selling a million copies of the app - a nice problem to have - just create a new version with a new R table.

Did you consider using some existing protection + key generation scheme? I know that EXECryptor (I am not advertising it at all, this is just some info I remember) offers strong protection whcih together with complimentatary product of the same guys, StrongKey (if memory serves) offers short keys and protection against cracking. Armadillo is another product name that comes to my mind, though I don't know what level of protection they offer now. But they also had short keys earlier.
In general, cryptographically strong short keys are based on some aspects of ECC (elliptic curve cryptography). Large part of ECC is patented, and in overall ECC is hard to implement right and so industry solution is a preferred way to go.
Of course, if you don't need strong keys, you can go with just a hash of "secret word" (salt) + user name, and verify them in the application, but this is crackable in minutes.

Why use public key crypto? It gives you the advantage that nobody can reverse-engineer the executable to create a key generator, but key generators are a somewhat secondary risk compared to patching the executable to skip the check, which is generally much easier for an attacker, even with well-obfuscated executables.
Eugene's suggestion of using ECC is a good one - ECC keys are much shorter than RSA or DSA for a given security level.
However, 16 characters in base 32 is still only 5*16=80 bits, which is low enough that brute-forcing for valid keys might be practical, regardless of what algorithm you use.

Related

Extending short key for AES256 (SNMPv3)

I am currently working on security of a switch that runs SNMPv3.
I am expected to code it in such a way, that any SHA (1 - 2-512) is compatible with any AES (128 - 256C).
Everything, like the algorithms alone, works pretty well. The problem is, that its been estabilished, that we are going to use SHA for key generation for both authentification and encryption.
When I want to use, let's say, SHA512 with AES256, there's no problem, since SHA has output of 64B and I need just 32B for key for AES256.
But when I want to use SHA1 with AES256, SHA1 produces only 20B, which is insufficient for the key.
I've searched the internet through and through and I found out, that it's common to use this combination (snmpget, openssl), but I havent found a single word about how are you supposed to prolong the key.
How can I extend the key from 20B to 32B so it works?
P. S.: Yes, I know SHA isn't KDF, yes, I know it's not that common to use this combination, but this is just how it is in my job assignment.
Here is a page discussing your exact question. In short, there is no standard way to do this (as you have already discovered), however, Cisco has adopted the approach outlined in section 2.1 of this document:
Chaining is described as follows. First, run the password-to-key algorithm with inputs of the passphrase and engineID as described in the USM document. This will output as many key bits as the hash algorithm used to implement the password-to-key algorithm. Secondly, run the password-to-key algorithm again with the previous output (instead of the passphrase) and the same engineID as inputs. Repeat this process as many times as necessary in order to generate the minimum number of key bits for the chosen privacy protocol. The outputs of each execution are concatenated into a single string of key bits.
When this process results in more key bits than are necessary, only the most significant bits of the string should be used.
For example, if password-to-key implemented with SHA creates a 40-octet string string for use as key bits, only the first 32 octets will be used for usm3DESEDEPrivProtocol.

Private/Public key encryption algorithm for short messages, giving short results via ED25519?

I have short messages (<=256bit) that need to be encrypted and published as a (HTTP URL) QR code, along with the public key(s). Because of the QR requirement the result should also stay 256bits long - with the scheme, servername, and base64 encoding the resulting URL already has quite a length, and so the QR becomes "too" big easily.
RSA is out of the question for that key size.
libsodium provides crypto_box functions using ED25519; but for these I need to transport the nonce (24 bytes) as well, and the result is eg. 48 bytes - this makes the QR code already a bit unwieldy.
Furthermore, using one (constant) key pair and another randomly generated per message means the random key needs to be embedded as well, enlarging the result
Using a single key pair doesn't work - If I encrypt with sec1 and pub1, I need to publish exactly these values for decrypting too.
So I'm pondering using plain, raw ED25519 en- and decryption. Are there any pitfalls like with RSA (padding, bad keys (like pub exp 3)) that I need to look out for?
My plan would be to take the input, do an SHA256 of it, use the hash value to pad the input to 256 bits, and then do a plain ED25519 encryption. (I'll prepend a key marker to the result to make key rotation possible.)
What can go wrong? After all, all the complexity in libsodium has to have a reason, right?
Thanks a lot for any help!

simple encryption tutorial?

I'm looking for a simple encryption tutorial, for encoding a string into another string. I'm looking for it in general mathematical terms or psuedocode; we're doing it in a scripting language that doesn't have access to libraries.
We have a Micros POS ( point of sale ) system and we want to write a script that puts an encoded string on the bottom of receipts. This string is what a customer would use to log on to a website and fill out a survey about the business.
So in this string, I would like to get a three-digit hard-coded location identifier, the date, and time; e.g.:
0010912041421
Where 001 is the location identifier, 09 the year, 12 the month, and 04 the day, and 1421 the military time ( 2:41 PM ). That way we know which location the respondent visited and when.
Obviously if we just printed that string, it would be easy for someone to crack the 'code' and fill out endless surveys at our expense, without having actually visited our stores. So if we could do a simple encryption, and decode it with a pre-set key, that would be great. The decoding would take place on the website.
The encrypted string should also be about the same number of characters, to lessen the chance of people mistyping a long arbitrary string.
Encryption won't give you any integrity protection or authentication, which are what you need in this application. The customer knows when and where they made a purchase, so you have nothing to hide.
Instead, consider using a Message Authentication Code. These are often based on a cryptographic hash, such as SHA-1.
Also, you'll want to consider a replay attack. Maybe I can't produce my own code, but what's to stop me from coming back a few times with the same code? I assume you might serve more than one customer per minute, and so you'll want to accept duplicate timestamps from the same location.
In that case, you'll want to add a unique identifier. It might only be unique when combined with the timestamp. Or, you could simply extend the timestamp to include seconds or tenths of seconds.
First off, I should point out that this is probably a fair amount of work to go through if you're not solving a problem you are actually having. Since you're going to want some sort of monitoring/analysis of your survey functionality anyway, you're probably better off trying to detect suspicious behavior after the fact and providing a way to rectify any problems.
I don't know if it would be feasible in your situation, but this is a textbook case for asymmetric crypto.
Give each POS terminal it's own private key
Give each POS terminal the public key of your server
Have the terminal encrypt the date, location, etc. info (using the server's public key)
Have the terminal sign the encrypted data (using the terminal's private key)
Encode the results into human-friendly string (Base64?)
Print the string on the receipt
You may run into problems with the length of the human-friendly string, though.
NOTE You may need to flip flop the signing and encrypting steps; I don't have my crypto reference book(s) handy. Please look this up in a reputable reference, such as Applied Cryptography by Schneier.
Which language are you using/familiar with?
The Rijndael website has c source code to implement the Rijndael algorithm. They also have pseudo code descriptions of how it all works. Which is probably the best you could go with. But most of the major algorithms have source code provided somewhere.
If you do implement your own Rijndael algorithm, then be aware that the Advanced Encryption Standard limits the key and block size. So if you want to be cross compatible you will need to use those sizes I think 128 key size and 128, 192, 256 key sizes.
Rolling your own encryption algorithm is something that you should never do if you can avoid it. So finding a real algorithm and implementing it if you have to is definitely a better way to go.
Another alternative that might be easier is DES, or 3DES more specifically. But I don't have a link handy. I'll see if I can dig one up.
EDIT:
This link has the FIPS standard for DES and Triple DES. It contains all the permutation tables and such, I remember taking some 1s and 0s through a round of DES manually once. So it is not too hard to implement once you get going, just be careful not to change around the number tables. P and S Boxes they are called if I remember correctly.
If you go with these then use Triple DES not DES, 3DES actually uses two keys, doubling the key size of the algorithm, which is the only real weakness of DES. It has not been cracked as far as I know by anything other than brute force. 3DES goes through des using one key to encrypt, the other to decrypt, and the same one to encrypt again.
The Blowfish website also has links to implement the Blowfish algorithm in various languages.
I've found Cryptographic Right Answers to be a helpful guide in choosing the right cryptographic primitives to use under various circumstances. It tells you what crypto/hash to use and what sizes are appropriate. It contains links to the various cryptographic primitives it refers to.
One way would be to use AES - taking the location, year, month, and day - encoding it using a private key and then tacking on the last 4 digits (the military time) as the inversion vector. You can then convert it to some form of Base32. You'll end up with something that looks like a product key. It may be too long for you though.
A slight issue would be that you would probably want to use more digits on the military time though since you could conceivably get multiple transactions on the same day from the same location within the same minute.
What I want to use is XOR. It's simple enough that we can do it in the proprietary scripting language ( we're not going to be able to do any real encryption in it ), and if someone breaks it, they we can change the key easily enough.

How can SHA encryption be possible? [duplicate]

This question already has answers here:
Closed 13 years ago.
Duplicate:
Confused about hashes
How can SHA encryption create unique 40 character hash for any string, when there are n infinite number of possible input strings but only a finite number of 40 character hashes?
SHA is not an encryption algorithm, it is a cryptographic hashing algorithm.
Check out this reference at Wikipedia
The simple answer is that it doesn't create a unique 40 character hash for any string - it's inevitable that different strings will have the same hash.
It does try to make sure that close-by string will have very different hashes. 40 characters is a pretty long hash, so the chance of collision is quite low unless you're doing ridiculous numbers of them.
SHA doesn't create a unique 40 character hash for any string. If you create enough hashes, you'll get a collision (two inputs that hash to the same output) eventually. What makes SHA and other hash functions cryptographically useful is that there's no easy way to find two files that will have the same hash.
To elaborate on jdigital's answer:
Since it's a hash algorithm and not an encryption algorithm, there is no need to reverse the operation. This, in turn, means that the result does not need to be unique; there are (in theory) in infinite number of strings that will result in the same hash. Finding out which on those are is practically impossible, though.
Hash algorithms like SHA-1 or the SHA-2 family are used as "one-way" hashes in support of password-based authentication. It is not computationally feasible to find a message (password) that hashes to a given value. So, if an attacker obtains the list of hashed passwords, they can't determine the original passwords.
You are correct that, in general, there are an infinite number of messages that hash to a given value. It's still hard to find one though.
It does not guarantee that two strings will have unique 40 character hashes. What it does is provide an extremely low probability that two strings will have conflicting hashes, and makes it very difficult to create two conflicting documents without just randomly trying inputs.
Generally, a low enough probability of something bad happening is as good as a guarantee that it never will. As long as it's more likely that the world will end when a comet hits it, the chance of a colliding hash isn't generally worth worrying about.
Of course, secure hash algorithms are not perfect. Because they are used in cryptography, they are very valuable things to try and crack. SHA-1, for instance, has been weakened (you can find a collision in 2000 times fewer guesses than just doing random guessing); MD5 has been completely cracked, and security researchers have actually created two certificates which have the same MD5 sum, and got one of them signed by a certificate authority, thus allowing them to use the other one as if it had been signed by the certificate authority. You should not blindly put your faith in cryptographic hashes; once one has been weakened (like SHA-1), it is time to look for a new hash, which is why there is currently a competition to create a new standard hash algorithm.
The function is something like:
hash1 = SHA1(plaintext1)
hash2 = SHA1(plaintext2)
now, hash1 and hash2 can technically be the same. It's a collision. Not common, but possible, and not a problem.
The real magic is in the fact that it's impossible to do this:
plaintext1 = SHA1-REVERSE(hash1)
So you can never reverse it. Handy if you dont want to know what a password is, only that the user gave you the same one both times. Think about it. You have 1024 bytes of input. You get 40 bits of output. How can you EVER reconstruct those 1024 bytes from the 40 - you threw information away. It's just not possible (well, unless you design the algorithm to allow it, I guess....)
Also, if 40 bits isn't enough, use SHA256 or something with a bigger output. And Salt it. Salt is good.
Oh, and as an aside: any website which emails you your password, is not hashing it's passwords. It's either storing them unencrypted (run, run screaming), or encrypting them with a 2 way encryption (DES, AES, public-private key et al - trust them a little more)
There is ZERO reasons for a website to be able to email you your password, or need to store anything but the hash. /rant.
Nice observation. Short answer it can't and leads to collisions which can be exploited in birthday attacks.
The simple answer is: it doesn't create unique hashes. Look at the Pidgeonhole priciple. It's just so unlikely for there to be a collision that nobody has ever found one.

What is the difference between Obfuscation, Hashing, and Encryption?

What is the difference between Obfuscation, Hashing, and Encryption?
Here is my understanding:
Hashing is a one-way algorithm; cannot be reversed
Obfuscation is similar to encryption but doesn't require any "secret" to understand (ROT13 is one example)
Encryption is reversible but a "secret" is required to do so
Hashing is a technique of creating semi-unique keys based on larger pieces of data. In a given hash you will eventually have "collisions" (e.g. two different pieces of data calculating to the same hash value) and when you do, you typically create a larger hash key size.
obfuscation generally involves trying to remove helpful clues (i.e. meaningful variable/function names), removing whitespace to make things hard to read, and generally doing things in convoluted ways to make following what's going on difficult. It provides no serious level of security like "true" encryption would.
Encryption can follow several models, one of which is the "secret" method, called private key encryption where both parties have a secret key. Public key encryption uses a shared one-way key to encrypt and a private recipient key to decrypt. With public key, only the recipient needs to have the secret.
That's a high level explanation. I'll try to refine them:
Hashing - in a perfect world, it's a random oracle. For the same input X, you always recieve the same output Y, that is in NO WAY related to X. This is mathematically impossible (or at least unproven to be possible). The closest we get is trapdoor functions. H(X) = Y for with H-1(Y) = X is so difficult to do you're better off trying to brute force a Z such that H(Z) = Y
Obfuscation (my opinion) - Any function f, such that f(a) = b where you rely on f being secret. F may be a hash function, but the "obfuscation" part implies security through obscurity. If you never saw ROT13 before, it'd be obfuscation
Encryption - Ek(X) = Y, Dl(Y) = X where E is known to everyone. k and l are keys, they may be the same (in symmetric, they are the same). Y is the ciphertext, X is the plaintext.
A hash is a one way algorithm used to compare an input with a reference without compromising the reference.
It is commonly used in logins to compare passwords and you can also find it on your reciepe if you shop using credit-card. There you will find your credit-card-number with some numbers hidden, this way you can prove with high propability that your card was used to buy the stuff while someone searching through your garbage won't be able to find the number of your card.
A very naive and simple hash is "The first 3 letters of a string".
That means the hash of "abcdefg" will be "abc". This function can obviously not be reversed which is the entire purpose of a hash. However, note that "abcxyz" will have exactly the same hash, this is called a collision. So again: a hash only proves with a certain propability that the two compared values are the same.
Another very naive and simple hash is the 5-modulus of a number, here you will see that 6,11,16 etc.. will all have the same hash: 1.
Modern hash-algorithms are designed to keep the number of collisions as low as possible but they can never be completly avoided. A rule of thumb is: the longer your hash is, the less collisions it has.
Obfuscation in cryptography is encoding the input data before it is hashed or encrypted.
This makes brute force attacks less feasible, as it gets harder to determine the correct cleartext.
That's not a bad high-level description. Here are some additional considerations:
Hashing typically reduces a large amount of data to a much smaller size. This is useful for verifying the contents of a file without having to have two copies to compare, for example.
Encryption involves storing some secret data, and the security of the secret data depends on keeping a separate "key" safe from the bad guys.
Obfuscation is hiding some information without a separate key (or with a fixed key). In this case, keeping the method a secret is how you keep the data safe.
From this, you can see how a hash algorithm might be useful for digital signatures and content validation, how encryption is used to secure your files and network connections, and why obfuscation is used for Digital Rights Management.
This is how I've always looked at it.
Hashing is deriving a value from
another, using a set algorithm. Depending on the algo used, this may be one way, may not be.
Obfuscating is making something
harder to read by symbol
replacement.
Encryption is like hashing, except the value is dependent on another value you provide the algorithm.
A brief answer:
Hashing - creating a check field on some data (to detect when data is modified). This is a one way function and the original data cannot be derived from the hash. Typical standards for this are SHA-1, SHA256 etc.
Obfuscation - modify your data/code to confuse anyone else (no real protection). This may or may not loose some of the original data. There are no real standards for this.
Encryption - using a key to transform data so that only those with the correct key can understand it. The encrypted data can be decrypted to obtain the original data. Typical standards are DES, TDES, AES, RSA etc.
All fine, except obfuscation is not really similar to encryption - sometimes it doesn't even involve ciphers as simple as ROT13.
Hashing is one-way task of creating one value from another. The algorithm should try to create a value that is as short and as unique as possible.
obfuscation is making something unreadable without changing semantics. It involves value transformation, removing whitespace, etc. Some forms of obfuscation can also be one-way,so it's impossible to get the starting value
encryption is two-way, and there's always some decryption working the other way around.
So, yes, you are mostly correct.
Obfuscation is hiding or making something harder to understand.
Hashing takes an input, runs it through a function, and generates an output that can be a reference to the input. It is not necessarily unique, a function can generate the same output for different inputs.
Encryption transforms the input into an output in a unique manner. There is a one-to-one correlation so there is no potential loss of data or confusion - the output can always be transformed back to the input with no ambiguity.
Obfuscation is merely making something harder to understand by intruducing techniques to confuse someone. Code obfuscators usually do this by renaming things to remove anything meaningful from variable or method names. It's not similar to encryption in that nothing has to be decrypted to be used.
Typically, the difference between hashing and encryption is that hashing generally just employs a formula to translate the data into another form where encryption uses a formula requiring key(s) to encrypt/decrypt. Examples would be base 64 encoding being a hash algorithm where md5 being an encryption algorithm. Anyone can unhash base64 encoded data, but you can't unencrypt md5 encrypted data without a key.

Resources