I'm looking for some sort of three way encryption algorithm. So you three pairs of data (components) are encrypted, and one encrypted component can be decrypted by a key that is derived from the other two encrypted components. I thought I read about an algorithm like this in my uni years, but I can't seem to find it. I would appreciate any of the pointers I can get on this.
So to get down on what I mean, let's say I have the values A, B, C. Where encrypted versions of these are denoted as A', B', C'. In order to decrypt A' into A, a decryption key has to be derived from B' and C'. In this sense the components store the actual data intended to hide, but also store the decryption key for the other component, so the encryption can be lifted when all three components are known.
Initially I thought with the cumulative nature of the xor operation it could be accomplished with that. My idea was by creating a A' = A ^ sha1(B) ^ sha1(C), B' = B ^ sha1(A) ^ sha1(C). You could combine the keys and cancel out sha1 components. But this doesn't seem to work. So I'm looking for some algorithm that could work like what is mentioned above.
Requirement: Given three data pieces data_A, data_B and data_C, encrypt each in such a manor that it requires all three to create the encryption key necessary to decrypt any one.
Create a master_key, split into three components (split_key_A, split_key_B and split_key_C.
Derive an encryption_key from the master_key with PBKDF2.
Individually encrypt each data_ with the encryption_key and prepend the encrypted data with the associated split_key_.
To decrypt, get all three encryptions, split off the split_keys and combine them into the master_key. Derive the encryption_key from the master_key with PBKDF2.
Decrypt any of the data_ with the encryption_key.
Note: I an not a cryptographic expert, do not rely on this for production work, it may not be secure.
Related
I have recently been learning about public/private key encryption in my computer science lessons, and how it works in terms of data encryption/decryption. We also covered how it can be used for digital signatures. However, we didn't go into too much detail on how the actual keys are generated themselves.
I know that it begins with a very large number, which is then passed through some kind of keygen algorithm which returns two distinctive keys, one of which is private and the other is public. Are these algorithms known or are they black box systems? And does one user always have the same pair of keys linked to them or do they ever change at any point?
It just seems like a very mathematical issue, as the keys are linked, yet one is not deducible from the other.
I know that it begins with a very large number, which is then passed through some kind of keygen algorithm which returns two distinctive keys, one of which is private and the other is public.
Well, that's not entirely correct. Most asymmetric algorithms are of course based on large numbers, but this is not a requirement. There are, for instance, algorithms based on hashing, and hashing is based on bits/bytes, not numbers.
But yes, for asymmetric algorithms usually contain a specific algorithm to perform the key pair generation. For instance, asymmetric encryption consists of a triple Gen, Enc and Dec where Gen represents the key pair generation. And the key pair of course consists of a public and a private part.
RSA basically starts off by generating two large random primes, it doesn't start with a single number necessarily.
Are these algorithms known or are they black box systems?
They are known, and they are fundamental to the security of the system. You cannot use just any numbers to perform, e.g., RSA. Note that for RSA there are different algorithms and configurations possible; not every system will use the same Gen.
And does one user always have the same pair of keys linked to them or do they ever change at any point?
That depends on the key management of the system. Usually there is some way of refreshing or regenerating keys. For instance X.509 certificates tend to have a end date (the date of expiry or expiration date), so you cannot even keep using the corresponding private key forever; you have to refresh the certificates and keys now and then.
It just seems like a very mathematical issue, as the keys are linked, yet one is not deducible from the other.
That's generally not correct. The public key is usually easy to derive from the private key. For RSA the public exponent may not be known, but it is usually set to a fixed number (65537). This together with the modulus - also part of the private key - makes the public key. For Elliptic Curve keys a private random value is first produced and the public key is directly derived from it.
You can of course never derive the private key from the public key; that would make no sense - it would not be very private if you could.
In RSA the generated two numbers p and q are very large prime numbers more or less the same size, which are used to calculate N which derives the public/private keys using modulo arithmetic.
The following answer in crypto.stackexchange.com describes in more details how we can start from a random (large) number and use Fermat test and Miller-Rabin tests to reach a number that is very probable to be prime.
I was considering hashing small blocks of sensitive ID data but I require to maintain the full uniqueness of the data blocks as a whole once obfuscated.
So, I came up with the idea of encrypting some publicly-known input data (say, 128 bits of zeroes), and use the data I want to obfuscate as the key/password, then throw it away, thus protecting the original data from ever being discovered.
I already know about hashing algorithms, but my problem is that I need to maintain full uniqueness (generally speaking a 1:1 mapping of input to output) while still making it impossible to retrieve the actual input. A hash cannot serve this function because information is lost during the process.
It is not necessary that the data be retrieved once "encrypted". It is only to be used as an ID number from then on.
An actual GUID/UUID is not suitable here because I need to manually control the identifiers on a per-identifier basis. The IDs cannot be unknown or arbitrarily generated data.
EDIT: To clarify exactly what these identifiers are made of:
(unencrypted) 64bit Time Stamp
ID Generation Counter (one count for each filetype)
Random Data (to make multiple encrypted keys dissimilar)
MAC Address (or if that's not available, set top bit + random digits)
Other PC-Specific Information (from registry)
The whole thing should add up to 192 bits, but the encrypted section's content size(s) could vary (this is by no means a final specification).
Given:
A static IV value
Any arbitrary 128bit key
A static 128 bits of input
Are AES keys treated in a fashion that would result in a 1:1 key<---->output mapping, given the same input and IV value?
No. AES is, in the abstract, a family of permutations of which you select a random one with the key. It is the case that for one of those permutations(i.e. for encryption under a given AES key) you will not get collisions because permutations are bijective.
However, for two different permutations (i.e. encryption under different AES keys, which is what you have), there is no guarantee what so ever that you don't get a collision. Indeed, because of the birthday paradox, the likelihood of a collision is probably higher than you think.
If your ID's are short ( < 1024 bits) you could just do an RSA encryption of them which would give you want you want. You'd just need to forget the private key.
How hard is it for a given ciphertext generated by a given (symmetric or asymmetric) encryption algorithm working on a plaintext/key pair, to find a different plaintext/key pair that yields the same cyphertext?
And how hard is it two find two plaintext/key pairs lead to the same cyphertext?
What led to this question, is another question that might turn out to have nothing to do with the above questions:
If you have a ciphertext and a key and want to decrypt it using some decryption routine, the routine usually tells you, if the key was correct. But how does it know it? Does it look for some pattern in the resulted plaintext, that indicates, that the decryption was successful? Does there exists another key results in some different plaintext, that contains the pattern and is also reported "valid" by the routine?
Follow-up question inspired by answers and comments:
If the allowed plaintext/key pairs where restricted in the on of the following (or both) way(s):
1) The plaintext starts with the KCV (Key check value) of the key.
2) The plaintext starts with a hash value of some plaintext/key combination
Would this make the collision finding infeasible? Is it even clear, that such a plaintext/key exists=
The answer to your question the way you phrased it, is that there is no collision resistance what so ever.
Symmetric case
Let's presume you got a plain text PT with a length that is a multiple of the block length of the underlying block cipher. You generate a random IV and encrypt the plain text using a key K, CBC mode and no padding.
Producing a plain text PT' and key K' that produces the same cipher text CT is easy. Simply select K' at random, decrypt CT using key K' and IV, and you get your colliding PT'.
This gets a bit more complicated if you also use padding, but it is still possible. If you use PKCS#5/7 padding, just keep generating keys until you find one such that the last octet of your decrypted text PT' is 0x01. This will take on average 128 attempts.
To make such collision finding infeasible, you have to use a message authentication code (MAC).
Asymmetric case
Something similar applies to RSA public key encryption. If you use no padding (which obviously isn't recommended and possibly not even supported by most cryptographic libraries), and use a public key (N,E) for encrypting PT into CT, simply generate a second key pair (N',E',D') such that N' > N, then PT' = CT^D' (mod N) will encrypt into CT under (N',E').
If you are using PKCS#1 v1.5 padding for your RSA encryption, the most significant octet after the RSA private key operation has to be 0x02, which it will be with a probability of approximately one in 256. Furthermore the first 0x00 valued octet has to occur no sooner than at index 9, which will happen with a high probability (approximately 0,97). Hence, on average you will have to generate on average some 264 random RSA key pairs of the same bit size, before you hit one that for some plain text could have produced the same cipher text.
If your are using RSA-OAEP padding, the private key decryption is however guaranteed to fail unless the cipher text was generated using the the corresponding public key.
If you're encrypting some plaintext (length n), then there are 2n unique input strings, and each must result in a unique ciphertext (otherwise it wouldn't be reversible). Therefore, all possible strings of length n are valid ciphertexts. But this is true for all keys. Therefore, for any given ciphertext, there are 2k ways of obtaining it, each with a different key of length k.
Therefore, to answer your first question: very easy! Just pick an arbitrary key, and "decrypt" the ciphertext. You will get the plaintext that matches the key.
I'm not sure what you mean by "the routine usually tells you if the key was correct".
One simple way to check the validity of a key is to add a known part to the plaintext before encryption. If the decryption doesn't reproduce that, it's not the right key.
The known part should not be a constant, since that would be an instant crib. But it could be e.g. be a hash of the plaintext; if hashing the decrypted text yields the same hash value, the key is probably correct (with the exception of hash collisions).
Say I have a scheme that derives a key from N different inputs. Each of the inputs may not be completely secure (f.x. bad passwords) but in combination they are secure. The simple way to do this is to concatenate all of the inputs in order and use a hash as a result.
Now I want to allow key-derivation (or rather key-decryption) given only N-1 out of the N inputs. A simple way to do this is to generate a random key K, generate N temporary keys out of different N subsets of the input, each with one input missing (i.e. Hash(input_{1}, ..., input_{N-1}), Hash(input_{0}, input_{2}, ..., input_{N-1}), Hash(input_{0}, input_{1}, input_{3},..., input_{N-1}), ..., Hash(input_{0}, ..., input_{N-2})) then encrypt K with each of the N keys and store all of the results.
Now I want to a generalized solution, where I can decrypt the key using K out of N inputs. The naive way to expand the scheme above requires storing (N choose N-K) values, which quickly becomes infeasible.
Is there a good algorithm for this, that does not entail this much storage?
I have thought about a way to use something like Shamir's Secret Sharing Scheme, but cannot think of a good way, since the inputs are fixed.
Error Correcting Codes are the most direct way to deal with the problem. They are not, however, particularly easy to implement.
The best approach would be using a Reed Solomon Code. When you derive the password for the first time you also calculate the redundancy required by the code and store it. When you want to recalculate the key you use the redundancy to correct the wrong or missing inputs.
To encrypt / create:
Take the N inputs. Turn each into a block in a good /secure way. Use Reed Solomon to generate M redundancy blocks from the N block combination. You now have N+M blocks, of which you need only a total of N to generate the original N blocks.
Use the N blocks to encrypt or create a secure key.
If the first, store the encrypted key and the M redundancy blocks. If the second, store only the M redundancy blocks.
To decrypt / retrieve:
Take N - R correct input blocks, where R =< M. Combine them with the redundancy blocks you stored to create the original N blocks. Use the original N blocks to decrypt or create the secure key.
(Thanks to https://stackoverflow.com/users/492020/giacomo-verticale : This is essentially what he/she said, but I think a little more explicit / clearer.)
Shamir's share secret is a techinique that is used when you want to split a secret in multiple shares such that only a combination of minimum k parts would reveal the intial secret. If you are not sure about the correctness of the initiator and you want to verify this you use verifiable secret sharing .both are based to polynomial interpolation
One approach would be to generate a purely random key (or by hashing all of the inputs, if you want to avoid an RNG for some reason), split it using a k-of-n threshold scheme, and encrypt each share using the individual password inputs (eg send them through PBKDF2 with 100000 iterations and then encrypt/MAC with AES-CTR/HMAC). This would require less storage than storing hash subsets; roughly N * (share size + salt size + MAC size)
Rather than simply allowing a few errors out of a large number of inputs, you should divide the inputs up into groups and allow some number of errors in each group. If you were to allow 4 errors out of 64 inputs then you would have to have 15,249,024 encrypted keys, but if you break that up into two groups of 32, allowing two errors per group then you would only need to have 1984 encrypted keys.
Once you have decrypted the key information from each group then use that as input into decrypting key that you ultimately want.
Also, the keys acquired from each group must not be trivial in comparison to the key that you ultimately want. Do not simply break up a 256 bit key into 8 32bit key pieces. Doing this would allow someone that could decrypt 7 of those key pieces to attempt a bruteforce attack on the last piece. if you want access to a 256 bit key, then you must work with 256 bit keys for the whole procedure.
Is it possible to encrypt in one order and decrypt in another? For example I've got the following:
plain_text.txt
Public/Private Key pair 1
Public/Private Key pair 2
Example
Encryption:
public1(public2(plain_text.txt))
Decryption:
private1(private2(encrypted))
Is there any encryption algorithm that allows this? Is it even possible?
In most cases you can't change the order of the decryption.
Schemes that allow to reorder decryption are called commutative cryptosystems.
One public key cryptosystem that can be used to build a commutative cryptosystem is
the ElGamal encryption.
Here is just the main idea: Assume g is a generator of
a suitable group G, for which computing discrete logarithms is hard.
Let xA and xB be two private keys,
hA = g xA , and
hB = g xB
be the corresponding public keys. Both keys pairs use the same group
G (i.e. the same modulus p if we use G = Z/(p)). It is one advantage of the
ElGamal scheme that it still is secure if two users share the same group (or modulus).
RSA on the other hand will be insecure.
Encrypting a message m with hA gives the ciphertext
(m hAr, gr).
Note that knowing the secret key xA allows to decrypt because
(gr)xA = hAr
To encrypt the ciphertext a second time one would first re-encrypt the existing
ciphertext with A's public key.
He chooses a random r' and computes
(m hAr hAr', grgr') =
(m hAr+r', gr+r').
The result is just another valid encryption with A's public key.
This re-encryption is necessary to avoid an attack that works for example
against RSA with equal modulus as shown below.
Next, one would encrypt with B's public key giving
(m hAr+r' hBs, gr+r', gs).
Decryption is possible in either order, e.g. knowing xA allows to compute
(gr+r')xA = hAr+r'
and hence one can compute
(m hBs, gs),
which is just what we want: an encryption of m with B's public key.
There are a number of subtleties that need to be observed to get a secure implementation.
And getting this right isn't easy.
For more info see for example the Phd of Stephen Weis, which contains a chapter on commutative encryption.
There are a number of problems if the same idea is tried with "textbook RSA". First to make the encryption commutative it is necessary that both users A and B share the same modulus.
E.g. A uses (n, eA, dA) and B uses (n, eB, dB), where n is the modulus, eA, eB the public keys and dA, dB the secret keys. However, knowing for example (n, eA, dA) allows to factor n, and hence compute B's secret key, which is of course one big flaw.
Now we could encrypt m as
meA mod n,
encrypt again as
meAeB mod n,
decrypt with A's secret key giving
meB mod n,
and decrypt again with B's secret key to get m. That looks fine until one notices that
an attacker who can intercept the two ciphertexts c = meA mod n and c' = meB mod n can use Euclid's algorithm to find r,s such that
r eA + s eB = 1
and then compute
m = cr (c')s mod n.
The idea also works against the solution using RC4 proposed in another answer. Weis's thesis contains a detailed description of the attack.
With most public implementations of RSA it would not be possible. The decryption routine expects the plaintext to be in a specific format (i. e. properly padded) and fails if it's not. Again, on encryption it would apply padding to the plaintext, instead of using the blob as it is.
/*
The math of RSA allows for that, AFAIK, as long as the moduli of the two keys are coprime (which is true almost always). But you'll probably have to roll your own implementation.
*/
Another problem is that the numeric value of the plaintext block should be smaller than the modulus. So the modulus of the first key should be smaller than that of the second key, otherwise no guarantee that the first cyphertext would be a proper plaintext for the second encryption round.
OpenSSL has, I vaguely recall, a no-padding mode. You might have some luck with that.
EDIT: in general, coming up with your own cryptographic primitives is a bad idea in 99.9% cases. If your interest is purely academic, then be my guest; but if you're after a specific piece of applied functionality (i. e. encrypt something so that the consent of two nontrusting parties is needed to decrypt), then you're definitely on the wrong track.
EDIT2: the math of RSA allows for that if the moduli are identical. Scratch paragraph two. But having two keys share the same modulus compromises security very much. If Alice has private key (m, d) and Cindy as private key (m, d') - assuming same m - then Alice can determine d' in O(m) time, given a single plaintext/cyphertext pair from Cindy. Not good.
With public key/private key encryption, the answer is no. PubK1(PubK2(plain.text)) => encrypted.text. You must decrypt with PrivK2(PrivK1(encrypted.text)).
However, if you use a symmetric stream cipher such as RC4, then you could change the order of the decryption (A xor B xor C = C xor B xor A). But that is not a public/private key algorithm obviously.
This would only be true if the encryption algorithm behaved as a specific kind of mathematical group. Most (all?) block encryption algorithms are not such groups.
AFAIK this should be possible with slight modification to RSA. I do not know of any tool which can actually do it though.
No, it does not work. Very simply, you cannot guarantee unique decryption because one modulus is bigger than the other.
EDIT: I'm assuming this is RSA. If not, then I'd have to think about some of the others.
EDIT2: If you are always willing to use the smaller modulus first, then it does work.