So, I'm experimenting a bit with secure python programming if there is such a thing :D). This is half a real world project, and half just a training exercise. So both theory and practical advice is appreciated.
I am using an AES salted encryption script which encrypts text in the manner ...
data = "hello"
encrypted_text = encryption(data, salt_word)
print encrypted_text (responds with "huio37*\xhuws%hwj2\xkuyq\x#5tYtd\xhdtye")
plain_text = decryption(encrypted_text, salt_word)
print plain_text (responds with "hello")
QUESTION: If you know the values of "encrypted_text" and "plain_text "...can you reverse engineer the "salt_word". AND if so, how hard is it (12 seconds on a PC, or 20 years on a Cray?)
My understanding from the entire point of AES is, no you can't. But I'm just not that familiar.
I'm using an insignificantly modified version of the script here:
Encrypt / decrypt data in python with salt
Basically, it uses the "salt" to encrypt "data". salt is a string, and data comes out as a string of encrypted characters.
Using decrypt, with the same salt string, returns the data to normal text.
When you say salt with AES, what specifically are you talking about? Is this the IV to CBC or CTR mode? Is it a salt used for key generation with a passphrase (PBKDF2)?
In either case, salt's aren't usually kept secret. IV's can be transmitted in the clear, and with hashing schemes, are usually stored along with the hash (otherwise it'd be impossible to compute the hash a second time).
I've given a previous answer as to why it's safe to transmit IV's in the clear, which you can find here.
Related
I have a tcl/tk based tool, which uses network password for authentication. Issue is that, it is saving password in the logs/history. So objective is to encrypt the password.
I tried to use aes package. But at the very beginning aes::init asks for keydata and initialization vector (16 byte). So how to generate IV and keydata. Is is some Random number? I am a novice in encryption algorithms.
If you have the password in the logs/history, why not fix the bug of logging/storing it in the first place?
Otherwise there are distinct things you might want:
A password hashing scheme like PBKDF2, bcrypt, argon2 etc. to store a password in a safe way and compare some user input to it. This is typically the case when you need to implement some kind of authentication with passwords on the server side.
A password encryption and protection scheme like AES. You need a password to authenticate to some service automatically, and it requires some form of cleartext password.
You have some secret data and need to securly store it to in non cleartext form.
If you have case 1, don't use the aespackage, it is the wrong tool for the job. If you have case 2, the aes package might help you, but you just exchanged the problem of keeping the password secret with the other problem of keeping the key secret (not a huge win). So the only viable case where aes is an option might be 3.
Lets assume you need to store some secret data in a reversible way, e.g. case 3 from above.
AES has a few possible modes of operation, common ones you might see are ECB, CBC, OFB, GCM, CTR. The Tcllib package just supports ECB and CBC, and only CBC (which is the default) is really an option to use.
Visit Wikipedia for an example why you should never use ECB mode.
Now back to your actual question:
Initialization Vector (IV)
This is a random value you pick for each encryption, it is not secret, you can just publish it together with the encrypted data. Picking a random IV helps to make two encrypted blocks differ, even if you use the same key and cleartext.
Secret Key
This is also a random value, but you must keep it secret, as it can be used for encryption and decryption. You often have the same key for multiple encryptions.
Where to get good randomness?
If you are on Linux, BSD or other unixoid systems just read bytes from /dev/urandom or use a wrapper for getrandom(). Do NOT use Tcls expr {rand()} or similar pseudorandom number generators (PRNG). On Windows TWAPI and the CryptGenRandom function would be the best idea, but sadly there is no Tcl high level wrapper included.
Is that enough?
Depends. If you just want to hide a bit of plaintext from cursory looks, maybe. If you have attackers manipulating your data or actively trying to hack your system, less so. Plain AES-CBC has a lot of things you can do wrong, and even experts did wrong (read about SSL/TLS 1.0 problems with AES-CBC).
Final words: If you are a novice in encryption algorithms, be sure you understand what you want and need to protect, there are a lot of pitfalls.
If I read the Tcler's Wiki page on aes, I see that I encrypt by doing this:
package require aes
set plaintext "Some super-secret bytes!"
set key "abcd1234dcba4321"; # 16 bytes
set encrypted [aes::aes -dir encrypt -key $key $plaintext]
and I decrypt by doing:
# Assuming the code above was run...
set decrypted [aes::aes -dir decrypt -key $key $encrypted]
Note that the decrypted text has NUL (zero) bytes added on the end (8 of them in this example) because the encryption algorithm always works on blocks of 16 bytes, and if you're working with non-ASCII text then encoding convertto and encoding convertfrom might be necessary.
You don't need to use aes::init directly unless you are doing large-scale streaming encryption. Your use case doesn't sound like it needs that sort of thing. (The key data is your “secret”, and the initialisation vector is something standardised that usually you don't need to set.)
For
`BDK = "0123456789ABCDEFFEDCBA9876543210"` `KSN = "FFFF9876543210E00008"`
The ciphertext generated was below
"C25C1D1197D31CAA87285D59A892047426D9182EC11353C051ADD6D0F072A6CB3436560B3071FC1FD11D9F7E74886742D9BEE0CFD1EA1064C213BB55278B2F12"`
which I found here. I know this cipher-text is based on BDK and KSN but how this 128 length cipher text was generated? What are steps involved in it or algorithm used for this? Could someone explain in simple steps. I found it hard to understand the documents I got while googled.
Regarding DUKPT , there are some explanations given on Wiki. If that doesn't suffice you, here goes some brief explanation.
Quoting http://www.maravis.com/library/derived-unique-key-per-transaction-dukpt/
What is DUKPT?
Derived Unique Key Per Transaction (DUKPT) is a key management scheme. It uses one time encryption keys that are derived from a secret master key that is shared by the entity (or device) that encrypts and the entity (or device) that decrypts the data.
Why DUKPT?
Any encryption algorithm is only as secure as its keys. The strongest algorithm is useless if the keys used to encrypt the data with the algorithm are not secure. This is like locking your door with the biggest and strongest lock, but if you hid the key under the doormat, the lock itself is useless. When we talk about encryption, we also need to keep in mind that the data has to be decrypted at the other end.
Typically, the weakest link in any encryption scheme is the sharing of the keys between the encrypting and decrypting parties. DUKPT is an attempt to ensure that both the parties can encrypt and decrypt data without having to pass the encryption/decryption keys around.
The Cryptographic Best Practices document that VISA has published also recommends the use of DUKPT for PCI DSS compliance.
How DUKPT Works
DUKPT uses one time keys that are generated for every transaction and then discarded. The advantage is that if one of these keys is compromised, only one transaction will be compromised. With DUKPT, the originating (say, a Pin Entry Device or PED) and the receiving (processor, gateway, etc) parties share a key. This key is not actually used for encryption. Instead, another one time key that is derived from this master key is used for encrypting and decrypting the data. It is important to note that the master key should not be recoverable from the derived one time key.
To decrypt data, the receiving end has to know which master key was used to generate the one time key. This means that the receiving end has to store and keep track of a master key for each device. This can be a lot of work for someone that supports a lot of devices. A better way is required to deal with this.
This is how it works in real-life: The receiver has a master key called the Base Derivation Key (BDK). The BDK is supposed to be secret and will never be shared with anyone. This key is used to generate keys called the Initial Pin Encryption Key (IPEK). From this a set of keys called Future Keys is generated and the IPEK discarded. Each of the Future keys is embedded into a PED by the device manufacturer, with whom these are shared. This additional derivation step means that the receiver does not have to keep track of each and every key that goes into the PEDs. They can be re-generated when required.
The receiver shares the Future keys with the PED manufacturer, who embeds one key into each PED. If one of these keys is compromised, the PED can be rekeyed with a new Future key that is derived from the BDK, since the BDK is still safe.
Encryption and Decryption
When data needs to be sent from the PED to the receiver, the Future key within that device is used to generate a one time key and then this key is used with an encryption algorithm to encrypt the data. This data is then sent to the receiver along with the Key Serial Number (KSN) which consists of the Device ID and the device transaction counter.
Based on the KSN, the receiver then generates the IPEK and from that generates the Future Key that was used by the device and then the actual key that was used to encrypt the data. With this key, the receiver will be able to decrypt the data.
Source
First, let me quote the complete sourcecode you linked and of which you provided only 3 lines...
require 'bundler/setup'
require 'test/unit'
require 'dukpt'
class DUKPT::DecrypterTest < Test::Unit::TestCase
def test_decrypt_track_data
bdk = "0123456789ABCDEFFEDCBA9876543210"
ksn = "FFFF9876543210E00008"
ciphertext = "C25C1D1197D31CAA87285D59A892047426D9182EC11353C051ADD6D0F072A6CB3436560B3071FC1FD11D9F7E74886742D9BEE0CFD1EA1064C213BB55278B2F12"
plaintext = "%B5452300551227189^HOGAN/PAUL ^08043210000000725000000?\x00\x00\x00\x00"
decrypter = DUKPT::Decrypter.new(bdk, "cbc")
assert_equal plaintext, decrypter.decrypt(ciphertext, ksn)
end
end
Now, you're asking is how the "ciphertext" was created...
Well, first thing we know is that it is based on "plaintext", which is used in the code to verify if decryption works.
The plaintext is 0-padded - which fits the encryption that is being tested by verifying decryption with this DecrypterTest TestCase.
Let's look at the encoding code then...
I found the related encryption code at https://github.com/Shopify/dukpt/blob/master/lib/dukpt/encryption.rb.
As the DecrypterTEst uses "cbc", it becomes apparent that the encrypting uses:
#cipher_type_des = "des-cbc"
#cipher_type_tdes = "des-ede-cbc"
A bit more down that encryption code, the following solves our quest for an answer:
ciphertext = des_encrypt(...
Which shows we're indeed looking at the result of a DES encryption.
Now, DES has a block size of 64 bits. That's (64/8=) 8 bytes binary, or - as the "ciphertext" is a hex-encoded text representation of the bytes - 16 chars hex.
The "ciphertext" is 128 hex chars long, which means it holds (128 hex chars/16 hex chars=) 8 DES blocks with each 64 bits of encrypted information.
Wrapping all this up in a simple answer:
When looking at "ciphertext", you are looking at (8 blocks of) DES encrypted data, which is being represented using a human-readable, hexadecimal (2 hex chars = 1 byte) notation instead of the original binary bytes that DES encryption would produce.
As for the steps involved in "recreating" the ciphertext, I tend to tell you to simply use the relevant parts of the ruby project where you based your question upon. Simply have to look at the sourcecode. The file at "https://github.com/Shopify/dukpt/blob/master/lib/dukpt/encryption.rb" pretty much explains it all and I'm pretty sure all functionality you need can be found at the project's GitHub repository. Alternatively, you can try to recreate it yourself - using the preferred programming language of your choice. You only need to handle 2 things: DES encryption/decryption and bin-to-hex/hex-to-bin translation.
Since this is one of the first topics that come up regarding this I figured I'd share how I was able to encode the ciphertext. This is the first time I've worked with Ruby and it was specifically to work with DUKPT
First I had to get the ipek and pek (same as in the decrypt) method. Then unpack the plaintext string. Convert the unpacked string to a 72 byte array (again, forgive me if my terminology is incorrect).
I noticed in the dukpt gem author example he used the following plain text string
"%B5452300551227189^HOGAN/PAUL ^08043210000000725000000?\x00\x00\x00\x00"
I feel this string is incorrect as there shouldn't be a space after the name (AFAIK).. so it should be
"%B5452300551227189^HOGAN/PAUL^08043210000000725000000?\x00\x00\x00\x00"
All in all, this is the solution I ended up on that can encrypt a string and then decrypt it using DUKPT
class Encrypt
include DUKPT::Encryption
attr_reader :bdk
def initialize(bdk, mode=nil)
#bdk = bdk
self.cipher_mode = mode.nil? ? 'cbc' : mode
end
def encrypt(plaintext, ksn)
ipek = derive_IPEK(bdk, ksn)
pek = derive_PEK(ipek, ksn)
message = plaintext.unpack("H*").first
message = hex_string_from_unpacked(message, 72)
encrypted_cryptogram = triple_des_encrypt(pek,message).upcase
encrypted_cryptogram
end
def hex_string_from_unpacked val, bytes
val.ljust(bytes * 2, "0")
end
end
boomedukpt FFFF9876543210E00008 "%B5452300551227189^HOGAN/PAUL^08043210000000725000000?"
(my ruby gem, the KSN and the plain text string)
2542353435323330303535313232373138395e484f47414e2f5041554c5e30383034333231303030303030303732353030303030303f000000000000000000000000000000000000
(my ruby gem doing a puts on the unpacked string after calling hex_string_from_unpacked)
C25C1D1197D31CAA87285D59A892047426D9182EC11353C0B82D407291CED53DA14FB107DC0AAB9974DB6E5943735BFFE7D72062708FB389E65A38C444432A6421B7F7EDD559AF11
(my ruby gem doing a puts on the encrypted string)
%B5452300551227189^HOGAN/PAUL^08043210000000725000000?
(my ruby gem doing a puts after calling decrypt on the dukpt gem)
Look at this: https://github.com/sgbj/Dukpt.NET, I was in a similar situation where i wondered how to implement dukpt on the terminal when the terminal has its own function calls which take the INIT and KSN to create the first key, so my only problem was to make sure the INIT key was generated the same way on the terminal as it is in the above mentioned repo's code, which was simple enough using ossl encryption library for 3des with ebc and applying the appropriate masks.
Right now, this is what I am doing:
1. SHA-1 a password like "pass123", use the first 32 characters of the hexadecimal decoding for the key
2. Encrypt with AES-256 with just whatever the default parameters are
^Is that secure enough?
I need my application to encrypt data with a password, and securely. There are too many different things that come up when I google this and some things that I don't understand about it too. I am asking this as a general question, not any specific coding language (though I'm planning on using this with Java and with iOS).
So now that I am trying to do this more properly, please follow what I have in mind:
Input is a password such as "pass123" and the data is
what I want to encrypt such as "The bank account is 038414838 and the pin is 5931"
Use PBKDF2 to derive a key from the password. Parameters:
1000 iterations
length of 256bits
Salt - this one confuses me because I am not sure where to get the salt from, do I just make one up? As in, all my encryptions would always use the salt "F" for example (since apparently salts are 8bits which is just one character)
Now I take this key, and do I hash it?? Should I use something like SHA-256? Is that secure? And what is HMAC? Should I use that?
Note: Do I need to perform both steps 2 and 3 or is just one or the other okay?
Okay now I have the 256-bit key to do the encryption with. So I perform the encryption using AES, but here's yet another confusing part (the parameters).
I'm not really sure what are the different "modes" to use, apparently there's like CBC and EBC and a bunch of others
I also am not sure about the "Initialization Vector," do I just make one up and always use that one?
And then what about other options, what is PKCS7Padding?
For your initial points:
Using hexadecimals clearly splits the key size in half. Basically, you are using AES-128 security wise. Not that that is bad, but you might also go for AES-128 and use 16 bytes.
SHA-1 is relatively safe for key derivation, but it shouldn't be used directly because of the existence/creation of rainbow tables. For this you need a function like PBKDF2 which uses an iteration count and salt.
As for the solution:
You should not encrypt PIN's if that can be avoided. Please make sure your passwords are safe enough, allow pass phrases.
Create a random number per password and save the salt (16 bytes) with the output of PBKDF2. The salt does not have to be secret, although you might want to include a system secret to add some extra security. The salt and password are hashed, so they may have any length to be compatible with PBKDF2.
No, you just save the secret generated by the PBKDF2, let the PBKDF2 generate more data when required.
Never use ECB (not EBC). Use CBC as minimum. Note that CBC encryption does not provide integrity checking (somebody might change the cipher text and you might never know it) or authenticity. For that, you might want to add an additional MAC, HMAC or use an encryption mode such as GCM. PKCS7Padding (identical to PKCS5Padding in most occurences) is a simple method of adding bogus data to get N * [blocksize] bytes, required by block wise encryption.
Don't forget to prepend a (random) IV to your cipher text in case you reuse your encryption keys. An IV is similar to a salt, but should be exactly [blocksize] bytes (16 for AES).
First off, I would like to ask if any of you know of an encryption algorithm that uses a key to encrypt the data, but no key to decrypt the data. This seems highly unlikely, if not impossible to me, so sorry if it's a stupid question.
My final question is, say you have access to the plain text data before it is encrypted, the key used to encrypt the plain text data, and the resulting encrypted data, would figuring out which algorithm used to encrypt the data be feasible?
First off, I would like to ask
if any of you know of an encryption
algorithm that uses a key to encrypt
the data, but no key to decrypt the
data.
No. There are algorithms that use a different key to decrypt than to encrypt, but a keyless method would rely on secrecy of the algorithm, generally regarded as a poor idea.
My final question is, say you have
access to the plain text data before
it is encrypted, the key used to
encrypt the plain text data, and the
resulting encrypted data, would
figuring out which algorithm used to
encrypt the data be feasible?
Most likely yes, especially given the key. A good crypto algorithm relies on the secrecy of the key, and the key alone. See kerckhoff's principle.
Also if a common algorithm is used it would be a simple matter of trial and error, and besides cryptotext often is accompanied by metadata which tells you algorithm details.
edit: as per comments, you may be thinking of digital signature (which requires a secret only on the sender side), a hash algorithm (which requires no key but isn't encryption), or a zero-knowledge proof (which can prove knowledge of a secret without revealing it).
Abstractly, we can think of the encryption system this way:
-------------------
plaintext ---> | algorithm & key | ---> ciphertext
-------------------
The system must guarantee the following:
decrypt(encrypt(plaintext, algorithm, key), algorithm, key) = plaintext
First off, I would like to ask
if any of you know of an encryption
algorithm that uses a key to encrypt
the data, but no key to decrypt the
data.
Yes, in such a system the key is redundant; all the "secrecy" lies in the algorithm.
My final question
is, say you have access to the plain
text data before it is encrypted, the
key used to encrypt the plain text
data, and the resulting encrypted
data, would figuring out which
algorithm used to encrypt the data be
feasible?
In practice, you'll probably have a small space of algorithms, so a simple brute-force search is feasible. However, there may be more than one algorithm that fits the given information. Consider the following example:
We define the following encryption and decryption operations, where plaintext, ciphertext, algorithm, and key are real numbers (assume algorithm is nonzero):
encrypt(plaintext, algorithm, key) = algorithm x (plaintext + key) = ciphertext
decrypt(ciphertext, algorithm, key) = ciphertext/algorithm - key = plaintext
Now, suppose that plaintext + key = 0. We have ciphertext = 0 for any choice of algorithm. Hence, we cannot deduce the algorithm used.
First off, I would like to ask if any of you know of an encryption algorithm that uses a key to encrypt the data, but no key to decrypt the data.
What are you getting at? It's trivial to come up with a pair of functions that fits the letter of the specification, but without knowing the intent it's hard to give a more helpful answer.
say you have access to the plain text data before it is encrypted, the key used to encrypt the plain text data, and the resulting encrypted data, would figuring out which algorithm used to encrypt the data be feasible?
If the algorithm is any good the output will be indistinguishable from random noise, so there is no analytic solution to this. As a practical matter, there are only so many trusted algorithms in wide use. Trying each one in turn would be quick, but would be complicated by the fact that an implementation has some freedom with regard to things like byte order (little-endian vs big-endian), key derivation (if you had a pass-phrase instead of the actual cryptographic key itself), encryption modes and padding.
As frankodwyer points out, this situation is not part of usual threat models. This would work in your favor, as it makes it more likely that the algorithm is a well-known one.
The best you could do without a known key in the decoder would be to add a bit of obscurity. For example, if the first step of the decode algorythm is to strip out everything except for every tenth character, then your encode key may be used to seed some random garbage for nine out of every ten characters. Thus, with different keys you could achieve different encoded results which would be decoded to the same message, with no key necessary for the decoder.
However, this does not add much real security and should not be solely relied on to protect crucial data. I'm just thinking of a case where it would be possible to do so yes I suppose it could - if you were just trying to prove a point or add one more level of security.
I don't believe that there is such an algorithm that would use a key to encrypt, but not to decrypt. (Silly answers like a 26 character Caesar cipher aside...)
To your second question, yes; it just depends on how much time you're willing to spend on it. In theoretical cryptography it is assumed that the algorithm can always be determined. Whether that be through theft of the algorithm or a physical machine, or as in your case having a plain text and cipher text pair.
Is it recommended that I use an initialization vector to encrypt/decrypt my data? Will it make things more secure? Is it one of those things that need to be evaluated on a case by case basis?
To put this into actual context, the Win32 Cryptography function, CryptSetKeyParam allows for the setting of an initialization vector on a key prior to encrypting/decrypting. Other API's also allow for this.
What is generally recommended and why?
An IV is essential when the same key might ever be used to encrypt more than one message.
The reason is because, under most encryption modes, two messages encrypted with the same key can be analyzed together. In a simple stream cipher, for instance, XORing two ciphertexts encrypted with the same key results in the XOR of the two messages, from which the plaintext can be easily extracted using traditional cryptanalysis techniques.
A weak IV is part of what made WEP breakable.
An IV basically mixes some unique, non-secret data into the key to prevent the same key ever being used twice.
In most cases you should use IV. Since IV is generated randomly each time, if you encrypt same data twice, encrypted messages are going to be different and it will be impossible for the observer to say if this two messages are the same.
Take a good look at a picture (see below) of CBC mode. You'll quickly realize that an attacker knowing the IV is like the attacker knowing a previous block of ciphertext (and yes they already know plenty of that).
Here's what I say: most of the "problems" with IV=0 are general problems with block encryption modes when you don't ensure data integrity. You really must ensure integrity.
Here's what I do: use a strong checksum (cryptographic hash or HMAC) and prepend it to your plaintext before encrypting. There's your known first block of ciphertext: it's the IV of the same thing without the checksum, and you need the checksum for a million other reasons.
Finally: any analogy between CBC and stream ciphers is not terribly insightful IMHO.
Just look at the picture of CBC mode, I think you'll be pleasantly surprised.
Here's a picture:
http://en.wikipedia.org/wiki/Block_cipher_modes_of_operation
link text
If the same key is used multiple times for multiple different secrets patterns could emerge in the encrypted results. The IV, that should be pseudo random and used only once with each key, is there to obfuscate the result. You should never use the same IV with the same key twice, that would defeat the purpose of it.
To not have to bother keeping track of the IV the simplest thing is to prepend, or append it, to the resulting encrypted secret. That way you don't have to think much about it. You will then always know that the first or last N bits is the IV.
When decrypting the secret you just split out the IV, and then use it together with the key to decrypt the secret.
I found the writeup of HTTP Digest Auth (RFC 2617) very helpful in understanding the use and need for IVs / nonces.
Is it one of those things that need to be evaluated on a case by case
basis?
Yes, it is. Always read up on the cipher you are using and how it expects its inputs to look. Some ciphers don't use IVs but do require salts to be secure. IVs can be of different lengths. The mode of the cipher can change what the IV is used for (if it is used at all) and, as a result, what properties it needs to be secure (random, unique, incremental?).
It is generally recommended because most people are used to using AES-256 or similar block ciphers in a mode called 'Cipher Block Chaining'. That's a good, sensible default go-to for a lot of engineering uses and it needs you to have an appropriate (non-repeating) IV. In that instance, it's not optional.
The IV allows for plaintext to be encrypted such that the encrypted text is harder to decrypt for an attacker. Each bit of IV you use will double the possibilities of encrypted text from a given plain text.
For example, let's encrypt 'hello world' using an IV one character long. The IV is randomly selected to be 'x'. The text that is then encrypted is then 'xhello world', which yeilds, say, 'asdfghjkl'. If we encrypt it again, first generate a new IV--say we get 'b' this time--and encrypt like normal (thus encrypting 'bhello world'). This time we get 'qwertyuio'.
The point is that the attacker doesn't know what the IV is and therefore must compute every possible IV for a given plain text to find the matching cipher text. In this way, the IV acts like a password salt. Most commonly, an IV is used with a chaining cipher (either a stream or block cipher). In a chaining block cipher, the result of each block of plain text is fed to the cipher algorithm to find the cipher text for the next block. In this way, each block is chained together.
So, if you have a random IV used to encrypt the plain text, how do you decrypt it? Simple. Pass the IV (in plain text) along with your encrypted text. Using our fist example above, the final cipher text would be 'xasdfghjkl' (IV + cipher text).
Yes you should use an IV, but be sure to choose it properly. Use a good random number source to make it. Don't ever use the same IV twice. And never use a constant IV.
The Wikipedia article on initialization vectors provides a general overview.