What is XOR Encryption? - encryption

I have heard about people starting encryption and thought it may be something I would like, so I checked XOR and can't make any sense of it.
So can someone explain to me what XOR is ?

you take a key, such as 0101, then you use that to XOR your string (in binary format) to achieve an encrypted string.
0101 XOR <-- key
1011 <---- original message
----
1110 <-- send message
You send 1110 to your receiver. That receiver, then takes the received string and XORs it with the key to obtain the original message:
1110 XOR <--- received message
0101 <-- key
----
1011 <--- original message

XOR, or 'exclusive or' is a 2 operand logical operation defined as:
(a and b) or (not a and not b)
a b result
0 0 0
1 0 1
0 1 1
1 1 0
The critical feature of XOR with respect to encryption is it is reversible, ie where C = A XOR B, then you can get back A using A = C XOR B.
So for a stream of plaintext A, and a key of the same length B, you can generate cryptotext C, and send that to the recipient.
The recipient, who has a copy of B in his safe, can do C XOR B and regenerate A.

XOR is a logical operation, pronounced exclusive or. It can be used to cipher messages simply and fast. You can see a truth table for this operation here: http://mathworld.wolfram.com/XOR.html
quasi-pseudo code implementation (via http://www.evanfosmark.com/2008/06/xor-encryption-with-python/):
#!/usr/bin/env python
from itertools import izip, cycle
def xor_crypt_string(data, key):
return ''.join(chr(ord(x) ^ ord(y)) for (x,y) in izip(data, cycle(key)))
my_data = "Hello. This is a secret message! How fun."
my_key= "firefly"
# Do the actual encryption
encrypted = xor_crypt_string(my_data, key=my_key)
print encrypted
print '---->'
# This will obtain the original data from the encrypted
original = xor_crypt_string(encrypted, key=my_key)
print original
Output:
. BY2F
FRR
DF$IB
---->
Hello. This is a secret message! How fun.

I wrote a blog about XOR encryption http://programmingconsole.blogspot.in/2013/10/xor-encryption-for-alphabets.html
Mathematically, XOR encryption/cipher is additive cipher, an encryption algorithm that operates according to following principles:
(A * B) + (!A * !B)
A B A XOR B
0 0 0
1 0 1
0 1 1
1 1 0
xor operator is just like AND(*) and OR(+) operator
To decrypt the cipher we just need to XOR the cipher with the key to regain the original text .
The XOR operator is extremely common component in complex encryption Algorithms.
Such a encryption can easily be broken by using a constant repeating key and using frequency analysis .
But we change the key after each encryption breaking such encryption is computationally very hard
such a cipher is called a stream cipher in which every next bit is encrypted using a different pseudo-random key , such a kind of encryption was used by Germans in theirs Lorentz cipher .
By using a truly random* stream of key the cipher is theoretically unbreakable hence unusable
I would recommend you to watch
BBC: Code Breakers Bletchley Parks lost Heroes documentary
It will give you real insights into the world of cryptography and encrypted bits . How important cryptography is ? Well it was the cause for the invention of computers.

On the simplest level, reversible operations such as XOR (pronounced "exclusive OR") form the foundation of most cryptography.
XOR acts like a toggle switch where you can flip specific bits on and off. If you want to "scramble" a number (a pattern of bits), you XOR it with a "secret" number. If you take that scrambled number and XOR it again with the same secret number, you get your original number back.
Encrypt a number (210) with a secret "key" (145).
210 XOR 145 gives you 67 ←-- your "scrambled" result
|
+ now unscramble it +
|
↓
67 XOR 145 gives you 210 ←-- and back to your original number
This is a very rudimentary example. When you encrypt a sequence of numbers (or text or any pattern of bits) with XOR, you have a very basic cipher algorithm.

XOR is short for 'exclusive or'. A XOR B is true if A is true, or if B is true, but not if both A and B are true.
It is used for cryptography because A XOR B XOR A is equal to B - so if you can use A as a key for both encryption and decryption.

It should be noted, that this method of encryption can hardly be considered secure. If you encrypt any common file (PNGs, JPGs, etc.) where the header is well known, the key can easily be derived from the encrypted content and the known header.

XOR encryption can also be used in cipher block chaining. XOR CBC is used as an addition to many encryption implementations. There is a google code project that makes use of this by itself, although XOR alone is not very secure: http://code.google.com/p/xorencryption/

Related

RSA/ECB/PKCS1 Padding & AES/CBC/PKCS5Padding Encryption / Decryption

I have an API to call where I have to encrypt my data using RSA/ECB/PKCS1 Padding & AES/CBC/PKCS5PADDING.
Sample Data: {"KEY":"VALUE"}
Step.1:
I have to generate a random number of 16 digit. eg: '1234567890123456'
Step.2:
Do RSA/ECB/PKCS1Padding to random number and base64Encode the result. we get "encrypted_key"
Step.3:
Concatenate random number & data:
DATA = 1234567890123456{"KEY":"VALUE"}
Step.4:
Do AES/CBC/PKCS5Padding on DATA (from Step 3) using random number(1234567890123456) as KEY & Base64Encoded random number as IV. we get "ENCRYPTED_DATA"
So, for Step 1 I am using JSEncrypt javascript library.
for Step 4 I am using CrytoJS.AES.encrypt() function. I am pretty sure that my JSEncrypt function is running fine as the client is able to decrypt it but client is not able to decrypt my data. I feel that I am making a mistake while using CryptoJS.
Can someone guide me properly on how to use the library.
What I am doing is:
KEY = '1234567890123456'
IV = MTIzNDU2Nzg5MDEyMzQ1Ng== (result of btoa('1234567890123456') )
DATA = "1234567890123456{"KEY":"VAL"}"
cryptedData = Crypto.AES.encrypt(DATA, KEY, {iv: IV, mode: CryptoJS.mode.CBC,padding:CryptoJS.pad.Pkcs7})
I am told to use PKCS5Padding in AES/CBC Encryption ( Step 4 ) but it seems that AES does not support PKCS5Padding but PKCS7Padding.
I think I am making a mistake in the way I am passing KEY & IV to CryptoJS.
Any help will be greatly appreciated.
For the start lets see why are you doing the exercise. RSA is intended to encode only limited amout of data. So we use "hybrid encryption", where the data are encrypted using a symmetric cipher with a random key and the key itself is encrypted using RSA
Encryption works on binary data, to safely transmit binary data, the data are encoded to printable form (hex or base64)
Step.1: I have to generate a random number of 16 digit
What we see is 16 digits 0-9. That's not really safe. Generating 16 digits you will get a key of 10^16, which is equals of approx 2^53 (if I did the math wrong, please comment).
You need to generate 16 random bytes (digits 0-256 resulting in 2^128 key). That is your DEK (data encryption key).
You may encode the DEK to be in printable form, in hexadecimal encoding it will have 32 characters.
Step.2:
ok, you now get encrypted encoded_encryption_key
Step 3, Step 4
And here you should understand what are you doing.
encrypt DATA using DEK ( not encoded random number in binary form), you will get encrypted_data. You can encode the result to encoded_encrypted_data
concatenate the encrypted key and encrypted data. It. is up to you to choose if you encode it before or after encoding. I suggest you make concatenation of encoded_encryption_key and encoded_encrypted_data with some separator, because if RSA key length changes, the length of encoded_encryption_key changes too
Make sure to discuss with the client what format is expected exactly.
Notes:
IV needs to be 16 bytes long for AES and for CryptoJS I believe it needs to be Hex encoded, so using btoa may not be the best idea. I believe the CryptoJS just trims the value to 16 bytes, but formally it is not correct.
CBC cipher needs some sort of integrity check, I suggest to add some HMAC or signature to the result (otherwise someone could change the ciphertext without you being able to detect the tamper)
but it seems that AES does not support PKCS5Padding but PKCS7Padding.
Indeed AES supports Pkcs7. Pkcs5 is functionally the same, but defined on 64 blocks. The designation is still used in Java as heritage from DES encryption.

XOR encryption/decryption when the key is more than one byte long?

Suppose that the character 'b' is used as a key for XOR encryption. In that case, encrypting a plain text is done by XOR-ing each byte (character) of the text by the ascii code of 'b'. Conversely, the plain text can be obtained from the ciphered text by XOR-ing by 'b's ascii code again. This is understood.
However, how exactly does one encrypt when the key (password) is a string of characters? Suppose that the encrypting password is 'adg'. In that case, is the plain text ciphered via XOR-ing each of its bytes with the value of a XOR d XOR g? If not, then how?
A way is to repeat the key to cover plain text.
e.g. key = RTTI, plaintext = "how exactly does one"
Text: how exactly does one
Key: RTTIRTTIRTTIRTTIRTTI
Each character in the plain text will be XOR'd with the corresponding key character below it.
There are many ways to implement "XOR encryption", so if you're trying to decode some existing data, you'll first need to figure out which kind it's encrypted with.
The most common scheme I've seen works basically like the classic Vigenère cipher; e.g. for the three-byte key abc, the first byte of plaintext is XORed with a, the second with b, the third with c; the fourth byte is then again XORed with a, the fifth with b, and so on, like this:
Plaintext: THIS IS SOME SECRET TEXT...
Key: abcabcabcabcabcabcabcabcabc
--------------------------------------
XOR: 5**2B*2B0./&A1&"0&5B7$:7OLM
One way to recognize this kind of repeating-key cipher (and also find out the key length) is to compute the index of coincidence between pairs of bytes N positions apart in the ciphertext. If the key length is L, then plotting the index of coincidence as a function of N should reveal a regular array of peaks at the values of N that are divisible by L. (Of course, this only works if the plaintext is something like normal text or code that has a biased byte frequency distribution; if it's completely random data, then this won't help.)
Or you could just use hellman's xortool, which will automate all this for you. For example, running it on the ciphertext 5**2B*2B0./&A1&"0&5B7$:7OLM above, it says:
The most probable key lengths:
1: 17.3%
3: 40.7%
6: 21.5%
8: 6.5%
12: 5.4%
15: 4.6%
18: 4.0%
Key-length can be 3*n
If you have enough ciphertext, and can guess the most common byte in the plaintext, it will even spit out the key for you.

How to alter CBC encrypted text to change the message

I'm currently in the process of learning about encryption and i'm hoping to find more clarification on what I learned.
Suppose the message "100 dollars should be moved from account 123456 to 555555" was encrypted using aes-128-cbc and a random IV. My professor says it's possible to alter the encrypted text so that when it's decrypted again, the message reads "900 dollars should be moved from account 123456 to 555555". How do you go about doing this?
I tried figuring it out on my own by generating my own key and iv, encrypting the message, then converting it to hex characters to work with. From there can I swap out some characters then decrypt? I tried playing around with this but something always seemed to go wrong.
We're using a basic linux command line for this.
Any help or explanation would be awesome!
Suppose the string was encrypted using a one-time-pad and the resulting ciphertext is "B8B7D8CB9860EBD0163507FD00A9F923D45...". We know that the first byte of plaintext, the digit 1, has ASCII code 0x31. The first byte of the ciphertext is 0xB8. If k0 denotes the first byte of the key, then 0x31 xor k0 = 0xB8. Decoding a one-time-pad is just xor-ing the ciphertext with key. So, the person decoding gets the first byte of the plaintext as 0x31 = 0xB8 xor k0. If we xor the first byte of ciphertext with m0, then the person decoding the ciphertext will get (0xB8 xor m0) xor k0. But this is just (0xB8 xor k0) xor m0 as xor is commutative and associative. The last expression can be reduced to 0x31 xor m0. Now we want to change the resulting byte to 0x39, the ASCII code for the digit 9. So we need to solve 0x31 xor m0 = 0x39. But that is simple just xor with 0x31 on both sides.
The same principle applies when using CBC mode. You can modify the IV in a similar way to change the decoded message.
#user515430's reasoning above is based on the fact that every ciphertext C is linearly dependent from the plaintext P (since C = P ⊕ K).
Actually, as #polettix makes us notice, in CBC encryption we have that, e.g. for the 6-th block of a certain text, C₆ = E(P₆ ⊕ C₅, K), given a key K; and if E(·) is a good encryption function we shoud loose such linearity.
But, in CBC decryption, the 6-th block of plaintext will be obtained as P₆ = D(C₆, K) ⊕ C₅, so it will be linearly dependent not from C₆, but from C₅.
Re-wording, if you want to change a plaintext block in CBC, just change the previous chiphertext block.
See also https://crypto.stackexchange.com/q/30407/36884 (for the record, Cryptography StackExchange is the right site for this kind of question).

Sign with RSA-1024 an SHA-256 digest: what is the size?

I was wondering:
1) if I compute the digest of some datas with SHA-512 => resulting in a hash of 64 bytes
2) and then I sign this hash with RSA-1024 => so a block of 128 bytes, which is bigger than the 64 bytes of the digest
=> does it mean in the end my signed hash will be exactly 128 bytes?
Thanks a lot for any info.
With RSA, as specified by PKCS#1, the data to be signed is first hashed with a hash function, then the result is padded (a more or less complex operation which transforms the hash result into a modular integer), and then the mathematical operation of RSA is applied on that number. The result is a n-bit integer, where n is the length in bits of the "modulus", usually called "the RSA key size". Basically, for RSA-1024, n is 1024. A 1024-bit integer is encoded as 128 bytes, exactly, as per the encoding method described in PKCS#1 (PKCS#1 is very readable and not too long).
Whether a n-bit RSA key can be used to sign data with a hash function which produces outputs of length m depends on the details of the padding. As the name suggests, padding involves adding some extra data around the hash output, hence n must be greater than m, leaving some room for the extra data. A 1024-bit key can be used with SHA-512 (which produces 512-bit strings). You could not use a 640-bit key with SHA-512 (and you would not, anyway, since 640-bit RSA keys can be broken -- albeit not trivially).

Why is XOR used in cryptography?

Why is only XOR used in cryptographic algorithms, and other logic gates like OR, AND, and NOR are not used?
It isn't exactly true to say that the logical operation XOR is the only one used throughout all cryptography, however it is the only two way encryption where it is used exclusively.
Here is that explained:
Imagine you have a string of binary digits 10101
and you XOR the string 10111 with it you get 00010
now your original string is encoded and the second string becomes your key
if you XOR your key with your encoded string you get your original string back.
XOR allows you to easily encrypt and decrypt a string, the other logic operations don't.
If you have a longer string you can repeat your key until its long enough
for example if your string was 1010010011 then you'd simple write your key twice and it would become 1011110111 and XOR it with the new string
Here's a wikipedia link on the XOR cipher.
I can see 2 reasons:
1) (Main reason) XOR does not leak information about the original plaintext.
2) (Nice-to-have reason) XOR is an involutory function, i.e., if you apply XOR twice, you get the original plaintext back (i.e, XOR(k, XOR(k, x)) = x, where x is your plaintext and k is your key). The inner XOR is the encryption and the outer XOR is the decryption, i.e., the exact same XOR function can be used for both encryption and decryption.
To exemplify the first point, consider the truth-tables of AND, OR and XOR:
And
0 AND 0 = 0
0 AND 1 = 0
1 AND 0 = 0
1 AND 1 = 1 (Leak!)
Or
0 OR 0 = 0 (Leak!)
0 OR 1 = 1
1 OR 0 = 1
1 OR 1 = 1
XOR
0 XOR 0 = 0
0 XOR 1 = 1
1 XOR 0 = 1
1 XOR 1 = 0
Everything on the first column is our input (ie, the plain text). The second column is our key and the last column is the result of your input "mixed" (encrypted) with the key using the specific operation (ie, the ciphertext).
Now, imagine an attacker got access to some encrypted byte, say: 10010111, and he wants to get the original plaintext byte.
Let's say the AND operator was used in order to generate this encrypted byte from the original plaintext byte. If AND was used, then we know for certain that every time we see the bit '1' in the encrypted byte then the input (ie, the first column, the plain text) MUST also be '1' as per the truth table of AND. If the encrypted bit is a '0' instead, we do not know if the input (ie, the plain text) is a '0' or a '1'. Therefore, we can conclude that the original plain text is: 1 _ _ 1 _ 111. So 5 bits of the original plain text were leaked (ie, could be accessed without the key).
Applying the same idea to OR, we see that every time we find a '0' in the encrypted byte, we know that the input (ie, the plain text) must also be a '0'. If we find a '1' then we do not know if the input is a '0' or a '1'. Therefore, we can conclude that the input plain text is: _ 00 _ 0 _ _ _. This time we were able to leak 3 bits of the original plain text byte without knowing anything about the key.
Finally, with XOR, we cannot get any bit of the original plaintext byte. Every time we see a '1' in the encrypted byte, that '1' could have been generated from a '0' or from a '1'. Same thing with a '0' (it could come from both '0' or '1'). Therefore, not a single bit is leaked from the original plaintext byte.
Main reason is that if a random variable with unknown distribution R1 is XORed with a random variable R2 with uniform distribution the result is a random variable with uniform distribution, so basically you can randomize a biased input easily which is not possible with other binary operators.
The output of XOR always depends on both inputs. This is not the case for the other operations you mention.
I think because XOR is reversible. If you want to create hash, then you'll want to avoid XOR.
XOR is the only gate that's used directly because, no matter what one input is, the other input always has an effect on the output.
However, it is not the only gate used in cryptographic algorithms. That might be true of old-school cryptography, the type involving tons of bit shuffles and XORs and rotating buffers, but for prime-number-based crypto you need all kinds of mathematics that is not implemented through XOR.
XOR acts like a toggle switch where you can flip specific bits on and off. If you want to "scramble" a number (a pattern of bits), you XOR it with a number. If you take that scrambled number and XOR it again with the same number, you get your original number back.
210 XOR 145 gives you 67 <-- Your "scrambled" result
67 XOR 145 gives you 210 <-- ...and back to your original number
When you "scramble" a number (or text or any pattern of bits) with XOR, you have the basis of much of cryptography.
XOR uses fewer transistors (4 NAND gates) than more complicated operations (e.g. ADD, MUL) which makes it good to implement in hardware when gate count is important. Furthermore, an XOR is its own inverse which makes it good for applying key material (the same code can be used for encryption and decryption) The beautifully simple AddRoundKey operation of AES is an example of this.
For symmetric crypto, the only real choices operations that mix bits with the cipher and do not increase length are operations add with carry, add without carry (XOR) and compare (XNOR). Any other operation either loses bits, expands, or is not available on CPUs.
The XOR property (a xor b) xor b = a comes in handy for stream ciphers: to encrypt a n bit wide data, a pseudo-random sequence of n bits is generated using the crypto key and crypto algorithm.
Sender:
Data: 0100 1010 (0x4A)
pseudo random sequence: 1011 1001 (0xB9)
------------------
ciphered data 1111 0011 (0xF3)
------------------
Receiver:
ciphered data 1111 0011 (0xF3)
pseudo random sequence: 1011 1001 (0xB9) (receiver has key and computes same sequence)
------------------
0100 1010 (0x4A) Data after decryption
------------------
Let's consider the three common bitwise logical operators
Let's say we can choose some number (let's call it the mask) and combine it with an unknown value
AND is about forcing some bits to zero (those that are set to zero in the mask)
OR is about forcing some bits to one (those that are set to one in the mask)
XOR is more subtle you can't know for sure the value of any bit of the result, whatever the mask you choose. But if you apply your mask two times you get back your initial value.
In other words the purpose of AND and OR is to remove some information, and that's definitely not what you want in cryptographic algorithms (symmetric or asymmetric cipher, or digital signature). If you lose information you won't be able to get it back (decrypt) or signature would tolerate some minute changes in message, thus defeating it's purpose.
All that said, that is true of cryptographic algorithms, not of their implementations. Most implementations of cryptographic algorithms also use many ANDs, usually to extract individual bytes from 32 or 64 internal registers.
You typically get code like that (this is some nearly random extract of aes_core.c)
rk[ 6] = rk[ 0] ^
(Te2[(temp >> 16) & 0xff] & 0xff000000) ^
(Te3[(temp >> 8) & 0xff] & 0x00ff0000) ^
(Te0[(temp ) & 0xff] & 0x0000ff00) ^
(Te1[(temp >> 24) ] & 0x000000ff) ^
rcon[i];
rk[ 7] = rk[ 1] ^ rk[ 6];
rk[ 8] = rk[ 2] ^ rk[ 7];
rk[ 9] = rk[ 3] ^ rk[ 8];
8 XORs and 7 ANDs if I count right
XOR is a mathematical calculation in cryptography. It is a logical operation. There are other logical operations: AND, OR, NOT, Modulo Function etc. XOR is the most important and the most used.
If it's the same, it's 0.
If it's different, it's 1.
Example:
Message : Hello
Binary Version of Hello : 01001000 01100101 01101100 01101100 01101111
Key-stream : 110001101010001101011010110011010010010111
Cipher text using XOR : 10001110 11000110 00110110 10100001 01001010
Applications : The one-time pad/Vern-am Cipher uses the Exclusive or function in which the receiver has the same key-stream and receives the ciphertext over a covert transport channel. The receiver then Xor the ciphertext with the key-stream in order to reveal the plaintext of Hello. In One Time Pad, the key-stream should be at-least as long as the message.
Fact : The One Time Pad is the only truly unbreakable encryption.
Exclusive Or used in Feistel structure which is used in the block cipher DES algo.
Note : XOR operation has a 50% chance of outputting 0 or 1.
I think its simply because a given some random set of binary numbers a large number of 'OR' operations would tend towards all '1's, likewise a large number of 'AND' operations would tend towards all zeroes. Wheres a large number of 'XOR's produces a random-ish selection of ones and zeroes.
This is not to say that AND and OR are not useful - just that XOR is more useful.
The prevalence of OR/AND and XOR in cryptography is for two reasons:-
One these are lightning fast instructions.
Two they are difficult to model using conventional mathematical formulas

Resources