Explanation of Self-healing property of CBC (Cipher Block Chaining) - encryption

Wikipedia:
CBC mode has the self-healing property: if one block of the cipher is
altered, the error propagates for at most two blocks.
Made up Example:
Let the block size be 64 bits. The original plaintext is:
3231343336353837 3231343336353837 3231343336353837 • • •
The correct cipher text is:
ef7c4bb2b4ce6f3b f6266e3a97af0e2c 746ab9a6308f4256 • • •
If the ciphertext is corrupted, with the byte '0x4b' changed to '0x4c':
ef7c4cb2b4ce6f3b f6266e3a97af0e2c 746ab9a6308f4256 • • •
Then it is decrypted to:
efca61e19f4836f1 3231333336353837 3231343336353837 • • •
Question:
I am having hard time understanding the self-healing property of CBC (Cipher Block Chaining), I thought that a made up example might help but I am now more confused. Any help would be great.

Personally, I find the decryption graphics very helpful for such kind of questions. From the Wikipedia (public domain image):
Now let's add some corruption:
The red dots represent partial corrupted inputs, while the solid red line represents complete block corruption.
Some notation before we start: I'll number the original plaintext blocks as p1 through p3, the corrupted ones as p1' through p3', the correct ciphertext blocks as c1 through c3 and the corrupted ones as c1' through c3':
3231343336353837 3231343336353837 3231343336353837 • • •
p1 p2 p3
ef7c4bb2b4ce6f3b f6266e3a97af0e2c 746ab9a6308f4256 • • •
c1 c2 c3
ef7c4cb2b4ce6f3b f6266e3a97af0e2c 746ab9a6308f4256 • • •
c1' c2'=c3 c3'=c3
efca61e19f4836f1 3231333336353837 3231343336353837 • • •
p1' p2' p3'=p3
There is also some IV that you have not given in your example.
Let's take a look at the first block: Three bits in the input of the block cipher are changed (0x4b ^ 0x4c = 0x07 = 4+2+1). As a block cipher is designed to be a pseudo random permutation - that is a bijective function being indistinguishable from a random function (without knowledge of the key k) - we get a completely (pseudo) random block as output of the decryption function:
dec( c1 ,k) = p1 XOR IV
<=> dec(ef7c4bb2b4ce6f3b,k) = 3231343336353837 XOR IV
dec( c1' ,k) = p1' XOR IV
<=> dec(ef7c4cb2b4ce6f3b,k) = efca61e19f4836f1 XOR IV
As a next step the IV is XORed, so we end up with
dec( c1 ,k) XOR IV = p1
<=> dec(ef7c4bb2b4ce6f3b,k) XOR IV = 3231343336353837
dec( c1' ,k) XOR IV = p1'
<=> dec(ef7c4cb2b4ce6f3b,k) XOR IV = efca61e19f4836f1
which shows that the whole block was destroyed (complete red block at the bottom).
Now, on to the second block: We start again by decrypting the ciphertext block, which works fine as no corruption has occurred in the block:
dec( c2 ,k) = p2 XOR c1
<=> dec(f6266e3a97af0e2c,k) = 3231343336353837 XOR ef7c4bb2b4ce6f3b
^
Notice that this formula uses non-corrupted blocks everywhere. As a reminder, this block was generated like this during encryption:
c2 = enc( p2 XOR c1 ,k)
<=> f6266e3a97af0e2c = enc(3231343336353837 XOR ef7c4bb2b4ce6f3b,k)
The next step is again the application of the XOR with the previous block (this time not the IV, but c1'). This previous block c1' is corrupted:
dec( c2 ,k) XOR c1' = p2 XOR c1 XOR c1'
<=> dec(f6266e3a97af0e2c,k) XOR ef7c4cb2b4ce6f3b = 3231343336353837 XOR ef7c4bb2b4ce6f3b XOR ef7c4cb2b4ce6f3b
Now we can actually calculate c1 XOR c1' (the error) as c1 XOR c1' = 0000007000000000 and replace that everywhere:
dec( c2 ,k) XOR c1' = p2 XOR 0000007000000000
<=> dec(f6266e3a97af0e2c,k) XOR ef7c4cb2b4ce6f3b = 3231343336353837 XOR 0000007000000000
And at last simplify p2 XOR 0000007000000000 = p2':
dec( c2 ,k) XOR c1' = p2'
<=> dec(f6266e3a97af0e2c,k) XOR ef7c4cb2b4ce6f3b = 3231333336353837
You see that the original corruption (0x07) to the first ciphertext block c1' is transferred verbatim to the the second plaintext block p2', but it remains otherwise intact (as is visualized by a mostly white block in the graphic, with a single square being red). This peculiar property of CBC can lead to attacks against real world systems, like padding oracle attacks.
The third block is boring as hell: No inputs to the decryption and XOR have changed, thus p1=p1' and everything is fine there.

When decrypting in CBC mode a block is decrypted by first deciphering the block in question using the key, and then XOR it with the previous block in the ciphertext. Take a look at the CBC mode drawing on wiki
As you only need the current and previous block for decryptin in CBC mode, the effect of a changed byte in the ciphertext, would only affect the block it's in, and the following block (if that exists).

Related

How to generate g and p for DH safely on Arduino Nano

I want to make a DH Key exchange with a PC and an Arduino. I want to generate a Prime p and a Base g on the Arduino that should be 2048 Bits long.
You don't generate p and g.
They are specified by the Diffie-Hellman group choice.
You only need to generate a random integer in the range of zero to the group mod-1.

Convolutional Neural Network (LeNet 5). Construction of C3, C5 layers

http://i60.tinypic.com/no7tye.png
Fig. 1 Convolutional Neural Network (LeNet5)
On Convolutional Neural Network (LeNet 5), Fig. 1 proceeding of Convolution (C1), Max Pooling(Subsampling) (S2,S4) layers are computed by iterative manneur. But I did not understood how correctly proceed C3 (Convolution) layer.
http://tinypic.com/r/fvzp86/8
Fig. 2 Proceeding C1 layer
Firstly as an input we recieve a MNIST 32*32 grayscale image of number, perceiving it as an Array of Bytes of size 32*32. In C1 layer we have 6 distinct(various) kernels filled with random small values. Each kernel from 1 to 6 is used to build 6 various feature maps (one kernel per one feature map). Moving receptive field of size 5*5 one 1 pixel stride (bias) from left to right, multiplying value in image Array on kernel value adding bias and passing through sigmoid function. The result is i,j of a current constructed feature map. Once we have reached the end of Image Array we finished building of current feature map.
http://i57.tinypic.com/rk0jk9.jpg
Fig. 3 Proceeding S2 layer
Next we start to produce S2 layer, again there will be 6 feature maps, as we using 2*2 receptive field individually for each of 6 feature maps of C1 layer (using max pooling operations, selecting maximal value in 2*2 receptive field). Proceeding of C1,S2,S4 conducting on iterative manneur.
http://i58.tinypic.com/ifsidu.png
Fig. 4 Connection list of C3 layer
But next we need to compute C3 layer. According to various papers there exist a connection map. Could you please say what is perceived under connection list? Does this mean that we will still use 5*5 receptive field as in C1 layer. And for example we see that in first row there is a marked feature maps corresponding to columns (0,4,5,6,9,10,11,12,14,15). Does this means that to construct 0,4,5,6,9,10,11,12,14,15 feature maps of C3 layer we will proceed convolutional operation under the first feature map of S2 layer with 5*5 receptive field. What concrete kernel will be used during convolutional operation, or again we need to randomly generate 16 kernels filled with small numbers as we did it in C1 layer. If yes we see that feature maps 0,4,5,6,9,10,11,12,14,15 of C3 colored in light grey, light grey, dark grey, light grey, dark grey, light grey, dark grey, light grey, light grey, dark grey. It can be clearly see that first feature map of S2 is light grey but only 0,4,6,10,12,14 are colored in light grey. So maybe the building of 16 feature maps in C3 proceeding by different way. Could you please say how also produce C5 layer, will it have some certain connection list?
Disclaimer: I have just started with this topic so please do point out mistakes in my concept!
In the original Lenet paper, on page 8, you can find a connection map that links different layers of S2 to layers of C3. This connection list tells us which layers of S2 are being convolved with the kernel(details coming up) to produce the layers of C3.
You will notice that each layer of S2 is involved in producing exactly 10 (not all 16) layers of C3. This shows that the size of kernel is (5x5x6) x 10.
In C1 we had a (5x5) x 6 kernel i.e. 5x5 with 6 feature maps. This is 2D convolution. In C3 we have (5x5x6) x 10 kernel i.e. a "kernel-box" with 10 feature maps. These 10 feature maps and the kernel-box combine to produce 16 layers rather than 6 as these are not fully connected.
Regarding generation of kernel weights, it depends on the algo. It can be random, pre-defined or using some scheme e.g. xavier in caffe.
What confused me is that the kernel details are not well defined and have to be derived from the given information.
Update:
How is C5 Produced?
Layer C5 is a convolutional layer with 120 feature maps. C5 feature maps have size of 1x1 as a 5x5 kernel is applied on S4. In the case of a 32x32 input, we can also say that S4 and C5 are fully connected. Size of Kernel applied on S4 to get C5 is (5x5x16) x 120 (bias not shown). Details on how these 120 kernel-boxes connect to S4 are not given explicitly in the paper. However, as a hint, it is mentioned that S4 and C5 are fully connected.
The key point in the paper concerning "C5" seems to be that the 5x5 kernel is applied to ALL 16 or S4's feature maps - a fully connected layer.
"Each unit is connected to a 5x5 neighborhood on all 16 of S4's feature maps".
Since we have 120 output units, we should have 120 bias unit connections (or else the architecture details don't tally).
We then connect all the 25x16 input units to produce one of the feature map outputs.
So in total we have
num_connections = (25x16+1)x120 = 48000+120 = 48120
I have understood the forward pass of S2 to C3 to have 60*(5x5) + 16*1 = 1'516 trainable params. Here I've separated out x-times from *-times since 5x5 is the dimensions of each 2D kernel. Since there are 60 X:es in the table that means there are 60 such kernels:
From column 0 to 5 in the table we have thus 3*(5x5) kernels that are convolved (actually cross-correlated) with each specified feature map from S4 thus for each feature map (0 to 5) of C3 you get three 10x10 images since 14x14 - 5x5 + 1x1 = 10x10. Then these are summed together with a scalar bias to form a final 10x10 feature map in C3.
From column 6 to 14 you get 4*(5x5) kernels that are "convolved" with each specified feature map from S4 and then combined as before to feature maps 6 to 14 of C5.
Finally in column 15 you have 6*(5x5) kernels.
Together this is (6*3 + 9*4 + 1*6)*(5x5) = 60*(5x5), i.e. 60 pieces of 5x5 kernels. When adding the 16 scalar biases you get 60*5*5 + 16 = 1516 trainable parameters which agrees with the number specified in the paper.
Hope this helps.

Shortest keyboard distance typing

Can anyone help me with this problem?
We have a grid of MxN characters from some specific aplhabet, S={A,B,C,D} for example.
The cursor is positioned on (1,1) position, and we can move cursor using the arrow keys, up, down, left, right, and press enter to select the character ( just like selecting nick in old games ). What is minimum cost of operations where they are weighted same, (e.g. move right is equally costly as selecting the char) given some input string from aplhabet S? There can also be multiple occurences of the same character in the matrix.
Example:
alphabet S={A,B,C,D}
matrix :
ABDC
CADB
ABAA
and input string ADCABDA.
My incomplete solution would be:
Construct directed grid graph and find shortest path from 1,1 to end character, with inbetween characters similar to towns in TSP, and from optimal subpaths construct optimal final path using dynamic programming. Problem is that you could end with many possible end characters, and I totally have no idea how to construct longer optimal path from smaller optimal subpaths.
You should construct a graph with nodes something like this:
A1 A1 A1
A2 D1 C1 A2 B1 D1 A2
Start A3 D2 C2 A3 B2 D2 A3 End
A4 A4 B3 A4
A5 A5 A5
where there are edges connecting each node in a column to each node in the next column. Start is (1,1) and End is wherever. The edge weights are the "taxicab" distances between each pair of keys.
Now it's a fairly straightforward dynamic programming problem. You can start at either end; it's probably conceptually simpler to start at Start. Keep track of the minimum cost so far to reach each node.
You could use 3D dynamic programming, where each state is (x, y, l) - (x, y) representing current position and l representing what letter you are at.
To explain further. You start at position (0, 0, 0). First letter is "A". You can try all A's and we know that distance will be Manhattan distance (http://en.wikipedia.org/wiki/Taxicab_geometry). Solution for (0, 0, 0) would be minimum of all possibilities.
At each step repeat the above process. Note that importance of memorising each step. In the below sample code memo acts as function, you would use array in reality.
Here is sample pseudo-code:
f(x, y, l):
if memo(x, y, l) != -1:
return memo(x, y, l) # Check if already calculated.
if l == length(word):
return memo(x, y, l) = 0 # Ending condition.
memo(x, y, l) = inf
next_letter = word[l]
for each (x2, y2) in grid that contains next_letter:
distance = |x2 - x| + |y2 - y|
next_calc = f(x2, y2, l+1)
memo(x, y, l) = min(memo(x, y, l), distance + next_calc)
return memo(x, y, l)
Set all memo to -1, so we know that no states are calculated.
Solution is f(0, 0, 0).
Let me know which steps I need to clarify further.

Local interpolation of surfaces using normal vectors

I need to interpolate a 3D surface given it's points and normal vectors.
Given a point on it's surface, I need to find where that point would be in space once the interpolation has been accounted for. I need to be able to do this for each triangle in isolation.
Here's what I'm trying to describe. I need the position of the point once the curve / surface has been interpolated.
If I was working in 2D:
3D:
I've come across this paper "Simple local interpolation of surfaces using normal vectors - Takashi Nagata" which I think demonstrates exactly what I'm looking for (section 2.2. Interpolation of a patch using normals), but the math is just beyond me.
What I'm trying to extract from it is a set of equations where the position and normals of the points comprising the triangle go in, as well as the point on the triangle, and the adjusted point comes out (like magic).
The paper looks like its trying to fit a quadratic surface so that it matches the points and normals you have. The resulting surface is given by
p(s,t) = c00 + c10 s + c01 t + c11 s t + c20 s^2 + c02 t^2
where s,t are the two variables, c00 etc are all vectors with three coordinates. s,t are chosen so at s=0,t=0 its your first point, s=1, t=0 is your second point and s=1,t=1 is your third point. Assuming we can find the various c00's you can pick some values of s,t in the triangle to give a middle point, s=2/3, t=1/3 might be a find candidate.
Find c00 etc will take some work. You probably need to implement eqn 15, which gives a curvature, as a function
vec3 c(vec3 D,vec3 n0,vec3 n1) {
vec3 v = (n0 + n1)/2; // 12a
vec3 dv = (n0 - n1)/2; // 12b
double d = D.dot(v); // 13a
double dd = D.dot(dv); // 13b
double c = n0.dot(n0 - 2*dv); // 14a
double dc = n0.dot(dv); // 14b
vec3 res;
if( c == -1 || c==1 )
res = vec3.zeroVector;
else
res = dd / (1-dc) * v + d / dc * dv;
return res;
}
assuming you have a vec3 class which can do basic vector operators.
With that defined, use 35, 36 to define the starting vectors and normals. Use 39 to define differences between pairs of points d1, d2, d3 and curvatures c1, c2, c3. The use eq 44
x(η, ζ ) = x00(1 − η) + x10(η − ζ ) + x11ζ
− c1(1 − η)(η − ζ ) − c2(η − ζ )ζ − c3(1 − η)ζ
and your done.
For the records, and because I wanted to have this information somewhere on the web.
This is the 2d interpolation using the paper posted by the OP.
Where 1a and 1b are the boundary conditions, and the equations 4a and 4b are the x and y components of the vector c needed for the interpolation.

How do I separate an encryption key into parts?

I have a 128 bit encryption key that I would like to break up into three parts that when XOR'ed together reproduce the key.
How do I do this?
Pick two other 128 bit values at random (random_1 and random_2), then work out the equations to see how it works:
key ^ random_1 = xor_1
Now split xor_1 the same way:
xor_1 ^ random_2 = xor_2
Flipping that equation around, we get:
xor_1 = xor_2 ^ random_2
Now substitute back into the first equation:
key = random_1 ^ xor_2 ^ random_2
So your code will just do xor = key ^ random_1 ^ random_2 and you distribute everything but the key.
Just XOR the salt values in and then XOR them out to reverse it.
If key' = key ^ salt1 ^ salt2, then key = key' ^ salt1 ^ salt2.
It's pretty trivial to implement, but it's also pretty trivial to reverse engineer.
What are you trying to protect with this, and who are you trying to protect it from?

Resources