Vigenere Cipher - decryption (by hand) - math

This is a Vigenere cipher-text
EORLL TQFDI HOEZF CHBQN IFGGQ MBVXM SIMGK NCCSV
WSXYD VTLQS BVBMJ YRTXO JCNXH THWOD FTDCC RMHEH
SNXVY FLSXT ICNXM GUMET HMTUR PENSU TZHMV LODGN
MINKA DTLOG HEVNI DXQUG AZGRM YDEXR TUYRM LYXNZ
ZGJ
The index of coincidence gave a shift of six (6): I know this is right (I used an online Java applet to decrypt the whole thing using the key 'QUARTZ').
However, in this question we are only told the first and last two letters of the Key - 'Q' and 'TZ.'
So far I have split the ciphertext into slices using this awesome applet. So the first slice is 0, k, 2k, 3k, 4k; the second is 1, k + 1, 2k + 1, 3k + 1; et cetera.
KeyPos=0: EQEQQSCXQJJHDEYIUTSVMTVUMTYJ
KeyPos=1: OFZNMICYSYCWCHFCMUULILNGYUX
KeyPos=2: RDFIBMSDBRNOCSLNERTONOIADYN
KeyPos=3: LICFVGVVVTXDRNSXTPZDKGDZERZ
KeyPos=4: LHHGXKWTBXHFMXXMHEHGAHXGXMZ
KeyPos=5: TOBGMNSLMOTTHVTGMNMNDEQRRLG
My idea was to calculate the highest-frequency letter in each block, hoping that the most frequent letter would give me some clue as to how to find 'U,' 'A' and 'R.' However, the most frequent letters in these blocks are:
KeyPos=0: Q,4 T,3 E,3, J,3
KeyPos=1: C,4 U,3 Y,3
KeyPos=2: N,4 O,3 R,3 D,3 B,2
KeyPos=3: V,4 D,3 Z,3
KeyPos=4: H,6 X,6 M,3 G,3
KeyPos=5: M,4 T,4 N,3 G,3
Which yields QCNVHM, or QUNVHM (being generous), neither of which are that close to QUARTZ. There are online applets that can crack this no problem, so it mustn't be too short a text to yield decent frequency counts from the blocks.
I guess I must be approaching this the wrong way. I just hoped one of you might be able to offer some clue as to where I am going wrong.
p.s. This is for a digital crypto class.

Interesting question...
I don't have a programmatic solution for cracking the original ciphertext, but I was able to solve it with a little mind power and some helpful JavaScript.
I started by using this page and the information you supplied. Provide the ciphertext, a key length of 6 and hit initialize. What's nice about the approach here is that unknowns in either the plaintext or key are left as hyphens.
Update the key, adding only what you know Q---TZ and click 'update plaintext'. At this point we know:
o---sua---opo---oca---nha---enc---rom---dth---ama---int---ept---our---mun---tio---ewi---eus---the---ond---loc---onf---now---hed---off---ere---nsw---esd---tmi---ght
Here's where I applied a bit of brain power. You start recognizing bits of the plaintext. the, now and off make an appearance. At the end, there's ght - this made me think the prior letter is likely a vowel. For example light or thought. I replaced the corresponding hyphen with u and clicked update keyword to find what letter would have produced that combination. The matching letter turns out to be F. I think updated the plaintext to see the results. They didn't look promising. So I tried i instead which resulted in:
o--usua--ropo--loca--onha--eenc--prom--edth--eama--eint--cept--gour--mmun--atio--wewi--beus--gthe--cond--yloc--ionf--mnow--thed--poff--mere--insw--nesd--atmi--ight
Now we're getting somewhere. At the start I see something that might be usual, and further in I see int--cept and near the end w--nesd-- at mi--ight. Voila. Filling in the letters for wednesday and updating the keyword yielded QUARTZ.
... So, how to port this approach to code? Not sure about the best way to do that just yet. The idea of using the known characters in the key, partially decrypting the ciphertext and brute forcing the rest is appealing. But without a dictionary handy, I'm not sure what the best brute-forcing method would be...
To be continued (maybe)...

An algorithm wouldn't just consider the most frequent letters but the frequency pattern of the whole alphabet. Technically you compute the index of coincidence for each possible shift and consider the maximal ones.

Related

When checking a Bitcoin block, why do you get a leading prefix of zeros once you find the correct nonce?

I have recently been looking into Bitcoin and the proof of work system.
In this system, when mining, a user has a "challenge string" that they need to concatenate with the CORRECT "proof string"(nonce) and hash, the outcome of that hash starts with a prefix of leading zeros and that's how they verify the block.
My question is, when combining that "challenge string" and the CORRECT "proof string"(nonce), why does the corresponding hash of those values start with the prefix of zeros? How does that work?
The combination of "Challenge string" and "proof of string" is sent to a hashing function, which is a one way function and results in a "random string" (The condition on "random string" is that it should have x number of zeroes in the beginning. The difficulty of guessing increases day to day, which is nothing but x increases).
The job of a Miner is to guess the "proof of string" until the condition on the "random string" is met.
So, it is a pure guessing game. GPUs are very good at generating random numbers very quickly. That is why miners around the world are using top class GPUs to mine the bitcoin transactions.
Not specific to Bitcoin but Bitcoin uses the same mechanism, see https://en.wikipedia.org/wiki/Hashcash for a description of how Proof Of Work "works" and why it works that way.
The short answer is that you are creating a hash collision (https://en.wikipedia.org/wiki/Collision_resistance) and the more leading zeros there are the more difficult it is to create the collision.
In Bitcoin, the difficulty is algorithmically chosen based on the compute on the bitcoin network. This typically only goes up, but should the compute go down the dificulty will also go down. The algorithm adjusts dificulty to keep transaction verification time around 10 minutes.

Reverse Engineering hash/encryption function

I have 5 numeric codes. They vary in length (8-10 digits). For each numeric code I have a corresponding alpha-numeric code. The alpha numeric codes are always 8 digits in length.
Now the problem. I know that by some process each numeric code is converted into it's corresponding 8 digit alpha numeric code but I do not know the process used. At first I thought that the alpha-numeric codes may be randomly generated using a seed from the numeric code but that did not seem to work. Now I am thinking that some sort of hashing algorithm is being used to convert the numerics to the alpha-numerics
My question is
1) Can I brute force solve this
2) If yes then what algorithms should I look into that can covert a numeric code to an 8 digit alpha-numeric code
3) Is there some other way to solve this?
Notes: The alpha-numeric codes are not case sensitive. I do not mind if a brute force search returns a few false positives because I will be able to narrow them down myself.
Clarification: I think the first guy misunderstood something. I know the exact values of these numeric and alpha-numeric codes. I simply am not sharing them on the site. I'm not trying to randomly map codes to codes I'm trying to find an algorithm that map my specific codes to the outputs.
No, you cannot brute force this.
There are an unlimited number of functions that will map 5 inputs to 5 outputs. How would you know whether you found the right function? For example, you can use these 5 pairs as constraints for a polynomial of degree n. There are an infinite number of possible polynomial solutions.
If you can narrow the functions down, then there are additional constraints on the problem.
If you assume a hash function is used, you can try guessing that there is no salting, and the search space is over well known hash functions. If there is salting, you are stuck brute forcing all possible salts over all possible hash functions. With just the salts, you are probably looking at > 2^128 values. A brute force attack is not going to be useful.
If a symmetric cipher is used, you have an instance of the chosen ciphertext problem. Modern ciphers are intentionally designed with this attack in mind and use 128 bits or more of key space. Brute forcing all keys is not going to work.
You do not state anything about the function. Is it reversible? Is it randomized?

How to find Hash/Cipher

is there any tool or method to figure out what is this hash/cipher function?
i have only a 500 item list of input and output plus i know all of the inputs are numeric, and output is always 2 Byte long hexadecimal representation.
here's some samples:
794352:6657
983447:efbf
479537:0796
793670:dee4
1063060:623c
1063059:bc1b
1063058:b8bc
1063057:b534
1063056:b0cc
1063055:181f
1063054:9f95
1063053:f73c
1063052:a365
1063051:1738
1063050:7489
i looked around and couldn't find any hash this short, is this a hash folded on itself? (with xor maybe?) or maybe a simple trivial cipher?
is there any tool or method for finding the output of other numbers?
(i want to figure this out; my next option would be training a Neural Network or Regression, so i thought i ask before taking any drastic action )
Edit: The Numbers are directory names, and for accessing them, the Hex parts are required.
Actually, Wikipedia's page on hashes lists three CRCs and three checksum methods that it could be. It could also be only half the output from some more complex hashing mechanism. Cross your fingers and hope that it's of the former. Hashes are specifically meant to be difficult (if not impossible) to reverse engineer.
What it's being used for should be a very strong hint about whether or not it's more likely to be a checksum/CRC or a hash.

Finding similar hashes

I'm trying to find 2 different plain text words that create very similar hashes.
I'm using the hashing method 'whirlpool', but I don't really need my question to be answered in the case or whirlpool, if you can using md5 or something easier that's ok.
The similarities i'm looking for is that they contain the same number of letters (doesnt matter how much they're jangled up)
i.e
plaintext 'test'
hash 1: abbb5 has 1 a , 3 b's , one 5
plaintext 'blahblah'
hash 2: b5bab must have the same, but doesnt matter what order.
I'm sure I can read up on how they're created and break it down and reverse it, but I am just wondering if what I'm talking about occurs.
I'm wondering because I haven't found a match of what I'm explaining (I created a PoC to run threw random words / letters till it recreated a similar match), but then again It would take forever doing it the way i was dong it. and was wondering if anyone with real knowledge of hashes / encryption would help me out.
So you can do it like this:
create an empty sorted map \
create a 64 bit counter (you don't need more than 2^63 inputs, in all probability, since you would be dead before they would be calculated - unless quantum crypto really takes off)
use the counter as input, probably easiest to encode it in 8 bytes;
use this as input for your hash function;
encode output of hash in hex (use ASCII bytes, for speed);
sort hex on number / alphabetically (same thing really)
check if sorted hex result is a key in the map
if it is, show hex result, the old counter from the map & the current counter (and stop)
if it isn't, put the sorted hex result in the map, with the counter as value
increase counter, goto 3
That's all folks. Results for SHA-1:
011122344667788899999aaaabbbcccddeeeefff for both 320324 and 429678
I don't know why you want to do this for hex, the hashes will be so large that they won't look too much alike. If your alphabet is smaller, your code will run (even) quicker. If you use whole output bytes (i.e. 00 to FF instead of 0 to F) instead of hex, it will take much more time - a quick (non-optimized) test on my machine shows it doesn't finish in minutes and then runs out of memory.

Cracking the Playfair cipher

I have the ciphertext and an encrypting program (with the key hardcoded in). How would I go about finding the key? Surely the availability of the encryptor must open up possibilities beyond brute-forcing it.
Yes knowing the algorithm may help in decoding the cypher text, but only if there is a flaw in the algorithm that may be exploited. (the good news is Playfair has some flaws that can be exploited)
Here are a few good starting points to read.
Wikipedia (read it all - particularly Cryptanalysis)
Basic Cryptanalysis (look at chapter 7)
The second one is not what I would call a light read, but interesting if you're into cyphers.
I've found a way in five lines (obviously re-evaluated a bit, and admittedly very long lines):
(a,b,c)="".join((input("CODE: ")).split()),input("Polybius Square: "),""
for i in a:
c+=str(int(((b.find(i))-((b.find(i))%5))/5))+str((b.find(i))%5)
for j in range(0,(int(len(c)/2))):
print((b[((5*(int((c[:(int(len(c)/2))])[j])))+(int((c[(int(len(c)/2)):])[j])))]).lower(),end="")
NB: When prompted to enter the polybius square, first enter row 1, then row 2 etc, no spaces
Then you just have to remove unnecessary 'x's and voila!
Try:
(a,b,f,g,c)="".join(input("CODE: ").split()),input("Polybius S: "),"","",1
for(i)in(a):
if(c%2)==0:
g+=i
else:
f+=i
c+=1
for(j)in(range(0,len(f))):
if(b.find(f[j])%5)!=(b.find(g[j])%5)and(int(((b.find(f[j]))-(b.find(f[j])%5))/5))!=(int(((b.find(g[j]))-(b.find(g[j])%5))/5)):
print(b[((int(((b.find(f[j]))-(b.find(f[j])%5))/5))*5)+(b.find(g[j])%5)],end="")
print(b[((int(((b.find(g[j]))-(b.find(g[j])%5))/5))*5)+(b.find(f[j])%5)],end="")
elif(b.find(f[j])%5)==(b.find(g[j])%5)and(int(((b.find(f[j]))-(b.find(f[j])%5))/5))!=(int(((b.find(g[j]))-(b.find(g[j])%5))/5)):
print(b[((((int(((b.find(f[j]))-(b.find(f[j])%5))/5))-1)%5)*5)+b.find(f[j])%5],end="")
print(b[((((int(((b.find(g[j]))-(b.find(g[j])%5))/5))-1)%5)*5)+b.find(g[j])%5],end="")
elif(b.find(f[j])%5)!=(b.find(g[j])%5)and(int(((b.find(f[j]))-(b.find(f[j])%5))/5))==(int(((b.find(g[j]))-(b.find(g[j])%5))/5)):
print(b[((int(((b.find(f[j]))-(b.find(f[j])%5))/5))*5)+((b.find(f[j])%5)-1)%5],end="")
print(b[((int(((b.find(g[j]))-(b.find(g[j])%5))/5))*5)+((b.find(g[j])%5)-1)%5],end="")
It's not pretty, but it works

Resources