Cracking the Playfair cipher

Cracking the Playfair cipher - encryption

I have the ciphertext and an encrypting program (with the key hardcoded in). How would I go about finding the key? Surely the availability of the encryptor must open up possibilities beyond brute-forcing it.

Yes knowing the algorithm may help in decoding the cypher text, but only if there is a flaw in the algorithm that may be exploited. (the good news is Playfair has some flaws that can be exploited)
Here are a few good starting points to read.
Wikipedia (read it all - particularly Cryptanalysis)
Basic Cryptanalysis (look at chapter 7)
The second one is not what I would call a light read, but interesting if you're into cyphers.

I've found a way in five lines (obviously re-evaluated a bit, and admittedly very long lines):
(a,b,c)="".join((input("CODE: ")).split()),input("Polybius Square: "),""
for i in a:
c+=str(int(((b.find(i))-((b.find(i))%5))/5))+str((b.find(i))%5)
for j in range(0,(int(len(c)/2))):
print((b[((5*(int((c[:(int(len(c)/2))])[j])))+(int((c[(int(len(c)/2)):])[j])))]).lower(),end="")
NB: When prompted to enter the polybius square, first enter row 1, then row 2 etc, no spaces
Then you just have to remove unnecessary 'x's and voila!

Try:
(a,b,f,g,c)="".join(input("CODE: ").split()),input("Polybius S: "),"","",1
for(i)in(a):
if(c%2)==0:
g+=i
else:
f+=i
c+=1
for(j)in(range(0,len(f))):
if(b.find(f[j])%5)!=(b.find(g[j])%5)and(int(((b.find(f[j]))-(b.find(f[j])%5))/5))!=(int(((b.find(g[j]))-(b.find(g[j])%5))/5)):
print(b[((int(((b.find(f[j]))-(b.find(f[j])%5))/5))*5)+(b.find(g[j])%5)],end="")
print(b[((int(((b.find(g[j]))-(b.find(g[j])%5))/5))*5)+(b.find(f[j])%5)],end="")
elif(b.find(f[j])%5)==(b.find(g[j])%5)and(int(((b.find(f[j]))-(b.find(f[j])%5))/5))!=(int(((b.find(g[j]))-(b.find(g[j])%5))/5)):
print(b[((((int(((b.find(f[j]))-(b.find(f[j])%5))/5))-1)%5)*5)+b.find(f[j])%5],end="")
print(b[((((int(((b.find(g[j]))-(b.find(g[j])%5))/5))-1)%5)*5)+b.find(g[j])%5],end="")
elif(b.find(f[j])%5)!=(b.find(g[j])%5)and(int(((b.find(f[j]))-(b.find(f[j])%5))/5))==(int(((b.find(g[j]))-(b.find(g[j])%5))/5)):
print(b[((int(((b.find(f[j]))-(b.find(f[j])%5))/5))*5)+((b.find(f[j])%5)-1)%5],end="")
print(b[((int(((b.find(g[j]))-(b.find(g[j])%5))/5))*5)+((b.find(g[j])%5)-1)%5],end="")
It's not pretty, but it works

Related

cipher shift to encrypt a number and letters

I am searching for a simple code that encrypt a random number and letters with cipher shift. Example: -3 abcdefghijklmnopqrstuvwxyz output xyzabcdefghijklmnopqrstuvw

Be aware, that this is no encryption.
You are simply obfuscating the input. But this is being revealed quite easily.
Your question itself is very ambiguous. In what language do you want to look for? Please ask questions more specific.
Furthermore you could have look onto wikipedia:
https://en.wikipedia.org/wiki/Caesar_cipher
This is what you want to do. The code itself comes with the language.
Best,
Bent

Human readable alternative for UUIDs

I am working on a system that makes heavy use of pseudonyms to make privacy-critical data available to researchers. These pseudonyms should have the following properties:
They should not contain any information (e.g. time of creation, relation to other pseudonyms, encoded data, …).
It should be easy to create unique pseudonyms.
They should be human readable. That means they should be easy for humans to compare, copy, and understand when read out aloud.
My first idea was to use UUID4. They are quite good on (1) and (2), but not so much on (3).
An variant is to encode UUIDs with a wider alphabet, resulting in shorter strings (see for example shortuuid). But I am not sure whether this actually improves readability.
Another approach I am currently looking into is a paper from 2005 titled "An optimal code for patient identifiers" which aims to tackle exactly my problem. The algorithm described there creates 8-character pseudonyms with 30 bits of entropy. I would prefer to use a more widely reviewed standard though.
Then there is also the git approach: only display the first few characters of the actual pseudonym. But this would mean that a pseudonym could lose its uniqueness after some time.
So my question is: Is there any widely-used standard for human-readable unique ids?

Not aware of any widely-used standard for this. Here’s a non-widely-used one:
Proquints
https://arxiv.org/html/0901.4016
https://github.com/dsw/proquint
A UUID4 (128 bit) would be converted into 8 proquints. If that’s too much, you can take the last 64 bits of the UUID4 (= just take 64 random bits). This doesn’t make it magically lose uniqueness; only increases the likelihood of collisions, which was non-zero to begin with, and which you can estimate mathematically to decide if it’s still OK for your purposes.

Here you go UUID Readable
Generate Easy to Remember, Readable UUIDs, that are Shakespearean and Grammatically Correct Sentences

This article suggests to use the first few characters from a SHA-256 hash, similarly to what git does. UUIDs are typically based on SHA-1, so this is not all that different. The tradeoff between property (2) and (3) is in the number of characters.
With d being the number of digits, you get 2 ** (4 * d) identifiers in total, but the first collision is expected to happen after 2 ** (2 * d).
The big question is really not about the kind of identifier you use, it is how you handle collisions.

How to find Hash/Cipher

is there any tool or method to figure out what is this hash/cipher function?
i have only a 500 item list of input and output plus i know all of the inputs are numeric, and output is always 2 Byte long hexadecimal representation.
here's some samples:
794352:6657
983447:efbf
479537:0796
793670:dee4
1063060:623c
1063059:bc1b
1063058:b8bc
1063057:b534
1063056:b0cc
1063055:181f
1063054:9f95
1063053:f73c
1063052:a365
1063051:1738
1063050:7489
i looked around and couldn't find any hash this short, is this a hash folded on itself? (with xor maybe?) or maybe a simple trivial cipher?
is there any tool or method for finding the output of other numbers?
(i want to figure this out; my next option would be training a Neural Network or Regression, so i thought i ask before taking any drastic action )
Edit: The Numbers are directory names, and for accessing them, the Hex parts are required.

Actually, Wikipedia's page on hashes lists three CRCs and three checksum methods that it could be. It could also be only half the output from some more complex hashing mechanism. Cross your fingers and hope that it's of the former. Hashes are specifically meant to be difficult (if not impossible) to reverse engineer.
What it's being used for should be a very strong hint about whether or not it's more likely to be a checksum/CRC or a hash.

Finding similar hashes

I'm trying to find 2 different plain text words that create very similar hashes.
I'm using the hashing method 'whirlpool', but I don't really need my question to be answered in the case or whirlpool, if you can using md5 or something easier that's ok.
The similarities i'm looking for is that they contain the same number of letters (doesnt matter how much they're jangled up)
i.e
plaintext 'test'
hash 1: abbb5 has 1 a , 3 b's , one 5
plaintext 'blahblah'
hash 2: b5bab must have the same, but doesnt matter what order.
I'm sure I can read up on how they're created and break it down and reverse it, but I am just wondering if what I'm talking about occurs.
I'm wondering because I haven't found a match of what I'm explaining (I created a PoC to run threw random words / letters till it recreated a similar match), but then again It would take forever doing it the way i was dong it. and was wondering if anyone with real knowledge of hashes / encryption would help me out.

So you can do it like this:
create an empty sorted map \
create a 64 bit counter (you don't need more than 2^63 inputs, in all probability, since you would be dead before they would be calculated - unless quantum crypto really takes off)
use the counter as input, probably easiest to encode it in 8 bytes;
use this as input for your hash function;
encode output of hash in hex (use ASCII bytes, for speed);
sort hex on number / alphabetically (same thing really)
check if sorted hex result is a key in the map
if it is, show hex result, the old counter from the map & the current counter (and stop)
if it isn't, put the sorted hex result in the map, with the counter as value
increase counter, goto 3
That's all folks. Results for SHA-1:
011122344667788899999aaaabbbcccddeeeefff for both 320324 and 429678
I don't know why you want to do this for hex, the hashes will be so large that they won't look too much alike. If your alphabet is smaller, your code will run (even) quicker. If you use whole output bytes (i.e. 00 to FF instead of 0 to F) instead of hex, it will take much more time - a quick (non-optimized) test on my machine shows it doesn't finish in minutes and then runs out of memory.

Vigenere Cipher - decryption (by hand)

This is a Vigenere cipher-text
EORLL TQFDI HOEZF CHBQN IFGGQ MBVXM SIMGK NCCSV
WSXYD VTLQS BVBMJ YRTXO JCNXH THWOD FTDCC RMHEH
SNXVY FLSXT ICNXM GUMET HMTUR PENSU TZHMV LODGN
MINKA DTLOG HEVNI DXQUG AZGRM YDEXR TUYRM LYXNZ
ZGJ
The index of coincidence gave a shift of six (6): I know this is right (I used an online Java applet to decrypt the whole thing using the key 'QUARTZ').
However, in this question we are only told the first and last two letters of the Key - 'Q' and 'TZ.'
So far I have split the ciphertext into slices using this awesome applet. So the first slice is 0, k, 2k, 3k, 4k; the second is 1, k + 1, 2k + 1, 3k + 1; et cetera.
KeyPos=0: EQEQQSCXQJJHDEYIUTSVMTVUMTYJ
KeyPos=1: OFZNMICYSYCWCHFCMUULILNGYUX
KeyPos=2: RDFIBMSDBRNOCSLNERTONOIADYN
KeyPos=3: LICFVGVVVTXDRNSXTPZDKGDZERZ
KeyPos=4: LHHGXKWTBXHFMXXMHEHGAHXGXMZ
KeyPos=5: TOBGMNSLMOTTHVTGMNMNDEQRRLG
My idea was to calculate the highest-frequency letter in each block, hoping that the most frequent letter would give me some clue as to how to find 'U,' 'A' and 'R.' However, the most frequent letters in these blocks are:
KeyPos=0: Q,4 T,3 E,3, J,3
KeyPos=1: C,4 U,3 Y,3
KeyPos=2: N,4 O,3 R,3 D,3 B,2
KeyPos=3: V,4 D,3 Z,3
KeyPos=4: H,6 X,6 M,3 G,3
KeyPos=5: M,4 T,4 N,3 G,3
Which yields QCNVHM, or QUNVHM (being generous), neither of which are that close to QUARTZ. There are online applets that can crack this no problem, so it mustn't be too short a text to yield decent frequency counts from the blocks.
I guess I must be approaching this the wrong way. I just hoped one of you might be able to offer some clue as to where I am going wrong.
p.s. This is for a digital crypto class.

Interesting question...
I don't have a programmatic solution for cracking the original ciphertext, but I was able to solve it with a little mind power and some helpful JavaScript.
I started by using this page and the information you supplied. Provide the ciphertext, a key length of 6 and hit initialize. What's nice about the approach here is that unknowns in either the plaintext or key are left as hyphens.
Update the key, adding only what you know Q---TZ and click 'update plaintext'. At this point we know:
o---sua---opo---oca---nha---enc---rom---dth---ama---int---ept---our---mun---tio---ewi---eus---the---ond---loc---onf---now---hed---off---ere---nsw---esd---tmi---ght
Here's where I applied a bit of brain power. You start recognizing bits of the plaintext. the, now and off make an appearance. At the end, there's ght - this made me think the prior letter is likely a vowel. For example light or thought. I replaced the corresponding hyphen with u and clicked update keyword to find what letter would have produced that combination. The matching letter turns out to be F. I think updated the plaintext to see the results. They didn't look promising. So I tried i instead which resulted in:
o--usua--ropo--loca--onha--eenc--prom--edth--eama--eint--cept--gour--mmun--atio--wewi--beus--gthe--cond--yloc--ionf--mnow--thed--poff--mere--insw--nesd--atmi--ight
Now we're getting somewhere. At the start I see something that might be usual, and further in I see int--cept and near the end w--nesd-- at mi--ight. Voila. Filling in the letters for wednesday and updating the keyword yielded QUARTZ.
... So, how to port this approach to code? Not sure about the best way to do that just yet. The idea of using the known characters in the key, partially decrypting the ciphertext and brute forcing the rest is appealing. But without a dictionary handy, I'm not sure what the best brute-forcing method would be...
To be continued (maybe)...

An algorithm wouldn't just consider the most frequent letters but the frequency pattern of the whole alphabet. Technically you compute the index of coincidence for each possible shift and consider the maximal ones.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex