When checking a Bitcoin block, why do you get a leading prefix of zeros once you find the correct nonce?

When checking a Bitcoin block, why do you get a leading prefix of zeros once you find the correct nonce? - encryption

I have recently been looking into Bitcoin and the proof of work system.
In this system, when mining, a user has a "challenge string" that they need to concatenate with the CORRECT "proof string"(nonce) and hash, the outcome of that hash starts with a prefix of leading zeros and that's how they verify the block.
My question is, when combining that "challenge string" and the CORRECT "proof string"(nonce), why does the corresponding hash of those values start with the prefix of zeros? How does that work?

The combination of "Challenge string" and "proof of string" is sent to a hashing function, which is a one way function and results in a "random string" (The condition on "random string" is that it should have x number of zeroes in the beginning. The difficulty of guessing increases day to day, which is nothing but x increases).
The job of a Miner is to guess the "proof of string" until the condition on the "random string" is met.
So, it is a pure guessing game. GPUs are very good at generating random numbers very quickly. That is why miners around the world are using top class GPUs to mine the bitcoin transactions.

Not specific to Bitcoin but Bitcoin uses the same mechanism, see https://en.wikipedia.org/wiki/Hashcash for a description of how Proof Of Work "works" and why it works that way.
The short answer is that you are creating a hash collision (https://en.wikipedia.org/wiki/Collision_resistance) and the more leading zeros there are the more difficult it is to create the collision.
In Bitcoin, the difficulty is algorithmically chosen based on the compute on the bitcoin network. This typically only goes up, but should the compute go down the dificulty will also go down. The algorithm adjusts dificulty to keep transaction verification time around 10 minutes.

Related

Recognising hash/checksum

Hello everyone,
In one of projects I've found that data encrypts with algorithm I have never meet before. It converts input string to 10-symbol numeric string. It's definitely not one of popular hashes (md5, sha1, etc) or check sums (crc16, crc32, etc), I've checked all I known.
Example:
Input string: "davidc"; Output string: "2172453193".
Length of output string is fixed - 10 symbols. It contains only digits ("0123456789"). This is all details i can add =/
Maybe someone met something similar to that or maybe know algorithm - You'll economy lot of my time.
With love <3

why don't we use simple encryption?

What i mean by this is that if i create a Lua program that randomly assigns numbers and letters to a three digit code wouldn't this code then be almost unbreakable(like if someone that wasn't supposed to got it) unless you have the program? sorry if this was already asked could some1 direct me to it.

Simple encryption is not used because it is not sufficiently secure. We use a level of encryption necessary to meet the required security level to successfully defend against attackers.
Attackers range from a curious friend to nation states, think the NSA, GCHQ, KGB & etc.
"Schneier's Law": Anyone, from the most clueless amateur to the best cryptographer, can create an algorithm that he himself can't break.

What you are describing is a called a substitution cipher.
These are broken by using frequency analysis. Because each letter and number is always assigned to the same code, letters in the input will lead to the corresponding codes appearing in the output with the same frequency. The cryptanalyst will study the kinds of data he expects to be input to the cipher, and find common symbols and patterns, then match those with frequent patterns in the output. For example, if the input is English text, the cryptanalyst knows that the most frequent codes represent E, T, A, O, I, … and the most common sequences are "THE", "BE", "OF", "AND", etc.
Codes in the output should occur with uniform probability. When there is a bias in the output, it can be exploited to break the code. One way to avoid this, in basic terms, is to use a different "code book" for each letter in the input. So "E" doesn't always translate to the same code; it would depend on the position of the "E" in the message.

What is the name for encoding/encrypting with noise padding?

I want code to render n bits with n + x bits, non-sequentially. I'd Google it but my Google-fu isn't working because I don't know the term for it.
For example, the input value in the first column (2 bits) might be encoded as any of the output values in the comma-delimited second column (4 bits) below:
0 1,2,7,9
1 3,8,12,13
2 0,4,6,11
3 5,10,14,15
My goal is to take a list of integer IDs, and transform them in a way they can still be used for persistent URLs, but that can't be iterated/enumerated sequentially, and where a client cannot determine programmatically if a URL in a search result set has been visited previously without visiting it again.

I would term this process "encoding". You'll see something similar done to permit the use of communications channels that have special symbols that are not permitted in data. Examples: uuencoding and base64 encoding.
That said, you still need to (and appear at first blush to have) ensure that there is only one correct de-code; and accept the increase in size of the output (in the case above, the output will be double the size, bit-for-bit as the input).
I think you'd be better off encrypting the number with a cheap cypher + a constant secret key stored on your server(s), adding a random character or four at the end, and a cheap checksum, and simply reject any responses that don't have a valid checksum.
<encrypt(secret)>
<integer>+<random nonsense>
</encrypt>
+
<checksum()>
<integer>+<random nonsense>
</checksum>
Then decrypt the first part (remember, cheap == fast), validate the ciphertext using the checksum, throw off the random nonsense, and use the integer you stored.
There are probably some cryptographic no-no's here, but let's face it, the cost of this algorithm being broken is a touch on the low side.

Vigenere Cipher - decryption (by hand)

This is a Vigenere cipher-text
EORLL TQFDI HOEZF CHBQN IFGGQ MBVXM SIMGK NCCSV
WSXYD VTLQS BVBMJ YRTXO JCNXH THWOD FTDCC RMHEH
SNXVY FLSXT ICNXM GUMET HMTUR PENSU TZHMV LODGN
MINKA DTLOG HEVNI DXQUG AZGRM YDEXR TUYRM LYXNZ
ZGJ
The index of coincidence gave a shift of six (6): I know this is right (I used an online Java applet to decrypt the whole thing using the key 'QUARTZ').
However, in this question we are only told the first and last two letters of the Key - 'Q' and 'TZ.'
So far I have split the ciphertext into slices using this awesome applet. So the first slice is 0, k, 2k, 3k, 4k; the second is 1, k + 1, 2k + 1, 3k + 1; et cetera.
KeyPos=0: EQEQQSCXQJJHDEYIUTSVMTVUMTYJ
KeyPos=1: OFZNMICYSYCWCHFCMUULILNGYUX
KeyPos=2: RDFIBMSDBRNOCSLNERTONOIADYN
KeyPos=3: LICFVGVVVTXDRNSXTPZDKGDZERZ
KeyPos=4: LHHGXKWTBXHFMXXMHEHGAHXGXMZ
KeyPos=5: TOBGMNSLMOTTHVTGMNMNDEQRRLG
My idea was to calculate the highest-frequency letter in each block, hoping that the most frequent letter would give me some clue as to how to find 'U,' 'A' and 'R.' However, the most frequent letters in these blocks are:
KeyPos=0: Q,4 T,3 E,3, J,3
KeyPos=1: C,4 U,3 Y,3
KeyPos=2: N,4 O,3 R,3 D,3 B,2
KeyPos=3: V,4 D,3 Z,3
KeyPos=4: H,6 X,6 M,3 G,3
KeyPos=5: M,4 T,4 N,3 G,3
Which yields QCNVHM, or QUNVHM (being generous), neither of which are that close to QUARTZ. There are online applets that can crack this no problem, so it mustn't be too short a text to yield decent frequency counts from the blocks.
I guess I must be approaching this the wrong way. I just hoped one of you might be able to offer some clue as to where I am going wrong.
p.s. This is for a digital crypto class.

Interesting question...
I don't have a programmatic solution for cracking the original ciphertext, but I was able to solve it with a little mind power and some helpful JavaScript.
I started by using this page and the information you supplied. Provide the ciphertext, a key length of 6 and hit initialize. What's nice about the approach here is that unknowns in either the plaintext or key are left as hyphens.
Update the key, adding only what you know Q---TZ and click 'update plaintext'. At this point we know:
o---sua---opo---oca---nha---enc---rom---dth---ama---int---ept---our---mun---tio---ewi---eus---the---ond---loc---onf---now---hed---off---ere---nsw---esd---tmi---ght
Here's where I applied a bit of brain power. You start recognizing bits of the plaintext. the, now and off make an appearance. At the end, there's ght - this made me think the prior letter is likely a vowel. For example light or thought. I replaced the corresponding hyphen with u and clicked update keyword to find what letter would have produced that combination. The matching letter turns out to be F. I think updated the plaintext to see the results. They didn't look promising. So I tried i instead which resulted in:
o--usua--ropo--loca--onha--eenc--prom--edth--eama--eint--cept--gour--mmun--atio--wewi--beus--gthe--cond--yloc--ionf--mnow--thed--poff--mere--insw--nesd--atmi--ight
Now we're getting somewhere. At the start I see something that might be usual, and further in I see int--cept and near the end w--nesd-- at mi--ight. Voila. Filling in the letters for wednesday and updating the keyword yielded QUARTZ.
... So, how to port this approach to code? Not sure about the best way to do that just yet. The idea of using the known characters in the key, partially decrypting the ciphertext and brute forcing the rest is appealing. But without a dictionary handy, I'm not sure what the best brute-forcing method would be...
To be continued (maybe)...

An algorithm wouldn't just consider the most frequent letters but the frequency pattern of the whole alphabet. Technically you compute the index of coincidence for each possible shift and consider the maximal ones.

Is there any decryption algorithm that uses a dictionary to decrypt an encrypted algorithm?

Well I have been working on an assigment and it states:
A program has to be developed, and coded in C language, to decipher a document written
in Italian that is encoded using a secret key. The secret key is obtained as random
permutation of all the uppercase letters, lowercase letters, numbers and blank space. As
an example, let us consider the following two strings:
Plain: “ABCDEFGHIJKLMNOPQRSTUVXWYZabcdefghijklmnopqrstuvwxyz0123456789 ”
Code: “BZJ9y0KePWopxYkQlRjhzsaNTFAtM7H6S24fC5mcIgXbnLOq8Uid 3EDv1ruVGw”
The secret key modifies only letters, numbers, and spaces of the original document, while
the remaining characters are left unchanged. The document is stored in a text file whose
length is unknown.
The program has to read the document, find the secret key (which by definition is
unknown; the above table is just an example and it is not the key used for preparing the
sample files available on the web course) using a suitable decoding algorithm, and write
the decoded document to a new text file.
And I know that I have to upload an English dictionary into the program but I don't why it has been asked (may be not in that statement but I have to do THAT). My question is, while I can do that program using simple encryption/decryption algorithm then what's the use of uploading the English dictionary in our program? So is there any decryption algorithm that uses a dictionary to decrypt an encrypted algorithm? Or can somebody tell me what approach or algorithm should I use to solve that problem???
An early reply (and also authentic one) will be highly appreciated from you.
Thank you guys.

This is a simple substitution cipher. It can be broken using frequency analysis. The Wikipedia articles explain both concepts thoroughly. What you need to do is:
Find the statistical frequency of characters in Italian texts. If you can't find this published anywhere, you can build it yourself by analyzing a large corpus of Italian texts.
Analyze the frequency of characters in the cipher text, and match it to the statistical data.
The first Wikipedia article links to a set of tools that implement all of the above. You just need to use and possibly adapt it to your use case.

Your cipher is a substitution cipher. That is it substitutes one letter for another.
consider the cipher text
"yjr,1drv2ry1od1q1..."
We can use a dictionary to find the plaintext.
Find punctuation, since a space always follows a comma, you can find the substitution rule for spaces.
which gives you.
"yjr, drv2ry od q..."
Notice the word lengths. Since there only two 1 letter words in the english language the q is probably i or a. "yjr" is probably "why", "the", "how" etc.
We try why with the result
"why, dyv2yw od q..."
There are no english words with two y's, and end in w.
So we try "the" and get
"the, dev2et od q..."
We conclude that the is a likely answer.
Now we search our dictionary for words that start look like ?e??et.
rinse repeat.
That is, find some set of words which fit into the lengths available and do not break each others substitution rules.
Personally I just do the frequency analysis suggested above.

Frequency analysis, as both other respondents said, is the way to go, and you can use digrams and trigrams to make it much stronger. Just grab tons of Italian text from the web and churn ahead! It's really pretty simple programming.