I am looking for a hash function with the following properties
It maps an arbitrary string uniformly between 0 and 1
The hash function output is independent of the length of the string
The hash function accepts a random seed
For a given random seed the mapping from string to (0,1) is deterministic meaning if Alice and Bob calculate the hash for a given string and random seed they will both get the same value
I am not worried about security. I don’t care if someone in theory could reconstruct the set of strings given random seed and hash value
Would be great to get some ideas.
If you don't like this 'solution' explain why not and you'll get better answers.
Take the ASCII code table and throw away the codes for non-characters such as 'bell', you'll be left with approximately 100 characters.
Make a 1:1 mapping between characters and 2-digit numbers, eg you might start with
space <-> 00
! <-> 01
A <-> 33
...
Z <-> 58
...
a <-> 65
I expect you get the picture. Now, encode the first 32 (or whatever) characters in your string in the obvious way, eg
`Aa aa` -> `3365006565`
and pad any strings shorter than 32-characters with 00. (I couldn't be bothered typing all the 00 for the example.)
Generate a random number in the range [1,64] and use that to rotate the numeric string left by that number of places.
Slap a decimal point on the front of what is left and you have the sought-for real number.
I believe that this satisfies your requirements.
Related
I'm learning some basic cryptography related programming and I'm now learning diffie-hellman-merkle key exchange. I was watching a video by Computerphile where they explain the mathematics of it.
In the video, they said that you should use n that is 2000 or 4000 bits long. I've seen this key length be discussed as bits in several other places, like with AES. But I don't understand what "length in bits" means. What does it mean to have a key that is 2000 bits long, so if I need to write a program that creates or uses keys that are of certain length, what would I need to do?
If you need to write a program that creates keys of a certain length, you pass the desired key length to the program. You need to know in which unit the length is expressed (for example, some interfaces might require bytes rather than bits), but you usually don't need to know what the key length means under the hood.
The concrete meaning of the key length depends on the cryptographic scheme, and for some schemes, there can be an ambiguity as to what the “key length” is. It's typically one of three things:
The length of a string that is associated with the algorithm, such as a key.
A number n such that an integer parameter of the algorithm is picked between 2^(n-1) and (2^n)-1.
A number n such that an integer parameter of the algorithm is picked between 1 (or some other small bound) and (2^n)-1.
Both of the last two cases are called “n-bit numbers”. In cryptography, “n-bit number” sometimes means a number that can be written with n digits in base 2, and sometimes a number which requires exactly n digits in base 2. In other words, “n-bit number” sometimes means a number whose bit-size is exactly n, and sometimes a number whose bit-size is at most n. You have to check the exact requirement in the description of each cryptographic scheme.
Depending on the cryptographic scheme, a different number is conventionally chosen as the “key length”. For any specific scheme, a larger key length is harder to break, but you can't compare key lengths between different schemes.
For most symmetric schemes, the key is a randomly generated string (each bit of the string has an independent ½ chance of being 0 or 1), and the length is the number of bits of the string. For example, AES-128 is AES using a 128-bit (16-byte) key. There is only one exception worth mentioning: DES keys are expressed as 64-bit strings, but only 56 of those bits are random (the other 8 are calculated from the random 56), and so DES is sometimes considered to have a “56-bit” key length and sometimes a “64-bit” key length.
For Diffie-Hellman, the key length n is the exact size of the group (conventionally written p). Both the private key and the public key are numbers between 1 and p, so they're at-most n-bit numbers. This is as simple as it goes in terms of key length for asymmetric cryptography.
For RSA, the key length n is the exact size of the modulus, which is one part of the public key (the public key is a pair of numbers: the modulus and the public exponent). For example, 4096-bit RSA means that the modulus is between 2^4095 and 2^4096-1. The private key is also an n-bit number, but in the at-most sense.
For DSA, there are two numbers that can be called the key length, because the private key and the public key are chosen in intervals that have different sizes. The public key length is the size of the larger prime p; the public key is a number between 2 and p-2. The private key length is the size of the smaller prime q; the private key is a number between 1 and q-1.
For elliptic curve cryptography, the domain parameters of the algorithm are called a curve: a set of points, and a parametrization of this set of points. A private key is a parameter value that designates a point on the curve, and a public key is a pair of integers that are the coordinates of a point on the curve. In general, since the private key and the public key live in different mathematical spaces, there are two numbers that could be called the “key size”. A private key is a number between 1 and n-1 for some m-bit number n, and a public key is a point with two coordinates, each of which are between 0 and q for some ℓ-bit number q. In general, m and ℓ don't have to be equal. However, n and q are usually close (if they aren't, it's a waste of performance for a given security level), and so m and ℓ are usually equal and can be called the “key length” without ambiguity.
Every bit can be either 1 or 0. It's the basic unit in the digital world. As you may know, all that is digital end up being either 1 or 0. Each 1 and 0 is a bit.
So something of length n bits means that it has n 1s and 0s.
Is possible encrypt 30 digit number into 10 digit number, i have number like
23456-32431-23233-76543-98756-54543 i need look like 10 digit encrypt format.
Is possible encrypt 30 digit number into 10 digit number,
Purely mathematically - you cannot. Still we are assuming you want to represent 30 decimal digits of any value using 10 decimal digits. You simply want to put a pint into a shot glass.
Anything, i need compressed the digit.
Compression would be possible if some of the stated assumptions would be not valid.
If you could represent the output as text (any character or binary), you could encode the decimal value to binary/base64 form which would allow shorten representation (still no to 1:3 ratio)
Compression would work well, if the input values (or part of the input) would not be random. If digits or significant part of the input have not uniform distribution or part of the input would represent a limited counter, then the parts or digits could be represented with limited number of bits.
You may know more about your data, so only you could tell anything about the data distribution.
curiosity, how goo.gl & bit.ly working?
The "shortening" sites are a key-value storage, mapping a generated short value to stored full url. So it's mapping, not any compression.
I'm trying to find 2 different plain text words that create very similar hashes.
I'm using the hashing method 'whirlpool', but I don't really need my question to be answered in the case or whirlpool, if you can using md5 or something easier that's ok.
The similarities i'm looking for is that they contain the same number of letters (doesnt matter how much they're jangled up)
i.e
plaintext 'test'
hash 1: abbb5 has 1 a , 3 b's , one 5
plaintext 'blahblah'
hash 2: b5bab must have the same, but doesnt matter what order.
I'm sure I can read up on how they're created and break it down and reverse it, but I am just wondering if what I'm talking about occurs.
I'm wondering because I haven't found a match of what I'm explaining (I created a PoC to run threw random words / letters till it recreated a similar match), but then again It would take forever doing it the way i was dong it. and was wondering if anyone with real knowledge of hashes / encryption would help me out.
So you can do it like this:
create an empty sorted map \
create a 64 bit counter (you don't need more than 2^63 inputs, in all probability, since you would be dead before they would be calculated - unless quantum crypto really takes off)
use the counter as input, probably easiest to encode it in 8 bytes;
use this as input for your hash function;
encode output of hash in hex (use ASCII bytes, for speed);
sort hex on number / alphabetically (same thing really)
check if sorted hex result is a key in the map
if it is, show hex result, the old counter from the map & the current counter (and stop)
if it isn't, put the sorted hex result in the map, with the counter as value
increase counter, goto 3
That's all folks. Results for SHA-1:
011122344667788899999aaaabbbcccddeeeefff for both 320324 and 429678
I don't know why you want to do this for hex, the hashes will be so large that they won't look too much alike. If your alphabet is smaller, your code will run (even) quicker. If you use whole output bytes (i.e. 00 to FF instead of 0 to F) instead of hex, it will take much more time - a quick (non-optimized) test on my machine shows it doesn't finish in minutes and then runs out of memory.
I have a table with an auto-increment 32-bit integer primary key in a database, which will produce numbers ranging 1-4294967295.
I would like to keep the convenience of an auto-generated primary key, while having my numbers on the front-end of an application look like randomly generated.
Is there a mathematical function which would allow a two-way, one-to-one transformation between an integer and another?
For example a function would take a number, and translate it to another:
1 => 1538645623
2 => 2043145593
3 => 393439399
And another function the way back:
1538645623 => 1
2043145593 => 2
393439399 => 3
I'm not necessarily looking for an implementation here, but rather a hint on what I suppose, must be a well-known mathematical problem somewhere :)
Mathematically this is almost exactly the same problem as cryptography.
You: I want to go from an id(string of bits) to another number (string of bits) and back again in a non-obvious way.
Cryptography: I want to go from plaintext (string of bits) to another string of bits and back again (reversible) in a non-obvious way.
So for a simple solution, can I suggest just plugging in whatever cryptography algorithm is most convenient in your language, and encrypt and decrypt your id?
If you wanted to be a bit cleverer you can do what is called "salting" in addition to cryptography. Take your id as a 32 bit (or whatever) number. Concatenate it with a random 32 bit number. Encrypt the result. To reverse, just decrypt, and throw away the random part.
Of course, if someone was seriously attacking this, this might be vulnerable to known plaintext/differential cryptanalysis attacks as you have a very small known plaintext space, but it sounds like you aren't trying to defend against serious attacks.
First remove the offset of 1, so you get numbers in the range 0 to 232-2. Let m = 232-1.
Choose some a that is relative prime to m. Since it is relatively prime it has an inverse a' so that a * a' = 1 (mod m). Also choose some b. Choose big numbers to get a good mixing effect.
Then you can compute your desired pseudo-random number by y = (a * x + b) % m, and get back the original by x = ((y - b) * a') % m.
This is essentially one step of a linear congruential generator (LCG) for pseudo-random numbers.
Note that this is not secure, it is only obfuscation. For example, if a user can get two numbers in sequence then he can recover a and b easily.
In most cases web apps use a hash of a randomly generated number as a reference to a table row. This hash can be stored as a number and displayed as a string for the end user.
This hash is unique and it is identifier and the id is only used in the application itself, never shown to the outside world.
if i have a hash say like this: 0d47aeda9d97686ab3da96bae2c93d078a5ab253
how do i do the math to find out the number of possibilities to try if i start with 0000000000000000000000000000000000000000 to 9999999999999999999999999999999999999999 which is the general length of a sha1.
The number of possibilities would be 2^(X) where X is the number of bits in the hash.
In the normal hexadecimal string representation of the hash value like the one you gave, each character is 4 bits, so it would be 2^(4*len) where len is the string length of the hash value. In your example, you have a 40 character SHA1 digest, which corresponds to 160 bits, or 2^160 == 1.4615016373309029182036848327163e+48 values.
An SHA-1 hash is 160 bits, so there are 2^160 possible hashes.
Your hexadecimal digit range is 0 through f.
Then it's simply 16^40 or however many characters it contains
Recall that a hash function accepts inputs of arbitrary length. A good cryptographic hash function will seem to assign a "random" hash result to any input. So if the digest is N bits long (for SHA-1, N=160), then every input will be hashed to one of 2^N possible results, in a manner we'll treat as random.
That means that the expectation for finding a preimage for your hash result is running though 2^N inputs. They don't have to be specifically the range that you suggested - any 2^N distinct inputs are fine.
This also means that 2^N inputs don't guarantee that you'll find a preimage - each try is random, so you might miss your 1-in-2^N chance in every single one of those 2^N inputs (just like flipping a coin twice doesn't guarantee you'll get heads at least once). But you can figure out how many inputs are required to find a preimage for the hash with probability p or greater - with p being as close to one as you desire (just not actually 1).
maximum variations, with repeating and with attention to the order are defined as n^k. in your case this would mean 10^40, which can't be correct for SHA1. Reading Wikipedia it sais SHA1 has a max. complexity for a collision based attack of 2^80, using different technices researches were allready successfull with 2^51 collisions, so 10^40 seems a bit much.