Is it possible to create a unique 6 digit number using guid?
I seem to be able to only get 10 digits using :
byte[] buffer = Guid.NewGuid().ToByteArray();
Console.WriteLine(BitConverter.ToUInt32(buffer, 6).ToString());
GUIDs are not guaranteed to be unique. However, the range from which they are picked up is so large that there indeed is a very low probability of collision. This is not the case for a six digit number unless you pick each number sequentially or use any other scheme that ensure there is no collision.
EDIT
See http://en.m.wikipedia.org/wiki/Birthday_problem for the probability of a collision.
Related
I'm learning some basic cryptography related programming and I'm now learning diffie-hellman-merkle key exchange. I was watching a video by Computerphile where they explain the mathematics of it.
In the video, they said that you should use n that is 2000 or 4000 bits long. I've seen this key length be discussed as bits in several other places, like with AES. But I don't understand what "length in bits" means. What does it mean to have a key that is 2000 bits long, so if I need to write a program that creates or uses keys that are of certain length, what would I need to do?
If you need to write a program that creates keys of a certain length, you pass the desired key length to the program. You need to know in which unit the length is expressed (for example, some interfaces might require bytes rather than bits), but you usually don't need to know what the key length means under the hood.
The concrete meaning of the key length depends on the cryptographic scheme, and for some schemes, there can be an ambiguity as to what the “key length” is. It's typically one of three things:
The length of a string that is associated with the algorithm, such as a key.
A number n such that an integer parameter of the algorithm is picked between 2^(n-1) and (2^n)-1.
A number n such that an integer parameter of the algorithm is picked between 1 (or some other small bound) and (2^n)-1.
Both of the last two cases are called “n-bit numbers”. In cryptography, “n-bit number” sometimes means a number that can be written with n digits in base 2, and sometimes a number which requires exactly n digits in base 2. In other words, “n-bit number” sometimes means a number whose bit-size is exactly n, and sometimes a number whose bit-size is at most n. You have to check the exact requirement in the description of each cryptographic scheme.
Depending on the cryptographic scheme, a different number is conventionally chosen as the “key length”. For any specific scheme, a larger key length is harder to break, but you can't compare key lengths between different schemes.
For most symmetric schemes, the key is a randomly generated string (each bit of the string has an independent ½ chance of being 0 or 1), and the length is the number of bits of the string. For example, AES-128 is AES using a 128-bit (16-byte) key. There is only one exception worth mentioning: DES keys are expressed as 64-bit strings, but only 56 of those bits are random (the other 8 are calculated from the random 56), and so DES is sometimes considered to have a “56-bit” key length and sometimes a “64-bit” key length.
For Diffie-Hellman, the key length n is the exact size of the group (conventionally written p). Both the private key and the public key are numbers between 1 and p, so they're at-most n-bit numbers. This is as simple as it goes in terms of key length for asymmetric cryptography.
For RSA, the key length n is the exact size of the modulus, which is one part of the public key (the public key is a pair of numbers: the modulus and the public exponent). For example, 4096-bit RSA means that the modulus is between 2^4095 and 2^4096-1. The private key is also an n-bit number, but in the at-most sense.
For DSA, there are two numbers that can be called the key length, because the private key and the public key are chosen in intervals that have different sizes. The public key length is the size of the larger prime p; the public key is a number between 2 and p-2. The private key length is the size of the smaller prime q; the private key is a number between 1 and q-1.
For elliptic curve cryptography, the domain parameters of the algorithm are called a curve: a set of points, and a parametrization of this set of points. A private key is a parameter value that designates a point on the curve, and a public key is a pair of integers that are the coordinates of a point on the curve. In general, since the private key and the public key live in different mathematical spaces, there are two numbers that could be called the “key size”. A private key is a number between 1 and n-1 for some m-bit number n, and a public key is a point with two coordinates, each of which are between 0 and q for some ℓ-bit number q. In general, m and ℓ don't have to be equal. However, n and q are usually close (if they aren't, it's a waste of performance for a given security level), and so m and ℓ are usually equal and can be called the “key length” without ambiguity.
Every bit can be either 1 or 0. It's the basic unit in the digital world. As you may know, all that is digital end up being either 1 or 0. Each 1 and 0 is a bit.
So something of length n bits means that it has n 1s and 0s.
Is possible encrypt 30 digit number into 10 digit number, i have number like
23456-32431-23233-76543-98756-54543 i need look like 10 digit encrypt format.
Is possible encrypt 30 digit number into 10 digit number,
Purely mathematically - you cannot. Still we are assuming you want to represent 30 decimal digits of any value using 10 decimal digits. You simply want to put a pint into a shot glass.
Anything, i need compressed the digit.
Compression would be possible if some of the stated assumptions would be not valid.
If you could represent the output as text (any character or binary), you could encode the decimal value to binary/base64 form which would allow shorten representation (still no to 1:3 ratio)
Compression would work well, if the input values (or part of the input) would not be random. If digits or significant part of the input have not uniform distribution or part of the input would represent a limited counter, then the parts or digits could be represented with limited number of bits.
You may know more about your data, so only you could tell anything about the data distribution.
curiosity, how goo.gl & bit.ly working?
The "shortening" sites are a key-value storage, mapping a generated short value to stored full url. So it's mapping, not any compression.
what i ask about is if exist a way to generate unique random numbers without helper structures.
I mean if already exist some mathematics functions (or algorithms) that natively generate random numbers only at once on a field (i would not try to write some kind of hash function specific for this problem).
This because i would generate a lot of unique numbers (integer) choosen between 0 and 10.000.000.000 (about 60% of the field), so a random repetition is not so improbable and store previously generated number in a structure for a subsequent lookup (even if well optimized, like bit arrays) could be too expensive (spatially and temporally).
P.S.
(Note that when i write random i really mean pseudo random)
If you want to ensure uniqueness then do not use a hash function, but instead use an encryption function to encrypt the numbers 0, 1, 2, 3 ... Since encryption is reversible then every number (up to the block size) is uniquely encrypted and will produce a unique result.
You can either write a simple Feistel cypher with a convenient block size or else use the Hasty Pudding cypher, which allows a large range of block sizes. Whenever an input number generates too large an output, then just go to the next input number.
Changing the key of the cypher will generate a different series of output numbers. the same series of numbers can be regenerated whenever needed by remembering the key and starting again with 0, 1, 2 ... There is no need to store the entire sequence. As you say, the sequence is pseudo-random and so can be regenerated easily if you know the key.
Instead of pseudo-random numbers, you could try so-called quasi-random numbers, which are more accurately called low-discrepancy sequences. [1]
[1] https://en.wikipedia.org/wiki/Low-discrepancy_sequence
I used oracle dictionary views to find out column differences if any between two schema's. While syncing data type discrepancies I found that both NUMBER and INTEGER data types stored in all_tab_columns/user_tab_columns/dba_tab_columns as NUMBER only so it is difficult to sync data type discrepancies where one schema/column has number datatype and another schema/column has integer data type.
While comparison of schema's it show datatype mismatch. Please suggest if there is any other alternative apart form using dictionary views or if any specific properties from dictionary views can be used to identify if data type is integer.
the best explanation i've found is this:
What is the difference betwen INTEGER and NUMBER? When should we use NUMBER and when should we use INTEGER? I just wanted to update my comments here...
NUMBER always stores as we entered. Scale is -84 to 127. But INTEGER rounds to whole number. The scale for INTEGER is 0. INTEGER is equivalent to NUMBER(38,0). It means, INTEGER is constrained number. The decimal place will be rounded. But NUMBER is not constrained.
INTEGER(12.2) => 12
INTEGER(12.5) => 13
INTEGER(12.9) => 13
INTEGER(12.4) => 12
NUMBER(12.2) => 12.2
NUMBER(12.5) => 12.5
NUMBER(12.9) => 12.9
NUMBER(12.4) => 12.4
INTEGER is always slower then NUMBER. Since integer is a number with added constraint. It takes additional CPU cycles to enforce the constraint. I never watched any difference, but there might be a difference when we load several millions of records on the INTEGER column. If we need to ensure that the input is whole numbers, then INTEGER is best option to go. Otherwise, we can stick with NUMBER data type.
Here is the link
Integer is only there for the sql standard ie deprecated by Oracle.
You should use Number instead.
Integers get stored as Number anyway by Oracle behind the scenes.
Most commonly when ints are stored for IDs and such they are defined with no params - so in theory you could look at the scale and precision columns of the metadata views to see of no decimal values can be stored - however 99% of the time this will not help.
As was commented above you could look for number(38,0) columns or similar (ie columns with no decimal points allowed) but this will only tell you which columns cannot take decimals, and not what columns were defined so that INTS can be stored.
Suggestion:
do a data profile on the number columns. Something like this:
select max( case when trunc(column_name,0)=column_name then 0 else 1 end ) as has_dec_vals
from table_name
This is what I got from oracle documentation, but it is for oracle 10g release 2:
When you define a NUMBER variable, you can specify its precision (p) and scale (s) so that it is sufficiently, but not unnecessarily, large. Precision is the number of significant digits. Scale can be positive or negative. Positive scale identifies the number of digits to the right of the decimal point; negative scale identifies the number of digits to the left of the decimal point that can be rounded up or down.
The NUMBER data type is supported by Oracle Database standard libraries and operates the same way as it does in SQL. It is used for dimensions and surrogates when a text or INTEGER data type is not appropriate. It is typically assigned to variables that are not used for calculations (like forecasts and aggregations), and it is used for variables that must match the rounding behavior of the database or require a high degree of precision. When deciding whether to assign the NUMBER data type to a variable, keep the following facts in mind in order to maximize performance:
Analytic workspace calculations on NUMBER variables is slower than other numerical data types because NUMBER values are calculated in software (for accuracy) rather than in hardware (for speed).
When data is fetched from an analytic workspace to a relational column that has the NUMBER data type, performance is best when the data already has the NUMBER data type in the analytic workspace because a conversion step is not required.
if i have a hash say like this: 0d47aeda9d97686ab3da96bae2c93d078a5ab253
how do i do the math to find out the number of possibilities to try if i start with 0000000000000000000000000000000000000000 to 9999999999999999999999999999999999999999 which is the general length of a sha1.
The number of possibilities would be 2^(X) where X is the number of bits in the hash.
In the normal hexadecimal string representation of the hash value like the one you gave, each character is 4 bits, so it would be 2^(4*len) where len is the string length of the hash value. In your example, you have a 40 character SHA1 digest, which corresponds to 160 bits, or 2^160 == 1.4615016373309029182036848327163e+48 values.
An SHA-1 hash is 160 bits, so there are 2^160 possible hashes.
Your hexadecimal digit range is 0 through f.
Then it's simply 16^40 or however many characters it contains
Recall that a hash function accepts inputs of arbitrary length. A good cryptographic hash function will seem to assign a "random" hash result to any input. So if the digest is N bits long (for SHA-1, N=160), then every input will be hashed to one of 2^N possible results, in a manner we'll treat as random.
That means that the expectation for finding a preimage for your hash result is running though 2^N inputs. They don't have to be specifically the range that you suggested - any 2^N distinct inputs are fine.
This also means that 2^N inputs don't guarantee that you'll find a preimage - each try is random, so you might miss your 1-in-2^N chance in every single one of those 2^N inputs (just like flipping a coin twice doesn't guarantee you'll get heads at least once). But you can figure out how many inputs are required to find a preimage for the hash with probability p or greater - with p being as close to one as you desire (just not actually 1).
maximum variations, with repeating and with attention to the order are defined as n^k. in your case this would mean 10^40, which can't be correct for SHA1. Reading Wikipedia it sais SHA1 has a max. complexity for a collision based attack of 2^80, using different technices researches were allready successfull with 2^51 collisions, so 10^40 seems a bit much.