Obscure / encrypt an order number as another number: symmetrical, "random" appearance? - encryption

Client has an simple increasing order number (1, 2, 3...). He wants end-users to receive an 8- or 9- digit (digits only -- no characters) "random" number. Obviously, this "random" number actually has to be unique and reversible (it's really an encryption of the actualOrderNumber).
My first thought was to just shuffle some bits. When I showed the client a sample sequence, he complained that subsequent obfuscOrderNumbers were increasing until they hit a "shuffle" point (point where the lower-order bits came into play). He wants the obfuscOrderNumbers to be as random-seeming as possible.
My next thought was to deterministically seed a linear congruential pseudo-random-number generator and then take the actualOrderNumber th value. But in that case, I need to worry about collisions -- the client wants an algorithm that is guaranteed not to collide in at least 10^7 cycles.
My third thought was "eh, just encrypt the darn thing," but if I use a stock encryption library, I'd have to post-process it to get the 8-or-9 digits only requirement.
My fourth thought was to interpret the bits of actualOrderNumber as a Gray-coded integer and return that.
My fifth though was: "I am probably overthinking this. I bet someone on StackOverflow can do this in a couple lines of code."

Pick a 8 or 9 digit number at random, say 839712541. Then, take your order number's binary representation (for this example, I'm not using 2's complement), pad it out to the same number of bits (30), reverse it, and xor the flipped order number and the magic number. For example:
1 = 000000000000000000000000000001
Flip = 100000000000000000000000000000
839712541 = 110010000011001111111100011101
XOR = 010010000011001111111100011101 = 302841629
2 = 000000000000000000000000000010
Flip = 010000000000000000000000000000
839712541 = 110010000011001111111100011101
XOR = 100010000011001111111100011101 = 571277085
To get the order numbers back, xor the output number with your magic number, convert to a bit string, and reverse.

Hash function? http://www.partow.net/programming/hashfunctions/index.html

Will the client require the distribution of obfuscated consecutive order numbers to look like anything in particular?
If you do not want to complicate yourself with encryption, use a combination of bit shuffling with a bit of random salting (if you have bits/digits to spare) XOR-superimposed over some fixed constant (or some function of something that would be readily available alongside the obfuscated order ID at any time, such as perhaps the customer_id who placed the order?)
EDIT
It appears that all the client desires is for an outside party to not be able to infer the progress of sales. In this case a shuffling solution (bit-mapping, e.g. original bit 1 maps to obfuscated bit 6, original bit 6 maps to obfuscated bit 3, etc.) should be more than sufficient. Add some random bits if you really want to make it harder to crack, provided that you have the additional bits available (e.g. assuming original order numbers go only up to 6 digits, but you're allowed 8-9 in the obfuscated order number, then you can use 2-3 digits for randomness before performing bit-mapping). Possibly XOR the result for additional intimidation (an inquisitive party might attempt to generate two consecutive obfuscated orders, XOR them against each other to get rid of the XOR constant, and would then have to deduce which of the non-zero bits come from the salt, and which ones came from an increment, and whether he really got two consecutive order numbers or not... He would have to repeat this for a significant number of what he'd hope are consecutive order numbers in order to crack it.)
EDIT2
You can, of course, allocate completely random numbers for the obfuscated order IDs, store the correspondence to persistent storage (e.g. DB) and perform collision detection as well as de-obfuscation against same storage. A bit of overkill if you ask me, but on the plus side it's the best as far as obfuscation goes (and you implement whichever distribution function your soul desires, and you can change the distribution function anytime you like.)

In 9 digit number, the first digit is a random index between 0 and 7 (or 1-8). Put another random digit at that position. The rest is the "real order number:
Orig order: 100
Random index: 5
Random digit: 4 (guaranteed, rolled a
dice :) )
Result: 500040100
Orig Nr: 101
Random index: 2
Random digit 6
Result: 200001061
You can decide that the 5th (or any other) digit is the index.
Or, if you can live with real order numbers of 6 digits, then you can introduce "secondary" index as well. And you can reverse the order of the digits in the "real" order nr.

I saw this rather late, (!) hence my rather belated response. It may be useful to others coming along later.
You said: "My third thought was "eh, just encrypt the darn thing," but if I use a stock encryption library, I'd have to post-process it to get the 8-or-9 digits only requirement."
That is correct. Encryption is reversible and guaranteed to be unique for a given input. As you point out, most standard encryptions do not have the right block size. There is one however, Hasty Pudding Cipher which can have any block size from 1 bit upwards.
Alternatively you can write your own. Given that you don't need something the NSA can't crack, then you can construct a simple Feistel cipher to meet your needs.

If your Order Id is unique, Simply you can make a prefix and add/mix that prefix with your order Id.
Something like this:
long pre = DateTime.Now.Ticks % 100;
string prefix = pre.ToString();
string number = prefix + YOURID.ToString()

<?PHP
$cry = array(0=>5,1=>3,2=>9,3=>2,4=>7,5=>6,6=>1,7=>8,8=>0,9=>4);
function enc($e,$cry,$k){
if(strlen($e)>10)die("max encrypt digits is 10");
if(strlen($e) >= $k)die("Request encrypt must be lesser than its length");
if(strlen($e) ==0)die("must pass some numbers");
$ct = $e;
$jump = ($k-1)-strlen($e);
$ency = $cry[(strlen($e))];
$n = 0;
for($a=0;$a<$k-1;$a++){
if($jump > 0){
if($a%2 == 1){
$ency .=rand(0,9);
$jump -=1;
}else{
if(isset($ct[$n])){
$ency.=$cry[$ct[$n]];
$n++;
}else{
$ency .=rand(0,9);
$jump -=1;
}
}
}else{
$ency.= $cry[$ct[$n]];
$n++;
}
}
return $ency;
}
function dec($e,$cry){
//$decy = substr($e,6);
$ar = str_split($e,1);
$len = array_search($ar[0], $cry);
$jump = strlen($e)-($len+1);
$val = "";
for($i=1;$i<strlen($e);$i++){
if($i%2==0){
if($jump >0){
//$val .=array_search($e[$i], $cry);
$jump--;
}else{
$val .=array_search($e[$i], $cry);
}
}else{
if($len > 0){
$val .=array_search($e[$i], $cry);
$len--;
}else{
$jump--;
}
}
}
return $val;
}
if(isset($_GET["n"])){
$n = $_GET["n"];
}else{
$n = 1000;
}
$str = 1253;
$str = enc($str,$cry,15);
echo "Encerypted Value : ".$str ."<br/>";
$str = dec($str,$cry);
echo "Decrypted Value : ".$str ."<br/>";
?>

Related

Finding similar hashes

I'm trying to find 2 different plain text words that create very similar hashes.
I'm using the hashing method 'whirlpool', but I don't really need my question to be answered in the case or whirlpool, if you can using md5 or something easier that's ok.
The similarities i'm looking for is that they contain the same number of letters (doesnt matter how much they're jangled up)
i.e
plaintext 'test'
hash 1: abbb5 has 1 a , 3 b's , one 5
plaintext 'blahblah'
hash 2: b5bab must have the same, but doesnt matter what order.
I'm sure I can read up on how they're created and break it down and reverse it, but I am just wondering if what I'm talking about occurs.
I'm wondering because I haven't found a match of what I'm explaining (I created a PoC to run threw random words / letters till it recreated a similar match), but then again It would take forever doing it the way i was dong it. and was wondering if anyone with real knowledge of hashes / encryption would help me out.
So you can do it like this:
create an empty sorted map \
create a 64 bit counter (you don't need more than 2^63 inputs, in all probability, since you would be dead before they would be calculated - unless quantum crypto really takes off)
use the counter as input, probably easiest to encode it in 8 bytes;
use this as input for your hash function;
encode output of hash in hex (use ASCII bytes, for speed);
sort hex on number / alphabetically (same thing really)
check if sorted hex result is a key in the map
if it is, show hex result, the old counter from the map & the current counter (and stop)
if it isn't, put the sorted hex result in the map, with the counter as value
increase counter, goto 3
That's all folks. Results for SHA-1:
011122344667788899999aaaabbbcccddeeeefff for both 320324 and 429678
I don't know why you want to do this for hex, the hashes will be so large that they won't look too much alike. If your alphabet is smaller, your code will run (even) quicker. If you use whole output bytes (i.e. 00 to FF instead of 0 to F) instead of hex, it will take much more time - a quick (non-optimized) test on my machine shows it doesn't finish in minutes and then runs out of memory.

How to calculate password strength?

I'm using a certain electronic currency and they use pass phrases as passwords.
Basically every password is 12 English words long. How can I calculate how secure this is?
I don't know much about these things, but 12 words seem rather feasible with a dictionary attack (at least in my mind).
Naturally I'm worried how secure this is so instead of just asking if it is, I'd like to know methods on calculating it myself (you can spoil the answer, of course).
Any advice, links, literature recommendations, etc are welcome!
PS: How long it would take for an average computer to get an valid pass phrase with the details I gave above? I need to know if I have to keep making new accounts regularly to transfer funds to if it really doesn't take that much effort. I'd also appreciate any information on how to calculate that as well, but is not the main issue here. Thanks again!
It's all a question of entropy. How many different symbols are there to test ?
Traditionally, passwords are a string of characters. Symbols are then characters. If you use lower case letters only a-z is a range of 26 possible letters. With upper case and numbers, you get 62 symbols. With all special symbols that are in the ASCII set (so without fancy encodings) you get over 90 possible symbols already. In your case, a symbol is a word.
From this question on Oxford dictionaries’ website I would gather there are 115000 words that you could expect (without obsolete and derivatives).
To compute the number of combinations, you have to realize that for each possible symbol at a given position, you have the choice of every possible character at another position. With strings of characters, if your password starts with a $, you still have any character for the other positions. This means that we have to multiply the number of possible symbols for each symbol position. Thus with 2 characters that have s possible symbols, you have s*s possibilities. In general, you would have for c characters sc possibilities for a password.
Note that this means that in the case of dictionary words, you put random words instead of making sentences !
In your case, there are 11500012 possibilities, which is about 5.3*1060. So a huge lot.
The time to brute-force a password is then given by how much time t it takes to test a password, and the number of attempts, in your case t × 2.65 × 10^60 if you enumerate all combinations in a random order, and t × 5.3 × 10^60 if you try word combinations completely at random.
Here i have created the function on react-js to calculate the strenght of the password on the basic of some condition ..
A password must contain ATLEAST one Uppercase
A password must contain ATLEAST one lowercase
A password must contain ATLEAST one specialchar
A password must contain ATLEAST one Number
lenght of password must be 8 or above
export const PasswordStrenght: any = (password: string) => {
// Initial Percentage
let percentage: number = 0;
// Special character regex enter code here
const specialChars = /[ `!##$%^&*()_+\-=\[\]{};':"\\|,.\/?~]/;
const ownWeight : number = 20;
// Atleat one number
if (/\d/.test(password)) {
percentage = percentage + ownWeight;
}
// Atleat one lowercase alphabet
if (/.*[a-z].*/.test(password)) {
percentage = percentage + ownWeight;
}
// Atleat one uppercase alphabet
if (/.*[A-Z].*/.test(password)) {
percentage = percentage + ownWeight;
}
// Atleat one special character
if (specialChars.test(password)) {
percentage = percentage + ownWeight;
}
// lenght altest 8 or above
if (password.length >= 8) {
percentage = percentage + ownWeight;
}
return percentage;
};

Data Masking in SAS: Scrambling Sensitive observations at character level

I'm working with client data in SAS with sensitive customer identification information. The challenge is to mask the field in such a way that it remains numeric/alphabetic/alphanumeric. I found a way of using Bitwise function in SAS (BXOR, BOR, BAND) but the output is full of special characters which SAS cant handle/sort/merge etc.
I also thought of scrambling the field itself, based on a key, but haven't been able to see it through. Following are the challenges:
1) It HAS to be key based
2) HAS to be reversible.
3) Masked/scrambled field has to be numeric/alphabetic/alphanumeric only so it can be used in SAS.
4) The field to be masked has both alphabets and numbers but has varying lengths and with millions of observartions.
Any tips on how to achieve this masking/scrambling would be greatly appreicated :(
Here is a simple key-based solution. I present the data step solution here, and then will present a FCMP version in a bit. I keep everything in the range of 48 to 127 (Numbers, letters, and common characters such as # > < etc.); that's not quite alphanumeric but I can't imagine why it would matter in this case. You could reduce it further to only truly alphanumeric using this same method, but it would make the key much worse (only 62 values) and be clunky to work with (as you have 3 noncontiguous ranges).
data construct_key;
length keystr $1500;
do _t = 1 to 1500;
_rannum = ceil(ranuni(7)*80);
*if _rannum=12 then _rannum=-15;
substr(keystr,_t,1)=byte(47+_rannum);
end;
call symput('keystr',keystr);
run;
%put %bquote(&keystr);
data encrypted;
set sashelp.class;
retain key "&keystr";
length name_encrypt $30;
do _t = 1 to length(name);
substr(name_encrypt,_t,1) = byte(mod(rank(substr(name,_t,1)) + rank(substr(key,1,1))-94,80)+47);
key = substr(key,2);
end;
keep name:;
run;
data unencrypted;
set encrypted;
retain key "&keystr";
length name_unenc $30;
do _t = 1 to length(name_encrypt);
substr(name_unenc,_t,1) = byte(
mod(80+rank(substr(name_encrypt,_t,1)) - rank(substr(key,1,1)),80)
+47);
key = substr(key,2);
end;
run;
In this solution, there is a medium level of encryption - a key with 80 possible values is not strong enough to deter a truly sophisticated hacker, but is strong enough for most purposes. You need to pass either the key itself or the seed to the key algorithm in order to unencrypt; if you use this multiple times, make sure to pick a new seed each time (and not something related to the data). If you seed with zero (or a nonpostive integer) you will effectively guarantee a new key each time, but you will have to pass the key itself rather than the seed, which may present some data security issues (obviously, the key itself can be obtained by a malicious user, and would have to be stored in a different location than the data). Passing the key by way of the seed is probably better, as you could pass that verbally over the telephone or through some sort of prearranged list of seeds.
I'm not sure I recommend this sort of approach in general; a superior approach may well be to simply encrypt the entire SAS dataset using a superior encryption method (PGP, for example). Your exact solution may vary, but if you have for example some customer information that isn't actually necessary for most steps of your process, you may be better off separating that information from the rest of the (non-sensitive) data and only incorporating that when it's needed.
For example, I have a process whereby I pull sample for a client for a healthcare survey. I select valid records from a dataset that has no information for the customer except a numeric unique identifier; once I have narrowed the sample down to the valid records, then I attach the customer information from a separate dataset and create the mailing files (which are stored in an encrypted directory). That keeps the data nonsensitive for as long as possible. It's not perfect - the unique numeric identifier still means there is a tie back, even if it's not to anything someone would know outside of the project - but it keeps things safe as long as possible on our end.
Here is the FCMP version:
%let keylength=5;
%let seed=15;
proc fcmp outlib=work.funcs.test;
subroutine encrypt(value $,key $);
length key $&keylength.;
outargs value,key;
do _t = 1 to lengthc(value);
substr(value,_t,1) = byte(mod(rank(substr(value,_t,1)) + rank(substr(key,1,1))-62,96)+31);
key = substr(key,2)||substr(key,1,1);
end;
endsub;
subroutine unencrypt(value $,key $);
length key $&keylength.;
outargs value,key;
do _t = 1 to lengthc(value);
substr(value,_t,1) = byte(mod(96+rank(substr(value,_t,1)) - rank(substr(key,1,1)),96)+31);
key = substr(key,2)||substr(key,1,1);
end;
endsub;
subroutine gen_key(seed,keystr $);
outargs keystr;
length keystr $&keylength.;
do _t = 1 to &keylength.;
_rannum = ceil(ranuni(seed)*80);
substr(keystr,_t,1)=byte(47+_rannum);
end;
endsub;
quit;
options cmplib=work.funcs;
data encrypted;
set sashelp.class;
length key $&keylength.;
retain key ' '; *the missing is to avoid the uninitialized variable warning;
if _n_ = 1 then call gen_key(&seed,key);
call encrypt(name,key);
drop key;
run;
data unencrypted;
set encrypted;
length key $&keylength.;
retain key ' ';
if _n_ = 1 then call gen_key(&seed,key);
call unencrypt(name,key);
run;
This is somewhat more robust; it allows characters from 32 to 127 rather than from 48, meaning it deals with space successfully. (Tab will still not decode properly - it would beocme a 'k'.) You pass the seed to call gen_key and then it uses that key for the remainder of the process.
It goes without saying that this is not guaranteed to function for your purposes and/or to be a secure solution and you should consult with a security professional if you have substantial security needs. This post is not warranted for any purpose and any and all liability arising from its use is disclaimed by the poster.
SAS have an article on their website on how to encrypt specific variables. Hopefully this will help you.
link

Generating binary numbers with length n with same amount of 1's and 0's

Question same as in the title.
I've done two approaches. One is straightforward.
Generate all bitmasks from
2^{n-1}
to
2^n
And for every bitmask check if there is same amount 1's and 0's, if yes, work on it.
And that's the problem, because i have to work on those bitmasks not only count them.
I came with second approach which runs on O(2^{n/2}) time, but seems like it's not generating all bitmasks and i don't know why.
Second approach is like that :
generate all bitmasks from 0 to 2^{n/2} and to have valid bitmask( call it B ) i have to do something like this : B#~B
where ~ is negative.
So for example i have n=6, so i'm going to generate bitmasks with length of 3.
For example i have B=101, so ~B will be 010
and final bitmask would be 101010, as we see, we have same amount of 1's and 0's.
Is this method good or am i implementing something bad ? Maybe some another interesting approach exist?
Thanks
Chris
Try a recursive approach:
void printMasks(int n0, int n1, int mask) {
if (!n0 && !n1) {
cerr << mask << endl;
return;
}
mask <<= 1;
if (n0) {
printMasks(n0-1, n1, mask);
}
if (n1) {
printMasks(n0, n1-1, mask | 1);
}
}
Call printMasks passing it the desired number of 0's and 1's. For example, if you need 3 ones and 3 zeros, call it like this:
printMasks(3, 3, 0);
It's possible, given a binary number, to produce the next higher binary number which has the same number of 'ones', using a constant number of operations on words large enough to hold all the bits (assuming that division by a power of two counts as one operation).
Identify the positions of the least significant '1' (hint: what happens if you decrement the number) and the least significant '0' above that (hint: what happens if you add the "least significant 1" to the original number?) You should change that least significant '0' to a '1', and set the proper number of least-significant bits to '1', and set the intervening bits to '0'.

Pseudo-random numbers from a 32-bit auto-increment INTEGER

I have a table with an auto-increment 32-bit integer primary key in a database, which will produce numbers ranging 1-4294967295.
I would like to keep the convenience of an auto-generated primary key, while having my numbers on the front-end of an application look like randomly generated.
Is there a mathematical function which would allow a two-way, one-to-one transformation between an integer and another?
For example a function would take a number, and translate it to another:
1 => 1538645623
2 => 2043145593
3 => 393439399
And another function the way back:
1538645623 => 1
2043145593 => 2
393439399 => 3
I'm not necessarily looking for an implementation here, but rather a hint on what I suppose, must be a well-known mathematical problem somewhere :)
Mathematically this is almost exactly the same problem as cryptography.
You: I want to go from an id(string of bits) to another number (string of bits) and back again in a non-obvious way.
Cryptography: I want to go from plaintext (string of bits) to another string of bits and back again (reversible) in a non-obvious way.
So for a simple solution, can I suggest just plugging in whatever cryptography algorithm is most convenient in your language, and encrypt and decrypt your id?
If you wanted to be a bit cleverer you can do what is called "salting" in addition to cryptography. Take your id as a 32 bit (or whatever) number. Concatenate it with a random 32 bit number. Encrypt the result. To reverse, just decrypt, and throw away the random part.
Of course, if someone was seriously attacking this, this might be vulnerable to known plaintext/differential cryptanalysis attacks as you have a very small known plaintext space, but it sounds like you aren't trying to defend against serious attacks.
First remove the offset of 1, so you get numbers in the range 0 to 232-2. Let m = 232-1.
Choose some a that is relative prime to m. Since it is relatively prime it has an inverse a' so that a * a' = 1 (mod m). Also choose some b. Choose big numbers to get a good mixing effect.
Then you can compute your desired pseudo-random number by y = (a * x + b) % m, and get back the original by x = ((y - b) * a') % m.
This is essentially one step of a linear congruential generator (LCG) for pseudo-random numbers.
Note that this is not secure, it is only obfuscation. For example, if a user can get two numbers in sequence then he can recover a and b easily.
In most cases web apps use a hash of a randomly generated number as a reference to a table row. This hash can be stored as a number and displayed as a string for the end user.
This hash is unique and it is identifier and the id is only used in the application itself, never shown to the outside world.

Resources