Point of checkdigits in MRZs? - standards

Not sure if this is the right subreddit to ask this question, but I will give it a shot. There is the ICAO standard for Machine Readable Zones as described here https://en.wikipedia.org/wiki/Machine-readable_passport. I don't see the point for check digits there.
If I have F instead of 5 for example in the MRZ code somewhere in the second line for example, all the checkdigits will be the same. What is the point in the first place for those check digits in the ICAO standard? Especially I don't see the point of the last check digits calculation since you could also calculate it by using the check digits from the second line and not all the letters/numbers.
Could someone explain why we need those checkdigits?

To be fair. This is not a subreddit. Anyway, there are multiple reasons that there are check digits inside the MRZ. The first reason is that automatic readers can check if the code is read well enough. The second reason is that it prevents a lot of fraud and identification theft. Some people that alter their travel documents do not know that there are check digits in place. So some people will get caught because they fail to edit the numbers.
Some countries now include PDF417 barcodes and/or QR-codes to reach better reads by machines. But keep in mind that not all governments/countries have access to high-tech devices, so the machine readable zone is still mandatory for a check with the naked eye.
Source: I work for a travel document verification company.

MRZ check digits are calculated on subsections of the entire MRZ. Each calculation serves as a check for each section. A final check digit is calculated on the sum of each sections and this digit serves as a double check of the individual check.
The below have same check digit of 8:
123456780
128456785
Whereas the subsection check digit matched after the tampering but the final check digit will detect this. Therefore, the final check digit adds additional robustness.
Although, I am wondering whether this visual check digit is mandatory because an eMRTD NFC chip BAC protocol also does a much stronger cryptographic check of the MRZ value.
UPDATES: My original claim that the composite check digit adds robustness to tampering is incorrect. Given the below TD1 MRZ:
IDSLV0012345678<<<<<<<<<<<<<<<
9306026F2708252SLV<<<<<<<<<<<4
JOHN<SMEAGOL<<WENDY<LIESSETTEF
An OCR scanner can either gave 0012345678 or OO12345678 for the document number portion and all check digits passes including the composite check digit. But there is no way to tell which document number is correct. It seems that an MRZ check digit has edge cases that cannot be helped.

Related

Are initial and final permutation in DES always done in the same order?

Everywhere on the internet, it is found that the 58th bit position takes first position in initial permutation. Also, the 40th bit position takes first position in final permutation. Is this always the same in every case? I mean is this done randomly or in a particular order (the same order)?
enter image description here
Ciphers doesn't work randomly. In the case of DES, the tables (PC1, PC2, IP, E, P, IP-1, but also the shifts and the S boxes) are always the same. You can find them in Wikipedia; here's NIST official documentation.
This documentation also contains a lot of test sets to validate correct DES implementations (with a small error... I challenge you to find it out!).
Anyway, this is another useful resource to fully understand DES.

Encrypting small messages

i need to implement a coupon-code feature. because of the number of codes required and some other constraints, i can't store them in a database. in addition the displayed codes need to be short (around 10 characters).
my original idea was to use a cryptographic function to create codes by encrypting an ongoing counter. but i'm at a loss what method to use.
Because of the counter i would be encoding only a couple of bytes and I am aware that many algorithms are not secure when used with very short messages.
Is my Approach a good idea?
What algorithm could i use?
I'm not sure if this is what you're after, and as per my comment, you have no real guarantee of security, but one possible answer could be to seed a prng with some number and give out the first x numbers as codes. As long as x is much smaller than the total possible number of outcomes, the chance for repetition is small, and codes could be validated by re-generating the sequence (you may want to hash parts of it for speed purposes)
if you use base 62: [a-z A-Z 0-9] with 10 numbers, there are over 839 quadrillion possible outcomes. If you were to give everyone on the planet a unique code, you would have used roughly 0.0000009% of your addressable space

How to generate a unique GUID from two unique GUIDs, which are order-insignificant

I have an application whereby users have their own IDs.
The IDs are unique.
The IDs are GUIDs, so they include letters and numbers.
I want a formulae whereby if I have both IDs I can find their combined GUID, regardless of which order I use them in.
These GUIDs are 16 digits long, for the example below I will pretend they are 4.
user A: x43y
user B: f29a
If I use formula X which takes two arguments: X(a,b) I want the produced code to give the same result regardless whether a = UserA or UserB's GUID.
I do not require a method to find either users IDs, given one, from this formulae - ie it is a one way method.
Thank you for any answers or direction
So I'll turn my comment into an answer. Then this question can get answered, the answer accepted (if it is good enough) and we can all move on.
Sort the GUIDs lexicographically and append the second to the first. The result is unique, and has all the other characteristics you've asked for.
Can you compress it (I know you wrote shorten but bear with me) down to 16 characters ? No you can't; not, that is, if you want to be able to decompress it again and recover the original bits. (You've written that you don't need to be able to recover the original GUIDs, skip the next paragraph if you want to.)
A GUID is, essentially, a random sequence of 128 bits. Random sequences can't, by definition, be compressed. If a sequence of 128 bits is compressible it can't be random, there would have to be some algorithm for inflating the compressed version back to 128 bits. I know that since GUIDs are generated algorithmically they're not truly random. However, in practice there is almost no point in regarding them as anything other than truly random; I certainly don't think you should waste your time trying to compress them.
Given that the total population of possible GUIDs is large, you might be satisfied by a method which takes the first half of each individual GUID and assembles a pseudo-GUID from them. Depending on how many GUIDs your system is likely to be working with, and your appetite for risk, this might satisfy your practical needs.

How do I find out the complexity needed for a Unix password

My Unix password has timed out, and I need to enter a new one, so I get this bit as soon as I login:
Current Password:####
New Password:
but anything I type is too simple (apparently), even 1y4y5re987wnf
Is there something I can type to find out the rules around the password?
You should ask your system administrator: it may be that 1y4y5re987wnf is rejected because it does not contain any special characters or no capitalized letters. You can also find more information about your required password in the file /etc/default/passwd.
I would expect you to need at least one capital letter, at least one lower case letter, at least one digit, preferably at least one punctuation or control character, and at least 8 characters in total. However, without knowing a lot more about which version of Unix (Linux, etc) you are using, no-one can be more precise than that. Different systems will impose different rules. Different systems will likely impose different upper limits on the length of a password.

simple encryption tutorial?

I'm looking for a simple encryption tutorial, for encoding a string into another string. I'm looking for it in general mathematical terms or psuedocode; we're doing it in a scripting language that doesn't have access to libraries.
We have a Micros POS ( point of sale ) system and we want to write a script that puts an encoded string on the bottom of receipts. This string is what a customer would use to log on to a website and fill out a survey about the business.
So in this string, I would like to get a three-digit hard-coded location identifier, the date, and time; e.g.:
0010912041421
Where 001 is the location identifier, 09 the year, 12 the month, and 04 the day, and 1421 the military time ( 2:41 PM ). That way we know which location the respondent visited and when.
Obviously if we just printed that string, it would be easy for someone to crack the 'code' and fill out endless surveys at our expense, without having actually visited our stores. So if we could do a simple encryption, and decode it with a pre-set key, that would be great. The decoding would take place on the website.
The encrypted string should also be about the same number of characters, to lessen the chance of people mistyping a long arbitrary string.
Encryption won't give you any integrity protection or authentication, which are what you need in this application. The customer knows when and where they made a purchase, so you have nothing to hide.
Instead, consider using a Message Authentication Code. These are often based on a cryptographic hash, such as SHA-1.
Also, you'll want to consider a replay attack. Maybe I can't produce my own code, but what's to stop me from coming back a few times with the same code? I assume you might serve more than one customer per minute, and so you'll want to accept duplicate timestamps from the same location.
In that case, you'll want to add a unique identifier. It might only be unique when combined with the timestamp. Or, you could simply extend the timestamp to include seconds or tenths of seconds.
First off, I should point out that this is probably a fair amount of work to go through if you're not solving a problem you are actually having. Since you're going to want some sort of monitoring/analysis of your survey functionality anyway, you're probably better off trying to detect suspicious behavior after the fact and providing a way to rectify any problems.
I don't know if it would be feasible in your situation, but this is a textbook case for asymmetric crypto.
Give each POS terminal it's own private key
Give each POS terminal the public key of your server
Have the terminal encrypt the date, location, etc. info (using the server's public key)
Have the terminal sign the encrypted data (using the terminal's private key)
Encode the results into human-friendly string (Base64?)
Print the string on the receipt
You may run into problems with the length of the human-friendly string, though.
NOTE You may need to flip flop the signing and encrypting steps; I don't have my crypto reference book(s) handy. Please look this up in a reputable reference, such as Applied Cryptography by Schneier.
Which language are you using/familiar with?
The Rijndael website has c source code to implement the Rijndael algorithm. They also have pseudo code descriptions of how it all works. Which is probably the best you could go with. But most of the major algorithms have source code provided somewhere.
If you do implement your own Rijndael algorithm, then be aware that the Advanced Encryption Standard limits the key and block size. So if you want to be cross compatible you will need to use those sizes I think 128 key size and 128, 192, 256 key sizes.
Rolling your own encryption algorithm is something that you should never do if you can avoid it. So finding a real algorithm and implementing it if you have to is definitely a better way to go.
Another alternative that might be easier is DES, or 3DES more specifically. But I don't have a link handy. I'll see if I can dig one up.
EDIT:
This link has the FIPS standard for DES and Triple DES. It contains all the permutation tables and such, I remember taking some 1s and 0s through a round of DES manually once. So it is not too hard to implement once you get going, just be careful not to change around the number tables. P and S Boxes they are called if I remember correctly.
If you go with these then use Triple DES not DES, 3DES actually uses two keys, doubling the key size of the algorithm, which is the only real weakness of DES. It has not been cracked as far as I know by anything other than brute force. 3DES goes through des using one key to encrypt, the other to decrypt, and the same one to encrypt again.
The Blowfish website also has links to implement the Blowfish algorithm in various languages.
I've found Cryptographic Right Answers to be a helpful guide in choosing the right cryptographic primitives to use under various circumstances. It tells you what crypto/hash to use and what sizes are appropriate. It contains links to the various cryptographic primitives it refers to.
One way would be to use AES - taking the location, year, month, and day - encoding it using a private key and then tacking on the last 4 digits (the military time) as the inversion vector. You can then convert it to some form of Base32. You'll end up with something that looks like a product key. It may be too long for you though.
A slight issue would be that you would probably want to use more digits on the military time though since you could conceivably get multiple transactions on the same day from the same location within the same minute.
What I want to use is XOR. It's simple enough that we can do it in the proprietary scripting language ( we're not going to be able to do any real encryption in it ), and if someone breaks it, they we can change the key easily enough.

Resources