Please help identify multi-byte character encoding scheme on ASP Classic page - asp-classic

I'm working with a 3rd party (Commidea.com) payment processing system and one of the parameters being sent along with the processing result is a "signature" field. This is used to provide a SHA1 hash of the result message wrapped in an RSA encrypted envelope to provide both integrity and authenticity control. I have the API from Commidea but it doesn't give details of encoding and uses artificially created signatures derived from Base64 strings to illustrate the examples.
I'm struggling to work out what encoding is being used on this parameter and hoped someone might recognise the quite distinctive pattern. I initially thought it was UTF8 but having looked at the individual characters I am less sure.
Here is a short sample of the content which was created by the following code where I am looping through each "byte" in the string:
sig = Request.Form("signature")
For x = 1 To LenB(sig)
s = s & AscB(MidB(sig,x,1)) & ","
Next
' Print s to a debug log file
When I look in the log I get something like this:
129,0,144,0,187,0,67,0,234,0,71,0,197,0,208,0,191,0,9,0,43,0,230,0,19,32,195,0,248,0,102,0,183,0,73,0,192,0,73,0,175,0,34,0,163,0,174,0,218,0,230,0,157,0,229,0,234,0,182,0,26,32,42,0,123,0,217,0,143,0,65,0,42,0,239,0,90,0,92,0,57,0,111,0,218,0,31,0,216,0,57,32,117,0,160,0,244,0,29,0,58,32,56,0,36,0,48,0,160,0,233,0,173,0,2,0,34,32,204,0,221,0,246,0,68,0,238,0,28,0,4,0,92,0,29,32,5,0,102,0,98,0,33,0,5,0,53,0,192,0,64,0,212,0,111,0,31,0,219,0,48,32,29,32,89,0,187,0,48,0,28,0,57,32,213,0,206,0,45,0,46,0,88,0,96,0,34,0,235,0,184,0,16,0,187,0,122,0,33,32,50,0,69,0,160,0,11,0,39,0,172,0,176,0,113,0,39,0,218,0,13,0,239,0,30,32,96,0,41,0,233,0,214,0,34,0,191,0,173,0,235,0,126,0,62,0,249,0,87,0,24,0,119,0,82,0
Note that every other value is a zero except occasionally where it is 32 (0x20). I'm familiar with UTF8 where it represents characters above 127 by using two bytes but if this was UTF8 encoding then I would expect the "32" value to be more like 194 (0xC2) or (0xC3) and the other value would be greater than 0x80.
Ultimately what I'm trying to do is convert this signature parameter into a hex encoded string (eg. "12ab0528...") which is then used by the RSA/SHA1 function to verify the message is intact. This part is already working but I can't for the life of me figure out how to get the signature parameter decoded.
For historical reasons we are having to use classic ASP and the SHA1/RSA functions are javascript based.
Any help would be much appreciated.
Regards,
Craig.
Update: Tried looking into UTF-16 encoding on Wikipedia and other sites. Can't find anything to explain why I am seeing only 0x20 or 0x00 in the (assumed) high order byte positions. I don't think this is relevant any more as the example below shows other values in this high order position.
Tried adding some code to log the values using Asc instead of AscB (Len,Mid instead of LenB,MidB too). Got some surprising results. Here is a new stream of byte-wise characters followed by the equivalent stream of word-wise (if you know what I mean) characters.
21,0,83,1,214,0,201,0,88,0,172,0,98,0,182,0,43,0,103,0,88,0,103,0,34,33,88,0,254,0,173,0,188,0,44,0,66,0,120,1,246,0,64,0,47,0,110,0,160,0,84,0,4,0,201,0,176,0,251,0,166,0,211,0,67,0,115,0,209,0,53,0,12,0,243,0,6,0,78,0,106,0,250,0,19,0,204,0,235,0,28,0,243,0,165,0,94,0,60,0,82,0,82,0,172,32,248,0,220,2,176,0,141,0,239,0,34,33,47,0,61,0,72,0,248,0,230,0,191,0,219,0,61,0,105,0,246,0,3,0,57,32,54,0,34,33,127,0,224,0,17,0,224,0,76,0,51,0,91,0,210,0,35,0,89,0,178,0,235,0,161,0,114,0,195,0,119,0,69,0,32,32,188,0,82,0,237,0,183,0,220,0,83,1,10,0,94,0,239,0,187,0,178,0,19,0,168,0,211,0,110,0,101,0,233,0,83,0,75,0,218,0,4,0,241,0,58,0,170,0,168,0,82,0,61,0,35,0,184,0,240,0,117,0,76,0,32,0,247,0,74,0,64,0,163,0
And now the word-wise data stream:
21,156,214,201,88,172,98,182,43,103,88,103,153,88,254,173,188,44,66,159,246,64,47,110,160,84,4,201,176,251,166,211,67,115,209,53,12,243,6,78,106,250,19,204,235,28,243,165,94,60,82,82,128,248,152,176,141,239,153,47,61,72,248,230,191,219,61,105,246,3,139,54,153,127,224,17,224,76,51,91,210,35,89,178,235,161,114,195,119,69,134,188,82,237,183,220,156,10,94,239,187,178,19,168,211,110,101,233,83,75,218,4,241,58,170,168,82,61,35,184,240,117,76,32,247,74,64,163
Note the second pair of byte-wise characters (83,1) seem to be interpreted as 156 in the word-wise stream. We also see (34,33) as 153 and (120,1) as 159 and (220,2) as 152. Does this give any clues as the encoding? Why are these 15[2369] values apparently being treated differently from other values?
What I'm trying to figure out is whether I should use the byte-wise data and carry out some post-processing to get back to the intended values or if I should trust the word-wise data with whatever implicit decoding it is apparently performing. At the moment, neither seem to give me a match between data content and signature so I need to change something.
Thanks.

Quick observation tells me that you are likely dealing with UTF-16. Start from there.

Related

How does 95cd 21eb fc from Farbrausch's "fuenf" translate into

In 2001 German scene group Farbrausch released a demo called "fuenf" (in your face). pouet.net It contains a 5 Byte executable which could be rather considered a troll approach than a demo. If you run it your hear a weird sound and it could crash your computer. At least it produces a sound. Whatever.
The hexadecimal content is:
95cd 21eb fc
And the binary representation is:
10010101 11001101 00100001 11101011 11111100
Using xxd I also get the printable chars from the content, which are:
..!..
And that makes me a little confused. Looking up the values in the ASCII table (e.g. here), I get this as a result:
•Í!ëü
At least the exclamation mark is correct.
But how does 95cd21ebfc translate into ..!..?
Side note:
file -bi fuenf.com sais the encoding is not known:
charset=unknown-8bit
And iconv -f ISO-8859-1 -t UTF-8 fuenf.com returns
Í!ëü
Which leads to the assumption, that XXD simply cannot decode the content and therefore just uses default results, like the dot?
First of all, this is not a text file, so looking at it as one makes no sense. It's instructions.
Secondly, even if it could be interpreted as text, you would need to know the encoding. It's definitely not ASCII, because that only defines symbols in the range 0-127 (and the 3rd byte here is the only one in that range, which maps to '!'). The "extended ASCII" table you link to is only one of many possible code pages that give meaning to the value from 128-255, but there are many of those code pages. Calling it "extended ASCII" is misleading, because it suggests that ASCII created an updated standard for this, which they did not. For a while, computer vendors just did whatever they wanted with those additional characters, and some of them became quasi-standards by virtue of being included in DOS, Windows, etc. Or they got standardized by ISO (you tried iso-8859-1, which is one such standard).

Given final block not properly padded. Such issues can arise if a bad key is used during decryption

Hi guys I encrypted school project but my AES saved txt has been deleted, I pictured it before and I filled a new file. But new AES key file is not equal to the typed in jpeg file. Which character is wrong I couldn't find it. Could you please help me.
Pic : https://i.stack.imgur.com/pAXzl.jpg
Text file : http://textuploader.com/dfop6
If you directly convert bytes with any value to Unicode you may lose information because some bytes will not correspond to a Unicode character, a whitespace character or other information that cannot be easily distinguished in printed out form.
Of course there may be ways to brute force your way out of this, but this could easily result in very complex code and possibly near infinite running time. Better start over, and if you want to use screen shots or similar printed text: base 64 or hex encode your results; those can be easily converted back.

Using Coldfusion's Encrypt function to encrypt a hex block and return a block-length result

My company is working on a project that will put card readers in the field. The readers use DUKPT TripleDES encryption, so we will need to develop software that will decrypt the card data on our servers.
I have just started to scratch the surface on this one, but I find myself stuck on a seemingly simple problem... In trying to generate the IPEK (the first step to recreating the symmetric key).
The IPEK's a 16 byte hex value created by concatenating two triple DES encrypted 8 byte hex strings.
I have tried ECB and CBC (zeros for IV) modes with and without padding, but the result of each individual encoding is always 16 bytes or more (2 or more blocks) when I need a result that's the same size as the input. In fact, throughout this process, the cyphertexts should be the same size as the plaintexts being encoded.
<cfset x = encrypt("FFFF9876543210E0",binaryEncode(binaryDecode("0123456789ABCDEFFEDCBA98765432100123456789ABCDEF", "hex"), "base64") ,"DESEDE/CBC/PKCS5Padding","hex",BinaryDecode("0000000000000000","hex"))>
Result: 3C65DEC44CC216A686B2481BECE788D197F730A72D4A8CDD
If you use the NoPadding flag, the result is:
3C65DEC44CC216A686B2481BECE788D1
I have also tried encoding the plaintext hex message as base64 (as the key is). In the example above that returns a result of:
DE5BCC68EB1B2E14CEC35EB22AF04EFC.
If you do the same, except using the NoPadding flag, it errors with "Input length not multiple of 8 bytes."
I am new to cryptography, so hopefully I'm making some kind of very basic error here. Why are the ciphertexts generated by these block cipher algorithms not the same lengths as the plaintext messages?
For a little more background, as a "work through it" exercise, I have been trying to replicate the work laid out here:
https://www.parthenonsoftware.com/blog/how-to-decrypt-magnetic-stripe-scanner-data-with-dukpt/
I'm not sure if it is related and it may not be the answer you are looking for, but I spent some time testing bug ID 3842326. When using different attributes CF is handling seed and salt differently under the hood. For example if you pass in a variable as the string to encrypt rather than a constant (hard coded string in the function call) the resultant string changes every time. That probably indicates different method signatures - in your example with one flag vs another flag you are seeing something similar.
Adobe's response is, given that the resulting string can be unecrypted in either case this is not really a bug - more of a behavior to note. Can your resultant string be unencrypted?
The problem is encrypt() expects the input to be a UTF-8 string. So you are actually encrypting the literal characters F-F-F-F-9.... rather than the value of that string when decoded as hexadecimal.
Instead, you need to decode the hex string into binary, then use the encryptBinary() function. (Note, I did not see an iv mentioned in the link, so my guess is they are using ECB mode, not CBC.) Since the function also returns binary, use binaryEncode to convert the result to a more friendly hex string.
Edit: Switching to ECB + "NoPadding" yields the desired result:
ksnInHex = "FFFF9876543210E0";
bdkInHex = "0123456789ABCDEFFEDCBA98765432100123456789ABCDEF";
ksnBytes = binaryDecode(ksnInHex, "hex");
bdkBase64 = binaryEncode(binaryDecode(bdkInHex, "hex"), "base64");
bytes = encryptBinary(ksnBytes, bdkBase64, "DESEDE/ECB/NoPadding");
leftRegister = binaryEncode(bytes, "hex");
... which produces:
6AC292FAA1315B4D
In order to do this we want to start with our original 16 byte BDK
... and XOR it with the following mask ....
Unfortunately, most of the CF math functions are limited to 32 bit integers. So you probably cannot do that next step using native CF functions alone. One option is to use java's BigInteger class. Create a large integer from the hex strings and use the xor() method to apply the mask. Finally, use the toString(radix) method to return the result as a hex string:
bdkText ="0123456789ABCDEFFEDCBA9876543210";
maskText = "C0C0C0C000000000C0C0C0C000000000";
// use radix=16 to create integers from the hex strings
bdk = createObject("java", "java.math.BigInteger").init(bdkText, 16);
mask = createObject("java", "java.math.BigInteger").init(maskText, 16);
// apply the mask and convert the result to hex (upper case)
newKeyHex = ucase( bdk.xor(mask).toString(16) );
WriteOutput("<br>newKey="& newKeyHex);
writeOutput("<br>expected=C1E385A789ABCDEF3E1C7A5876543210");
That should be enough to get you back on track. Given some of CF's limitations here, java would be a better fit IMO. If you are comfortable with it, you could write a small java class and invoke that from CF instead.

What is the encoding of this data?

Who can tell me the encoding type of these data? It doesn't seem to be base64.
/zZ/u00GIaP9HW010G000G01003/sm1302WS7YCU6IWZ8ICjAoWmF6H1F3StF7jONKbaaO2Pbe+
0Z8gWjER3eAhQhOgCoF/Bskxr////cy7////w/+Rz//Z/sm130IijBJmrF7P1GNRufOob+FZu+F
Zu+FZu+FZu+FZu+FZu+FZu+FZu+FZu+FZu+FZu+FZu+FZu+FZu+FZu+FZu+FZ/m00H200X07430
I800X410n41/yG07m000GK10G410G400000000000420mG51WS82GeB/yG0jH000W430m840mK5
10G0005z0G8300GH1H8XCK464r5X1o9n53A1aQ488qAnmHLIqV0aCs9oWWaA5XSO6Heb9YSeAIe
qDJOtE3awGqH5HaT8IKfJL5LMLrXPMcDaPMPdQ6bgStHrTdTuUNg3X8M6XuY9YfAJb9MMbvYPcg
AZfAMcfwYfghApjBMsjxYvkiB3nCN6nyZ9ojBJrDNMrzZPsk7Yu+JbvkVewUhnylFqzVRt+Fdw/
yG07m400m410G410G410G00000000420mG51WS82GeB/yG0jH400W4210G310S510G00G9t0042
0n441I4n1X91KGTXSHCYCe4854AHeR712ICpKl0LOdBH2XOaDE4byHSO6Hec9oWfAZKsDpWvEaD
4HKP7I4bAKrHLLbTOMLfZP6LcPsXfQdDqTNPtU7bwWeE4XOQ7Y8cAafEKbPQNc9cQegEafQQdgA
cgihEqjRQtkBcwmiF4nSR7oCdAqjFKrTRNsDdQukFavURdwEdgylFqzVRt+Fdw/ze030C1008H0
n40Fm2vQMjkzf0pGHVSKab1ad5I/PBPkbl5Z/S7D5dyrb1dfvQ/ZnIotCAIB6x7B38LLB5lm7Qc
G9zajcwMyMFzmSr18sd82pnn1586H5a4aP7EFIfblRUMRoVCsj/TO5IVph7kBAUH1TnkMJPl18s
bU/JFuvxydwWpLYZi9s8YItR7OAkV/m1LFIsjPLLrjujX08FbWPhdxUEIvlnvESbzsuZE1dgSd+
jT7DF77jyni1ZXG0IMFq52JUmCRzajcwMyMFy0S7D7sIsRfRnO/m1mSqXlRSo17VPaP0TIkVpxK
iy+C95jQHssgfFV6Sds0fyhMuYbBFPBUHsyTh4+M2imK31pzAjImsKKPbaWYMDUfyiS/fM8xgcg
ix7v1EIJrutLSdqoUxccdSX2I0e7Ex0ndhmE/Vy0ngQIjOQBqCTZSblAXYPKE2H6C4/N5IVPBPk
bl5Z/071pMHRwQ3V97vmEq1sKZ3O/0d+VlMpBSHHZzv8gZhZkVmzAWFGRzajcwMyMFzmSqVPBPk
bl5Z/S7DBzfYPrGahk+xkKZT+TJVV/0Dt+T0Qd6qKKKYZgxFvhA3FJor/7Yi+rbaRKBif5vpbzf
F2XL1nr/BZlYj2p+QoWpqyjVnugl5ljRYLNHtXcSoAw8JrwWu/3wqo1VEZakaIxjbZaF+hSuODW
yOFzAfHtKgj1O+KZgGkL2bICW7a+eF9EEsQkp1hwvX2HbOeM3cHq8px3FwrFBPmN4WaTCaSWXYC
dru/3ds50p1byrBpVRAoC+S8WzEm0wZZkFK7a6j4oksgkVACiYe0g361m2VcFrFrpLo6pWZVUY3
e02UJZ6C0+d+UcAYn91TFAoj97C536DUX0nqzEl+UkbDsk9wYpJ8P45vRg49mid3BdwaSVvxKv5
bCiapnAt92yu9K7W3sxzidZfKTtklWi4KR1IGnaT201xPxrU+//0Blyw9Q90SvgCPMyas/SPYm9
mESPFFy08VJ7NdRUQIApCio1olB2FE2Czi+t+SK2pWPjs7FkP6EUdlx3yXKWXZC8Y0Fb3W0a/m0
wKfNIGRb3Hcy/xHCxPTt+PSzktuSdyglI97l+q6CiLN0mCa/GVv/AajxQA5I8a037B7+yVyAYbb
dIut6DfBSZ0239pKeQrUX3VJ6TOeoZHHkGJ98E1MZz/m3tVvrHkNUyZycE1nd1tEk0Ajn9jYHCv
LL0p/UflOgMoHo5555G0KKKK0555501HHHG0KKKK0555501HHHG0KKKK0555507/za
(Line endings inserted for readability)
It appears to be base64. If you add a single equal ('=') for padding to the end your decoder should be happy (see https://en.wikipedia.org/wiki/Base64#Padding).
Decoded it's 1568 bytes which mod 16 is zero. This histogram of byte value occurance is flat. So I'd guess something encrypted with a 128 bit block cipher like AES.
It does look like base64 to me. Most variants of base64 include the following characters:
A-Z a-z 0-9 + / (and = for padding)
However, if it were proper base64 it would end with a single = as padding, as the 2091 characters don't exactly fit a number of bytes.
Your data doesn't seem to decode to anything readable, so it might be binary data, or encrypted (or both). Only with thorough knowledge of cryptography systems, and a lot of hints and luck, might some expert be able to figure out the encryption used (if any), but that's beyond the scope of this site.
Without more information as to the source of the data, we can only guess.

GS1 support in a QR encoder/decoder?

Very few QR encoders/decoders have (explicit) support for so-called GS1 encoding. Zint is one of the exceptions (under QR select GS-1 Data Mode), but its license prevents me from using it. Commercial offers, mainly from Tec-It, are costly, especially because I'm not interested in all other kinds of barcodes they support.
Is there a way to add GS1 support to any QR encoder/decoder without changing its source? For example, could I apply some algorithm to transform textual GTIN AI data into compatible binary? I think it should be possible, because after all, it's still QR. Please note that I am not a data coding expert - I'm just looking for a way to deal with this standard without paying a small fortune. So far, I found postscriptbarcode which does have support for it, and seems to use its own QR engine, but output quality is so-so and my PostScript skills are far too limited to figure out the algorithm.
As long as the library supports decoding of the FNC1 special character, it can be used to read GS1 codes. The FNC1 character is not a byte in the data-stream, but more of a formatting symbol.
The specification says that a leading FNC1-character is used to identify GS1 barcodes, and should be decoded as "]d2" (GS1 DataMatrix), "]C1" (GS1-128), "]e0" (GS1 DataBar Omnidirectional) or "]Q3" (GS1 QR Code). Any other FNC1-characters should be decoded as ASCII GS-characters (byte value 29).
Depending on the library, the leading FNC1 might be missing, or decoded as GS (not critical), or the embedded FNC1-characters might be missing (critical). The embedded FNC1-characters are used to delimit variable-length fields.
You can read the full specification here (pdf). The algorithm for decoding the data can be found under heading 7.9 Processing of Data from a GS1 Symbology using GS1 Application Identifiers (page 426).
The algorithm goes something like this:
Peek at the first character.
If it is ']',
If string does not start with ']C1' or ']e0' or ']d2' or ']Q3',
Not a GS1 barcode.
Stop.
Consume the caracters.
Else if it is <GS>,
Consume character.
Else,
No symbology identifier, assume GS1.
While not end of input,
Read the first two digits.
If they are in the table of valid codes,
Look up the length of the AI-code.
Read the rest of the code.
Look up the length of the field.
If it is variable-length,
Read until the next <FNC1> or <GS>.
Else,
Read the rest if the field.
Peek at the next character.
If it is <FNC1> or <GS>, consume it.
Save the read field.
Else,
Error: Invalid AI
The binary data in the QR Code is encoded as 4-bit tokens, with embedded data.
0111 -> Start Extended Channel Interpretation (ECI) Mode (special encodings).
0001, 0010, 0100, 1000 -> start numeric, alphanumeric, raw 8-bit, kanji encoded data.
0011 -> structured append (combine two or more QR Codes to one data-stream).
0101 -> FNC1 initial position.
1001 -> FNC1 other positions.
0000 -> End of stream (can be omitted if not enough space).
After an encoding specification comes the data-length, followed by the actual data. The meanings of the data bits depends on the encoding used. In between the data-blocks, you can squeeze FNC1 characters.
The QR Code specification (ISO/IEC 18004) unfortunately costs money (210 Franc). You might find some pirate version online though.
To create GS1 QR Codes, you need to be able to specify the FNC1-characters in the data. The library should either recognize the "]Q3" prefix and GS-characters, or allow you to write FNC1 tokens via some other method.
If you have some way to write the FNC1-characters, you can encode GS1 data as follows:
Write initial FNC1.
For each field,
Write the AI-code as decimal digits.
Write field data.
If the code is a variable-length field,
If not the last field,
Write FNC1 to terminate the field.
If possible, you should order the fields such that a variable-length field comes last.
As noted by Terry Burton in the comments; The FNC1 symbol in a GS1 QR Code can be encoded as % in alphanumeric data, and as GS in byte mode. To encode an actual percent symbol, you write it as %%.
To encode (01) 04912345123459 (15) 970331 (30) 128 (10) ABC123, you first combine it into the data string 01049123451234591597033130128%10ABC123 (% indicator is the encoded FNC1 symbol). This string is then written as
0101 - Initial FNC1, GS1 mode indicator
0001 - QR numeric mode
0000011101 - Data length (29)
<data bits for "01049123451234591597033130128">
0010 - QR alphanumeric mode
000001001 - Data length (9)
<data bits for "%10ABC123">
(Example from the ISO 18004:2006 specification)

Resources