How do I convert a signed 16-bit hexadecimal to decimal? - r

I have a very long sequence of data stored as unsigned 16-bit PCM (it's audio). Using Frhed, I can view the file as hex. I'd like to convert the hex into decimal. So far I've exported into a csv file and attempted to convert in R using as.integer(). However, this doesn't seem to take into account that the values are signed.

You can convert hex text strings to decimal digits using strtoi. You will need to specify that the base is 16.
HexData = c("A167", "AE8F", "6A99", "0966", "784D", "A637", "102D", "4521")
strtoi(HexData, base=16L)
[1] 41319 44687 27289 2406 30797 42551 4141 17697
This is assuming unsigned data. Your question mentions both signed and unsigned so I am not quite sure which you have.

Related

QChar stores a negative Latin1 code for multiply sign '×'

I want to get the Latin1 code for multiply sign ×, but when I check the value inside the QChar it has -41'×'.
My code:
QString data = "×";
QChar m = data.at(0);
unsigned short ascii = (unsigned short)m.toLatin1();
When I debug, in the second line I see the QChar value is -41'×'.
I changed the code:
unsigned int ascii = c.unicode();
But I get the value 215 rather and I expect 158.
The multiply sign × is not an ascii sign, as you can see when checking man ascii if you are on a unix system.
What its value is depends on the encoding, see here for its UTF representations.
For example on UTF-8 it has the value 0xC397 which are two bytes.
As is mentioned on the unicode page I linked 215 is the decimal value to represent this character in UTF-16 encoding, which is what c.unicode() returns.
I don't know why you expect 158.
There is an ascii multiply sign though, which is *.
If you check the Latin1 code table, it's obvious that × is indeed encoded as 215, or -41. Qt is giving you the correct result.
Your mistakes are:
Assuming that Latin1 is equivalent to ASCII. Latin1 merely contains ASCII, but is the superset: it defines 2x more codes than ASCII does.
Assuming that × is represented in the ASCII. It is not.
I have no clue where you got the idea that Latin1-encoded × should be 158. Surely it didn't come from the Latin1 code table! Incidentally, the Latin1 and UTF-8 encodings of × are identical.

How can I use SyncSort to convert data to unsigned packed format?

I have a requirement to convert numeric data (stored as character on input) to either packed signed or packed unsigned formats. I can convert to packed/signed using the "PD" format, but I'm having a difficult time getting unsigned packed data.
For instance, I need a ZD number like 14723 converted to:
042
173
Using PD, I get this (which is fine):
0173
042C
Any suggestions? We do not have COBOL at this shop and are relying on SyncSort to handle these data conversions. I'm not seeing a "PK" option in SyncSort, but I've missed things before!
So you don't want a packed-decimal, which always has a sign (even when F for unsigned) in the low-order half-byte. You want Binary Coded Decimal (BCD).
//STEP0100 EXEC PGM=SORT
//SYSOUT DD SYSOUT=*
//SORTOUT DD SYSOUT=*
//SYSIN DD *
OPTION COPY
INREC IFTHEN=(WHEN=INIT,OVERLAY=(1,5,ZD,MUL,+10,TO=PD,LENGTH=4)),
IFTHEN=(WHEN=INIT,BUILD=(1,3))
//SORTIN DD *
14723
Will give you, in vertical hex:
042
173
To use an existing BCD, look at field-type PD0.

What is the difference between char and nchar datatype when installing oracle 11g with unicode char set option?

I install oracle 11g with unicode char set option. And I found that I can insert unicode character into CHAR datatype column. So my question is that:
what is the difference between CHAR and NCHAR datatype when installing oracle 11g with unicode option ?
There are two main differences.
The default for the length semantic. By default
CHAR(30) != NCHAR(30) but CHAR(30 CHAR) = NCHAR(30).
The default length semantic (as specified by the NLS_LENGTH_SEMANTICS parameter) is used for CHAR but not for NCHAR. The default value for this parameter is byte. The length of NCHAR is always in characters. This is important because NCHAR(30) will always hold 30 unicode characters - as will CHAR(30 CHAR) - but CHAR(30) will only hold 30 bytes by default which may or may not equal 30 unicode characters.
AL32UTF8 (the database characterset unicode default) and AL16UTF8 (the NLS_NCHAR_CHARACTERSET default) are not equivalent. Both are variable length unicode character sets but store characters differently so storage requirements between the two vary with the former using 1, 2, 3 and sometimes 4 bytes per character and the later 2 and sometimes 4 bytes per character). Your mileage will vary depending on the characters you store.
Additionally NCHAR support is limited in many client applications and some Oracle components so if you use AL32UTF8 for the database character set, Oracle's advice is to just stick to CHAR and not use NCHAR at all.

What causes XOR encryption to return a "blank"?

What is the cause of certain characters to be blank when using XOR encryption? Furthermore, how can this be compensated for when decrypting?
For instance:
....
void basic_encrypt(char *to_encrypt) {
char c;
while (*to_encrypt) {
*to_encrypt = *to_encrypt ^ 20;
to_encrypt++;
}
}
will return "nothing" for the character k. Clearly, character decay is problematic for decryption.
I assume this is caused by the bit operator, but I am not very good with binary so I was wondering if anyone could explain.
Is it converting an element, k, in this case, to some spaceless ASCII character? Can this be compensated for by choosing some y < x < z operator where x is the operator?
Lastly, if it hasn't been compensated for, is there a realistic decryption strategy for filling in blanks besides guess and check?
'k' has the ASCII value 107 = 0x6B. 20 is 0x14, so
'k' ^ 20 == 0x7F == 127
if your character set is ASCII compatible. 127 is \DEL in ASCII, which is a non-printable character, so won't be displayed if you print it out.
You will have to know the difference between bytes and characters to understand which is happening. On the one hand you have the C char type, which is simply a presentation of a byte, not a character.
In the old days each character was mapped to one byte or octet value in a character encoding table, or code page. Nowadays we have encodings that take more bytes for certain characters, e.g. UTF-8, or even encodings that always take more than one byte such as UTF-16. The last two are unicode encodings, which means that each character has a certain number value and the encoding is used to encode this number into bytes.
Many computers will interpret bytes in ISO/IEC 8859-1 or Latin-1, sometimes extended by Windows-1252. These code pages have holes for control characters, or byte values that are simply not used. Now it depends on the runtime system how these values are handled. Java by default substitutes an ? character in place of the missing character. Other runtimes will simply drop the value or - of course - execute the control code. Some terminals may use the ESC control code to set the color or to switch to another code page (making a mess of the screen).
This is why ciphertext should be converted to another encoding, such as hexadecimals or Base64. These encodings should make sure that the result is readable text. This takes care of the cipher text. You will have to choose a character set for your plain text too, e.g. simply perform ASCII or UTF-8 encoding before encryption.
Getting a zero value from encryption does not matter because once you re-xor with the same xor key you get the original value.
value == value
value XOR value == 0 [encryption]
( value XOR value ) XOR value == value [decryption]
If you're using a zero-terminated string mechanism, then you have two main strategies for preventing 'character degradation'
store the length of the string before encryption and make sure to decrypt at least that number of characters on decryption
check for a zero character after decoding the character

ColdFusion: Integer "0" doesn't convert to ASCII character

I have a string (comprised of a userID and a date/time stamp), which I then encrypt using ColdFusion's Encrypt(inputString, myKey, "Blowfish/ECB/PKCS5Padding", "Hex").
In order to interface with a 3d party I have to then perform the following:
Convert each character pair within the resultant string into a HEX value.
HEX values are then represented as integers.
Resultant integers are then output as ASCII characters.
All the ASCII characters combine to form a Bytestring.
Bytestring is then converted to Base64.
Base64 is URL encoded and finally sent off (phew!)
It all works seamlessly, APART FROM when the original cfEncrypted string contains a "00".
The HEX value 00 translates as the integer (via function InputBaseN) 0 which then refuses to translate correctly into an ASCII character!
The resultant Bytestring (and therefore url string) is messed up and the 3d party is unable to decipher it.
It's worth mentioning that I do declare: <cfcontent type="text/html; charset=iso-8859-1"> at the top of the page.
Is there any way to correctly output 00 as ASCII? Could I avoid having "00" within the original encrypted string? Any help would be greatly appreciated :)
I'm pretty sure ColdFusion (and the Java underneath) use a null-terminated string type. This means that every string contains one and only one asc(0) char, which is the string terminator. If you try to insert an asc(0) into a string, CF is erroring because you are trying to create a malformed string element.
I'm not sure what the end solution is. I would play around with toBinary() and toString(), and talk to your 3rd party vendor about workarounds like sending the raw hex values or similar.
Actually there is a very easy solution. The credit card company who is processing your request needs you to convert it to lower case letters of hex. The only characters processed are :,-,0-9 do a if else and convert them manually into a string.

Resources