Understanding IR codes for Samsung TV - arduino

Can somebody help me to understand how could I use RAW IR data in an project using ESP8266-HTTP-IR-Blaster library ?
I've created a NodeMCU board with an IR sender and receiver according to https://github.com/mdhiggins/ESP8266-HTTP-IR-Blaster
Everything is working fine as long as I'm using the captured codes for ex.:
http://NodeMCU-IP/msg?code=E0E040BF:SAMSUNG:32
This is the Code for the Power button (E0E040BF)
As I'm using it in a home automation system, it would very beneficial for me if I had a dedicated ON and OFF sequence, which I found here: http://www.remotecentral.com/cgi-bin/mboard/rc-discrete/thread.cgi?5780 , but unable to translate, modify, or send it via RAW data.
Using Node-red, tried many ways, also implemented a MQTT client to the original project, but did not accept these codes anyhow I try. Does not sends it.
Also tried as a JSON, didn't help.
[
{
"type":"raw",
"data":"[0000, 006D, 0000, 0022, 00AC, 00AC, 0015, 0040, 0015, 0040, 0015, 0040, 0015, 0015, 0015, 0015, 0015, 0015, 0015, 0015, 0015, 0015, 0015, 0040, 0015, 0040, 0015, 0040, 0015, 0015, 0015, 0015, 0015, 0015, 0015, 0015, 0015, 0015, 0015, 0015, 0015, 0015, 0015, 0015, 0015, 0040, 0015, 0040, 0015, 0015, 0015, 0015, 0015, 0040, 0015, 0040, 0015, 0040, 0015, 0040, 0015, 0015, 0015, 0015, 0015, 0040, 0015, 0040, 0015, 0015, 0015, 0689]",
"khz":38
}
]
Any idea what could I try next?

A good introduction to the 'Pronto format' that you show above is at Remote Central
For the specific example above, for a Samsung OFF code given at your remote central link, the full code is given as a sequence of 16-bit numbers represented in hexadecimal with spaces in between:
0000 006D 0000 0022 00AC 00AC 0015 0040 0015 0040 0015 0040 0015 0015 0015 0015 0015 0015 0015 0015 0015 0015 0015 0040 0015 0040 0015 0040 0015 0015 0015 0015 0015 0015 0015 0015 0015 0015 0015 0015 0015 0015 0015 0015 0015 0040 0015 0040 0015 0015 0015 0015 0015 0040 0015 0040 0015 0040 0015 0040 0015 0015 0015 0015 0015 0040 0015 0040 0015 0015 0015 0689
You can break that down as a preamble (broken out here for interest, but not needed if you already have a working Samsung code):
0x0000 - This is raw code format data
0x006D - Frequency 109 decimal = 38.028kHz (see above link for calculation)
0x0000 - No burst pairs in first sequence
0x0022 - 34 decimal - 34 burst pairs of signal follow
00AC 00AC - First burst of signal - on for 0xAC (172 decimal) cycles at 38kHz, off for the same amount
After that comes 32 pairs of data "burst pairs" (which is likely the only bit you need if you already have other codes for the same device)
0015 0689 - Final burst of signal - on for 0x15 (21 decimal) cycles, off for 0x689 (1673 decimal) cycles, guaranteeing 44ms without any IR before the next code can be transmitted
To interpret the data manually, copy it out (e.g. into a text editor) in groups of 8 numbers:
0015 0040 0015 0040 0015 0040 0015 0015
0015 0015 0015 0015 0015 0015 0015 0015
0015 0040 0015 0040 0015 0040 0015 0015
0015 0015 0015 0015 0015 0015 0015 0015
0015 0015 0015 0015 0015 0015 0015 0040
0015 0040 0015 0015 0015 0015 0015 0040
0015 0040 0015 0040 0015 0040 0015 0015
0015 0015 0015 0040 0015 0040 0015 0015
Then:
Ignore the columns where all the numbers are the same (even columns above, which represent the on time - 0x15 = 21 decimal cycles of IR at 38kHz )
For the remaining columns (which represent the off-time), replace the big numbers (0x40 in this case) with '1' and the small (0x15) with '0'.
For the first line
0015 0040 0015 0040 0015 0040 0015 0015
ignoring the even numbered columns leaves:
0040 0040 0040 0015
replacing those with 1's and 0's
1 1 1 0
and if you convert that into hexadecimal, it's 'E'
Next line is '0', then 'E' then '0' (already it's comforting to see it starting with the same E0E0 that starts your other Samsung code above...), and the remaining lines make it
E0E019E6
Doing the same with the ON code gives you
E0E09966
And as I've needed to solve the same problem just recently for the same codes, I can confirm that my Samsung TV responds to those codes as OFF and ON.
Not surprisingly, there are a variety of software tools to convert between formats, and a huge range of formats to describe the same signal (explained very well by xkcd). For example, irdb on GitHUB will decode the above string to "Protocol NECx2, device 7, subdevice 7, OBC 152". It's up to you to know that you have to
bit-reverse the device number '07' to get 'E0'
bit-reverse the subdevice number (also '07') to get 'E0'
convert 152 to hexadecimal and reverse the bits to get '19'
calculate the last two digits as ( 0xFF - the bit-reversed OBC ), 0xFF - 0x19 = 0xE6, giving the final 8 bits 'E6'

Related

How could I determine this checksum pattern?

Let's assume that I have 512-byte packets plus a 16-bit CRC at the end. I would like to determine what the CRC parameters are.
It's a Fujitsu chip, where I'm writing the the flash with a programmer, the programmer calculates the CRC for me, and I read out the CRC with an oscilloscope. I have the ability to check every possible combination.
My test messages are 512 zeros except for one byte that I set to the values 0 to 17 in decimal. The one byte is one of the first four or last two in the packet. Here are the resulting CRCs in hexadecimal, where the rows are the value of the byte, and the columns are which byte is set:
00 01 02 03 510 511
00 00 00 00 00 00 00
01 0x8108 0x0100 0x3020 0xC6B0 0xF1F0 0x8108
02 0x8318 0x0200 0x6040 0x0C68 0x62E8 0x8318
03 0x0210 0x0300 0x5060 0xCAD8 0x9318 0x0210
04 0x8738 0x0400 0xC080 0x18D0 0xC5D0 0x8738
05 0x0630 0x0500 0xF0A0 0xDE60 0x3420 0x0630
06 0x0420 0x0600 0xA0C0 0x14B8 0xA738 0x0420
07 0x8528 0x0700 0x90E0 0xD208 0x56C8 0x8528
08 0x8F78 0x0800 0x0008 0x31A0 0x0AA8 0x8F78
09 0x0E70 0x0900 0x3028 0xF710 0xFB58 0x0E70
10 0x0C60 0x0A00 0x6048 0x3DC8 0x6840 0x0C60
11 0x8D68 0x0B00 0x5068 0xFB78 0x99B0 0x8D68
12 0x0840 0x0C00 0xC088 0x2970 0xCF78 0x0840
13 0x8948 0x0D00 0xF0A8 0xEFC0 0x3E88 0x8948
14 0x8B58 0x0E00 0xA0C8 0x2518 0xAD90 0x8B58
15 0x0A50 0x0F00 0x90E8 0xE3A8 0x5C60 0x0A50
16 0x9FF8 0x1000 0x0010 0x6340 0x1550 0x9FF8
17 0x1EF0 0x1100 0x3030 0xA5F0 0xE4A0 0x1EF0
As you can see the first and last bytes give the same value. I tried several variations of CRC-16, but without much luck. The closet one was CRC-16 with polynomial 0x1021 and initial value 0.
The fact that every single CRC ends in 0 or 8 strongly suggests that it is not a 16-bit CRC, but rather a 13-bit CRC. Indeed, all of the sequences check against a 13-bit CRC with polynomial 0x1021 not reflected, initial value zero, and final exclusive-or zero.
We can't be sure about the initial value and final exclusive-or unless you can provide at least one packet with a length other than 512. With only examples of a single length, there are 8,191 other combinations of initial values and final exclusive-ors that would produce the exact same CRCs.

Decode Epson print (ESC-i) command decoding/encoding

I'm trying to understand the algorithm used for compression value = 1 with the Epson ESCP2 print command, "ESC-i". I have a hex dump of a raw print file which looks, in part, like the hexdump below (note little-endian format issues).
000006a 1b ( U 05 00 08 08 08 40 0b
units; (page1=08), (vt1=08), (hz1=08), (base2=40 0b=0xb40=2880)
...
00000c0 691b 0112 6802 0101 de00
esc i 12 01 02 68 01 01 00
print color1, compress1, bits1, bytes2, lines2, data...
color1 = 0x12 = 18 = light cyan
compress1 = 1
bits1 (bits/pixel) = 0x2 = 2
bytes2 is ??? = 0x0168 = 360
lines2 is # lines to print = 0x0001 = 1
00000c9 de 1200 9a05 6959
00000d0 5999 a565 5999 6566 5996 9695 655a fd56
00000e0 1f66 9a59 6656 6566 5996 9665 9659 6666
00000f0 6559 9999 9565 6695 9965 a665 6666 6969
0000100 5566 95fe 9919 6596 5996 5696 9666 665a
0000110 5956 6669 0456 1044 0041 4110 0040 8140
0000120 9000 0d00
1b0c 1b40 5228 0008 5200 4d45
FF esc # esc ( R 00 REMOTE1
The difficulty I'm having is how to decode the data, starting at 00000c9, given 2 bits/pixel and the count of 360. It's my understanding this is some form of tiff or rle encoding, but I can't decode it in a way that makes sense. The output was produced by gutenprint plugin for GIMP.
Any help would be much appreciated.
The byte count is not a count of the bytes in the input stream; it is a count of the bytes in the input stream as expanded to an uncompressed form. So when expanded, there should be a total of 360 bytes. The input bytes are interpreted as either a count of bytes to follow, if positive, in which case the count is the byte value +1; and if negative the count is a count of the number of times the immediately following byte should be expanded, again, +1. The 0D at the end is a terminating carriage return for the line as a whole.
The input stream is only considered as a string of whole bytes, despite the fact that the individual pixel/nozzle controls are only 2 bits each. So it is not really possible to use a repeat count for something like a 3-nozzle sequence; a repeat count must always specify a full byte 4-nozzle combination.
The above example then specifies:
0xde00 => repeat 0x00 35 times
0x12 => use the next 19 bytes as is
0xfd66 => repeat 0x66 4 times
0x1f => use the next 32 bytes as is
etc.

UTF-8 hex to unicode code point (only math)

Let's take this table with characters and HEX encodings in Unicode and UTF-8.
Does anyone know how it is possible to convert UTF-8 hex to Unicode code point using only math operations?
E.g. let's take the first row. Given 227, 129 130 how to get 12354?
Is there any simple way to do it by using only math operations?
Unicode code point
UTF-8
Char
30 42 (12354)
e3 (227) 81 (129) 82 (130)
あ
30 44 (12356)
e3 (227) 81 (129) 84 (132)
い
30 46 (12358)
e3 (227) 81 (129) 86 (134)
う
* Source: https://www.utf8-chartable.de/unicode-utf8-table.pl?start=12288&unicodeinhtml=hex
This video is the perfect source (watch from 6:15), but here is its summary and code sample in golang. With letters I mark bits taken from UTF-8 bytes, hopefully it makes sense. When you understand the logic it's easy to apply bitwise operators):
Bytes
Char
UTF-8 bytes
Unicode code point
Explanation
1-byte (ASCII)
E
1. 0xxx xxxx0100 0101 or 0x45
1. 0xxx xxxx0100 0101 or U+0045
no conversion needed, the same value in UTF-8 and unicode code point
2-byte
Ê
1. 110x xxxx2. 10yy yyyy1100 0011 1000 1010 or 0xC38A
0xxx xxyy yyyy0000 1100 1010 or U+00CA
1. First 5 bits of the 1st byte2. First 6 bits of the 2nd byte
3-byte
あ
1. 1110 xxxx2. 10yy yyyy3. 10zz zzzz1110 0011 1000 0001 1000 0010 or 0xE38182
xxxx yyyy yyzz zzzz0011 0000 0100 0010 or U+3042
1. First 4 bits of the 1st byte2. First 6 bits of the 2nd byte3. First 6 bits of the 3rd byte
4-byte
𐄟
1. 1111 0xxx2. 10yy yyyy3. 10zz zzzz4. 10ww wwww1111 0000 1001 0000 1000 0100 1001 1111 or 0xF090_849F
000x xxyy yyyy zzzz zzww wwww0000 0001 0000 0001 0001 1111 or U+1011F
1. First 3 bits of the 1st byte2. First 6 bits of the 2nd byte3. First 6 bits of the 3rd byte4. First 6 bits of the 4th byte
2-byte UTF-8
func get(byte1 byte, byte2 byte) {
int1 := uint16(byte1 & 0b_0001_1111) << 6
int2 := uint16(byte2 & 0b_0011_111)
return rune(int1 + int2)
}
3-byte UTF-8
func get(byte1 byte, byte2 byte, byte3 byte) {
int1 := uint16(byte1 & 0b_0000_1111) << 12
int2 := uint16(byte2 & 0b_0011_111) << 6
int3 := uint16(byte3 & 0b_0011_111)
return rune(int1 + int2 + int3)
}
4-byte UTF-8
func get(byte1 byte, byte2 byte, byte3 byt3, byte4 byte) {
int1 := uint(byte1 & 0b_0000_1111) << 18
int2 := uint(byte2 & 0b_0011_111) << 12
int3 := uint(byte3 & 0b_0011_111) << 6
int4 := uint(byte4 & 0b_0011_111)
return rune(int1 + int2 + int3 + int4)
}

DICOM File Meta Information Version padded with zeros

I've got a DICOM file that starts like this, and most of it makes sense to me based on the spec parts 5, 6 and 10, but the File Meta Information Version element (0002,0001) has me foxed.
00000000: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000010: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000020: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000030: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000040: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000050: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000060: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000070: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000080: 4449 434d 0200 0000 554c 0400 ce00 0000 DICM....UL......
00000090: 0200 0100 4f42 0000 0200 0000 0001 0200 ....OB..........
000000a0: 0200 5549 1e00 312e 322e 3834 302e 3130 ..UI..1.2.840.10
000000b0: 3030 382e 352e 312e 342e 312e 312e 3737 008.5.1.4.1.1.77
000000c0: 2e31 2e36 0200 0300 5549 3800 312e 322e .1.6....UI8.1.2.
This is the bit that I don't understand:
00000090: 0200 0100 4f42 0000 0200 0000 0001
The first four bytes are the (0002,0001) tag, and the next two are the VR 4f42 = OB. I expect 0200 for the value length (2 bytes) and 0001 for the version, but what are the two sets of 0000 in between?
I haven't found any specification of padding here, and in any case all mention of padding that I've found in the spec only extends as far as padding onto two-byte boundaries, never four or more.
And if the zeros were leading zeros in 32-bit quantities then I'd expect them to come after the 0200 and 0100, not before. And of course then the length would have to be 0400, not 0200.
The file was created by OrthancWSIDicomizer.exe, part of the Orthanc DICOM offering.
What am I missing? (Apart from the obvious: a deep understanding of DICOM!)
I haven't found any specification of padding here,
The data element is correctly encoded: in the Explicit VR Little Endian transfer syntax, there is padding depending on the data element's value representation. According to Table 7.1-1, if the VR is either of "OB", "OD", "OF", "OL", "OV", "OW", "SQ", "UC", "UR", "UT", or "UN", then the two VR bytes are followed by "Reserved (2 bytes) set to a value of 0000H", and the length of the element is defined with 4 bytes instead of just 2. In this case, these are the bytes at position 0096. The subsequent 4 bytes, at 0098, represent the length of the element: 0200 0000 (little endian for 2).
The complete data element with header is 14 bytes long: 0200 0100 4f42 0000 0200 0000 0001, (0002,0001) File Meta Information Version, OB, with the value 1.

Incorrect wav header generated by sox

I was using sox to convert a 2 channels, 48000Hz, 24bits wav file (new.wav) to a mono wav file (post.wav).
Here are the related commands and outputs:
[Farmer#Ubuntu recording]$ soxi new.wav
Input File : 'new.wav'
Channels : 2
Sample Rate : 48000
Precision : 24-bit
Duration : 00:00:01.52 = 72901 samples ~ 113.908 CDDA sectors
File Size : 447k
Bit Rate : 2.35M
Sample Encoding: 24-bit Signed Integer PCM
[Farmer#Ubuntu recording]$ sox new.wav -c 1 post.wav
[Farmer#Ubuntu recording]$ soxi post.wav
Input File : 'post.wav'
Channels : 1
Sample Rate : 48000
Precision : 24-bit
Duration : 00:00:01.52 = 72901 samples ~ 113.908 CDDA sectors
File Size : 219k
Bit Rate : 1.15M
Sample Encoding: 24-bit Signed Integer PCM
It looks fine. But let us check the header of post.wav:
[Farmer#Ubuntu recording]$ xxd post.wav | head -10
00000000: 5249 4646 9856 0300 5741 5645 666d 7420 RIFF.V..WAVEfmt
00000010: 2800 0000 feff 0100 80bb 0000 8032 0200 (............2..
00000020: 0300 1800 1600 1800 0400 0000 0100 0000 ................
00000030: 0000 1000 8000 00aa 0038 9b71 6661 6374 .........8.qfact
00000040: 0400 0000 c51c 0100 6461 7461 4f56 0300 ........dataOV..
This is the standard wav file header structure.
The first line is no problem.
The second line "2800 0000" shows the size of sub chunk "fmt ", it should be 0x00000028 (as this is little endian) = 40 bytes. But there are 54 bytes (before sub chunk "fmt " and sub chunk "data").
The third line shows "ExtraParamSize" is 0x0018 = 22 bytes. But actually it is 36 bytes (from third line's "1600" to 5th line's "0100"). The previous 16 bytes are standard.
So what's the extra 36 bytes?
Ok,I found out the answer.
Look at the second line, we can found that audio format is "feff", actual value is 0xFFFE, so this is not a PCM standard wave format, but a extensible format.
Wav head detailed introduction can refer to this link. The article is well written and thanks to the author.
So as this is a Non-PCM format wav, "fmt " chunk space occupied 40 bytes is no problem, and followed by a "fact" chunk, and then is "data" chunk, So everything makes sense.

Resources