Convert list of raw-vectors in dataframe - r

I have a list of raw-vectors named "output". Something like that:
[1] 58 0a 00 00 00 03 00 04 00 03 00 03 05 00 00 00 00 05 55 54 46 2d 38 00 00 00 fe
[1] 58 0a 00 00 00 03 00 04 00 03 00 03 05 00 00 00 00 05 55 54 46 2d 38 00 01 03 19 00 00 04 02 00 00 00 01 00 04 00 09 00 00 00 04 6d 65 74 61 00 00 02 13 00 00 00 03 00 00 00 10 00 00 00 01 00
[1] ...
They have different lenghts and are from the type "raw".
I need a dataframe with one vector in each cell:
ID
vectors
1
58 0a 00 00 00 03 00 04 00 03 00 03 05 00 00 00 00 05 55 54 46 2d 38 00 00 00 fe
2
58 0a 00 00 00 03 00 04 00 03 00 03 05 00 00 00 00 05 55 54 46 2d 38 00 01 03 19 00 00 04 02 00 00 00 01 00 04 00 09 00 00 00 04 6d 65 74 61 00 00 02 13 00 00 00 03 00 00 00 10 00 00 00 01 00
I have tried this:
as.data.frame(output)
#Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, :
arguments imply differing number of rows: 27, 3132, 4141, 4267, 3701, 3943, 5200
df <- data.frame(matrix(unlist(output), nrow=length(output)))
#Warning message:
In matrix(unlist(output), nrow = length(output)) :
data length [32954] is not a sub-multiple or multiple of the number of rows [14]
Is there a way to solve my problem?

You have to use I when creating the data.frame.
output <- list(raw(2), raw(3))
DF <- data.frame(ID=1:2, vectors = I(output))
str(DF)
#'data.frame': 2 obs. of 2 variables:
# $ ID : int 1 2
# $ vectors:List of 2
# ..$ : raw 00 00
# ..$ : raw 00 00 00
# ..- attr(*, "class")= chr "AsIs"
DF
#DF
# ID vectors
#1 1 00, 00
#2 2 00, 00, 00

This can be also done with tibble
library(tibble)
output <- list(raw(2), raw(3))
tibble(ID = 1:2, vectors = output)
# A tibble: 2 x 2
ID vectors
<int> <list>
1 1 <raw [2]>
2 2 <raw [3]>

Related

Hex to datetime from a BLE custom service

I'm trying to decode a hex string to a datetime format from a BLE custom UUID. I don't have any information except the decoded value and datetime, that's why I put similar examples in milliseconds and minutes. It will be complicated to see the value of another year.
If you look closely, you will notice there are only 2 values repeated twice in each row (bold and italic). First starts with 0x01 (maybe a flag) and then 2 bytes and second 2 bytes too.
Similar minutes:
01 8E 6C 00 00 8E 6C 00 00 62 25 62 25 00 00 00 00 00 00 00 - 1669024172,5141 (Nov 21 2022 09:49:32,5387632)
01 9C 6C 00 00 9C 6C 00 00 62 49 62 49 00 00 00 00 00 00 00 - 1669024172,68598 (Nov 21 2022 09:49:32,7638451)
01 fd 6c 00 00 fd 6c 00 00 64 0f 64 0f 00 00 00 00 00 00 00 - 1669024174,23293 (Nov 21 2022 09:49:34,3170414)
01 0d 6d 00 00 0d 6d 00 00 6e 2f 6e 2f 00 00 00 00 00 00 00 - 1669024174,5767 (Nov 21 2022 09:49:34,5414078)
01 f0 6d 00 00 f0 6d 00 00 62 21 62 21 00 00 00 00 00 00 00 - 1669024178,01436 (Nov 21 2022 09:49:38,0746688)
01 00 6e 00 00 00 6e 00 00 67 81 67 81 00 00 00 00 00 00 00 - 1669024178,35812 (Nov 21 2022 09:49:38,2775026)
Similar milliseconds:
01 16 1d 00 00 16 1d 00 00 4d f4 4d f4 00 00 00 00 00 00 00 - 1669023854,70251 (Nov 21 2022 09:44:14,7728813)
01 1e 6e 00 00 1e 6e 00 00 67 c5 67 c5 00 00 00 00 00 00 00 - 1669024178,70189 (Nov 21 2022 09:49:38,7714679)
01 e2 6d 00 00 e2 6d 00 00 62 4d 62 4d 00 00 00 00 00 00 00 - 1669024177,84247 (2022-11-21 10:49:37,8263242)
01 5f dd 06 00 5f dd 06 00 cb 46 cb 46 00 00 00 00 00 00 00 - 1669110577,66758 (2022-11-22 10:49:37,8231004)
01 00 6e 00 00 00 6e 00 00 67 81 67 81 00 00 00 00 00 00 00 - 1669024178,35812 (2022-11-21 10:49:38,2775026)
01 7c dd 06 00 7c dd 06 00 c8 0c c8 0c 00 00 00 00 00 00 00 - 1669024178      (2022-11-22 10:49:38,2722736)
It's looks like:
It's not decoded in the unix epoch (JAN 1 1970)
The first value is increasing (0xXX6C, 0xXX6E, 0xXX6E), so maybe it's the int part.
I tried to decode like year, minutes, seconds but it seems it isn't.
Sometimes first value is bigger than 2 bytes (3 bytes):
01 DE 61 04 00 DE 61 04 00 4A C1 4A C1 00 00 00 00 00 00 00 - 1669104436 (Nov 22 2022 09:07:16,1902042)
I tried this solutions:
https://stackoverflow.com/questions/1389046/what-is-the-specification-of-hexadecimal-date-format-in-sql-server Error: System.ArgumentOutOfRangeException (trying one of my hex)
http://erbhavi.blogspot.com/search/label/Converting%20DateTime%20to%20Hex%20in%20C%23 DateTime yourDateTime = new DateTime( ticks1970 + gmt * 10000000L ); //Error CS0019: Operator '*' can't apply to types 'string' y 'long'
I'll update the question if I find more info.

How to read the MPEG2VideoDescriptor in an MXF file?

Here follows the hex dump of the MPEG2VideoDescriptor:
06 0e 2b 34 02 53 01 01 0d 01 01 01 01 01 51 00
83 00 00 f3 3c 0a 00 10 a3 be 51 b2 00 05 e7 11
bf 82 21 97 f7 a0 14 ed 30 06 00 04 00 00 00 02
30 01 00 08 00 00 ea 60 00 00 03 e9 80 00 00 04
01 c9 c3 80 30 04 00 10 06 0e 2b 34 04 01 01 02
0d 01 03 01 02 04 61 01 32 15 00 01 05 32 0e 00
08 00 00 00 10 00 00 00 09 32 0d 00 10 00 00 00
02 00 00 00 04 00 00 00 1a 00 00 00 00 32 0c 00
01 00 32 08 00 04 00 00 02 d0 32 09 00 04 00 00
05 00 32 02 00 04 00 00 02 d0 32 03 00 04 00 00
05 00 32 01 00 10 06 0e 2b 34 04 01 01 03 04 01
02 02 01 04 03 00 33 02 00 04 00 00 00 02 33 08
00 04 00 00 00 01 33 03 00 01 04 33 01 00 04 00
00 00 08 33 0b 00 01 00 33 07 00 02 00 00 33 04
The first 16 bytes:
06 0e 2b 34 02 53 01 01 0d 01 01 01 01 01 51 00 (UID)
Next 4 bytes is the BER size:
83 00 00 f3 (0xf3 bytes long)
Next 4 bytes:
3c 0a 00 10 (0x3c0a means Instance UUID and 0x0010 is the size)
Then follows the UUID:
a3 be 51 b2 00 05 e7 11 bf 82 21 97 f7 a0 14 ed
Next 4 bytes:
30 06 00 04 (0x3006 means Linked Track ID and 0x0004 is the size)
Next 4 bytes is the Linked Track ID: 00 00 00 02
Next 4 bytes: 30 01 00 08 (0x3001 means Sample Rate and 0x0008 is the size)
The following 8 bytes are actually frame rate numerator and denominator:
0000ea60 == 60000 and 000003e9 == 1001.
Now we have the bold part: 80 00 00 04
.
Can somebody please explain what does it mean?
The next four bytes are 01 c9 c3 80 and it is definitely the bitrate (30000000), but how can I know that for sure?
Edit:
Does 80 00 00 04 mean the following:
0x8000 is a dynamic tag. According to SMPTE 337, tags 0x8000-0xFFFF are dynamically allocated. The 0x0004 is the size (4 bytes). If that's true, how can I tell that the following 4 bytes 01 c9 c3 80 are actually the bitrate? It could be anything, or?
First you have to understand how local tags work.
Local tags 0x8000 and above are user defined.
You have to look at the primer pack of the header partition.
The primer pack translates the local tag to a global UL which may or may not be vendor specific.
Consider the primer pack being a translation table between the 2 byte local tag and the 16 byte UL.

Data modified on AWS API Gateway Response body

I am trying to return hexadecimal string as response from my AWS Lambda function. When it reaches to the client the data seems to be modified.
Data :
47 49 46 38 39 61 01 00 01 00 80 00 00 00 00 00
ff ff ff 21 f9 04 01 00 00 01 00 2c 00 00 00 00
01 00 01 00 00 08 04 00 03 04 04 00 3b
Hexadecimal Excaped Data ( Sent Data ):
\x47\x49\x46\x38\x39\x61\x01\x00\x01\x00\x80\x00\x00\x00\x00\x00"
"\xff\xff\xff\x21\xf9\x04\x01\x00\x00\x01\x00\x2c\x00\x00\x00\x00"
"\x01\x00\x01\x00\x00\x08\x04\x00\x03\x04\x04\x00\x3b
Received Data
47 49 46 38 39 61 01 00 01 00 c2 80 00 00 00 00
00 c3 bf c3 bf c3 bf 21 c3 b9 04 01 00 00 01 00
2c 00 00 00 00 01 00 01 00 00 08 04 00 03 04 04
00 3b
How to fix this?
Last time I checked it was not very explicit in the doc, but API Gateway is really made for json (or similar) and support for binary is 'on the roadmap' but clearly doesn't seem to be a priority. It converts everything it sends to utf-8.
Comparing precisely your original data with the received one you can see it :
47 49 46 38 39 61 01 00 01 00 80 00 00 00 00 00 ff ff ff 21 f9 04 01 00 00 01 00 2c 00 00 00 00 01 00 01 00 00 08 04 00 03 04 04 00 3b
47 49 46 38 39 61 01 00 01 00 c2 80 00 00 00 00 00 c3 bf c3 bf c3 bf 21 c3 b9 04 01 00 00 01 00 2c 00 00 00 00 01 00 01 00 00 08 04 00 03 04 04 00 3b
Everything under 0x7f is OK because the unicode code point is the same as the encoded byte (U+0047 -> 47), but for 0x80 or more the problem arises : U+0080 -> c2 80, U+00FF -> c3 bf and so on.
We had a similar problem recently : binary data was corrupted and bigger when sent through Gateway than with direct access to our backend. It was because a lot of bytes get replaced by Unicode special 'replacement character' aka 'U+FFFD' aka '0xEF 0xBF 0xBD'.
How to fix ? We just stopped using Gateway but if you can afford your data to be bigger, you can base64 encode it.

the meaning of bits in rawToBits?

> as.raw(15)
[1] 0f
> rawToBits(as.raw(15))
[1] 01 01 01 01 00 00 00 00
> rawToBits(0f)
Error: unexpected symbol in "rawToBits(0f"
> rawToBits("0f")
Error in rawToBits("0f") : argument 'x' must be a raw vector
> rawToBits("0x0f")
Error in rawToBits("0x0f") : argument 'x' must be a raw vector
I have some problems to ask:
1) is that 0f a raw type data?
2) why rawToBits(as.raw(15)) can not get 11110000? the 15 is not 11110000?
15=0f=1*2^0+1*2^1+1*2^2+1*2^3
What is the meaning of 0 in [1] 01 00 00 00 00 00 00 00 when you input rawToBits(as.raw(1))?
In the manual ,i get a raw vector with entries 0 or 1,what is the meaning ofentries 0 or 1.
Why rawToBits(as.raw(2)) is not 10 00 00 00 00 00 00 00?
Just typing 0f doesn't give you something of type raw.
> str(as.raw(15))
raw 0f
> str(0f)
Error: unexpected symbol in "str(0f"
> str("0f")
chr "0f"
If you want to know what's going on with the bits you could try some other values to get a better idea what is going on
> rawToBits(as.raw(1))
[1] 01 00 00 00 00 00 00 00
> rawToBits(as.raw(2))
[1] 00 01 00 00 00 00 00 00
> rawToBits(as.raw(4))
[1] 00 00 01 00 00 00 00 00
> rawToBits(as.raw(8))
[1] 00 00 00 01 00 00 00 00
> rawToBits(as.raw(1 + 2 + 4 + 8))
[1] 01 01 01 01 00 00 00 00
> rawToBits(as.raw(15))
[1] 01 01 01 01 00 00 00 00

Comparing two mach-o files

I have two mach-o files and i need to find the difference(hexadecimal differed values) in them. is there any tool available for doing this.
i tried using "DiffMerge" but it doesn't have the supported encoding format it seems.
how about:
hexdump binary1 > dump1.txt
hexdump binary2 > dump2.txt
diff dump1.txt dump2.txt
This will give you a diff file and you can use the offset in the first column in hexfiend or whatever your choice of editor is to investigate further.
2c2
< 0000010 21 00 00 00 bc 0f 00 00 85 00 20 00 01 00 00 00
---
> 0000010 20 00 00 00 84 0f 00 00 85 00 20 00 01 00 00 00
248,254c248,250
< 0000f80 0c 00 00 00 38 00 00 00 18 00 00 00 02 00 00 00
< 0000f90 00 00 01 00 00 00 01 00 40 65 78 65 63 75 74 61
< 0000fa0 62 6c 65 5f 70 61 74 68 2f 69 6e 6a 65 63 74 2e
< 0000fb0 64 79 6c 69 62 00 00 00 26 00 00 00 10 00 00 00
< 0000fc0 58 76 18 00 a4 23 00 00 1d 00 00 00 10 00 00 00
< 0000fd0 e0 09 19 00 60 37 00 00 00 00 00 00 00 00 00 00
< 0000fe0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
---
> 0000f80 26 00 00 00 10 00 00 00 58 76 18 00 a4 23 00 00
> 0000f90 1d 00 00 00 10 00 00 00 e0 09 19 00 60 37 00 00
> 0000fa0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

Resources