I need to serialize some R objects into an XDR format that complies to RFC 4506 standards. I know about serialize:
serialize("Hello world", NULL) # xdr = TRUE
# [1] 58 0a 00 00 00 02 00 03 04 02 00 02 03 00 00 00 00 10 00 00 00 01 00 04 00 09 00 00 00 0b 48 65 6c 6c 6f 20 77 6f 72 6c 64
With a bit of googling, it seems that R formats to RFC 1832, not RFC 4506. I can't seem to find any libraries that specifically handle XDR serialization to different formats.
Can anyone point me to a good library, or failing that some good resources for doing this by hand? I haven't had any experience with XDR before today, and I am already aware of this, which is a bit dry.
Related
I am struggling to get Windows to load the default WinUSB driver for my device. Please note that I am looking for a solution that is using BOS descriptor (and not the old 0xEE string index).
The device enumerates and Windows tells me that it is installing the device, but the WinUSB driver is not loaded. I have tried everything that I can think of, but still I can't get Windows to load the driver. I even uninstall the device and delete the USB flags in the registry whenever I re-try, but to no avail. Is there anyone who can help me to get this to work?
I don't want WebUSB capabilities or anything additional. This is a non-composite device.
This is my BOS descriptor (as sent over USB):
05 0F 21 00 01 1C 10 05 00 DF 60 DD D8 89
45 C7 4C 9C D2 65 9D 9E 64 8A 9F 00 00 03
06 B2 00 01 00
And this my BOS descriptor set:
0A 00 00 00 00 00 03 06 B2 00 08 00 01 00 ..............
00 00 A8 00 08 00 02 00 00 00 A0 00 14 00 ..............
03 00 57 49 4E 55 53 42 00 00 00 00 00 00 ..WINUSB......
00 00 00 00 84 00 04 00 07 00 2A 00 44 00 ..........*.D.
65 00 76 00 69 00 63 00 65 00 49 00 6E 00 e.v.i.c.e.I.n.
74 00 65 00 72 00 66 00 61 00 63 00 65 00 t.e.r.f.a.c.e.
47 00 55 00 49 00 44 00 73 00 00 00 50 00 G.U.I.D.s...P.
7B 00 46 00 37 00 32 00 46 00 45 00 30 00 {.F.7.2.F.E.0.
44 00 34 00 2D 00 43 00 42 00 43 00 42 00 D.4.-.C.B.C.B.
2D 00 34 00 30 00 37 00 44 00 2D 00 38 00 -.4.0.7.D.-.8.
38 00 31 00 34 00 2D 00 39 00 45 00 44 00 8.1.4.-.9.E.D.
36 00 37 00 33 00 44 00 30 00 44 00 44 00 6.7.3.D.0.D.D.
36 00 42 00 7D 00 00 00 00 00 6.B.}.....
The layout is:
typedef struct _SMSOS20DescriptorSet
{
SDeviceDescSetHeader sDescriptorSetHeader;
SConfigurationSubsetHeader sConfSubsetHeader;
SFunctionSubsetHeader sFuncSubsetHeader;
SDeviceCompatibleIdDescriptor sCompIdDescriptor;
SDeviceRegDescDeviceInterfaceGUID sRegistryDescDevInterfaceGuid;
} SMSOS20DescriptorSet;
I have follewed these guides and doc:
https://learn.microsoft.com/en-us/windows-hardware/drivers/usbcon/automatic-installation-of-winusb#winusb-device-installation-by-using-the-in-box-winusbinf
MS_OS_2_0_desc.docx
https://thewindowsupdate.com/2018/10/12/how-to-install-winusb-sys-without-a-custom-inf/
https://learn.microsoft.com/en-us/windows-hardware/drivers/usbcon/winusb-installation#automatic-installation-of--winusb-without-an-inf-file
UPDATE:
when you have a non-composite device that only has a single a configuration, then you are not use any subset headers (neither 'Configuration subset header' nor 'Function subset header'). So, the correct layout in this case is:
typedef struct _SMSOS20DescriptorSet
{
SDeviceDescSetHeader sDescriptorSetHeader;
SDeviceCompatibleIdDescriptor sCompIdDescriptor;
SDeviceRegDescDeviceInterfaceGUID sRegistryDescDevInterfaceGuid;
} SMSOS20DescriptorSet;
UPDATE: when you have a non-composite device that only has a single a configuration, then you are not use any subset headers (neither 'Configuration subset header' nor 'Function subset header'). So, the correct layout in this case is:
typedef struct _SMSOS20DescriptorSet
{
SDeviceDescSetHeader sDescriptorSetHeader;
SDeviceCompatibleIdDescriptor sCompIdDescriptor;
SDeviceRegDescDeviceInterfaceGUID sRegistryDescDevInterfaceGuid;
} SMSOS20DescriptorSet;
It's currently 04:40 AM and I am stuck on something I simply do not understand. I am trying to look up a domain's nameservers directly by using the DNS protocol. If I send a host -t ns google.com 1.1.1.1 and monitor it with Wireshark, I can see the full query of the DNS query. However, I cannot figure out, why some ASCII characters are used one time, but not another time. Here is an example:
0000 70 4d 7b 94 dd e0 00 d8 61 a9 c5 ec 08 00 45 00 pM{.....a.....E.
0010 00 38 d6 ff 00 00 80 11 9f 50 c0 a8 01 bb 01 01 .8.......P......
0020 01 01 e8 40 00 35 00 24 a0 19 9e f7 01 00 00 01 ...#.5.$........
0030 00 00 00 00 00 00 06 67 6f 6f 67 6c 65 03 63 6f .......google.co
0040 6d 00 00 02 00 01 m.....
In this DNS query, I am looking up the nameservers for google.com. The actual query starts at 06 07.
06 in ASCII is ACK/Acknowledgment.
Now, if we take a look at gmail.com instead:
0000 70 4d 7b 94 dd e0 00 d8 61 a9 c5 ec 08 00 45 00 pM{.....a.....E.
0010 00 37 d7 00 00 00 80 11 9f 50 c0 a8 01 bb 01 01 .7.......P......
0020 01 01 e8 58 00 35 00 23 8f cc 6f e2 01 00 00 01 ...X.5.#..o.....
0030 00 00 00 00 00 00 05 67 6d 61 69 6c 03 63 6f 6d .......gmail.com
0040 00 00 02 00 01 .....
the query starts at 05 67 instead.
05 is ENQ/Enquiry.
Why are they different? If I try to send 06 instead of 05 the DNS server gives me no response but Wireshark tells me:
Unknown extended label
I've seen 05, 06, and 09 so far. 09 is my biggest "wat" of all time, because it's a HT/Horizontal Tab.
Anyone with a lot of DNS knowledge who can help me here? I'm not looking for "just use dig/nslookup/host command". I'm currently trying to research a bit on the DNS protocol, and this is a thing I do not understand.
Good read where I got a lot of help: http://dev.lab427.net/dns-query-wth-netcat.html
For a binary protocols like this, you can't assume each byte corresponds to the matching ASCII character.
Take a look at section 4.1.2 of the DNS RFC (https://www.ietf.org/rfc/rfc1035.txt).
The domain name in a DNS request is broken up into "labels". For each label, the first byte is the length of the label, then the bytes for the string are written.
For your Google.com example, the labels are "google" and "com". The 06 is the number of bytes in the first label. This is followed by the bytes for "google". Then the 03 is the number of bytes in the "com" label. After the "com" bytes, the 00 byte is the NULL label to mark the end.
With a large file (1GB) created by saving a large data.frame (or data.table) is it possible to very quickly load a small subset of rows from that file?
(Extra for clarity: I mean something as fast as mmap, i.e. the runtime should be approximately proportional to the amount of memory extracted, but constant in the size of the total dataset. "Skipping data" should have essentially zero cost. This can be very easy, or impossible, or something in between, depending on the serialiization format. )
I hope that the R serialization format makes it easy to skip forward through the file to the relevant portions of the file.
Am I right in assuming that this would be impossible with a compressed file, simply because gzip requires to uncompress everything from the beginning?
saveRDS(object, file = "", ascii = FALSE, version = NULL,
compress = TRUE, refhook = NULL)
But I'm hoping binary (ascii=F) uncompressed (compress=F) might allow something like this. Use mmap on the file, then quickly skip to the rows and columns of interest?
I'm hoping it has already been done, or there is another format (reasonably space efficient) that allows this and is well-supported in R.
I've used things like gdbm (from Python) and even implemented a custom system in Rcpp for a specific data structure, but I'm not satisfied with any of this.
After posting this, I worked a bit with the package ff (CRAN) and am very impressed with it (not much support for character vectors though).
Am I right in assuming that this would be impossible with a compressed
file, simply because gzip requires to uncompress everything from the
beginning?
Indeed, for a short explanation let's take some dummy method as starting point:
AAAAVVBABBBC gzip would do something like: 4A2VBA3BC
Obviously you can't extract all A from the file without reading it all as you can't guess if there's an A at end or not.
For the other question "Loading part of a saved file" I can't see a solution on top of my head. You probably can with write.csv and read.csv (or fwrite and fread from the data.table package) with skipand nrows parameters could be an alternative.
By all means, using any function on a file already read would mean loading the whole file in memory before filtering, which is no more time than reading the file and then subsetting from memory.
You may craft something in Rcpp, taking advantage of streams for reading data without loading them in memory, but reading and parsing each entry before deciding if it should be kept or not won't give you a real better throughput.
saveDRS will save a serialized version of the datas, example:
> myvector <- c("1","2","3").
> serialize(myvector,NULL)
[1] 58 0a 00 00 00 02 00 03 02 03 00 02 03 00 00 00 00 10 00 00 00 03 00 04 00 09 00 00 00 01 31 00 04 00 09 00 00 00 01 32 00 04 00 09 00 00
[47] 00 01 33
It is of course parsable, but means reading byte per byte according to the format.
On the other hand, you could write as csv (or write.table for more complex data) and use an external tool before reading, something along the line:
z <- tempfile()
write.table(df, z, row.names = FALSE)
shortdf <- read.table(text= system( command = paste0( "awk 'NR > 5 && NR < 10 { print }'" ,z) ) )
You'll need a linux system with awk wich is able to parse millions of lines in a few milliseconds, or to use a windows compiled version of awk obviously.
Main advantage is that awk is able to filter on a regex or some other conditions each line of data.
Complement for case of data.frame, a data.frame is more or less a list of vectors (simple case), this list will be saved sequentially so if we have a dataframe like:
> str(ex)
'data.frame': 3 obs. of 2 variables:
$ a: chr "one" "five" "Whatever"
$ b: num 1 2 3
It's serialization is:
> serialize(ex,NULL)
[1] 58 0a 00 00 00 02 00 03 02 03 00 02 03 00 00 00 03 13 00 00 00 02 00 00 00 10 00 00 00 03 00 04 00 09 00 00 00 03 6f 6e 65 00 04 00 09 00
[47] 00 00 04 66 69 76 65 00 04 00 09 00 00 00 08 57 68 61 74 65 76 65 72 00 00 00 0e 00 00 00 03 3f f0 00 00 00 00 00 00 40 00 00 00 00 00 00
[93] 00 40 08 00 00 00 00 00 00 00 00 04 02 00 00 00 01 00 04 00 09 00 00 00 05 6e 61 6d 65 73 00 00 00 10 00 00 00 02 00 04 00 09 00 00 00 01
[139] 61 00 04 00 09 00 00 00 01 62 00 00 04 02 00 00 00 01 00 04 00 09 00 00 00 09 72 6f 77 2e 6e 61 6d 65 73 00 00 00 0d 00 00 00 02 80 00 00
[185] 00 ff ff ff fd 00 00 04 02 00 00 00 01 00 04 00 09 00 00 00 05 63 6c 61 73 73 00 00 00 10 00 00 00 01 00 04 00 09 00 00 00 0a 64 61 74 61
[231] 2e 66 72 61 6d 65 00 00 00 fe
Translated to ascii for an idea:
X
one five Whatever?ð## names a b row.names
ÿÿÿý class
data.frameþ
We have the header of the file, the the header of the list, then each vector composing the list, as we have no clue on how much size the character vector will take we can't skip to arbitrary datas, we have to parse each header (the bytes just before the text data give it's length). Even worse now to get the corresponding integers, we have to go to the integer vector header, which can't be determined without parsing each character header and summing them.
So in my opinion, crafting something is possible but will probably not be really much quicker than reading all the object and will be brittle to the save format (as R has already 3 formats to save objects).
Some reference here
Same view as the serialize output in ascii format (more readable to get how it is organized):
> write(rawToChar(serialize(ex,NULL,ascii=TRUE)),"")
A
2
197123
131840
787
2
16
3
262153
3
one
262153
4
five
262153
8
Whatever
14
3
1
2
3
1026
1
262153
5
names
16
2
262153
1
a
262153
1
b
1026
1
262153
9
row.names
13
2
NA
-3
1026
1
262153
5
class
16
1
262153
10
data.frame
254
I wrote a program for parsing SQLite file, i can parse all data from b-tree pages to record, column & values but i need to parse schema of tables, i found something like database schema stored in page 1 (root page) and i can see it with Hex Editor, and i found structure of sqlite_master, i read it exact as explain in http://sqlite.org/fileformat2.html
I want to know how can i found the first byte of sqlite_master table in db file, how can i detect starting byte of schema? is there anything related in SQLite DB Header?
Edit 1 (more info):
For example:
i opened sqlite db with hex editor, (if you check my page size is 4096 bytes and i marked page header in image):
i marked root page header that start with 05 means the page is an interior table b-tree page and please check B-tree Page Header Format (http://sqlite.org/fileformat2.html) and its have 5 cells that you can see it with this cell pointers array: 0FFB, 0FF6, 0FF1, 0FEC, 0FE7 (that start after ending header) and all cells have 5 bytes and start from 0FE7 then the schema that you can see it in picture ( in text part ) start from 232~240 and i check other dbs and schema in different place...
Edit 2:
You can download Example File from https://www.dropbox.com/s/lanky02kneyb74w/31bb7ba8914766d4ba40d6dfb6113c8b614be442
Edit 3:
In my file you can see
$ hexdump -C 31bb7ba8914766d4ba40d6dfb6113c8b614be442
00000000 53 51 4c 69 74 65 20 66 6f 72 6d 61 74 20 33 00 |SQLite format 3.|
00000010 10 00 02 02 00 40 20 20 00 00 00 02 00 00 00 3f |.....# .......?|
00000020 00 00 00 00 00 00 00 00 00 00 00 47 00 00 00 04 |...........G....|
00000030 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 00 |................|
00000040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 02 |................|
00000060 00 2d e2 25 05 00 00 00 05 0f e7 00 00 00 00 3d |.-.%...........=|
00000070 0f fb 0f f6 0f f1 0f ec 0f e7 08 7f 07 9d 08 3c |...............<|
00000080 07 01 06 22 05 92 04 fe 03 fc 04 c1 03 4d 02 b8 |...".........M..|
00000090 02 0a 02 75 01 32 01 c7 00 e9 00 e9 00 00 00 00 |...u.2..........|
000000a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
000000e0 00 00 00 00 00 00 00 00 00 47 18 06 17 5b 35 01 |.........G...[5.|
000000f0 00 69 6e 64 65 78 73 71 6c 69 74 65 5f 61 75 74 |.indexsqlite_aut|
00000100 6f 69 6e 64 65 78 5f 41 42 4d 75 6c 74 69 56 61 |oindex_ABMultiVa|
00000110 6c 75 65 45 6e 74 72 79 4b 65 79 5f 31 41 42 4d |lueEntryKey_1ABM|
00000120 75 6c 74 69 56 61 6c 75 65 45 6e 74 72 79 4b 65 |ultiValueEntryKe|
Page Header ( offset 64)
05 <- interior table b-tree page
0000 <- Byte offset into the page of the first freeblock
0005 <- Number of cells on this page
0FE7 <- Offset to the first byte of the cell content area
00 <- Number of fragmented free bytes
0000003D (61) <- The right-most pointer
Cell Array Pointers & Cell Contents:
(Table Interior Cell Format)
Cell Pointer| Page number of left child | Rowid
------------|---------------------------|-------
0FFB | 0000001A (26) | 15
0FF6 | 0000001C (28) | 2D
0FF1 | 00000031 (49) | 3C
0FEC | 00000039 (57) | 48
0FE7 | 0000003C (60) | 4C <- equal to (Offset to the first byte of the cell content area) in page header
I realize your question was asked over a year ago and you probably resolved it, but I would like to submit an answer in case anyone else has this same question. I was in the same situation as you, Mehdi. I wanted to read a SQLite database file, and was looking for the master table / schema. It appeared to be in page 1, but the header was not pointing to it. There were two reasons for my confusion.
(1) There was a lot of "dead" data in my SQLite database file that was not being used. I believe as the database was created and grew, the location of the actual active data moved, and the old location was not overwritten with zeros. Doing a search for some of the "CREATE TABLE" statements found multiple results in different locations of the file. I later determined the actual schema was split up and located on pages 18, 10, and 8 (which the page 1 interior table pointed to). I would have detected this earlier, if not for reason #2.
(2) I had miscalculated the byte position of the page number, which confused me. Where p = page #, and s = page size, I thought it was [p * s] .... but actually it's [(p-1) * s] (except for page 1 which starts at byte 100). In other words, I thought the page numbering started at zero instead of 1.
As an additional note, I believe the http://sqlite.org/fileformat2.html page is missing some vital info. Specifically, it doesn't explain where the "root page" number is in the schema table (it's in field 4). I couldn't find this information on the sqlite.org page.
The documentation you linked to says in section 2.6:
Page 1 of a database file is the root page of a table b-tree that holds a special table named "sqlite_master"
and in section 1.5:
A b-tree page is divided into regions in the following order:
The 100-byte database file header (found on page 1 only)
The 8 or 12 byte b-tree page header …
For example, with this database:
$ sqlite3 test.db "create table hello(world);"
$ hexdump -C test.db
00000000 53 51 4c 69 74 65 20 66 6f 72 6d 61 74 20 33 00 |SQLite format 3.|
00000010 04 00 01 01 00 40 20 20 00 00 00 01 00 00 00 02 |.....# ........|
00000020 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 04 |................|
00000030 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 00 |................|
00000040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 |................|
00000060 00 2d e6 03 0d 00 00 00 01 03 cf 00 03 cf 00 00 |.-æ.......Ï..Ï..|
00000070 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
000003c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 2f |.............../|
000003d0 01 06 17 17 17 01 3f 74 61 62 6c 65 68 65 6c 6c |......?tablehell|
000003e0 6f 68 65 6c 6c 6f 02 43 52 45 41 54 45 20 54 41 |ohello.CREATE TA|
000003f0 42 4c 45 20 68 65 6c 6c 6f 28 77 6f 72 6c 64 29 |BLE hello(world)|
00000400 0d 00 00 00 00 04 00 00 00 00 00 00 00 00 00 00 |................|
00000410 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
... the page header at offset 0x64 has these values:
0d: page is a leaf table b-tree page
0000: freeblock offset
0001: number of cells
03cf: offset of cell content
00: fragmented free bytes
03cf: first cell pointer
And at offset 3cf, you have a standard table b-tree leaf cell, containing the only row of the sqlite_master table:
sqlite> select * from sqlite_master;
type name tbl_name rootpage sql
---------- ---------- ---------- ---------- -------------------------
table hello hello 2 CREATE TABLE hello(world)
I have a very long running task I need to take from my website and put into a web service however every time I try call the web service I get this error message The request failed with HTTP status 401: Unauthorized.
I assumed this could just be an issue with the credentials as I have seen something similar when sending emails so I did this:
Dim wsCustomer As New blueprintdev.RosterEmailService()
Dim basicAuthenticationInfo As New System.Net.NetworkCredential("user", "pass")
wsCustomer.UseDefaultCredentials = False
wsCustomer.PreAuthenticate = True
wsCustomer.Credentials = basicAuthenticationInfo
txtTestResult.Text = wsCustomer.Test()
This still gives me the same issue.
Also tried this using default credentials and that still does not help.
Now I have no idea how to properly do this, and so I have mostly worked off tutorials etc, can anyone tell me what im doing wrong here?
Thanks
Additional notes: the project im working on is one I inherited when I joined the company and cant be built in visual studio so breakpointing through the code is not an option for me without massively hacking the project and removing a lot of code.
Update: Running fiddler shows me 3 entries, when I look at the auth tab i see:
No Proxy-Authenticate Header is present.
WWW-Authenticate Header (Negotiate) appears to be a Kerberos reply:
A1 15 30 13 A0 03 0A 01 03 A1 0C 06 0A 2B 06 01 ¡.0. ....¡...+..
04 01 82 37 02 02 0A ..7...
Then
No Proxy-Authenticate Header is present.
WWW-Authenticate Header (Negotiate) appears to be a Kerberos reply:
A1 81 E2 30 81 DF A0 03 0A 01 01 A2 81 D7 04 81 ¡â0ß ....¢×.
D4 4E 54 4C 4D 53 53 50 00 02 00 00 00 0E 00 0E ÔNTLMSSP........
00 38 00 00 00 15 82 89 E2 C1 20 C3 44 5E 99 21 .8....âÁ ÃD^!
A0 00 00 00 00 00 00 00 00 8E 00 8E 00 46 00 00 ..........F..
00 06 01 B1 1D 00 00 00 0F 41 00 43 00 41 00 44 ...±.....A.C.A.D
00 45 00 4D 00 59 00 02 00 0E 00 41 00 43 00 41 .E.M.Y.....A.C.A
00 44 00 45 00 4D 00 59 00 01 00 06 00 44 00 45 .D.E.M.Y.....D.E
00 56 00 04 00 1A 00 6C 00 6F 00 63 00 61 00 6C .V.....l.o.c.a.l
00 2E 00 41 00 63 00 61 00 64 00 65 00 6D 00 79 ...A.c.a.d.e.m.y
00 03 00 22 00 44 00 45 00 56 00 2E 00 6C 00 6F ...".D.E.V...l.o
00 63 00 61 00 6C 00 2E 00 41 00 63 00 61 00 64 .c.a.l...A.c.a.d
00 65 00 6D 00 79 00 05 00 1A 00 6C 00 6F 00 63 .e.m.y.....l.o.c
00 61 00 6C 00 2E 00 41 00 63 00 61 00 64 00 65 .a.l...A.c.a.d.e
00 6D 00 79 00 07 00 08 00 2E 76 48 74 53 21 CD .m.y......vHtS!Í
01 00 00 00 00
.....
And finally this one highlighted in red
No Proxy-Authenticate Header is present.
No WWW-Authenticate Header is present.
Does this mean the authentication data is being lost somewhere between client and server? or that I am not passing anything to begin with?
If you can access the service through a browser I suggest passing in your username including domain as the credentials.
The username would look something like this: DOMAIN\USER
I've had a similar problem and the username and password had to be explicitly added.
ie.
wsCustomer.username = "something"
wsCustomer.password = "something else"
I guess you're already doing that with:
Dim basicAuthenticationInfo As New System.Net.NetworkCredential("user", "pass")
but maybe worth a try