Debugging SQLite R*tree - r

I have an SQLite database containing an R*tree virtual table. This table is behaving rather oddly and I'm at a loss as to what is wrong. I would appreciate any pointers to aspects I could investigate!
> dbGetQuery(con, 'PRAGMA integrity_check')
integrity_check
1 ok
It seems fine...
> dbGetQuery(con, 'SELECT * FROM peakLoc LIMIT 5')
peakID scanStart scanEnd mzMin mzMax
1 18481 5540 5904 435.1880 435.2095
2 18429 5555 5644 408.7411 408.7459
3 18251 5621 5710 432.7190 432.7285
4 16415 6081 6173 432.2292 432.2470
5 16391 6089 6351 454.1823 454.1960
The general look of the R*tree table
> dbGetQuery(con, 'SELECT MIN(scanEnd), MAX(scanEnd) FROM peakLoc')
MIN(scanEnd) MAX(scanEnd)
1 51 19369
The bounds of scanEnd
> dbGetQuery(con, 'SELECT * FROM peakLoc WHERE scanEnd > 5000 LIMIT 5')
peakID scanStart scanEnd mzMin mzMax
1 20987 4839 6284 410.1729 410.2035
2 6705 9827 10132 738.8564 738.8674
3 15190 6482 6756 615.3235 615.3395
4 15189 6482 6756 509.2193 509.2258
5 12001 7449 7710 855.4534 855.4631
So far so good...
> dbGetQuery(con, 'SELECT * FROM peakLoc WHERE scanEnd > 6000 LIMIT 5')
[1] peakID scanStart scanEnd mzMin mzMax
<0 rows> (or 0-length row.names)
Where are the records?
The same is happening for the other columns with bigger-than once the comparator gets to an arbitrary large number. This behaviour is only present in the R*tree table - the regular tables works fine...
Have I stumbled upon a constraint in the R*tree module that I do not know about? All records in the R*tree comes from one big insert and I have not touched the underlying tables that the R*tree relies on...
edit:
On request from CL I've tried to create a reproducible example. At least on my system the following produces an R*tree with the same behaviour:
set.seed(1)
library(RSQLite)
con <- dbConnect(dbDriver('SQLite'), ':memory:')
dbGetQuery(con, 'CREATE VIRTUAL TABLE test USING rtree(id, xmin, xmax, ymin, ymax)')
x <- abs(rnorm(100))
y <- abs(rnorm(100))
data <- data.frame(id=1:100, xmin=x, xmax=x+2, ymin=y, ymax=y+3)
dbGetPreparedQuery(con, 'INSERT INTO test VALUES ($id, $xmin, $xmax, $ymin, $ymax)', bind.data=data)
dbGetQuery(con, 'SELECT max(xmax) FROM test')
dbGetQuery(con, 'SELECT * FROM test WHERE xmax > 4 LIMIT 5')
dbGetQuery(con, 'SELECT * FROM test WHERE +xmax > 4 LIMIT 5')
edit 2:
A database created with the commands given in the first edit can be downloaded from this link:
https://dl.dropboxusercontent.com/u/2323585/testdb.sqlite

Related

while query to Riak timeseries databse i am getting SQL Parser error

I am getting this as a problem:
{0,riak_ql_parser, <<"Used group as a measure of time in 712903232group. Only s, m, h and d are allowed.
My Query:
select memberId,COUNT(memberId) from Emp18 where start>1478925732000 and start< 1478925939000 and end>1478913322000 and memberId<712903232 group by memberId;
but I am getting response with following query:
select memberId,COUNT(memberId) from Emp18 where start>1478925732000 and start< 1478925939000 and end>1478913322000 and memberId<712903232;
and I am getting output as :
+---------+-----+---------------+
|memberId |steps|COUNT(memberId)|
+---------+-----+---------------+
|712903230| 350 | 4 |
+---------+-----+---------------+

Hash Table + Binary Search

I'm using an Hash Table to store some values. Here are the details:
There will be roughly 1M items to store (not known before, so no perfect-hash possible).
Table is 10M large.
Hash function is MurMurHash3.
I did some tests and storing 1M values I get 350,000 collisions and 30 elements at the most-colliding hash table's slot.
Are these result good?
Would it make sense to implement Binary Search for lists that get created at colliding hash-table's slots?
What' your advice to improve performances?
EDIT: Here is my code
var
HashList: array [0..10000000 - 1] of Integer;
for I := 0 to High(HashList) do
HashList[I] := 0;
for I := 1 to 1000000 do
begin
Y := MurmurHash3(UIntToStr(I));
Y := Y mod Length(HashList);
Inc(HashList[Y]);
if HashList[Y] > 1 then
Inc(TotalCollisionsCount);
if HashList[Y] > MostCollidingSlotItemCount then
MostCollidingSlotItemCount := HashList[Y];
end;
Writeln('Total: ' + IntToStr(TotalCollisionsCount) + ' Max: ' + IntToStr(MostCollidingSlotItemCount));
Here is the result I get:
Total: 48169 Max: 5
Am I missing something?
This is what you get when you put 1M items randomly into 10M cells
calendar_size=10000000 nperson = 1000000
E/cell| Ncell | frac | Nelem | frac |h/cell| hops | Cumhops
----+---------+--------+----------+--------+------+--------+--------
0: 9048262 (0.904826) 0 (0.000000) 0 0 0
1: 905064 (0.090506) 905064 (0.905064) 1 905064 905064
2: 45136 (0.004514) 90272 (0.090272) 3 135408 1040472
3: 1488 (0.000149) 4464 (0.004464) 6 8928 1049400
4: 50 (0.000005) 200 (0.000200) 10 500 1049900
----+---------+--------+----------+--------+------+--------+--------
5: 10000000 1000000 1.049900 1049900
The left column is the number of items in a cell. The second: the number of cells having this itemcount.
WRT the binary search: it is obvious that for small tables like this (maximum chain length=4, but most chains are of length=1), linear search outperforms binary search. The takeover-point is probably somewhere between 10 and 100.

RODBC: able to connect to db but can't find table object

I am trying to connect SQLite database using RODBC in R. RODBC is able to connect to the database but is not able to get the list of tables in database using sqlTables, which returns "0 rows". The database has 20 tables.
System: R 3.1.2, Windows 7, Rstudio
Code snippet
> library(RODBC)
> odbcGetInfo(bbdb1)
DBMS_Name
"SQLite"
DBMS_Ver
"3.8.6"
Driver_ODBC_Ver
"03.00"
Data_Source_Name
"bbdb1"
Driver_Name
"sqlite3odbc.dll"
Driver_Ver
"0.999"
ODBC_Ver
"03.80.0000"
Server_Name
"C:\\Users\\shals\\Documents\\R in a nutshell\\nutshell\\data\\bb1"
> sqlListTables(bbdb1)
Error: could not find function "sqlListTables"
> sqlTables(bbdb1)
[1] TABLE_CAT TABLE_SCHEM TABLE_NAME TABLE_TYPE REMARKS
<0 rows> (or 0-length row.names)
> sqlPrimaryKeys(bbdb1,func,errors=FALSE,as.is=TRUE,catalog=NULL,schema=NULL)
Error in sqlPrimaryKeys(bbdb1, func, errors = FALSE, as.is = TRUE, catalog = NULL, :
object 'func' not found
Can anyone please help why sqlTables returning 0 rows when there are 20 tables in database.
changed the connection string as below after which the code worked fine.
bbdb1 <- odbcConnect(dsn="bbdb",believeNRows = FALSE,rows_at_time = 1)

Update Columns with Using Case

i have an asp.net project,on this project,
i need to update my column according to values of my other 2 columns.
Structure of my table looks ;
ID - NAME - GK - PG - GK+PG
1 - mike - 1 - 1 [the sql should write here 540 but writes 180 for the case]
2 - john - 2 - 1 [the sql should write here 1080 but writes 180 for the case]
3 - sue - 1 - 2 [the sql should write here 1080 but writes 180 for the case]
Here is the .cs code for it.
string strSQL = "UPDATE [info] SET
[GK_PG] = (CASE
WHEN ([GK]='1') THEN '180'
WHEN ([PG]='1') THEN '180'
WHEN ([GK]='2') THEN '540'
WHEN ([PG]='2') THEN '540'
WHEN ([GK]='3') THEN '1080'
WHEN ([PG]='3') THEN '1080'
WHEN ([GK]='1' AND [PG]='1') THEN '540'
WHEN ([GK]='2' AND [PG]='1') THEN '1080'
WHEN ([GK]='1' AND [PG]='2') THEN '1080'
ELSE 0
END)
WHERE [DATE] BETWEEN #DATE1 AND #DATE1 AND WORK_TYPE='IN'";
This code just writes the my [GK_PG] column '180', it doesnt look for other statuses.
Waiting your answer.
Thank you.
I've solved it by seperating GK and PG situations into the 2 different button click event.
Thanks anyway.

Generating complete SKUs in Classic ASP

Hi I have products that are made up of a couple of options. Each Option has a SKU Code. You can only select one option from each SKU Group and the options have to be concatenated in the order of the SKUGroup.
So for example i would have a list of options in a table in the DB that looked like
OptID PID SKU Price SKUGroup
156727 93941 C 171.00 1
156728 93941 BN 171.00 1
156729 93941 PN 171.00 1
156718 93940 W 115.20 2
156719 93940 CA 115.20 2
156720 93940 BA 115.20 2
156721 93940 BNA 115.20 2
156722 93940 BN 115.20 2
156723 93940 BS 115.20 2
156716 93939 CHR 121.50 3
156717 93939 NK 138.00 3
And a few finished product SKUs would look something like:
C-W-CHR 407.70
C-W-NK 424.20
C-CA-CHR 407.20
C-CA-NK 424.20
I am trying to make a script that will create a listing of every possible combination of SKU and the price of the combined options.
I need this done in Classic ASP (vbscript) and I'm not that familiar with it. So I'm looking for all the help I can get.
Thanks!
I would start by connecting to the database and creating three recordsets.
Set connection = CreateObject("ADODB.Connection")
connection.Open ConnectionString
Set rsOption1 = CreateObject("ADODB.recordset")
Set rsOption2 = CreateObject("ADODB.recordset")
Set rsOption3 = CreateObject("ADODB.recordset")
rsOption1.Open "SELECT * FROM TableName WHERE SKUGroup = 1", connection, 3,3
rsOption2.Open "SELECT * FROM TableName WHERE SKUGroup = 2", connection, 3,3
rsOption3.Open "SELECT * FROM TableName WHERE SKUGroup = 3", connection, 3,3
Then you can use nested loops to get the combinations. Something like this (Untested, this probably will not work as is, but it gives you an idea of how to do this) (Also this assumes that you have to select at least one option from each group)
for i = 0 to rsOption1.RecordCount
rsOption1.Move i, 1
for j = 0 to rsOption2.RecordCount
rsOption2.Move j, 1
for k = 0 to rsOption3.RecordCount
rsOption3.Move k, 1
'Write rsOption1.Fields(2).Value & "-" & rsOption2.Fields(2).Value & _
'"-" & rsOption3.Fields(2).Value & " " & _
'FormatCurrency((rsOption1.Fields(3).Value + rsOption2.Fields(3).Value + rsOption3.Fields(3).Value))
Next
Next
Next

Resources