How to design a database(sqlite) for multi-condition query? - sqlite

Suppose 1000,000 records arranged as:
c1_v1 c2_v1 c3_v1 d1
c1_v1 c2_v1 c3_v2 d2
c1_v1 c2_v1 c3_v3 d3
...
c1_v1 c2_v2 c3_v1 d999
c1_v1 c2_v2 c3_v2 d1000
...
c1_v999 c2_v999 c3_v998 d999999
c1_v999 c2_v999 c3_v999 d1000000
say that we need three conditions(c1_vx, c2_vx, c3_vx) to query the result(dx), but the single condition such as c1_v1 in different records may be same. An alternative style of the records:
c1_v1
c2_v1
c3_v1 : d1
c3_v2 : d2
c3_v3 : d3
...
c2_v2
c3_v1 : d999
c3_v2 : d1000
...
c1_v999
c2_v999
c3_v998: d999999
c3_v1000: d1000000
How to design the tables for fasttest query? (Just query, don't care about insert/update/delete)
Thanks!

A typical query operation is like select d from t_table where c1 = 'UA1000_2048X32_MCSYN' and c2 = '1.234' and c3 = '2.345';
Well, then you just need a composite index on {c1, c2, c3}.
Ideally, you'd also cluster the table, so retrieving d just involves an index seek without a table heap access, but I don't think SQLite supports clustering. Alternatively, consider creating a covering index on {c1, c2, c3, d} instead.
c1 is a string like UA1000_2048X32_MCSYN, c2 and c3 is a real(double) number
I'd refrain from trying to equate numbers with strings in your query - some DBMSes can't use index in these situations and SQLite might be one of them. Instead, just write the query in most natural way, without single quotes around number literals:
select d from t_table
where c1 = 'UA1000_2048X32_MCSYN' and c2 = 1.234 and c3 = 2.345;

Related

Kusto equivalent of SQL NOT IN

I am trying to identify what records exist in table 1 that are not in table 2 (so essentially using NOT IN)
let outliers =
Table 2
| project UniqueEventGuid;
Table 1
|where UniqueEventGuid !in (outliers)
|project UniqueEventGuid
but getting 0 records back even though I know there are orphans in table 1.
Is the !in not the right syntax?
Thanks in advance!
!in operator
"In tabular expressions, the first column of the result set is
selected."
In the following example I intentionally ordered the column such that the query will result in error due to mismatched data types.
In your case, the data types might match, so the query is valid, but the results are wrong.
let t1 = datatable(i:int, x:string)[1,"A", 2,"B", 3,"C" ,4,"D" ,5,"E"];
let t2 = datatable(y:string, i:int)["d",4 ,"e",5 ,"f",6 ,"g",7];
t1
| where i !in (t2)
Relop semantic error: SEM0025: One of the values provided to the
'!in' operator does not match the left side expression type 'int',
consider using explicit cast
Fiddle
If that is indeed the case, you can reorder the columns or project only the relevant one.
Note the use of double brackets.
let t1 = datatable(i:int, x:string)[1,"A", 2,"B", 3,"C" ,4,"D" ,5,"E"];
let t2 = datatable(y:string, i:int)["d",4 ,"e",5 ,"f",6 ,"g",7];
t1
| where i !in ((t2 | project i))
i
x
1
A
2
B
3
C
Fiddle
Another option is to use leftanti join
let t1 = datatable(i:int, x:string)[1,"A", 2,"B", 3,"C" ,4,"D" ,5,"E"];
let t2 = datatable(y:string, i:int)["d",4 ,"e",5 ,"f",6 ,"g",7];
t1
| join kind=leftanti t2 on i
i
x
2
B
3
C
1
A
Fiddle

informix 11.5 How to count the number of members of a column separately

I have a table that have a column like this:
table1:
c1 c2 c3
. a .
. a .
. a .
a
b
b
c
How to get a result like the following?:
-- a b c
count(a) count(b) count(c)
Of course, there is an auxiliary table like the one below:
--field table
d1 d2
a
b
c
Transferring comments into an answer.
If there was an entry in table1.c2 with d as the value, is it correct to guess/assume that you'd want a fourth column of output with the name d and the count of the number of d values as the value. And there'd be an extra row in the auxilliary table too. That's pretty tricky.
You'd probably be better off with a result table with N rows, one for each value in the table1.c2 column, with the first column identifying the value and the second the count:
SELECT c2, COUNT(c2) FROM table1 GROUP BY c2 ORDER BY c2
To generate a single row with the names and counts as shown requires a dynamically built SQL statement — you write an SQL statement that generates the SQL (or the key components of the SQL) for a second statement that you actually execute to get the result. The main reason for it being dynamic like that is that the number of columns in the result set is not known until you run a query that determines which values exist in table1.c2. That's non-trivial — doable, but non-trivial.
I forget whether 11.50 has a built-in sysmaster:sysdual table. I ordinarily use a regular one-column, one-row table called dual. You can get the result you want, if your Table1.C2 has values a through e in it, with:
SELECT (SELECT COUNT(*) FROM Table1 WHERE c2 = 'a') AS a,
(SELECT COUNT(*) FROM Table1 WHERE c2 = 'b') AS b,
(SELECT COUNT(*) FROM Table1 WHERE c2 = 'c') AS c,
(SELECT COUNT(*) FROM Table1 WHERE c2 = 'd') AS d,
(SELECT COUNT(*) FROM Table1 WHERE c2 = 'e') AS e
FROM dual;
This gets the information you need. I don't think it is elegant, but "works" beats "doesn't work".

How to query extended path between nodes with Cypher in Neo4J?

I'm using Neo4J / Cypher to store / retrieve some data based on a graph model.
Let's suppose the following model: I have a set of node (type=child) that are connected through a relation (type=CONNECTED_TO).
C1 -[:CONNECTED_TO]-> C2 -[:CONNECTED_TO]-> C3 -[:CONNECTED_TO]-> C4
If I want to query a path starting from C1 to C4 without knowing intermediates:
MATCH p=
(a:child {id:'c1Id'}) -[:CONNECTED_TO*0..]-(z:child {id:'c4Id'})
RETURN p
So far so good.
Now suppose that each child is contained in a parent and I want to start the query from parent ID
P1 -[:CONTAINS]-> C1
P2 -[:CONTAINS]-> C2
P3 -[:CONTAINS]-> C3
P4 -[:CONTAINS]-> C4
The query looks like:
MATCH p=
(a:parent {id:'p1Id'})
-[:CONTAINS]->
(cStart:child)
-[:CONNECTED_TO*0..]-
(cEnd:child)
<-[Contains]-
(z:parent {id:'p4Id'})
RETURN p
This give me the good result. The following path:
P1 -[:CONTAINS]-> C1 -[:CONNECTED_TO]-> C2 -[:CONNECTED_TO]-> C3 -[:CONNECTED_TO]-> C4 <-[:CONTAINS]- P4
What I would like to do is to query this path from P1 to P4 using the child topology but I want to retrieve also all parents containing intermediates.
How can I improve my last cypher query to return in addition of that:
P2 -[:CONTAINS]-> C2
P3 -[:CONTAINS]-> C3
Is it possible? Maybe my model design is not appropriate for that Use case? In this case, how to improve it to address this query?
Tx
You can use list comprehension construct:
MATCH p=
(a:parent {id:'p1Id'})
-[:CONTAINS]->
(cStart:child)
-[:CONNECTED_TO*0..]-
(cEnd:child)
<-[Contains]-
(z:parent {id:'p4Id'})
RETURN p,
[n IN nodes(p)[1..-1] | (n)<-[:CONTAINS]-(:parent)][0]

SDO_NN cannot be evaluated without using index when using inside in statement

If I run the following query:
select B3.bid as id ,B3.bshape as shape
from Buildings B3
where B3.bid in
(
select distinct B1.bid from Buildings B1,
(
select * from Buildings B where B.bname in (select BOF.bname from Buildings_On_Fire BOF)
) B2 where sdo_nn(B1.bshape, B2.bshape, 'distance=100') = 'TRUE' and B1.bname != b2.bname
)
I receive following errors:
ERROR at line 1: ORA-13249: SDO_NN cannot be evaluated without
using index ORA-06512: at "MDSYS.MD", line 1723 ORA-06512: at
"MDSYS.MDERR", line 17 ORA-06512: at "MDSYS.PRVT_IDX", line 9
However if just run the following subquery:
select distinct B1.bid from Buildings B1,
(
select * from Buildings B where B.bname in (select BOF.bname from Buildings_On_Fire BOF)
) B2 where sdo_nn(B1.bshape, B2.bshape, 'distance=100') = 'TRUE' and B1.bname != b2.bname
This executed fine. I have verified the spatial index, they seems to be valid.
I am new to oracle and have no idea what to do next. please help.
If there is solution which doesn't require changing the above query, that would be best.
A bit late for an answer, but here comes ...
The error you get is because the optimizer did not use the spatial index to solve the SDO_NN operator. Contrary to the other spatial operators (SDO_RELATE, SDO_WIHIN_DISTANCE), SDO_NN cannot be resolved without the help of the index.
Then again I suspect your query is incorrectly formulated. If I understand correctly, what you want to do is find all buildings that are within a distance of 100 (what ? meters ?) from any building that is on fire. For that, use the SDO_WITHIN_DISTANCE operator.
Let's assume your tables are like this:
buildings (bid number, bname varchar2(30), bshape sdo_geometry)
buildings_on_fire (bid number, bname varchar2(30))
The select will then be like this:
select b1.bid as id, b1.bshape as shape
from buildings b1,
buildings b2,
buildings_on_fire bof
where b2.bname = bof.bname
and b1.bname <> b2.bname
and sdo_within_distance (
b1.bshape, b2.bshape,
'distance=100 unit=meter'
) = 'TRUE';

sqlite alter using select and match from other tables

I know how to do these sorts of things using perl, python or even MySQL but I can't not seem to figure out how to do this with sqlite. Hoping maybe somebody here can help.
UPDATED NOTE: I'm limited to sqlite version 2.8.17
I have:
create table Ta (
a1 INTEGER PRIMARY KEY,
a2 VARCHAR(12) );
create table Tb (
b1 VARCHAR(12) PRIMARY KEY,
b2 INTEGER,
b3 VARCHAR(8),
b4 VARCHAR(8) );
What I would like to do via the command line and in a basic sql script is this:
Go through all of the rows in Tb and where b2 == a1, I would like to replace the value stored in b1 with the cooresponding value in a2.
Simplified it's something like:
b1 = select a2 from Ta where a1 = b2
Any ideas?
How about this?
UPDATE Tb
SET
b1 = (SELECT a2
FROM Ta
WHERE Tb.b2 = Ta.a1 )
WHERE
EXISTS (
SELECT *
FROM Ta
WHERE Tb.b2 = Ta.a1 );
Unless I missed something in your question, you simply have to update:
update tb set b1 = (select a2 from Ta where a1 = b2);
UPDATE 1
OP mentions that she/he is using sqlite 2.8.17, so "cross-table" updates are not supported.
I found this link which provides a workaround. It requires that the field joined on be primary keys, which is the case for this question.
Here is the statement:
insert or replace into tb (b2, b1)
select ta.a1, ta.a2
from ta, tb
where ta.a1=tb.b2;
I haven't tested it otherwise then verifying that it executes without errors. To the best of my SQL knowledge it should do the same as the update statement I posted prior to this update.
UPDATE 2
There is a problem with the above as the OP pointed out. It will insert new records rather than update existing records in Tb. I do now see a contradiction in what the OP is trying to do:
Assume that sqlite 3.x.y is used. A simple update statement can get the job done. The problem is that it will fail as soon as more than one record in Tb has the same b2 value that exists in Ta.a1:
sqlite> create table ta (
...> a_key INTEGER PRIMARY KEY,
...> a_val TEXT);
sqlite> create table tb (
...> b_key TEXT PRIMARY KEY,
...> b_val INTEGER);
sqlite> insert into ta values (1, 'a');
sqlite> insert into tb values ('z', 1);
sqlite> insert into tb values ('y', 1);
sqlite> update tb set b_key=(select a_val from ta where a_key=b_val);
Error: column b_key is not unique
So the solution here is to make Tb1.b2 unique:
create table Tb (
b1 VARCHAR(12) PRIMARY KEY,
b2 INTEGER UNIQUE,
b3 VARCHAR(8),
b4 VARCHAR(8));
Making Tb.b2 unique makes the solution in my first update work properly and prevents the uniqueness violation exposed above.

Resources