How to write complex recursive maria db query - recursion

Im trying to write a recursive query for a use on a old and poorly designed database - and so the queries get quite complex.
Here is the (relevant) table relationships
Because people asked - here is the creation code for these tables:
CREATE TABLE CircuitLayout(
CircuitLayoutID int,
PRIMARY KEY (CircuitLayoutID)
);
CREATE TABLE LitCircuit (
LitCircuitID int,
CircuitLayoutID int,
PRIMARY KEY (LitCircuitID)
FOREIGN KEY (CircuitLayoutID) REFERENCES CircuitLayout(CircuitLayoutID)
);
CREATE TABLE CircuitLayoutItem(
CircuitLayoutItemID int,
CircuitLayoutID int,
TableName varchar(255),
TablePK int,
PRIMARY KEY (CircuitLayoutItemID)
FOREIGN KEY (CircuitLayoutID) REFERENCES CircuitLayout(CircuitLayoutID)
);
TableName refers to another table in the database and thus TablePK is a primary key from the specified table
One of the valid options for TableName is LitCircuit
I'm trying to write a query that will select a circuit and any circuit it is related to
I am having trouble understanding the syntax for recursive ctes
my non-functional attempt is this:
WITH RECURSIVE carries AS (
SELECT LitCircuit.LitCircuitID AS recurseList FROM LitCircuit
JOIN CircuitLayoutItem ON LitCircuit.CircuitLayoutID = CircuitLayoutItem.CircuitLayoutID
WHERE CircuitLayoutItem.TableName = "LitCircuit" AND CircuitLayoutItem.TablePK IN (00340)
UNION
SELECT LitCircuit.LitCircuitID AS CircuitIDs FROM LitCircuit
JOIN CircuitLayout ON LitCircuit.CircuitLayoutID = CircuitLayoutItem.CircuitLayoutID
WHERE CircuitLayoutItem.TableName = "LitCircuit" AND CircuitLayoutItem.TablePK IN (SELECT recurseList FROM carries)
)
SELECT * FROM carries;
the "00340" is a dummy number for testing, and it would get replaced with an actual list in usage
What i'm attempting to do is get a list of LitCircuitIDs based on one or many LitCircuitIDs - that's the anchor member, and that works fine.
What I want to do is take this result and feed it back into itself.
I lack an understanding of how to access data from the anchor member:
I don't know if it is a table with the columns from the select in the anchor or if it is simply a list of resulting values
I dont understand if or where I need to include "carries" in the FROM part of a query
If I were to write this function in python I would do it like this:
def get_circuits(circuit_list):
result_list = []
for layout_item_key, layout_item in CircuitLayoutItem.items():
if layout_item['TableName'] == "LitCircuit" and layout_item['TablePK'] in circuit_list:
layout = layout_item['CircuitLayoutID']
for circuit_key, circuit in LitCircuit.items():
if circuit["CircuitLayoutID"] == layout:
result_list.append(circuit_key)
result_list.extend(get_circuits(result_list))
return result_list
How do I express this in SQL?

danblack's comment made me realize something I was missing:
Here is what I was trying to do:
WITH RECURSIVE carries AS (
SELECT LitCircuit.LitCircuitID FROM LitCircuit
JOIN CircuitLayoutItem ON LitCircuit.CircuitLayoutID = CircuitLayoutItem.CircuitLayoutID
WHERE CircuitLayoutItem.TableName = 'LitCircuit' AND CircuitLayoutItem.TablePK IN (00340)
UNION ALL
SELECT LitCircuit.LitCircuitID FROM carries
JOIN CircuitLayoutItem ON carries.LitCircuitID = CircuitLayoutItem.TablePK
JOIN LitCircuit ON CircuitLayoutItem.CircuitLayoutID = LitCircuit.CircuitLayoutID
WHERE CircuitLayoutItem.TableName = 'LitCircuit'
)
SELECT DISTINCT LitCircuitID FROM carries;
I did not think of the CTE as a table to query against - rather just a result set, so I did not realize you have to SELECT from it - or in general treat it like a table.

Related

How to check if a parent/child relationship exists in a tree?

Checking if the following tables have a certain relationship among their records would be useful:
-- Table: privilege_group
CREATE TABLE privilege_group (
privilege_group_id integer NOT NULL CONSTRAINT privilege_group_pk PRIMARY KEY AUTOINCREMENT,
name text NOT NULL,
CONSTRAINT privilege_group_name UNIQUE (name)
);
-- Table: privilege_relationship
CREATE TABLE privilege_relationship (
privilege_relationship_id integer NOT NULL CONSTRAINT privilege_relationship_pk PRIMARY KEY AUTOINCREMENT,
parent_id integer NOT NULL,
child_id integer NOT NULL,
CONSTRAINT privilege_relationship_parent_child UNIQUE (parent_id, child_id),
CONSTRAINT privilege_relationship_parent_id FOREIGN KEY (parent_id)
REFERENCES privilege_group (privilege_group_id),
CONSTRAINT privilege_relationship_child_id FOREIGN KEY (child_id)
REFERENCES privilege_group (privilege_group_id),
CONSTRAINT privilege_relationship_check CHECK (parent_id != child_id)
);
Parents can have many children, children can have many parents. Writing code to process records outside of the database is always possible, but is it possible to use a depth-first (or breadth-first) search to check if a child has a particular parent?
My related question received a comment from CL. that mentions the WITH clause, but my experience with hierarchical queries is rather limited and insufficient to understand, select, and apply the examples on the page to my goal:
Only worked with hierarchical queries in Oracle.
Only used to implement "range" number generators (like in Python).
Only seen how to process records in a broad-to-narrow pattern.
Not sure if an expanding result set in a hierarchical query is possible.
Unsure of how to select a depth-first or breadth-first search strategy.
Could someone show me how to find out if a child has a parent if the names of both are known?
This is a standard tree search (using UNION instead of UNION ALL to prevent infinite loops):
WITH RECURSIVE ParentsOfG1(id) AS (
SELECT privilege_group_id
FROM privilege_group
WHERE name = 'G1'
UNION
SELECT parent_id
FROM privilege_relationship
JOIN ParentsOfG1 ON id = child_id
)
SELECT id
FROM ParentsOfG1
WHERE id = (SELECT privilege_group_id
FROM privilege_group
WHERE name = 'P2');
Depth/breadth-first does not matter for this.
An alternative to CL.'s answer could be this query which has been reformatted and adjusted to use bound parameters that could be plugged into a project that needs to check certain relationships:
WITH RECURSIVE parent_of_child(id)
AS (
SELECT privilege_group_id
FROM privilege_group
WHERE name = :child
UNION
SELECT parent_id
FROM privilege_relationship
JOIN parent_of_child
ON id = child_id)
SELECT id
FROM parent_of_child
WHERE id = (
SELECT privilege_group_id
FROM privilege_group
WHERE name = :parent)

SQLite Query plan

Is there a way to manipulate the query plan generated in SQLite?
I 'l try to explain my problem:
I have 3 tables:
CREATE TABLE "index_term" (
"id" INT,
"term" VARCHAR(255) NOT NULL,
PRIMARY KEY("id"),
UNIQUE("term"));
CREATE TABLE "index_posting" (
"doc_id" INT NOT NULL,
"term_id" INT NOT NULL,
PRIMARY KEY("doc_id", "field_id", "term_id"),,
CONSTRAINT "index_posting_doc_id_fkey" FOREIGN KEY ("doc_id")
REFERENCES "document"("doc_id") ON DELETE CASCADE,
CONSTRAINT "index_posting_term_id_fkey" FOREIGN KEY ("term_id")
REFERENCES "index_term"("id") ON DELETE CASCADE);;
CREATE INDEX "index_posting_term_id_idx" ON "index_posting"("term_id");
CREATE TABLE "published_files" (
"doc_id" INTEGER NOT NULL,,
"uri_id" INTEGER,
"user_id" INTEGER NOT NULL,
"status" INTEGER NOT NULL,
"title" VARCHAR(1024),
PRIMARY KEY("uri_id"));
CREATE INDEX "published_files_doc_id_idx" ON "published_files"("doc_id");
about 600.000 entries in the index_term, about 4 Millions in the index_posting and 300.000 in the published_files table.
Now when i want to find the number of unique doc_ids in index_posting which reference some terms i use the following SQL.
select count(distinct index_posting.doc_id) from index_term, index_posting
where
index_posting.term_id = index_term.id and index_term.term like '%test%'
The result is displayed in reasonable time (0.3 secs). Asking Explain Query plan returns
0|0|0|SCAN TABLE index_term
0|1|1|SEARCH TABLE index_posting USING INDEX index_posting_term_id_idx (term_id=?)
When i want to filter the count in the way that it only includes doc_ids of index_posting if there exists a published_files entry:
select count(distinct index_posting.doc_id) from index_term, index_posting,
published_files where
index_posting.term_id = index_term.id and index_posting.doc_id = published_files.doc_id and index_term.term like '%test%'
The query takes almost 10 times as long. Asking Explain Query plan returns
0|0|1|SCAN TABLE index_posting
0|1|0|SEARCH TABLE index_term USING INDEX sqlite_autoindex_index_term_1 (id=?)
0|2|2|SEARCH TABLE published_files AS pf USING COVERING INDEX published_files_doc_id_idx (doc_id=?)
So as far as i understand SQLITE changed here its query plan doing a full table scan of index_posting and a lookup in index_term instead of the other way around.
As a workaround i did do a
analyze index_posting;
analyze index_term;
analyze published_files;
and now it seems correct,
0|0|0|SCAN TABLE index_term
0|1|1|SEARCH TABLE index_posting USING INDEX index_posting_term_id_idx (term_id=?)
0|2|2|SEARCH TABLE published_files USING COVERING INDEX published_files_doc_id_idx (doc_id=?)
but my question is - is there a way to force SQLITE to always use the correct query plan?
TIA
ANALYZE is not a workaround; it's supposed to be used.
You can use CROSS JOIN to enforce a certain order of the nested loops, or use INDEXED BY to force a certain index to be used.
However, you asked for "the correct query plan", which might not be same as the one enforced by these mechanisms.

sqlite3 join-filter order-by performance

I'm trying to do a query like that on an sqlite3 database:
select node.loc, node.weight from node
inner join filt on (node.id = filt.node_id)
inner join filt T5 on (node.id = T5.node_id)
where (filt.word = 'aaa' and T5.word = 'aasvogel')
order by node.weight desc limit 10;
On mysql, such query works fine and fast (<0.2s); on sqlite3, on the same data, it runs for ~2s.
What could be the problem and what can I do to improve its performance?
The files I made to test sqlite3 can be found here: https://github.com/HoverHell/sqlperftst1
The table definitions in particular:
CREATE TABLE "node" (
"id" integer PRIMARY KEY,
"loc" varchar(255) NOT NULL,
"weight" real
);
CREATE INDEX "node_loc" ON "node" ("loc");
CREATE INDEX "node_weight" on "node" ("weight");
CREATE TABLE "filt" (
"id" integer PRIMARY KEY,
"node_id" integer NOT NULL,
"word" varchar(120) NOT NULL
);
CREATE INDEX "filt_word" ON "filt" ("word");
CREATE INDEX "filt_node_id" ON "filt" ("node_id");
UPD: Perofrmance comparison on realistic data and queries:
This query can be improved by creating an index on both columns used for lookups:
CREATE INDEX filt_word_node_id ON file(word, node_id);
However, the shape of tje data is unusual, which makes the query optimizer misestimate the selectivity of the word lookups.
Run ANALYZE to fix this.

sqlite3 if exists alternative

I am trying to run simple query for sqlite which is update a record if not exists.
I could have used Insert or replace but article_tags table doesn't have any primary key as it is a relational table.
How can I write this query for sqlite as if not exists is not supported.
And I don't have idea how to use CASE for this ?
Table Structure:
articles(id, content)
article_tags(article_id, tag_id)
tag(id, name)
SQLITE Incorrect Syntax Tried
insert into article_tags (article_id, tag_id ) values ( 2,7)
if not exists (select 1 from article_tags where article_id =2 AND tag_id=7)
I think the correct way to do this would be to add a primary key to the article_tags table, a composite one crossing both columns. That's the normal way to do many-to-many relationship tables.
In other words (pseudo-DDL):
create table article_tags (
article_id int references articles(id),
tag_id int references tag(id),
primary key (article_id, tag_id)
);
That way, insertion of a duplicate pair would fail rather than having to resort to if not exists.
This seems to work, although I will, as others have also said, suggest that you add some constraints on the table and then simply "try" to insert.
insert into article_tags
select 2, 7
where not exists ( select *
from article_tags
where article_id = 2
and tag_id = 7 )

SQLite Schema Information Metadata

I need to get column names and their tables in a SQLite database. What I need is a resultset with 2 columns: table_name | column_name.
In MySQL, I'm able to get this information with a SQL query on database INFORMATION_SCHEMA. However the SQLite offers table sqlite_master:
sqlite> create table students (id INTEGER, name TEXT);
sqlite> select * from sqlite_master;
table|students|students|2|CREATE TABLE students (id INTEGER, name TEXT)
which results a DDL construction query (CREATE TABLE) which is not helpful for me and I need to parse this to get relevant information.
I need to get list of tables and join them with columns or just get columns along with table name column. So PRAGMA table_info(TABLENAME) is not working for me since I don't have table name. I want to get all column metadata in the database.
Is there a better way to get that information as a result set by querying database?
You've basically named the solution in your question.
To get a list of tables (and views), query sqlite_master as in
SELECT name, sql FROM sqlite_master
WHERE type='table'
ORDER BY name;
(see the SQLite FAQ)
To get information about the columns in a specific table, use PRAGMA table_info(table-name); as explained in the SQLite PRAGMA documentation.
I don't know of any way to get tablename|columnname returned as the result of a single query. I don't believe SQLite supports this. Your best bet is probably to use the two methods together to return the information you're looking for - first get the list of tables using sqlite_master, then loop through them to get their columns using PRAGMA table_info().
Recent versions of SQLite allow you to select against PRAGMA results now, which makes this easy:
SELECT
m.name as table_name,
p.name as column_name
FROM
sqlite_master AS m
JOIN
pragma_table_info(m.name) AS p
ORDER BY
m.name,
p.cid
where p.cid holds the column order of the CREATE TABLE statement, zero-indexed.
David Garoutte answered this here, but this SQL should execute faster, and columns are ordered by the schema, not alphabetically.
Note that table_info also contains
type (the datatype, like integer or text),
notnull (1 if the column has a NOT NULL constraint)
dflt_value (NULL if no default value)
pk (1 if the column is the table's primary key, else 0)
RTFM: https://www.sqlite.org/pragma.html#pragma_table_info
There are ".tables" and ".schema [table_name]" commands which give kind of a separated version to the result you get from "select * from sqlite_master;"
There is also "pragma table_info([table_name]);" command to get a better result for parsing instead of a construction query:
sqlite> .tables
students
sqlite> .schema students
create table students(id INTEGER, name TEXT);
sqlite> pragma table_info(students);
0|id|INTEGER|0||0
1|name|TEXT|0||0
Hope, it helps to some extent...
Another useful trick is to first get all the table names from sqlite_master.
Then for each one, fire off a query "select * from t where 1 = 0". If you analyze the structure of the resulting query - depends on what language/api you're calling it from - you get a rich structure describing the columns.
In python
c = ...db.cursor()
c.execute("select * from t where 1=0");
c.fetchall();
print c.description;
Juraj
PS. I'm in the habit of using 'where 1=0' because the record limiting syntax seems to vary from db to db. Furthermore, a good database will optimize out this always-false clause.
The same effect, in SQLite, is achieved with 'limit 0'.
FYI, if you're using .Net you can use the DbConnection.GetSchema method to retrieve information that usually is in INFORMATION_SCHEMA. If you have an abstraction layer you can have the same code for all types of databases (NOTE that MySQL seems to swich the 1st 2 arguments of the restrictions array).
Try this sqlite table schema parser, I implemented the sqlite table parser for parsing the table definitions in PHP.
It returns the full definitions (unique, primary key, type, precision, not null, references, table constraints... etc)
https://github.com/maghead/sqlite-parser
The syntax follows sqlite create table statement syntax: http://www.sqlite.org/lang_createtable.html
This is an old question but because of the number of times it has been viewed we are adding to the question for the simple reason most of the answers tell you how to find the TABLE names in the SQLite Database
WHAT DO YOU DO WHEN THE TABLE NAME IS NOT IN THE DATABASE ?
This is happening to our app because we are creating TABLES programmatically
So the code below will deal with the issue when the TABLE is NOT in or created by the Database Enjoy
public void toPageTwo(View view){
if(etQuizTable.getText().toString().equals("")){
Toast.makeText(getApplicationContext(), "Enter Table Name\n\n"
+" OR"+"\n\nMake Table First", Toast.LENGTH_LONG
).show();
etQuizTable.requestFocus();
return;
}
NEW_TABLE = etQuizTable.getText().toString().trim();
db = dbHelper.getWritableDatabase();
ArrayList<String> arrTblNames = new ArrayList<>();
Cursor c = db.rawQuery("SELECT name FROM sqlite_master WHERE
type='table'", null);
if (c.moveToFirst()) {
while ( !c.isAfterLast() ) {
arrTblNames.add( c.getString( c.getColumnIndex("name")) );
c.moveToNext();
}
}
c.close();
db.close();
boolean matchFound = false;
for(int i=0;i<arrTblNames.size();i++) {
if(arrTblNames.get(i).equals(NEW_TABLE)) {
Intent intent = new Intent(ManageTables.this, TableCreate.class
);
startActivity( intent );
matchFound = true;
}
}
if (!matchFound) {
Toast.makeText(getApplicationContext(), "No Such Table\n\n"
+" OR"+"\n\nMake Table First", Toast.LENGTH_LONG
).show();
etQuizTable.requestFocus();
}
}

Resources