How to get a hierarchical tree path in SQLite? - sqlite

Imagine a simple table that defines a tree structure.
create table nodes (
id integer primary key,
name text not null,
parent integer
)
Some example nodes:
Node 1 is parent of 2 and 3. Node 3 is parent of 4. Is it possible to write a SQL query in SQLite, so that it returns:
id path
1 foo
2 foo/bar
3 foo/baz
4 foo/baz/stuff

You can perform recursion in SQLite using recursive common table expressions.
An example query that would return the node paths:
with recursive paths(id, name, path) as (
select id, name, name from nodes where parent is null
union
select nodes.id, nodes.name, paths.path || '/' || nodes.name
from nodes join paths where nodes.parent = paths.id
)
select id, path from paths

Related

What is the best way to design a tag-based data table with Sqlite?

Json received from the server has this form.
[
{
"id": 1103333,
"name": "James",
"tagA": [
"apple",
"orange",
"grape"
],
"tagB": [
"red",
"green",
"blue"
],
"tagC": null
},
{
"id": 1103336,
"name": "John",
"tagA": [
"apple",
"pinapple",
"melon"
],
"tagB": [
"black",
"white",
"blue"
],
"tagC": [
"London",
"New York"
]
}
]
An object can have multiple tags, and a tag can be associated with multiple objects.
In this list, I want to find an object whose tagA is apple or grape and tagB is black.
This is the first table I used to write.
create table response(id integer primary key, name text not null, tagA text,
tagB text, tagC text)
select * from response where (tagA like '%apple%' or tagA like '%grape%') and (tagB like '%black%')
This type of table design has a problem that the search speed is very slow because it does not support the surface function of the fts function when using ORM library such as Room.
The next thing I thought about was to create a table for each tag.
create table response(id integer primary key, name text not null)
create table tagA(objectID integer, value text, primary key(objectID, value))
create table tagB(objectID integer, value text, primary key(objectID, value))
create table tagC(objectID integer, value text, primary key(objectID, value))
select * from response where id in ((select objectId from tagA where value in ('apple','grape'))
intersect
(select objectId from tagB where value in 'black'))
This greatly increases the insertion time and the capacity of the APK (roughly twice as much per additional table), but the search speed is far behind that of the FTS virtual table.
I want to avoid this as much as I use FTS tables because there are more things I need to manage myself.
There are a lot of things I missed (index etc.) but I can not figure out what it is.
How can I optimize the database without using the FTS method?
You could use a reference table (aka mapping table along with a multitude of other names) to allow a many-many relationship between tags (single table for all) and objects (again single table).
So you have the objects table each object having an id and you have the tags table again with an id for each object. So something like :-
DROP TABLE IF EXISTS object_table;
CREATE TABLE IF NOT EXISTS object_table (id INTEGER PRIMARY KEY, object_name);
DROP TABLE IF EXISTS tag_table;
CREATE TABLE IF NOT EXISTS tag_table (id INTEGER PRIMARY KEY, tag_name);
You'd populate both e.g.
INSERT INTO object_table (object_name) VALUES
('Object1'),('Object2'),('Object3'),('Object4');
INSERT INTO tag_table (tag_name) VALUES
('Apple'),('Orange'),('Grape'),('Pineapple'),('Melon'),
('London'),('New York'),('Paris'),
('Red'),('Green'),('Blue'); -- and so on
The you'd have the mapping table something like :-
DROP TABLE IF EXISTS object_tag_mapping;
CREATE TABLE IF NOT EXISTS object_tag_mapping (object_reference INTEGER, tag_reference INTEGER);
Overtime as tags are assigned to objects or vice-versa you add the mappings e.g. :-
INSERT INTO object_tag_mapping VALUES
(1,4), -- obj1 has tag Pineapple
(1,1), -- obj1 has Apple
(1,8), -- obj1 has Paris
(1,10), -- obj1 has green
(4,1),(4,3),(4,11), -- some tags for object 4
(2,8),(2,7),(2,4), -- some tags for object 2
(3,1),(3,2),(3,3),(3,4),(3,5),(3,6),(3,7),(3,8),(3,9),(3,10),(3,11); -- all tags for object 3
You could then have queries such as :-
SELECT object_name,
group_concat(tag_name,' ~ ') AS tags_for_this_object
FROM object_tag_mapping
JOIN object_table ON object_reference = object_table.id
JOIN tag_table ON tag_reference = tag_table.id
GROUP BY object_name
;
group_concat is an aggregate function (applied per GROUP) that concatenates all values found for the specified column with (optional) separator.
The result of the query being :-
The following could be a search based upon tags (not that you'd likely use both tag_name and a tag_reference) :-
SELECT object_name, tag_name
FROM object_tag_mapping
JOIN object_table ON object_reference = object_table.id
JOIN tag_table ON tag_reference = tag_table.id
WHERE tag_name = 'Pineapple' OR tag_reference = 9
;
This would result in :-
Note this is a simple overview e.g. you may want to consider having the mapping table as a WITHOUT ROWID table, perhaps have a composite UNIQUE constraint.
Additional re comment :-
How do I implement a query that contains two or more tags at the same
time?
This is a little more complex if you want specific tags but still doable. Here's an example using a CTE (Common Table Expression) along with a HAVING clause (a where clause applied after the output has been generated, so can be applied to aggregates) :-
WITH cte1(otm_oref,otm_tref,tt_id,tt_name, ot_id, ot_name) AS
(
SELECT * FROM object_tag_mapping
JOIN tag_table ON tag_reference = tag_table.id
JOIN object_table ON object_reference = object_table.id
WHERE tag_name = 'Pineapple' OR tag_name = 'Apple'
)
SELECT ot_name, group_concat(tt_name), count() AS cnt FROM CTE1
GROUP BY otm_oref
HAVING cnt = 2
;
This results in :-

How to check if a parent/child relationship exists in a tree?

Checking if the following tables have a certain relationship among their records would be useful:
-- Table: privilege_group
CREATE TABLE privilege_group (
privilege_group_id integer NOT NULL CONSTRAINT privilege_group_pk PRIMARY KEY AUTOINCREMENT,
name text NOT NULL,
CONSTRAINT privilege_group_name UNIQUE (name)
);
-- Table: privilege_relationship
CREATE TABLE privilege_relationship (
privilege_relationship_id integer NOT NULL CONSTRAINT privilege_relationship_pk PRIMARY KEY AUTOINCREMENT,
parent_id integer NOT NULL,
child_id integer NOT NULL,
CONSTRAINT privilege_relationship_parent_child UNIQUE (parent_id, child_id),
CONSTRAINT privilege_relationship_parent_id FOREIGN KEY (parent_id)
REFERENCES privilege_group (privilege_group_id),
CONSTRAINT privilege_relationship_child_id FOREIGN KEY (child_id)
REFERENCES privilege_group (privilege_group_id),
CONSTRAINT privilege_relationship_check CHECK (parent_id != child_id)
);
Parents can have many children, children can have many parents. Writing code to process records outside of the database is always possible, but is it possible to use a depth-first (or breadth-first) search to check if a child has a particular parent?
My related question received a comment from CL. that mentions the WITH clause, but my experience with hierarchical queries is rather limited and insufficient to understand, select, and apply the examples on the page to my goal:
Only worked with hierarchical queries in Oracle.
Only used to implement "range" number generators (like in Python).
Only seen how to process records in a broad-to-narrow pattern.
Not sure if an expanding result set in a hierarchical query is possible.
Unsure of how to select a depth-first or breadth-first search strategy.
Could someone show me how to find out if a child has a parent if the names of both are known?
This is a standard tree search (using UNION instead of UNION ALL to prevent infinite loops):
WITH RECURSIVE ParentsOfG1(id) AS (
SELECT privilege_group_id
FROM privilege_group
WHERE name = 'G1'
UNION
SELECT parent_id
FROM privilege_relationship
JOIN ParentsOfG1 ON id = child_id
)
SELECT id
FROM ParentsOfG1
WHERE id = (SELECT privilege_group_id
FROM privilege_group
WHERE name = 'P2');
Depth/breadth-first does not matter for this.
An alternative to CL.'s answer could be this query which has been reformatted and adjusted to use bound parameters that could be plugged into a project that needs to check certain relationships:
WITH RECURSIVE parent_of_child(id)
AS (
SELECT privilege_group_id
FROM privilege_group
WHERE name = :child
UNION
SELECT parent_id
FROM privilege_relationship
JOIN parent_of_child
ON id = child_id)
SELECT id
FROM parent_of_child
WHERE id = (
SELECT privilege_group_id
FROM privilege_group
WHERE name = :parent)

Exclude parent vertex from results

I have a simple graph with one parent and three children:
Querying for the children, I also get back the parent:
select name
from (
traverse in()
from (
select
from group
where name = 'Parent'
)
)
Results:
name
Parent
Child 1
Child 2
Child 3
How can I exclude the parent from the results in the query? I'd rather not process the results in my application code.
Thanks.
To get only the children name, I suggest a query like this:
select in('belongsTo').name as Name from Group where name = "Parent" unwind Name
Excluding where depth is zero seems to do the trick:
select name
from (
traverse in()
from (
select
from group
where name = 'Parent'
)
)
where $depth > 0
Results in:
name
Child 1
Child 2
Child 3

start with prior reference on last select

I have a problem while inserting based on select query
I have a schema in the database with a parent-child relationship that looks like the following
A
B
C
G
L
F
C
G
L
Notice how Element c is reused, because it´s aviable twice with different parent id, but element g is only aviable once, since the id of c is the same in both cases. The select prints everything as expected with the following query
select id,
parent_id,
label
from table
start with parent_id is null
connect by nocycle prior id = parent_id
order siblings by sort
i am having around 2500 elements in this table, but in the end around 4000 are displayed because a few elemnts should be displayed multiple times at different places.
So, to identify both, the first and second g as unique elements, i have written the following insert statement
insert into other_tale (id, parent_id, label)
select create_id new_id,
prior ???,
label,
from table
start with parent_id is null
connect by nocycle prior id = parent_id
order siblings by sort;
Here i am calling a procedure to generate a new id for each raw that has been found. Now i am stuck at the part where i do recieve the new id of the parent element. I know that i can refer to the prior parent_raw in the table beeing select, but am i able to somehow refer to the column new_id of the parent_element in the select?
Create a package with 1 associative array id_cache and 2 functions: f_clear_cache and f_generate_id.
f_clear_cache deletes the cached ids - id_cache.delete.
f_generate_id takes id as argument and returns the new_id
check if the new_id was already generated - id_cache.exists(id)
if not, generate the new_id and cache it - id_cache(id) := new_id
return new_id - return id_cache(id)
finally use the function in your sql statement
insert into other_tale (id, parent_id, label)
select my_package.f_generate_id(id),
my_package.f_generate_id(parent_id),
label
...
note: do not forget to call f_clear_cache when you want to generate new set of ids within the same session.

Please help me with this traversing query

CatID parID catName
1 -1 A
2 1 B
3 2 C
4 3 D
I want to write a query which returns the parent child relationship in string format.
In the above table the catName has parentId -1, which means it has got no parent. B has parentID 1, which means that A is its parent.
So finaly the string is like this
A=>B=>c=>D
This is the way I want to generate a query.
I'll pass CatID, and it will traverse until it gets a -1.
declare #CatID int;
set #CatID = 4;
with C as
(
select parID,
cast(catName as varchar(max)) as catName
from YourTable
where CatID = #CatID
union all
select T.parID,
T.catName + '=>' + C.catName
from YourTable as T
inner join C
on T.CatID = C.parID
)
select catName
from C
where parID = -1
SE-Data
As a partial answer, it sounds like you need a recursive query. Here is a StackOverflow thread with some good information on recursive queries. As to how to use a query to turn it into a single string, I don't know... that part may be more optimized for a programming language.
you need to define function then call it in recursive loop.
You can use MPTT (Modified Preorder Tree Traversal) to store nested tree or hierarchical data.
this article describe how to get hierarchical "breadcrumb" within a single query.

Resources