sqlalchemy self-referencing many-to-many with "select" as association table - sqlite

Problem description
I'm using sqlalchemy (v1.2) declarative, and I have a simple class Node with an id and a label. I would like to build a self-referencing many-to-many relationship where the association table is not a database table, but a dynamic select statement. This statement selects from two joined aliases of Node and returns rows of the form (left_id, right_id), defining the relationship. The code I have so far works if I access the relationship through an instance object, but when I try to filter by the relationship the joins are messed up.
The "classical" self-referential many-to-many relation
For reference, let's start with the example from the documentation on Self-Referential Many-to-Many Relationship, which uses an association table:
node_to_node = Table(
"node_to_node", Base.metadata,
Column("left_node_id", Integer, ForeignKey("node.id"), primary_key=True),
Column("right_node_id", Integer, ForeignKey("node.id"), primary_key=True)
)
class Node(Base):
__tablename__ = 'node'
id = Column(Integer, primary_key=True)
label = Column(String, unique=True)
right_nodes = relationship(
"Node",
secondary=node_to_node,
primaryjoin=id == node_to_node.c.left_node_id,
secondaryjoin=id == node_to_node.c.right_node_id,
backref="left_nodes"
)
def __repr__(self):
return "Node(id={}, Label={})".format(self.id, self.label)
Joining Node to itself through this relationship:
>>> NodeAlias = aliased(Node)
>>> print(session.query(Node).join(NodeAlias, Node.right_nodes))
SELECT node.id AS node_id, node.label AS node_label
FROM node JOIN node_to_node AS node_to_node_1
ON node.id = node_to_node_1.left_node_id
JOIN node AS node_1
ON node_1.id = node_to_node_1.right_node_id
Everything looks well.
The many-to-many relation through an association select statement
As an example we implement a relationship next_two_nodes which connects a node to the two nodes with id+1 and id+2 (if existent). The complete code for testing.
Here is a function which generates the select statement for the "dynamic" association table:
_next_two_nodes = None
def next_two_nodes_select():
global _next_two_nodes
if _next_two_nodes is None:
_leftside = aliased(Node, name="leftside")
_rightside = aliased(Node, name="rightside")
_next_two_nodes = select(
[_leftside.id.label("left_node_id"),
_rightside.id.label("right_node_id")]
).select_from(
join(
_leftside, _rightside,
or_(
_leftside.id + 1 == _rightside.id,
_leftside.id + 2 == _rightside.id
)
)
).alias()
return _next_two_nodes
Note that the function caches the result in a global variable, so that successive calls always return the same object instead of using new aliases. Here is my attempt to use this select in a relationship:
class Node(Base):
__tablename__ = 'node'
id = Column(Integer, primary_key=True)
label = Column(String, unique=True)
next_two_nodes = relationship(
"Node", secondary=next_two_nodes_select,
primaryjoin=(lambda: foreign(Node.id)
== remote(next_two_nodes_select().c.left_node_id)),
secondaryjoin=(lambda: foreign(next_two_nodes_select().c.right_node_id)
== remote(Node.id)),
backref="previous_two_nodes",
viewonly=True
)
def __repr__(self):
return "Node(id={}, Label={})".format(self.id, self.label)
Some test data:
nodes = [
Node(id=1, label="Node1"),
Node(id=2, label="Node2"),
Node(id=3, label="Node3"),
Node(id=4, label="Node4")
]
session.add_all(nodes)
session.commit()
Accessing the relationship through an instance works as expected:
>>> node = session.query(Node).filter_by(id=2).one()
>>> node.next_two_nodes
[Node(id=3, Label=Node3), Node(id=4, Label=Node4)]
>>> node.previous_two_nodes
[Node(id=1, Label=Node1)]
However, filtering on the relationship does not give the expected result:
>>> session.query(Node).join(NodeAlias, Node.next_two_nodes).filter(NodeAlias.id == 3).all()
[Node(id=1, Label=Node1),
Node(id=2, Label=Node2),
Node(id=3, Label=Node3),
Node(id=4, Label=Node4)]
I would expect only Node1 and Node2 to be returned. And indeed, the SQL statement of the join is wrong:
>>> print(session.query(Node).join(NodeAlias, Node.next_two_nodes))
SELECT node.id AS node_id, node.label AS node_label
FROM node JOIN (SELECT leftside.id AS left_node_id, rightside.id AS right_node_id
FROM node AS leftside JOIN node AS rightside
ON leftside.id + 1 = rightside.id OR leftside.id + 2 = rightside.id) AS anon_1
ON anon_1.left_node_id = anon_1.left_node_id
JOIN node AS node_1 ON anon_1.right_node_id = node_1.id
Comparing with the working example above, instead of ON anon_1.left_node_id = anon_1.left_node_id it should clearly read ON node.id = anon_1.left_node_id. My primaryjoin seems to be wrong, but I cannot figure out how to connect the last dots.

After more debugging I found that "Clause Adaption" is replacing my ON clause. I'm not sure about the details, but for some reasen sqlalchemy thinks that I am referring to the node.id from the select rather than from the original Node table. The only way I found to suppress clause adaption was to select in text form:
select(
[literal_column("leftside.id").label("left_node_id"),
literal_column("rightside.id").label("right_node_id")]
)...
This way the relationship to Node is broken and filtering works as expected. It feels like a hack with unforeseeable side effects, maybe someone knows a cleaner way...

Related

How to load joined table in SQLAlchemy where joined table does not provide foreign key(relationship) in ORM

I have tables like below
import sqlalchemy as sa
class A(Base):
id = sa.Column(sa.Integer)
name = sa.Column(sa.String)
class B(Base):
id = sa.Column(sa.Integer)
a_id = sa.Column(sa.Integer)
and has query:
# Basic query
query = sa.select(B).join(A, A.id == B.a_id)
result = await session.execute(query)
results = result.scalars().all()
How should I change to get desired result?
query = sa.select(B).join(A, A.id == B.a_id)
result = session.execute(query)
results = result.scalars().all()
# Problem
# SOME_KEY should be indicated in query as loading column
# SOME_KEY's type should be A class
# I want below thing
results[0].SOME_KEY.name # it should give joined `A` entity's property value
I have read documentation, have seen loading techniques, but could not find solution , it is mostly for relations.
Arbitrary query with multiple objects per result
with Session(engine) as session:
for (b, a) in session.execute(select(B, A).join(A, B.a_id == B.id)).all():
print (b, a)
Relationship without ForeignKey
from sqlalchemy.orm import Session, declarative_base, aliased, relationship, remote, foreign
class A(Base):
__tablename__ = 'a_table'
id = Column(Integer, primary_key=True)
name = Column(String)
b_list = relationship('B', primaryjoin="remote(A.id) == foreign(B.a_id)", back_populates='a')
class B(Base):
__tablename__ = 'b_table'
id = Column(Integer, primary_key=True)
a_id = Column(Integer)
a = relationship('A', primaryjoin="remote(A.id) == foreign(B.a_id)", back_populates='b_list')
with Session(engine) as session:
for (b,) in session.execute(select(B).join(B.a)).all():
print (b, b.a_id, b.a, b.a.id, b in b.a.b_list)

can I search two sqlite fts5 index's across two tables?

I have a tasks application with two tables. One table has the task name, date, owner etc and the other has the comments for the task linked to the task number so there can be multiple comments attached to a single task.
Both tables have FTS5 indexes. Within my app I want to search both tables for a word and present the rows to the user. I have the below working for each table individually but how do I construct a query that returns data from both FTS5 tables?
(python3.6)
c.execute("select * from task_list where task_list = ? ", [new_search])
c.execute("select * from comments where comments = ? ", [new_search])
thanks #tomalak never thought of doing that, was focused on the query. Here's what I came up with and works for my purposes. Probably better ways to achieve the same result but I'm a beginner. This is a Tkinter app.
def db_search():
rows = ''
conn = sqlite3.connect('task_list_database.db')
c = conn.cursor()
d = conn.cursor()
new_search = entry7.get()
c.execute("select * from task_list where task_list = ? ", [new_search])
d.execute("select * from comments where comments = ? ", [new_search])
rows1 = c.fetchall()
rows2 = d.fetchall()
rows = rows1 + rows2
clear_tree(tree)
for row in rows:
tree.insert("", END, values=row)
conn.close()

ASP.NET MVC 5 Join on Between, no primary key

I have some interesting code to look at.
I have three tables:
Table A has 4 columns:
TablePK
UserID
TableBFKID
Score
Table B has 3 columns:
TablePK
Name
ShortName
Table c has 4 columns:
TablePK
ScoreMin
ScoreMax
Modifier
So when the full join happens it looks like this:
SELECT B.ShortName
, A.Score
, C.Modifier
FROM TableA A
INNER JOIN TableB B ON a.TablePK= B.TablePK
INNER JOIN TableC C ON A.Score BETWEEN C.ScoreMin AND C.ScoreMax
The results would look like this:
ShortName, Score, Modifier. EX:
CHA, 19, 4
Now I know how to do an Entity Framework join if there is an actual PK or FK, or even if there is only a 0:1 relationship.
But how do you do the join when there is neither a PK nor an FK?
LINQ, including LINQ to Entities, only supports equi-joins.
But you can run SQL directly:
var res = myContext.Database.SqlQuery<TResult>("SQL goes here", parmeters...);
and EF will map the columns of the result set to the properties of TResult (which needs to other connection to the context: it does not need to be an entity type with a DbSet typed property in the context).
In this case I wouldn't try to join them, just use a sub-select in your linq to select from the un-joined table where the scores are between your wanted ranges.
var results = context.TableA
.Select(a => new {
score = a.Score, //add all needed columns
tableCs = context.TableC.Where(c => a.Score >= c.ScoreMin && a.Score <= c.ScoreMax)
});

Replace nulls by default values in oracle

Please concern following oracle beginner's case:
Table "X" contains customer data:
ID Variable_A Variable_B Variable_C Variable_D
--------------------------------------------------
1 100 null abc 2003/07/09
2 null 2 null null
Table "Dictionary" contains what we can regard as default values for customer data:
Variable_name Default_Value
----------------------------
Variable_A 50
Variable_B 0
Variable_C text
Variable_D sysdate
The goal is to examine the row in "X" by given ID and replace null values by the default values from "Dictionary". The concrete question is about the optimal solution because, for now my own solution lies in use of looping with MERGE INTO statement which is, I think, not optimal. Also it is necessary to use flexible code without being ought to change it when new column is added into "X".
The direct way is to use
update X set
variable_a = coalesce(variable_a, (select default_value from Dictionary where name = 'Variable_A')),
variable_b = coalesce(variable_b, (select default_value from Dictionary where name = 'Variable_B')),
... and so on ...
Generally it should be fast enough.
Since you don't know which fields of table X will be null, you should provide every row with every default value. And since each field of X may be a different data type, the Dictionary table should have each default value in a field of the appropriate type. Such a layout is shown in thisFiddle.
A query which shows each row of X fully populated with either the value in X or its default becomes relatively simple.
select ID,
nvl( Var_A, da.Int_Val ) Var_A,
nvl( Var_B, db.Int_Val ) Var_B,
nvl( Var_C, dc.Txt_Val ) Var_C,
nvl( Var_D, dd.Date_Val ) Var_D
from X
join Dict da
on da.Name = 'VA'
join Dict db
on db.Name = 'VB'
join Dict dc
on dc.Name = 'VC'
join Dict dd
on dd.Name = 'VD';
Turning this into an Update statement is a little more complicated but is simple enough once you've used it a few times:
update X
set (Var_A, Var_B, Var_C, Var_D) =(
select nvl( Var_A, da.Int_Val ),
nvl( Var_B, db.Int_Val ),
nvl( Var_C, dc.Txt_Val ),
nvl( Var_D, dd.Date_Val )
from X InnerX
join Dict da
on da.Name = 'VA'
join Dict db
on db.Name = 'VB'
join Dict dc
on dc.Name = 'VC'
join Dict dd
on dd.Name = 'VD'
where InnerX.ID = X.ID )
where exists(
select 1
from X
where Var_A is null
or Var_B is null
or Var_C is null
or Var_D is null );
There is a problem with this. The default for Date types is given as sysdate which means that it will show the date and time the default table was populated not the date and time the Update was performed. This, I assume, is not what you want. You could try to make this all work using dynamic sql, but that will be a lot more complicated. Much too complicated for what you want to do here.
I see only two realistic options: either store a meaningful date as the default (9999-12-31, for example) or just know that every default for a date type will be sysdate and use that in your updates. That would be accomplished in the above Update just by changing one line:
nvl( Var_D, sysdate )
and getting rid of the last join.

SQL Multiple Row Subquery

I have a table Studies that I perform a SELECT on.
I then need to perform a further SELECT on the recordset returned. I've tried this (simplified for clarity):
SELECT * FROM Studies
WHERE Id = '2' OR Id = '3' OR Id = '7';
SELECT * FROM Studies
WHERE (Name = 'Test')
AND Id IN (SELECT * FROM Studies WHERE Id = '2' OR Id = '3' OR Id = '7');
But I keep getting the following SQL error:
Only a single result allowed for a SELECT that is part of an expression
Where am I going wrong? If it's not evident from my code - I am relatively new to database programming.
Thanks
You can't return more than one column in a IN (...) subquery. You have to change the * (return all columns) to ID. But your query does not need a subquery, You can just add the ID's to the first query. You usually want to avoid subqueries where you can because of performance reasons.
SELECT *
FROM Studies
WHERE Name = 'Test'
AND ID IN ('2', '3','7')
Or if you want to keep your structure:
SELECT *
FROM Studies
WHERE (Name = 'Test')
AND ID IN (SELECT ID FROM Studies WHERE ID = '2' OR ID = '3' OR ID = '7');
SELECT * FROM Studies WHERE (Name = 'Test') AND ID IN (SELECT ID FROM Studies WHERE ID = '2' OR ID = '3' OR ID = '7');

Resources