Is it possible to update a table by adding values from other two tables in SQLite - sqlite

I am trying out SQLite and encountered a problem. There are 3 Tables A, B, and C.
I want to update Table A using the sum of B and C.
Table A.
James null.
Table B.
James 5.
Table C
James 2
so with the update, I want table A to have
James 3. (5-2)
Thank You

SQLite does not support joins in an UPDATE statement so you can do it by accessing directly the corresponding rows of the tables A and B like this:
update A
set value =
(select value from B where name = A.name) -
(select value from C where name = A.name)
If you want to update only the row with name = 'James' then add:
where name = 'James'
See the demo

Works in every DB:
UPDATE
"A"
SET
"x" =
(
SELECT
SUM("x")
FROM "B"
WHERE "B"."id"="A"."id"
) +
(
SELECT
SUM("x")
FROM "C"
WHERE "C"."id"="A"."id"
)

I believe the following demonstrates that Yes you can:-
DROP TABLE IF EXISTS ta;
DROP TABLE IF EXISTS tb;
DROP TABLE IF EXISTS tc;
CREATE TABLE IF NOT EXISTS ta (name TEXT, numb INTEGER);
CREATE TABLE IF NOT EXISTS tb (name TEXT, numb INTEGER);
CREATE TABLE IF NOT EXISTS tc (name TEXT, numb INTEGER);
INSERT INTO ta VALUES ('JAMES',null),('Mary',100);
INSERT INTO tb VALUES ('JAMES',5),('Sue',33);
INSERT INTO tc VALUES ('JAMES',2),('Anne',45);
UPDATE ta SET numb =
(SELECT sum(numb) FROM tb WHERE name = 'JAMES')
-
(SELECT sum(numb) FROM tc WHERE name = 'JAMES')
WHERE name = 'JAMES';
SELECT * FROM ta;
SELECT * FROM tb;
SELECT * FROM tc;
This :-
Drops the tables if they exist allowing it to be rerun (simplifies modifications if need be).
column names name and numb have been assumed as they weren't given.
Creates the 3 tables (note table names used for the demo are ta, tb and tc)
Adds some data (note that additional rows have been added to show how to distinguish (at least to a fashion))
Updates column numb of table A (ta) where the name column has a value of JAMES according to the sum of the numb column from all rows with the same name (JAMES) from table tb minus the sum of the numb column from all rows with the same name (JAMES) from table tc
This may not be exactly what you want so it assumes that you want to sum all rows with the same name per table (ta and tc)
Queries all the tables (first is shown below as that is the table that has been updated.)
The first result showing that the row has been updated from null to 3 (5 - 2) and that the row for Mary has remained as it was :-
The following change to the UPDATE gets the name (rather than hard-coding 'JAMES' multiple times, as per the row(s) extract from the ta table, the use of hard-coded names perhaps making it easier to understand the working of the SQL).
UPDATE ta SET numb = (SELECT sum(numb) FROM tb WHERE name = ta.name) - (SELECT sum(numb) FROM tc WHERE name = ta.name) WHERE name = 'JAMES';
Note that should there not be an associated row (i.e. with the same name) in either tb or tc then the result will be null (whether or not sum is used).

Related

SQLite How to Join to an extra Table

In this SQLite example, I am selecting rows from tables a, b, and c using the common column 'aa'.
SELECT
a.aa,
a.ab,
a.ac,
b.ba,
b.bb,
b.bc,
c.ca,
c.cb,
c.cc
FROM
a
INNER JOIN b ON b.ba = a.aa
INNER JOIN c ON c.ca = a.aa
WHERE a.ab = 'blahblah'
This works OK. Now, I need to add an extra table and an extra JOIN. Table 'd' has a column 'd.dc' that is common with table 'c' and its column 'c.cc'.
When the correct row is selected in 'c', I want to be able to read the data in 'd.dd'.
SELECT
a.aa,
a.ab,
a.ac,
b.ba,
b.bb,
b.bc,
c.ca,
c.cb,
c.cc,
d.dc,
d.dd
FROM
a
INNER JOIN b ON b.ba = a.aa
INNER JOIN c ON c.ca = a.aa
INNER JOIN d ON d.dc = c.cc
WHERE a.ab = 'blahblah'
This does not work OK. Please can you tell me how to correct it?
I have also tried
FOREIGN KEY (cc) REFERENCES d(dc)
in the table 'c' definition, but it makes no apparent difference.
Here are my table definitions:
CREATE TABLE `a` ( `aa` TEXT PRIMARY KEY UNIQUE, `ab` TEXT, `ac` TEXT );
CREATE TABLE `b` ( `ba` TEXT PRIMARY KEY UNIQUE, `bb` TEXT, `bc` TEXT );
CREATE TABLE `c` ( `ca` TEXT PRIMARY KEY UNIQUE, `cb` TEXT, `cc` TEXT, FOREIGN KEY (cc) REFERENCES d(dc) );
CREATE TABLE `d` ( `dc` TEXT PRIMARY KEY UNIQUE, `dd` TEXT );
The strange results that I got were rather hard to describe, but one thing I noticed was that the only few rows returned were where c.cc were all the same value, whereas in fact there should have many more rows, and c.cc should have had a variety of values.
Perhaps the following will explain how to JOIN using the SQL (but this is really just tailored to meet the rules):-
DROP TABLE IF EXISTS a;
DROP TABLE IF EXISTS b;
DROP TABLE IF EXISTS c;
DROP TABLE IF EXISTS d;
CREATE TABLE `a` ( `aa` TEXT PRIMARY KEY UNIQUE, `ab` TEXT, `ac` TEXT );
CREATE TABLE `b` ( `ba` TEXT PRIMARY KEY UNIQUE, `bb` TEXT, `bc` TEXT );
CREATE TABLE `c` ( `ca` TEXT PRIMARY KEY UNIQUE, `cb` TEXT, `cc` TEXT, FOREIGN KEY (cc) REFERENCES d(dc) );
CREATE TABLE `d` ( `dc` TEXT PRIMARY KEY UNIQUE, `dd` TEXT );
INSERT INTO d VALUES('blah_c_cc_001','blahblah_d_dd'); -- MUST BE INSERTED BEFORE C else FK CONFLICT
INSERT INTO a VALUES('blah_a_aa_001','blahblah','blahblah_a_ac');
INSERT INTO b VALUES('blah_a_aa_001','blahblah_b_bb','blahblah_b_bc');
INSERT INTO c VALUES('blah_a_aa_001','blahblah_c_cb','blah_c_cc_001');
INSERT INTO d VALUES('blah_c_cc_002','blahblah_d_dd'); -- MUST BE INSERTED BEFORE C else FK CONFLICT
INSERT INTO a VALUES('blah_a_aa_002','blahblah','blahblah_a_ac');
INSERT INTO b VALUES('blah_a_aa_002','blahblah_b_bb','blahblah_b_bc');
INSERT INTO c VALUES('blah_a_aa_002','blahblah_c_cb','blah_c_cc_002');
INSERT INTO d VALUES('blah_c_cc_004','blahblah_d_dd');
INSERT INTO a VALUES('blah_a_aa_003','blahblah','blahblah_a_ac');
INSERT INTO b VALUES('blah_a_aa_003','blahblah_b_bb','blahblah_b_bc');
INSERT INTO c VALUES('blah_a_aa_003','blahblah_c_cb','blah_c_cc_004');
SELECT *
FROM
a
INNER JOIN b ON b.ba = a.aa
INNER JOIN c ON c.ca = a.aa
INNER JOIN d ON d.dc = c.cc
WHERE a.ab = 'blahblah'
;
- * used for brevity
This results in :-
Basically the rule is that for a INNER (normal/simple) JOIN there must be matched rows, so in your query (the following applies)
TABLE b must have a value in column ba that matches the aa column in table a as well as the matching row in table a having a value of blahblah in column ab
AND
Table c must have a value in column ca that matches the aa column in table a as well as the matching row in table a having a value of blahblah
AND
Table d must have a value in column dc that matches the cc column in table c AND that the matching row in cc is a row that matches a row in table a that has a value of blahblah in column ab
The FOREIGN KEY has no impact on the SELECT query, other than it restricting the insertion of a row in table c in that the cc column being inserted must match the value of column dc in one of the rows in table d.
OK, I found my problem. The original code was OK, but the problem was in the external data that I was using. It wasn't matching because I had not bothered to TRIM it before loading it to my database.
My strange results occurred because if Table 'd' did not have a value in column dc that matches the cc column in table c, then no row at all was returned. I had assumed that if Table 'd' did not have a value in column dc that matches the cc column in table c, then the SELECT d.dd would just return an empty result, whereas in fact it blocked all output.
Message to Self: Always TRIM your data. Also, when monitoring results, always surround with quote-marks, so you can see the extra whitespace.

General Sqlite foreign keys explanation needed

There is an overbearing chance that this might be an incredibly stupid question, so bear with me :)
I have over the last couple of weeks been learning and implementing Sqlite on some data for a project. I love the concept of keys, but there is however one thing that I cannot wrap my head around.
How do you reference the foreign key when inserting a big dataset in the db? Ill give you an example:
Im inserting say 300 rows of data, each row containing ("a","b","c","d","e","f","g"). Everything is going into the same table(original_table).
Now that i have my data in the db, I want to create another table(secondary_table) for the values "c". I then naturally want original_table to have a foreign key which links to the secondary_tables primary key.
I understand that you can create a foreign key before inserting, and then replacing "c" with the corresponding integer before you insert. This however seems very ineffiecient as you would have to replace huge amounts of data before inserting.
So my question is how can I have the foreign key replace the text in an already created table?
Cheers
So my question is how can I have the foreign key replace the text in
an already created table?
yes/no
That is you you can replace column C with the reference to the secondary table (as has been done below in addition to adding the new suggested column) BUT without dropping the table you CANNOT redefine the column's attributes and therefore make it have a type affinity of INTEGER (not really an issue) or specify that it has the FOREIGN KEY constraint.
Mass update is probably not an issue (not not even done withing a transaction here) for something like 300 rows.
How do you reference the foreign key when inserting a big dataset in
the db?
Here's the SQL for how you could do this but instead of trying to play around with column C add a new column that effectively makes column C redundant. However, the new column will have INTEGER type affinity and also have the FOREIGN KEY constraint applied.
300 rows is nothing, the example code uses 3000 rows, although column C only contains a short text value.
:-
-- Create the original table with column c having a finite number of values (0-25)
DROP TABLE IF EXISTS original_table;
CREATE TABLE IF NOT EXISTS original_table (A TEXT, B TEXT, C TEXT, D TEXT, E TEXT, F TEXT, G TEXT);
-- Load the original table with some data
WITH RECURSIVE counter(cola,colb,colc,cold,cole,colf,colg) AS (
SELECT random() % 26 AS cola, random() % 26 AS colb,abs(random() % 26) AS colc,random() % 26 AS cold,random() % 26 AS cole,random() % 26 AS colf,random() % 26 AS colg
UNION ALL
SELECT random() % 26 AS cola, random() % 26 AS colb,abs(random()) % 26 AS colc,random() % 26 AS cold,random() % 26 AS cole,random() % 26 AS colf,random() % 26 AS colg
FROM counter LIMIT 3000
)
INSERT INTO original_table SELECT * FROM counter;
SELECT * FROM original_table ORDER BY C ASC; -- Query 1 the original original_table
-- Create the secondary table by extracting values from the C column of the original table
DROP TABLE IF EXISTS secondary_table;
CREATE TABLE IF NOT EXISTS secondary_table (id INTEGER PRIMARY KEY, c_value TEXT);
INSERT INTO secondary_table (c_value) SELECT DISTINCT C FROM original_table ORDER BY C ASC;
SELECT * FROM secondary_table; -- Query 2 the new secondary table
-- Add the new column as a Foreign key to reference the new secondary_table
ALTER TABLE original_table ADD COLUMN secondary_table_reference INTEGER REFERENCES secondary_table(id);
SELECT * FROM original_table; -- Query 3 the altered original_table but without any references
-- Update the original table to apply the references to the secondary_table
UPDATE original_table
SET secondary_table_reference = (SELECT id FROM secondary_table WHERE c_value = C)
-- >>>>>>>>>> NOTE USE ONLY 1 OR NONE OF THE FOLLOWING 2 LINES <<<<<<<<<<
, C = null; -- OPTIONAL TO CLEAR COLUMN C
-- , C = (SELECT id FROM secondary_table WHERE c_value = C) -- ANOTHER OPTION SET C TO REFERENCE SECONDARY TABLE
;
SELECT * FROM original_table; -- Query 4 the final original table i.e. with references applied (column C now not needed)
Hopefully comments explain.
Results :-
Query 1 The original table without the secondary table :-
Query 2 The secondary table as generated from the original table :-
Query 3 The altered original_table without references applied :-
Query 4 The original table after application of references (applied to new column and old C column) :-
Timings (would obviously depend on numerous factors) :-
-- Create the original table with column c having a finite number of values (0-25)
DROP TABLE IF EXISTS original_table
> OK
> Time: 0.94s
CREATE TABLE IF NOT EXISTS original_table (A TEXT, B TEXT, C TEXT, D TEXT, E TEXT, F TEXT, G TEXT)
> OK
> Time: 0.353s
-- Load the original table with some data
WITH RECURSIVE counter(cola,colb,colc,cold,cole,colf,colg) AS (
SELECT random() % 26 AS cola, random() % 26 AS colb,abs(random() % 26) AS colc,random() % 26 AS cold,random() % 26 AS cole,random() % 26 AS colf,random() % 26 AS colg
UNION ALL
SELECT random() % 26 AS cola, random() % 26 AS colb,abs(random()) % 26 AS colc,random() % 26 AS cold,random() % 26 AS cole,random() % 26 AS colf,random() % 26 AS colg
FROM counter LIMIT 3000
)
INSERT INTO original_table SELECT * FROM counter
> Affected rows: 3000
> Time: 0.67s
SELECT * FROM original_table ORDER BY C ASC
> OK
> Time: 0.012s
-- Query 1 the original original_table
-- Create the secondary table by extracting values from the C column of the original table
DROP TABLE IF EXISTS secondary_table
> OK
> Time: 0.328s
CREATE TABLE IF NOT EXISTS secondary_table (id INTEGER PRIMARY KEY, c_value TEXT)
> OK
> Time: 0.317s
INSERT INTO secondary_table (c_value) SELECT DISTINCT C FROM original_table ORDER BY C ASC
> Affected rows: 26
> Time: 0.24s
SELECT * FROM secondary_table
> OK
> Time: 0s
-- Query 2 the new secondary table
-- Add the new column as a Foreign key to reference the new secondary_table
ALTER TABLE original_table ADD COLUMN secondary_table_reference INTEGER REFERENCES secondary_table(id)
> OK
> Time: 0.31s
SELECT * FROM original_table
> OK
> Time: 0.01s
-- Query 3 the altered original_table but without any references
-- Update the original table to apply the references to the secondary_table
UPDATE original_table
SET secondary_table_reference = (SELECT id FROM secondary_table WHERE c_value = C)
-- , C = null; -- OPTIONAL TO CLEAR COLUMN C
, C = (SELECT id FROM secondary_table WHERE c_value = C)
> Affected rows: 3000
> Time: 0.743s
SELECT * FROM original_table
> OK
> Time: 0.01s
-- Query 4 the final original table i.e. with references applied (column C now not needed)
> not an error
> Time: 0s
Supplementary Query
The following query utilises the combined tables :-
SELECT A,B,D,E,F,G, secondary_table.c_value FROM original_table JOIN secondary_table ON secondary_table_reference = secondary_table.id;
To result in :-
Note the data will not correlate with the previous results as this was run as a separate run and the data is generated randomly.

pyodbc-access column by name

I have 2 tables with 150 columns and trying to join those tables and fetch the result set one by one and process them:
qry = '''select a.*, b.*
from table_a a
full outer join table_b b
where a.id = b.id'''
table_row = conn.execute(qry) #execute method yields a generator
Now, I need to access the resultset which is generator and determine the values of each and every column of table-1 & table-2
For example:- if table-1 & table-2 has a column named name, I need to compare it..
How can I access the resultset by columnnname, im using Pyodbc,
ie resultset.table1.name = resultset.table2.name
Use the ISO information schema views (I'm using SQL Server in the
example) to return column names for each table, substituting
database and schema parameters values as appropriate.
Merge the resulting lists into a set containing column names present in both tables.
Use this set to build a string representing column names to select from each table, aliasing each column by prefixing with a table name. Defining column aliases will allow you to differentiate columns by table.
Execute select query and print values for comparison.
Code sample
# assumes connection, cursor already setup
# build SQL for retrieving column names
sql = '''SELECT COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMN
WHERE TABLE_CATALOG = ? AND TABLE_SCHEMA = ?
AND TABLE_NAME = ?'''
# get column names from table_a
rows = cursor.execute(sql, ('database', 'schema', 'table_a')).fetchall()
table_a_columns = [column[0] for column in rows]
# get column names from table_b
rows = cursor_b.execute(sql, ('database', 'schema', 'table_b')).fetchall()
table_b_columns = [column[0] for column in rows]
# get unique matching columns from lists
matches = set(table_a_columns).intersection(table_b_columns)
# get string of column names to use in query, setting column alias prefixed with
# table name for each column
column_alias = 'a.{0} as a_{0}, b.{0} as b_{0}'
columns = ', '.join([column_alias.format(column) for column in matches])
sql = 'SELECT {} FROM table_a a FULL OUTER JOIN table_b b ON a.id = b.id'
sql = sql.format(columns)
# print values to compare
for row in cursor.execute(sql):
print row
There's probably a less complicated way, but it's eluding me.

SQLITE equivalent for Oracle's ROWNUM?

I'm adding an 'index' column to a table in SQLite3 to allow the users to easily reorder the data, by renaming the old database and creating a new one in its place with the extra columns.
The problem I have is that I need to give each row a unique number in the 'index' column when I INSERT...SELECT the old values.
A search I did turned up a useful term in Oracle called ROWNUM, but SQLite3 doesn't have that. Is there something equivalent in SQLite?
You can use one of the special row names ROWID, OID or _ROWID_ to get the rowid of a column. See http://www.sqlite.org/lang_createtable.html#rowid for further details (and that the rows can be hidden by normal columns called ROWID and so on).
Many people here seems to mix up ROWNUM with ROWID. They are not the same concept and Oracle has both.
ROWID is a unique ID of a database ROW. It's almost invariant (changed during import/export but it is the same across different SQL queries).
ROWNUM is a calculated field corresponding to the row number in the query result. It's always 1 for the first row, 2 for the second, and so on. It is absolutely not linked to any table row and the same table row could have very different rownums depending of how it is queried.
Sqlite has a ROWID but no ROWNUM. The only equivalent I found is ROW_NUMBER() function (see http://www.sqlitetutorial.net/sqlite-window-functions/sqlite-row_number/).
You can achieve what you want with a query like this:
insert into new
select *, row_number() over ()
from old;
No SQLite doesn't have a direct equivalent to Oracle's ROWNUM.
If I understand your requirement correctly, you should be able to add a numbered column based on ordering of the old table this way:
create table old (col1, col2);
insert into old values
('d', 3),
('s', 3),
('d', 1),
('w', 45),
('b', 5465),
('w', 3),
('b', 23);
create table new (colPK INTEGER PRIMARY KEY AUTOINCREMENT, col1, col2);
insert into new select NULL, col1, col2 from old order by col1, col2;
The new table contains:
.headers on
.mode column
select * from new;
colPK col1 col2
---------- ---------- ----------
1 b 23
2 b 5465
3 d 1
4 d 3
5 s 3
6 w 3
7 w 45
The AUTOINCREMENT does what its name suggests: each additional row has the previous' value incremented by 1.
I believe you want to use the constrain LIMIT in SQLite.
SELECT * FROM TABLE can return thousands of records.
However, you can constrain this by adding the LIMIT keyword.
SELECT * FROM TABLE LIMIT 5;
Will return the first 5 records from the table returned in you query - if available
use this code For create Row_num 0....count_row
SELECT (SELECT COUNT(*)
FROM main AS t2
WHERE t2.col1 < t1.col1) + (SELECT COUNT(*)
FROM main AS t3
WHERE t3.col1 = t1.col1 AND t3.col1 < t1.col1) AS rowNum, * FROM Table_name t1 WHERE rowNum=0 ORDER BY t1.col1 ASC

Correct SQLite syntax - UPDATE SELECT with WHERE EXISTS

I am trying to update a selected values in a column in a SQLite table. I only want update of the cells in the maintable where the criteria are met, and the cells must be updated to individual values, taken from a subtable.
I have tried the following syntax, but I get only a single cell update. I have also tried alternatives where all cells are updated to the first selected value of the subtable.
UPDATE maintable
SET value=(SELECT subtable.value FROM maintable, subtable
WHERE maintable.key1=subtable.key1 AND maintable.key2=subtable.key2)
WHERE EXISTS (SELECT subtable.value FROM maintable, subtable
WHERE maintable.key1=subtable.key1 AND maintable.key2=subtable.key2)
What is the appropriate syntax?
You can do this with an update select, but you can only do one field at a time. It would be nice if Sqlite supported joins on an update statement, but it does not.
Here is a related SO question, How do I UPDATE from a SELECT in SQL Server?, but for SQL Server. There are similar answers there.
sqlite> create table t1 (id int, value1 int);
sqlite> insert into t1 values (1,0),(2,0);
sqlite> select * from t1;
1|0
2|0
sqlite> create table t2 (id int, value2 int);
sqlite> insert into t2 values (1,101),(2,102);
sqlite> update t1 set value1 = (select value2 from t2 where t2.id = t1.id) where t1.value1 = 0;
sqlite> select * from t1;
1|101
2|102
In this case, it only updates one value from subtable per each raw from maintable.
The error is when subtable is include into of SELECT sentence.
UPDATE maintable
SET value=(SELECT subtable.value
FROM subtable
WHERE maintable.key1=subtable.key1 );
By default update with joins does not exist in SQLite; But we can use the with-clause + column-name-list + select-stmt from https://www.sqlite.org/lang_update.html to make something like this:
CREATE TABLE aa (
_id INTEGER PRIMARY KEY,
a1 INTEGER,
a2 INTEGER);
INSERT INTO aa VALUES (1,10,20);
INSERT INTO aa VALUES (2,-10,-20);
INSERT INTO aa VALUES (3,0,0);
--a bit unpleasant because we have to select manually each column and it's just a lot to write
WITH bb (_id,b1, b2)
AS (SELECT _id,a1+2, a2+1 FROM aa WHERE _id<=2)
UPDATE aa SET a1=(SELECT b1 FROM bb WHERE bb._id=aa._id),a2=(SELECT b2 FROM bb WHERE bb._id=aa._id)
WHERE _id in (SELECT _id from bb);
--soo now it should be (1,10,20)->(1,12,21) and (2,-10,-20)->(2,-8,-19), and it is
SELECT * FROM aa;
--even better with one select for each row!
WITH bb (_id,b1, b2)
AS (SELECT _id,a1+2, a2+1 from aa WHERE _id<=2)
UPDATE aa SET (a1,a2)=(SELECT b1,b2 FROM bb WHERE bb._id=aa._id)
WHERE _id in (SELECT _id from bb);
--soo now it should be (1,12,21)->(1,14,22) and (2,-8,-19)->(2,-6,-18), and it is
SELECT * FROM aa;
--you can skip the WITH altogether
UPDATE aa SET (a1,a2)=(SELECT bb.a1+2, bb.a2+1 FROM aa AS bb WHERE aa._id=bb._id)
WHERE _id<=2;
--soo now it should be (1,14,22)->(1,16,23) and (2,-6,-18)->(2,-4,-17), and it is
SELECT * FROM aa;
Hopefully sqlite is smart enough to not query incrementally but according to the documentation it is. When setting multiple columns using one select (case 2 and 3) a not valid id (no where _id in line) will give an error that can not be ignored using ON IGNORE, case 1 will set columns to null (for all ids >2) which is also bad.
You need to use an INSERT OR REPLACE statement, something like the following:
Assume maintable has 4 columns: key, col2, col3, col4
and you want to update col3 with the matching value from subtable
INSERT OR REPLACE INTO maintable
SELECT maintable.key, maintable.col2, subtable.value, maintable.col4
FROM maintable
JOIN subtable ON subtable.key = maintable.key

Resources