SQLite group-by behaviour - sqlite

If a column in the SELECT clause is omitted from the GROUP BY clause, does SQLite group by the remaining columns (by default), and then return the value of the omitted column in the first row it evaluates?
For example, finding the TransactionId associated with the highest value per ProductId:
CREATE TABLE IF NOT EXISTS ProductTransaction
(
Id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
ProductId INTEGER NOT NULL,
TransactionType INTEGER NOT NULL,
Value INTEGER NOT NULL
);
INSERT INTO ProductTransaction (ProductId, TransactionType, Value)
VALUES (1, 7, 23), (1, 3, 12), (2, 4, 43), (1, 7, 5), (1, 10, 23),
(3, 3, 23), (3, 2, 31), (1, 1, 23), (2, 5, 50), (2, 6, 14), (1, 4, 23);
SELECT ProductId
, TransactionType
, MAX(Value)
FROM ProductTransaction
GROUP BY ProductId;
DELETE FROM ProductTransaction;
Running the previous statements gives me the TransactionType of 7 for ProductId 1 (Highest value 23).
However, if I add an the index:
CREATE INDEX IF NOT EXISTS IDX_TransType ON ProductTransaction(ProductId ASC, TransactionType ASC);
It returns the TransactionType 1, presumably because it's now ordering the rows according to the index. Modifying the index supports this theory:
CREATE INDEX IF NOT EXISTS IDX_TransType ON ProductTransaction(ProductId ASC, TransactionType DESC);
It will now return TransactionType 10 for ProductId 1.
Is this behaviour by design, or is it just an unreliable side-effect?
EDIT: It seems that it's an unreliable side-effect. From the documentation:
Each expression in the result-set is then evaluated once for each
group of rows. If the expression is an aggregate expression, it is
evaluated across all rows in the group. Otherwise, it is evaluated
against a single arbitrarily chosen row from within the group. If
there is more than one non-aggregate expression in the result-set,
then all such expressions are evaluated for the same row.
https://www.sqlite.org/lang_select.html#resultset

Since SQLite 3.7.11, using MAX() or MIN() will force any non-aggregated columns to come from the same row that matches the MAX()/MIN().
However, when there are multiple rows with the same largest/smalles value, it is still unspecified from which of those rows the other columns' values come. (SQLite's behaviour is consistent in this regard, but can change in different versions or with different database schemas.)

Related

VALUES clause in SQLAlchemy without column name (to be compatible with sqlite)

I have seen the answers from here VALUES clause in SQLAlchemy without being satisfactory. Basically SQLAlchemy forces you to give each column a name building the query as
SELECT * FROM (VALUES (1, 2, 3)) AS sq (colname1, colname2);
instead of using the default names "column1, column2, ..." when you don't specify (colname1, colname2). The problem with this is that specifying the column names is not compatible with sqlite. Do you know any way of doing that? I am thinking of using bare text query. The problem with that is that my full query is
SELECT pairs.column1 AS element_id,
pairs.column2 as variant_id,
products_elements.name as element_name,
elements_variants.name as variant_name
FROM (
VALUES (1, 2),
(2, 2),
(3, 1)
) AS pairs
JOIN (products_elements, elements_variants) ON (
products_elements.id = pairs.column1
AND elements_variants.id = pairs.column2
);
and I don't know how to embed the values. Thanks
If you want a raw query you can name to columns with a CTE:
WITH pairs(colname1, colname2) AS (VALUES (1, 2), (2, 2), (3, 1))
SELECT pairs.colname1 AS element_id,
pairs.colname2 AS variant_id,
products_elements.name AS element_name,
elements_variants.name AS variant_name
FROM pairs
JOIN products_elements ON products_elements.id = pairs.colname1
JOIN elements_variants ON elements_variants.id = pairs.colname2;

Sqlite Group By of subquery returns only one row

We observed that Sqlite returns always only one row if we apply group on a subquery and do not use an aggregation operation such as count or sum.
Here is a toy example:
Given table
CREATE TABLE ExampleTable (
id INT PRIMARY KEY,
rank INT NOT NULL
);
with data
INSERT INTO ExampleTable(id, rank) VALUES (1, 1);
INSERT INTO ExampleTable(id, rank) VALUES (2, 2);
INSERT INTO ExampleTable(id, rank) VALUES (3, 2);
the query
SELECT rank, COUNT(*) FROM (select id, rank from ExampleTable) GROUP BY rank;
returns
rank|count
2|2
1|1
However, without the COUNT operation Sqlite returns only 1 row.
SELECT rank FROM (select id, rank from ExampleTable) GROUP BY rank;
=>
rank
1
Is this is a bug or an expected behavior?

Delete data using merge statement in Oracle

How to only delete data using merge using oracle...
I am using the below code:
Merge
into
target_table
using
source_tablle
on (...)
when matched
then delete
But I am getting an error "missing Keyword" at last line
Your MERGE at the end is missing the UPDATE clause.
Lets look at a sample MERGE
CREATE TABLE employee (
employee_id NUMBER(5),
first_name VARCHAR2(20),
last_name VARCHAR2(20),
dept_no NUMBER(2),
salary NUMBER(10));
INSERT INTO employee VALUES (1, 'Dan', 'Morgan', 10, 100000);
INSERT INTO employee VALUES (2, 'Helen', 'Lofstrom', 20, 100000);
INSERT INTO employee VALUES (3, 'Akiko', 'Toyota', 20, 50000);
INSERT INTO employee VALUES (4, 'Jackie', 'Stough', 20, 40000);
INSERT INTO employee VALUES (5, 'Richard', 'Foote', 20, 70000);
INSERT INTO employee VALUES (6, 'Joe', 'Johnson', 20, 30000);
INSERT INTO employee VALUES (7, 'Clark', 'Urling', 20, 90000);
CREATE TABLE bonuses (
employee_id NUMBER, bonus NUMBER DEFAULT 100);
INSERT INTO bonuses (employee_id) VALUES (1);
INSERT INTO bonuses (employee_id) VALUES (2);
INSERT INTO bonuses (employee_id) VALUES (4);
INSERT INTO bonuses (employee_id) VALUES (6);
INSERT INTO bonuses (employee_id) VALUES (7);
COMMIT;
Now we have a sample data structure lets do some merging:
MERGE INTO bonuses b
USING (
SELECT employee_id, salary, dept_no
FROM employee
WHERE dept_no =20) e
ON (b.employee_id = e.employee_id)
WHEN MATCHED THEN
UPDATE SET b.bonus = e.salary * 0.1
DELETE WHERE (e.salary < 40000)
;
So this command the MERGE syntax using the merge_update_clause:
MERGE INTO (table/view)
USING (table/view)
ON (condition)
WHEN MATCHED THEN
UPDATE SET (column..expression)
DELETE WHERE (condition)
I guess what I'm hinting at is that you are missing your UPDATE SET clause as well as the DELETE conditions. I recommend following up on the MERGE syntax.
**Edit:**SQLFiddle is back so here you go.

SQLite Insert and Replace with condition

I can not figure out how to query a SQLite.
needed:
1) Replace the record (the primary key), if the condition (comparison of new and old fields entries)
2) Insert an entry if no such entry exists in the database on the primary key.
Importantly, it has to work very fast!
I can not come up with an effective inquiry.
Edit.
MyInsertRequest - the desired expression.
Script:
CREATE TABLE testtable (a INT PRIMARY KEY, b INT, c INT)
INSERT INTO testtable VALUES (1, 2, 3)
select * from testtable
1|2|3
-- Adds an entry, because the primary key is not
++ MyInsertRequest VALUES (2, 2, 3) {if c>4 then replace}
select * from testtable
1|2|3
2|2|3
-- Adds
++ MyInsertRequest VALUES (3, 8, 3) {if c>4 then replace}
select * from testtable
1|2|3
2|2|3
3|8|3
-- Does nothing, because such a record (from primary key field 'a')
-- is in the database and none c>4
++ MyInsertRequest VALUES (1, 2, 3) {if c>4 then replace}
select * from testtable
1|2|3
2|2|3
3|8|3
-- Does nothing
++ MyInsertRequest VALUES (3, 34, 3) {if c>4 then replace}
select * from testtable
1|2|3
2|2|3
3|8|3
-- replace, because such a record (from primary key field 'a')
-- is in the database and c>2
++ MyInsertRequest VALUES (3, 34, 1) {if c>2 then replace}
select * from testtable
1|2|3
2|2|3
3|34|1
Isn't INSERT OR REPLACE what you need ? e.g. :
INSERT OR REPLACE INTO table (cola, colb) values (valuea, valueb)
When a UNIQUE constraint violation occurs, the REPLACE algorithm
deletes pre-existing rows that are causing the constraint violation
prior to inserting or updating the current row and the command
continues executing normally.
You have to put the condition in a unique constraint on the table. It will automatically create an index to make the check efficient.
e.g.
-- here the condition is on columnA, columnB
CREATE TABLE sometable (columnPK INT PRIMARY KEY,
columnA INT,
columnB INT,
columnC INT,
CONSTRAINT constname UNIQUE (columnA, columnB)
)
INSERT INTO sometable VALUES (1, 1, 1, 0);
INSERT INTO sometable VALUES (2, 1, 2, 0);
select * from sometable
1|1|1|0
2|1|2|0
-- insert a line with a new PK, but with existing values for (columnA, columnB)
-- the line with PK 2 will be replaced
INSERT OR REPLACE INTO sometable VALUES (12, 1, 2, 6)
select * from sometable
1|1|1|0
12|1|2|6
Assuming your requirements are:
Insert a new row when a doesn't exists;
Replacing row when a exist and existing c greater then new c;
Do nothing when a exist and existing c lesser or equal then new c;
INSERT OR REPLACE fits first two requirements.
For last requirement, the only way I know to make an INSERT ineffective is supplying a empty rowset.
A SQLite command like following whould make the job:
INSERT OR REPLACE INTO sometable SELECT newdata.* FROM
(SELECT 3 AS a, 2 AS b, 1 AS c) AS newdata
LEFT JOIN sometable ON newdata.a=sometable.a
WHERE newdata.c<sometable.c OR sometable.a IS NULL;
New data (3,2,1 in this example) is LEFT JOINen with current table data.
Then WHERE will "de-select" the row when new c is not less then existing c, keeping it when row is new, ie, sometable.* IS NULL.
I tried the others answers because I was also suffering from a solution to this problem.
This should work, however I am unsure about the performance implications. I believe that you may need the first column to be unique as a primary key else it will simply insert a new record each time.
INSERT OR REPLACE INTO sometable
SELECT columnA, columnB, columnC FROM (
SELECT columnA, columnB, columnC, 1 AS tmp FROM sometable
WHERE sometable.columnA = 1 AND
sometable.columnB > 9
UNION
SELECT 1 AS columnA, 1 As columnB, 404 as columnC, 0 AS tmp)
ORDER BY tmp DESC
LIMIT 1
In this case one dummy query is executed and union-ed onto a second query which would have a performance impact depending on how it is written and how the table is indexed. The next performance problem has potential where the results are ordered and limited. However, I expect that the second query should only return one record and therefore it should not be too much of a performance hit.
You can also omit the ORDER BY tmp LIMIT 1 and it works with my version of sqlite, but it may impact performance since it can end up updating the record twice (writing the original value then the new value if applicable).
The other problem is that you end up with a write to the table even if the condition states that it should not be updated.

Counting the same column of different value sets in a single group by clause

I have a table (SQLite DB) like this,
CREATE TABLE parser (ip text, user text, code text);
Now I need to count how many code have a value of either 1, 2, or 3, and how many are not, group by ip field.
But as far as I can go, I can't do this altogether, but with two SQL phrases.
e.g
select count(*) as cnt, ip
from parser
where code in (1, 2, 3)
group by ip
order by cnt DESC
limit 10
And a not in query.
So, can I merge the two queries into a single one?
This will you give you two counts per ip, one for the rows where code has values 1, 2 or 3 and another count for all the rest (everything but 1, 2, 3, including NULL.)
SELECT ip,
COUNT(CASE WHEN code IN (1, 2, 3) THEN 1 ELSE NULL END) AS cnt_in,
COUNT(CASE WHEN code IN (1, 2, 3) THEN NULL ELSE 1 END) AS cnt_rest
FROM parser
GROUP BY ip
ORDER BY cnt_in DESC ;
This will you give you 3 counts, one for 1,2,3, another for the rest of integer values and a third for rows that have NULL in code:
SELECT ip,
COUNT(CASE WHEN code IN (1, 2, 3) THEN 1 END) AS cnt_in,
COUNT(CASE WHEN code NOT IN (1, 2, 3) THEN 1 END) AS cnt_not_in,
COUNT(CASE WHEN code IS NULL THEN 1 END) AS cnt_null
FROM parser
GROUP BY ip
ORDER BY cnt_in DESC ;
If you want to limit the first result (as your code) to the top 10 rows and the second result to the other top 10 rows, you can use two subqueries and a UNION:
( SELECT ip,
COUNT(*) AS cnt,
'in' AS type
FROM parser
WHERE code IN (1, 2, 3)
GROUP BY ip
ORDER BY cnt DESC
LIMIT 10
)
UNION ALL
( SELECT ip,
COUNT(*) AS cnt,
'not in' AS type
FROM parser
WHERE code NOT IN (1, 2, 3)
GROUP BY ip
ORDER BY cnt DESC
LIMIT 10
) ;
Tested at SQL-Fiddle

Resources