SUM IFS equivalent for SQLite - sqlite

The below crashes my DB Browser. Essentially I am trying to sum sales ("sales") by a sales person ("name") that occurred between two dates ("beg_period" and "end_period") pulled from a separate table.
SELECT ta.name, ta.beg_period, ta.end_period,
(SELECT SUM(tb.sales)
FROM sales_log tb
WHERE ta.name = tb.name
AND tb.date BETWEEN ta.beg_period AND ta.end_period
)
FROM performance ta
;

The nested query can be re-written as a single query with a standard join.
SELECT ta.name, ta.beg_period, ta.end_period, SUM(tb.sales)
FROM performance ta INNER JOIN sales_log tb
ON ta.name = tb.name
WHERE tb.date BETWEEN ta.beg_period AND ta.end_period
GROUP BY ta.name, ta.beg_period, ta.end_period;
My guess is that the original query was okay (however inefficient), but DB Browser just didn't know how to interpret the subquery for whatever parsing it attempts, etc. In other words, just because it crashed DB Browser doesn't mean that it would crash sqlite library. Try another sqlite database manager.

Related

How to introduce indexing to sqlite query in android?

In my android application, I use Cursor c = db.rawQuery(query, null); to query data from a local sqlite database, and one of the query string looks like the following:
SELECT t1.* FROM table t1
WHERE NOT EXISTS (
SELECT 1 FROM table t2
WHERE t2.start_time = t1.start_time AND t2.stop_time > t1.stop_time
)
however, the issue is that the query gets very slow when the database gets huge. Trying to look into introducing indexing to speed up the query, but so far, not been very successful, therefore, would be great to have some help here, as it's also hard to find examples for this for android applications.
You can create a composite index for the columns start_time and stop_time:
CREATE INDEX idx_name ON table_name(start_time, stop_time);
You can read in The SQLite Query Optimizer Overview:
The ON and USING clauses of an inner join are converted into
additional terms of the WHERE clause prior to WHERE clause analysis
...
and:
If an index is created using a statement like this:
CREATE INDEX idx_ex1 ON ex1(a,b,c,d,e,...,y,z);
Then the index might be used if the initial columns of the index
(columns a, b, and so forth) appear in WHERE clause terms. The initial
columns of the index must be used with the = or IN or IS operators.
The right-most column that is used can employ inequalities.
You may have to uninstall the app from the device so that the db is deleted and rerun to recreate it, or increase the version number of the db so that you can create the index in the onUpgrade() method.

Is it possible to run a Teradata query in Excel that uses Volatile tables?

My Teradata query creates a volatile that is used to join to existing views. When linking query to excel the following error pops up: "Teradata: [Teradata Database] [3932] Only an ET or null statement is legal after a DDL Statement". Is there a workaround for this for someone that does not have write permissions in teradata to create a real view or table? I want to avoid linking to Teradata in SQL and running an open query to pull in the data needed.
This is for Excel 2016 64bit and using Teradata version 15.10.1.12
Normally this error will occur if you are using ANSI mode or have issued a BT (Begin Transaction) in BTET mode.
Here are a few workarounds to try:
Issue an ET; statement (commit) after the create volatile table statement. If you are using ANSI mode, use COMMIT; instead of ET;. If you are unsure, try each one in turn. Only one will be valid but both do the same thing. Make sure your Volatile table includes ON COMMIT PRESERVE ROWS
Try using BT ET mode (a.k.a. Teradata mode) when establishing the session. I do not remember where but there will be a setting in the ODBC configuration for this.
Try using a Global Temporary table. These work similarly to Volatile tables except you define them once and the definition sticks around. That is, you can create it in, say BTEQ, or SQL assistant etc. The definition is common to all users and sessions (i.e. your Excel session), but the content is transient and unique to each session (like a volatile table).
Move the select part of your insert into the volatile table into the query that selects the data from the volatile table. See simple example below.
If you do not have create Global Temporary table permissions, ask your DBA.
Here is a simple example to illustrate point 4.
Current:
create volatile table tmp (id Integer)
ON COMMIT PRESERVE ROWS;
insert into tmp
select customer_number
from customer
where X = Y and yr = 2019
;
select a,b,c
from another_tbl A join TMP T ON
A.id = T.id
;
Becomes:
select a,b,c
from another_tbl A join (
select customer_number
from customer
where X = Y and yr = 2019
) AS T
ON
A.id = T.id
;
Or better yet, just Join your tables directly.
Note The first sequence (create table, Insert into and select) is a three statement series. This will return 3 "result sets". The first two will be row counts the last will be the actual data. Most programs (including I think Excel) can not process multiple result set responses. This is one of the reasons it is difficult to use Teradata Macros with client tools like Excel.
The latter solution (a single select) avoids this potential problem.

Strange result in DB2. Divergences queries

A strange thing, that I don't know the cause, is happenning when trying to collect results from a db2 database.
The query is the following:
SELECT
COUNT(*)
FROM
MYSCHEMA.TABLE1 T1
WHERE
NOT EXISTS (
SELECT
*
FROM
MYSCHEMA.TABLE2 T2
WHERE
T2.PRIMARY_KEY_PART_1 = T1.PRIMARY_KEY_PART_2
AND T2.PRIMARY_KEY_PART_2 = T1.PRIMARY_KEY_PART_2
)
It is a very simple one.
The strange thing is, this same query, if I change COUNT(*) to * I will get 8 results and using COUNT(*) I will get only 2. The process was repeated some more times and the strange result is still continuing.
At this example, TABLE2 is a parent table of the TABLE1 where the primary key of the TABLE1 is PRIMARY_KEY_PART_1 and PRIMARY_KEY_PART_2, and the primary key of the TABLE2 is PRIMARY_KEY_PART_1, PRIMARY_KEY_PART_2 and PRIMARY_KEY_PART_3.
There's no foreign key between them (because they were legacy ones) and they have a huge amount of data.
The DB2 query SELECT VERSIONNUMBER FROM SYSIBM.SYSVERSIONS returns:
7020400
8020400
9010600
And the client used is SquirrelSQL 3.6 (without the rows limit marked).
So, what is the explanation to this strange result?
Without the details (including, at least, the exact Db2 version and DDL for both tables and their indexes) it can be just anything, and even with that details only IBM support will be really able to say, what is the actual reason.
Generally this looks like damaged data (e.g. differences in index vs table data).
Worth to open the support case with IBM.

System or catalog tables - DBA_ / DBC - outputs are volatile

I am trying to get list of all tables/views (in other words all objects) where a particular field is referenced using the system or catalog tables. I am using the following query.
select *
from dba_col_comments
where column_name like('SXX_AXXX_%')
order by 1;
However, the output is volatile. When I repeatedly run the same query without any changes the output is varies. For instance, it produced 9300 records and then 9350 after a couple of minutes and then 9347 after a couple of minutes.
I am observing the same behaviour in Teradata as well.
My theory would be - in a production enironment temporary objects that are created are probably getting an entry in the system/catalog tables.
Any thoughts/directions?
In Teradata you will find that as global temporary tables are instantiated (referenced by an SQL statement) records should be added to the data dictionary table TVM. These records are then dropped after the session logs off leaving just the base table record associated with the original CREATE GLOBAL TEMPORARY TABLE statement that was submitted.
You can find these instances using the view DBC.AllTempTables.
In Teradata, volatile tables are not maintained within the data dictionary.
EDIT - Your mileage may vary but this should get you started on Teradata
SELECT D1.DatabaseNameI AS DatabaseName_
, T1.TVMNameI AS TableName_
, F1.FieldName AS ColumnName_
FROM "DBC".TVM T1
INNER JOIN
"DBC".Dbase D1
ON D1.DatabaseId = T1.DatabaseId
INNER JOIN
"DBC".TVFields F1
ON F1.DatabaseId = T1.DatabaseId
AND F1.TableId = T1.TVMId
WHERE F1.FieldName = 'MyColumn'
--AND D1.DatabaseNameI IN ('{Database1}', ... '{Database99}') -- Filter on databases
AND F1.FieldType in ('i', 'i1', 'i2', 'i8') -- Integer, ByteInt, SmallInt, BigInt
--AND T1.TableKind IN ('T') -- Optional Filter to just tables.
AND NOT EXISTS
(SELECT 'x'
FROM "DBC".TempTables TT1
WHERE Tt1.TableId = T1.TVMId
)
;

How to increase performace on a sqlite3 database that uses QT?

I am using QSQlQuery on a sqlite3 database. To fetch a particular item , I was populating the result from 4 different tables. I thought joining the tables would increase the performance/speed and get the result faster. So I joined 2 tables initially but it takes longer time to fetch the data after joining the tables (?)
Any suggestion on how to improve the performance would be really appreciated. Also, I was looking at the http://qt-project.org/doc/qt-4.8/qsqlquery.html and it is mentioned that using setForwardOnly would increase the performance on some databases. Any idea if it would work for SQLite3?
Thanks!
According to this link,
http://sqlite.org/cvstrac/wiki?p=PerformanceTuning
SQLite implements JOIN USING by translating the USING clausing into some extra WHERE clause terms. It does the same with NATURAL JOIN and JOIN ON. So while those constructs might be helpful to the human reader, they don't really make any difference to SQLite's query optimizer.
-I was wrong to join two tables and expect the fetch to be faster. It does not work with SQLite database. Instead using a "where" clause and joining two results directly definitely has some positive impact on the performance.
(example :
select * from A,B where A.id = B.id where A.id = 1; instead of
select * from A left outer join B on A.id = B.id where A.id = 1)
The SQLite translates first statement to second before compiling and you could save on the small amount of CPU time by directly using the second statement

Resources