Teradata MAX function with duplicate rows - teradata

Which value will be return if we use following function and why?
MAX (CASE WHEN NAME='ABC' THEN GENDER END) AS SEX
Where we have following two rows in table
GENDER NAME
M ABC
F ABC

Here is the test you can run in your own system (Teradata studio, SQL Assistant, Atanasuite, bteq, or whatever you use):
CREATE VOLATILE TABLE test2
(
f1 CHAR(1),
f2 CHAR(3)
) PRIMARY INDEX (f1) ON COMMIT PRESERVE ROWS;
INSERT INTO test2 VALUES ('M', 'ABC');
INSERT INTO test2 VALUES ('F', 'ABC');
SELECT MAX(CASE WHEN f2='ABC' THEN f1 END) FROM test2;
DROP TABLE test2;
This will output M
What is happening is that BEFORE aggregation an intermediate result set is being generated of all rows for column f1. That one column is being set to whatever the value is in f1 or NULL depending on your CASE statement:
Intermediate Result Set:
col1
-----
M
F
Both rows return something since both have a Name equal to ABC. Now we take the Max() as that's the next step in this SQL's order of operations.
The Max of the two values M and F is: M because M is higher in the alphabet (lexicographical sort).

Related

informix 11.5 How to count the number of members of a column separately

I have a table that have a column like this:
table1:
c1 c2 c3
. a .
. a .
. a .
a
b
b
c
How to get a result like the following?:
-- a b c
count(a) count(b) count(c)
Of course, there is an auxiliary table like the one below:
--field table
d1 d2
a
b
c
Transferring comments into an answer.
If there was an entry in table1.c2 with d as the value, is it correct to guess/assume that you'd want a fourth column of output with the name d and the count of the number of d values as the value. And there'd be an extra row in the auxilliary table too. That's pretty tricky.
You'd probably be better off with a result table with N rows, one for each value in the table1.c2 column, with the first column identifying the value and the second the count:
SELECT c2, COUNT(c2) FROM table1 GROUP BY c2 ORDER BY c2
To generate a single row with the names and counts as shown requires a dynamically built SQL statement — you write an SQL statement that generates the SQL (or the key components of the SQL) for a second statement that you actually execute to get the result. The main reason for it being dynamic like that is that the number of columns in the result set is not known until you run a query that determines which values exist in table1.c2. That's non-trivial — doable, but non-trivial.
I forget whether 11.50 has a built-in sysmaster:sysdual table. I ordinarily use a regular one-column, one-row table called dual. You can get the result you want, if your Table1.C2 has values a through e in it, with:
SELECT (SELECT COUNT(*) FROM Table1 WHERE c2 = 'a') AS a,
(SELECT COUNT(*) FROM Table1 WHERE c2 = 'b') AS b,
(SELECT COUNT(*) FROM Table1 WHERE c2 = 'c') AS c,
(SELECT COUNT(*) FROM Table1 WHERE c2 = 'd') AS d,
(SELECT COUNT(*) FROM Table1 WHERE c2 = 'e') AS e
FROM dual;
This gets the information you need. I don't think it is elegant, but "works" beats "doesn't work".

How can I accumulate values from rows on a per closest date basis to a list of dates as a parameter according the parameter dates?

There is table that has a date and cnt column e.g.
timestamp cnt
------------------
1547015021 14
1547024080 2
This table can be created using :-
DROP TABLE IF EXISTS roundit_base;
CREATE TABLE IF NOT EXISTS roundit_base (timestamp INTEGER, cnt INTEGER);
INSERT INTO roundit_base VALUES (1547015021,14),(1547024080,2);
The result should be the sum of the cnt column of rows that are the closest timestamp to a list of supplied timestamps, e.g. the supplied data could be
1546905600 - 0
1546992000 - 0
1547078400 - 0
...
The result should be along the lines of
1546905600 - 0
1546992000 - 14
1547078400 - 2
That is two columns:-
the timestamp from the list of supplied timestamps, that the respective rows from the database are closest to and
the sum of the cnt column those rows on a per supplied timestamp
Although the results are different from the expected results in that the calculations used places both 1547015021 and 1547024080 as being closest to the suplied timestamp of 1546992000;
The following could be the basis of an SQLite based solution :-
WITH
-- The supplied list of timestamps
v (cv,dflt) AS (
VALUES (1546905600,0),(1546992000,0),(1547078400,0)
),
-- Join the two sets calculating the difference
cte1 AS (
SELECT *, abs(cv - timestamp) AS diff FROM roundit_base INNER JOIN v
),
-- Find the closest (smallest difference) for each timestamp
cte2 AS (
SELECT *, min(diff) FROM cte1 GROUP BY timestamp
)
-- For each compartive value sum the counts allocated/assigned (timestamps) to that
SELECT cv,
CASE
WHEN
(SELECT sum(cnt) FROM cte2 WHERE cv = v.cv) IS NOT NULL
THEN
(SELECT sum(cnt) FROM cte2 WHERE cv = v.cv)
ELSE 0
END AS cnt
FROM v;
;
The above results in :-

Sqlite3 column division

I have the following columns (FirstCol, SecondCol, ThirdCol) in a sqlite3 db file:
1 Inside 100
1 Outside 200
2 Inside 46
2 Outside 68
First column has type INT, second has type TEXT and third one has type INT.
For each FirstCol value (in this case just 1 and 2) i need to obtain the result of the value associated with Outside/Inside, which is to say 200/100 where FirstCol=1 and 68/46 where FirstCol=2.
I don't mind whether this is done with a single query or by creating a new table, i just need that result.
Thanks.
You have to look up the values from different rows with correlated subqueries:
SELECT FirstCol,
(SELECT ThirdCol
FROM MyTable
WHERE FirstCol = T.FirstCol
AND SecondCol = 'Outside'
) /
(SELECT ThirdCol
FROM MyTable
WHERE FirstCol = T.FirstCol
AND SecondCol = 'Inside'
) AS Result
FROM (SELECT DISTINCT FirstCol
FROM MyTable) AS T;

Updating single specified values from another table in SQLite

I have two SQLite tables A and B defined as:
CREATE TABLE A (orig_cat INTEGER, type INTEGER,gv_ID INTEGER);
INSERT INTO A (orig_cat,type) VALUES (1,1);
INSERT INTO A (orig_cat,type) VALUES (2,2);
INSERT INTO A (orig_cat,type) VALUES (3,2);
INSERT INTO A (orig_cat,type) VALUES (4,2);
INSERT INTO A (orig_cat,type) VALUES (1,3);
INSERT INTO A (orig_cat,type) VALUES (2,3);
INSERT INTO A (orig_cat,type) VALUES (3,3);
UPDATE A SET gv_ID=rowid+99;
and
CREATE TABLE B (col_t INTEGER, orig_cat INTEGER, part INTEGER);
INSERT INTO B VALUES (1,1,1);
INSERT INTO B VALUES (3,1,2);
INSERT INTO B VALUES (2,2,1);
INSERT INTO B VALUES (1,2,2);
INSERT INTO B VALUES (3,3,1);
INSERT INTO B VALUES (4,3,2);
I'd like to update and set/replace the values in column col_t of table B where part=2 with selected values of column gv_ID of table A. The selected values I can get with a SELECT command:
SELECT gv_ID
FROM (SELECT * FROM B where part=2) AS B_sub
JOIN (SELECT * FROM A WHERE type=3) AS A_sub
ON B_sub.orig_cat=A_sub.orig_cat;
How can I use that so that the values of col_t in rows 2,3 and 5 (=1,2,3) get replaced with the values 104,105,106 (wich is returned by the selection)?
You can use correlated subquery:
UPDATE B
SET col_t = (SELECT gv_ID FROM A WHERE A.orig_cat = B.orig_cat AND A.type = 3)
WHERE B."part" = 2;
SqlFiddleDemo
I've assumed that pair A.orig_cat and A.type is UNIQUE.

SQLite: How do I find the daily/weekly/monthly/yearly average of a row count

I've just started learning SQLite, and had a question.
Here is an example of what I mean.
This is my CSV:
date
2010-10-24
2010-10-31
2010-11-01
2011-02-14
2011-02-15
2011-02-16
2011-10-01
2012-01-15
2012-05-12
2012-05-14
2012-08-12
2012-08-26
My code:
SELECT STRFTIME('%Y-%m', date) AS 'month', COUNT() AS 'month_count'
FROM tableName
GROUP BY STRFTIME('%Y-%m', date);
The result (in comma-delimited form):
month, month_count
2010-10, 2
2010-11, 1
2011-02, 3
2011-10, 1
2012-01, 1
2012-05, 2
2012-08, 2
What I'm looking for now, is a way to get the average number of 'month_count' per month, which is of course different from just the average of 'month_count'. That is, the former equals 0.55, while the latter equals 1.71, and I'm trying ti calculate the former.
I tried using AVG(COUNT()), though that obviously made no logical sense.
I'm guessing I'd have to store the code-generated table as a temporary file, then get the average from it, though I'm not sure how to properly write it.
Does anyone know what I'm missing?
Try the code below:
create table test(date date);
insert into test values ('2010-10-24');
insert into test values ('2010-10-31');
insert into test values ('2010-11-01');
insert into test values ('2011-02-14');
insert into test values ('2011-02-15');
insert into test values ('2011-02-16');
insert into test values ('2011-10-01');
insert into test values ('2012-01-15');
insert into test values ('2012-05-12');
insert into test values ('2012-05-14');
insert into test values ('2012-08-12');
insert into test values ('2012-08-26');
SELECT a.tot_months
, b.month_diff
, cast(a.tot_months as float) / b.month_diff avg_count
FROM (SELECT COUNT(*) tot_months FROM test) a
, (SELECT cast((strftime('%m',max(date))+12*strftime('%Y',max(date))) as int) -
cast((strftime('%m',min(date))+12*strftime('%Y',min(date))) as int) as 'month_diff'
FROM test) b
;
Output:
C:\scripts>sqlite3 < foo.sql
12|22|0.545454545454545

Resources