Teradata SQL selecting successive batch of rows - teradata

I have 300000 entries in my db and am trying to access entry 50000-100000 (to 50000 total).
My query is as follows:
query = 'SELECT TOP 50000* FROM database ORDER BY col_name QUALIFY ROW_NUMBER() BETWEEN 50000 and 100000'
I only found the BETWEEN KEYWORD in one source however and am suspecting I am not using it correctly since it says it can't be used on a non-ordered database. I assume the QUALIFY then gets evaluated before the ORDER BY.
So I tried something along the lines of
query_second_try = 'SELECT TOP 50000* FROM database QUALIFY ROW_NUMBER() OVER (ORDER BY col_name)'
to see if this fixes the problem (without taking into account the specific rows I want to select). This is also not the case.
I have tried using qualify with rank, but this doesn't seem to be exactly what I need either, I think the BETWEEN statement would be a better fit.
Can someone push me in the right direction here?
I am essentially trying to do the equivalent of 'ORDER BY col_name OFFSET BY 50000' in teradata.
Any help would be appreciated.

Few problems here.
row_number requires an order by. And it needs to be granular enough to ensure it's deterministic. You can also play around with rank, dense_rank, and row_number, depending on what you want to do with ties.
You're also mixing top N and qualify.
Try this:
select
*
from
<table>
qualify row_number() over (order by <column(s)>) between X and Y

Related

CosmosDB: DISTINCT for only one Column

I have the following query:
SELECT DISTINCT c.deviceId, c._ts FROM c
ORDER BY c._ts DESC
I would like to receive only one pair (c.deviceId, c._ts) per deviceId, but because the c._ts value is distinct for all entries, I am getting all the value-pairs for all deviceIds, with other words my whole DB.
I have tried to use Question: Distinct for only one value as a guide, but I see that CosmosDB does not support GROUP BY.
Is there a way to do this in cosmosDB?
Though it's a common requirement i think,i can't implement it on my side as same as you. The distinct keyword can't work on one single column cross whole query result.
Group by feature is currently in active development for a long period,based on the latest comment in this voice,it is coming soon.
If your need is urgent ,as workaround, you could follow this case to use documentdb-lumenize package which supports Aggregate Functions.

Cant navigate into a specific column sqlite

The first link is the problems.
I am very inexperienced in SQLite and needed some help.
Thanks in advance!
https://imgur.com/v7BdVe3
These are all the tables displayed open
https://imgur.com/a/VMOxAuc
https://imgur.com/a/LrDcCBQ
The schema
https://imgur.com/a/bv5KTHN
This is as far as I could get which is close but I couldn't figure out how to sort it also by marina 1.
SELECT BOAT_NAME, OWNER.OWNER_NUM, LAST_NAME, FIRST_NAME from OWNER inner join MARINA_SLIP on OWNER.OWNER_NUM = MARINA_SLIP.OWNER_NUM;
If you know anything else bout the other questions feel free to help me with those too, Thanks!
I believe that you want
SELECT BOAT_NAME, OWNER.OWNER_NUM, LAST_NAME, FIRST_NAME
FROM OWNER INNER JOIN MARINA_SLIP ON OWNER.OWNER_NUM = MARINA_SLIP.OWNER_NUM
WHERE MARINA_NUM = 1
ORDER BY BOAT_NAME;
The second question involves multiple joins.
The third question asks you to use the count(*) function, noting that this is an aggregate function and will result in the number of rows for the GROUP as per the GROUP BY clause (if no GROUP BY clause then there is just the one GROUP i.e. all resultant rows).
The fourth question progresses a little further asking you to extend the GROUP BY clause with the HAVING clause (see link above for GROUP BY).

How to put a part of a code as a string in table to use it in a procedure?

I'm trying to resolve below issue:
I need to prepare table that consists 3 columns:
user_id,
month
value.
Each from over 200 users has got different values of parameters that determine expected value which are: LOB, CHANNEL, SUBSIDIARY. So I decided to store it in table ASYSTENT_GOALS_SET. But I wanted to avoid multiplying rows and thought it would be nice to put all conditions as a part of the code that I would use in "where" clause further in procedure.
So, as an example - instead of multiple rows:
I created such entry:
So far I created testing table ASYSTENT_TEST (where I collect month and value for certain user). I wrote a piece of procedure where I used BULK COLLECT.
declare
type test_row is record
(
month NUMBER,
value NUMBER
);
type test_tab is table of test_row;
BULK_COLLECTOR test_tab;
p_lob varchar2(10) :='GOSP';
p_sub varchar2(14);
p_ch varchar2(10) :='BR';
begin
select subsidiary into p_sub from ASYSTENT_GOALS_SET where user_id='40001001';
execute immediate 'select mc, sum(ppln_wartosc) plan from prod_nonlife.mis_report_plans
where report_id = (select to_number(value) from prod_nonlife.view_parameters where view_name=''MIS'' and parameter_name=''MAX_REPORT_ID'')
and year=2017
and month between 7 and 9
and ppln_jsta_symbol in (:subsidiary)
and dcs_group in (:lob)
and kanal in (:channel)
group by month order by month' bulk collect into BULK_COLLECTOR
using p_sub,p_lob,p_ch;
forall x in BULK_COLLECTOR.first..BULK_COLLECTOR.last insert into ASYSTENT_TEST values BULK_COLLECTOR(x);
end;
So now when in table ASYSTENT_GOALS_SET column SUBSIDIARY (varchar) consists string 12_00_00 (which is code of one of subsidiary) everything works fine. But the problem is when user works in two subsidiaries, let say 12_00_00 and 13_00_00. I have no clue how to write it down. Should SUBSIDIARY column consist:
'12_00_00','13_00_00'
or
"12_00_00","13_00_00"
or maybe
12_00_00','13_00_00
I have tried a lot of options after digging on topics like "Deling with single/escaping/double qoutes".
Maybe I should change something in execute immediate as well?
Or maybe my approach to that issue is completely wrong from the very beginning (hopefully not :) ).
I would be grateful for support.
I didn't create the table function described here but that article inspired me to go back to try regexp_substr function again.
I changed: ppln_jsta_symbol in (:subsidiary) to
ppln_jsta_symbol in (select regexp_substr((select subsidiary from ASYSTENT_GOALS_SET where user_id=''fake_num''),''[^,]+'', 1, level) from dual
connect by regexp_substr((select subsidiary from ASYSTENT_GOALS_SET where user_id=''fake_num''), ''[^,]+'', 1, level) is not null) Now it works like a charm! Thank you #Dessma very much for your time and suggestion!
"I wanted to avoid multiplying rows and thought it would be nice to put all conditions as a part of the code that I would use in 'where' clause further in procedure"
This seems a misguided requirement. You shouldn't worry about number of rows: databases are optimized for storing and retrieving rows.
What they are not good at is dealing with "multi-value" columns. As your own solution proves, it is not nice, it is very far from nice, in fact it is a total pain in the neck. From now on, every time anybody needs to work with subsidiary they will have to invoke a function. Adding, changing or removing a user's subsidiary is much harder than it ought to be. Also there is no chance of enforcing data integrity i.e. validating that a subsidiary is valid against a reference table.
Maybe none of this matters to you. But there are very good reasons why Codd mandated "no repeating groups" as a criterion of First Normal Form, the foundation step of building a sound data model.
The correct solution, industry best practice for almost forty years, would be to recognise that SUBSIDIARY exists at a different granularity to CHANNEL and so should be stored in a separate table.

sqlite optimization on inner join with tables values around 18K

I have two tables tool , tool_attribute.
tool has 12 columns and tool_attribute has 5.
Information i needed from the tables :
tool - refid, serial, type, id
tool_attribute - key, value, id (There will be multiple entries for this)
Right now i have around 18264 in tool and 255696 in tool_attribute
Current Query :
select
tool.refid,
tool.serial,
tool_attribute.value,
tool.type
from tool
inner join tool_attribute
on tool.id = tool_attribute.id
where
(tool_attribute.val LIKE '%t00%' or
tool.serial LIKE '%t00%')
group by tool.refid
order by tool.serial asc;
This take around 750ms which is quite fast but i want to make it much faster. I run this code on low memory windows 6.0 device so it takes too much time.
Is there any way i could make it faster ?
You can try adding indices to the columns involved in the join:
CREATE INDEX idx_tool ON tool (id);
CREATE INDEX idx_tool_attr ON tool_attribute (id);
The LIKE conditions in your WHERE clause would preclude any chance of using an index on the columns involved, I think. The reason for this is a LIKE expression of the form %something eliminates the chance to search through a B-tree, which uses the suffix from left to right to find something. If you could rephrase your WHERE logic using something similar to LIKE 'something%' then an index could be used there as well.

SQLite - getting number of rows in a database

I want to get a number of rows in my table using max(id). When it returns NULL - if there are no rows in the table - I want to return 0. And when there are rows I want to return max(id) + 1.
My rows are being numbered from 0 and autoincreased.
Here is my statement:
SELECT CASE WHEN MAX(id) != NULL THEN (MAX(id) + 1) ELSE 0 END FROM words
But it is always returning me 0. What have I done wrong?
You can query the actual number of rows withSELECT Count(*) FROM tblName
see https://www.w3schools.com/sql/sql_count_avg_sum.asp
If you want to use the MAX(id) instead of the count, after reading the comments from Pax then the following SQL will give you what you want
SELECT COALESCE(MAX(id)+1, 0) FROM words
In SQL, NULL = NULL is false, you usually have to use IS NULL:
SELECT CASE WHEN MAX(id) IS NULL THEN 0 ELSE (MAX(id) + 1) END FROM words
But, if you want the number of rows, you should just use count(id) since your solution will give 10 if your rows are (0,1,3,5,9) where it should give 5.
If you can guarantee you will always ids from 0 to N, max(id)+1 may be faster depending on the index implementation (it may be faster to traverse the right side of a balanced tree rather than traversing the whole tree, counting.
But that's very implementation-specific and I would advise against relying on it, not least because it locks your performance to a specific DBMS.
Not sure if I understand your question, but max(id) won't give you the number of lines at all. For example if you have only one line with id = 13 (let's say you deleted the previous lines), you'll have max(id) = 13 but the number of rows is 1. The correct (and fastest) solution is to use count(). BTW if you wonder why there's a star, it's because you can count lines based on a criteria.
I got same problem if i understand your question correctly, I want to know the last inserted id after every insert performance in SQLite operation. i tried the following statement:
select * from table_name order by id desc limit 1
The id is the first column and primary key of the table_name, the mentioned statement show me the record with the largest id.
But the premise is u never deleted any row so the numbers of id equal to the numbers of rows.
Extension of VolkerK's answer, to make code a little more readable, you can use AS to reference the count, example below:
SELECT COUNT(*) AS c from profile
This makes for much easier reading in some frameworks, for example, i'm using Exponent's (React Native) Sqlite integration, and without the AS statement, the code is pretty ugly.

Resources