Creating even ranges based on values in an oracle table - plsql

I have a big table which is 100k rows in size and the PRIMARY KEY is of the datatype NUMBER. The way data is populated in this column is using a random number generator.
So my question is, can there be a possibility to have a SQL query that can help me with getting partition the table evenly with the range of values. Eg: If my column value is like this:
1
2
3
4
5
6
7
8
9
10
And I would like this to be broken into three partitions, then I would expect an output like this:
Range 1 1-3
Range 2 4-7
Range 3 8-10

It sounds like you want the WIDTH_BUCKET() function. Find out more.
This query will give you the start and end range for a table of 1250 rows split into 20 buckets based on id:
with bkt as (
select id
, width_bucket(id, 1, 1251, 20) as id_bucket
from t23
)
select id_bucket
, min(id) as bkt_start
, max(id) as bkt_end
, count(*)
from bkt
group by id_bucket
order by 1
;
The two middle parameters specify min and max values; the last parameter specifies the number of buckets. The output is the rows between the minimum and maximum bows split as evenly as possible into the specified number of buckets. Be careful with the min and max parameters; I've found poorly chosen bounds can have an odd effect on the split.

This solution works without width_bucket function. While it is more verbose and certainly less efficient it will split the data as evenly as possible, even if some ID values are missing.
CREATE TABLE t AS
SELECT rownum AS id
FROM dual
CONNECT BY level <= 10;
WITH
data AS (
SELECT id, rownum as row_num
FROM t
),
total AS (
SELECT count(*) AS total_rows
FROM data
),
parts AS (
SELECT rownum as part_no, total.total_rows, total.total_rows / 3 as part_rows
FROM dual, total
CONNECT BY level <= 3
),
bounds AS (
SELECT parts.part_no,
parts.total_rows,
parts.part_rows,
COALESCE(LAG(data.row_num) OVER (ORDER BY parts.part_no) + 1, 1) AS start_row_num,
data.row_num AS end_row_num
FROM data
JOIN parts
ON data.row_num = ROUND(parts.part_no * parts.part_rows, 0)
)
SELECT bounds.part_no, d1.ID AS start_id, d2.ID AS end_id
FROM bounds
JOIN data d1
ON d1.row_num = bounds.start_row_num
JOIN data d2
ON d2.row_num = bounds.end_row_num
ORDER BY bounds.part_no;
PART_NO START_ID END_ID
---------- ---------- ----------
1 1 3
2 4 7
3 8 10

Related

SQLITE get next row after ORDERBY

I need to get the next row from an ORDERBY query
I have 2 columns, ID(Primary key), Age(float) in a table T and I need something like the following
SELECT ID FROM T WHERE !> (inputted ID) + 1 rowID/Next row <! ORDERBY Age (then primary key, but I suspect if the Age values are the same SQLite would default to order by primary key anyway) LIMIT 1
Essentially it would select the next row after the inputted ID in the ordered table, its the next row / rowID + 1 I am not sure how to get.
As suggested here is a data set as an example
https://dbfiddle.uk?rdbms=sqlite_3.27&fiddle=19685ac20cc42041a59d318a01a2010f
ID Age
1 12.2
2 36.8
3 22.5
4 41
5 16.7
I am attempting to get the the following row from the ordered (by age) list given a specific ID
ID Age
1 12.2
5 16.7
3 22.5
2 36.8
4 41
Something similar to
SELECT ID FROM OrderedInfo WHERE ID = 5 ORDER BY Age ASC LIMIT 1 OFFSET 1;
My expected result would be '3' from the example data above
I have expanded the data set to include duplicate entries as I didn't implicitly state it could have such data - as such forpas answer works for the first example with no duplicate entries - thanks for your help
https://dbfiddle.uk?rdbms=sqlite_3.27&fiddle=f13d7f5a44ba414784547d9bbdf4997e
Use a subquery for the ID that you want in the WHERE clause:
SELECT *
FROM OrderedInfo
WHERE Age > (SELECT Age FROM OrderedInfo WHERE ID = 5)
ORDER BY Age LIMIT 1;
See the demo.
If there are duplicate values in the column Age use a CTE that returns the row that you want and join it to the table so that you expand the conditions:
WITH cte AS (SELECT ID, Age FROM OrderedInfo WHERE ID = 5)
SELECT o.*
FROM OrderedInfo o INNER JOIN cte c
ON o.Age > c.Age OR (o.Age = c.Age AND o.ID > c.ID)
ORDER BY o.Age, o.ID LIMIT 1;
See the demo.

Why does COUNT return NULL instead of `0` in this query?

I have the query
select d.did, count ( h.did ), unique_interested
from dealer as d
left outer join house as h
on h.did = d.did
left outer join (
-- cid = customer id
select hid, count (cid) as unique_interested
from is_interested
group by hid
) as ok
on h.hid = ok.hid
group by d.did
order by d.did asc
;
which is supposed to select the number of houses that each dealer is dealing, and the number of unique customers interested in said houses (as in the number of customers per dealer). This should happen even if the dealers have no houses to deal at the moment, which is why I'm using left outer joins when constructing the table the columns will be picked from.
Now, running this query against my database produces the following output:
d.did count ( h.did) unique_interested
----- -------------- ----------------
1 3
2 3 1
3 0
As you can see, instead of printing 0 in the last column, count returns null, when there is a null in one of the aparments produced by the last part of the join (as in cid is null):
select hid, count ( cid ) as unique_interested
from is_interested
group by hid
I know this is because there are apartments in the table produced by from, that no-one is interested in. But shouldn't count produce 0 instead of the actual column value null in every case?
Any explanation as to why this is happening would be appreciated, as it would lead me towards an answer to another question, which is "Why am I not getting the right number of unique interested customers per dealer from the table is_interested?", as with the current state of my database, the output should look more like:
d.did count ( h.did) unique_interested
----- -------------- ----------------
1 3 2
2 3 2
3 0 0

Sqlite Increment column value based on previous row's value

How do I increment a column value based on previous column value in Sqlite? I need to do this for 1000+ rows. I have data in the first row say 100. I need to increment the next 1000 rows by 2.
Row# ColName
1 100
2 102
3 104
4 106
I tried something like this:
Update Table SET ColName = (Select max(ColName ) from Table ) + 2 but this puts 102 in all columns.
Assuming that this table has a rowid column, it is possible to count how many previous rows there are:
UPDATE MyTable
SET ColName = (SELECT MAX(ColName)
FROM MyTable
) +
(SELECT COUNT(*)
FROM MyTable AS Previous
WHERE Previous.rowid < MyTable.rowid
) * 2
WHERE ColName IS NULL

SQLITE syntax error performing an Update

I'm trying to improve a transit scheduling table by adding a column and flagging some rows to indicate they are the last stop for each trip.
Each trip will have many rows showing its stops and their sequence along the trip. I want to update the LastStop column with a '1' if the Sequence number is the highest for that trip.
I think the following SQL is on the right track but I am getting a "no such column: s1.stop_sequence" so I have no idea if I'm even on the right track until this unobvious to me error is resolved. I am a SQL lightweight barely beyond novice level. Stop_Sequence is definitely the correct name for the column.
UPDATE stop_times
SET LastStop = '1'
WHERE stop_sequence =(
SELECT max(st.stop_sequence)
FROM stop_times s1
WHERE s1.trip_id = trip_id
)
AND
trip_id = s1.trip_id
AND
stop_ID = s1.stop_id;
A simplified version of sample data is below.
TripID Stop Sequence LastStop
665381 1766 1
665381 3037 2
665381 3038 3 1
667475 1130 1
667475 2504 2 1
644501 2545 1
644501 3068 2
644501 2754 3
644501 3069 4
644501 2755 5 1
You cannot refer to a column in the subquery from the outer query.
Furthermore, the filter trip_id = s1.trip_id is duplicated, and you do not want to filter on stop_id because that would prevent the MAX from looking at any other stops of the trip.
Try this:
UPDATE stop_times
SET LastStop = '1'
WHERE Stop_Sequence = (SELECT MAX(Stop_Sequence)
FROM stop_times s1
WHERE s1.Trip_ID = stop_times.Trip_ID)
Alternatively, a last stop is a stop for which no other stop with a larger sequence number in the same trip exists:
UPDATE stop_times
SET LastStop = '1'
WHERE NOT EXISTS (SELECT 1
FROM Stop_Sequence s1
WHERE s1.Trip_ID = stop_times.Trip_ID
AND s1.Stop_Sequence > stop_times.Stop_Sequence)
This will work for you, as long as stops field is always less than 1000 (use bigger multiplier if it is):
UPDATE stop_times
SET laststop = 1
WHERE tripid*1000+sequence IN (
SELECT tripid*1000+sequence FROM (
SELECT tripid, max(sequence) AS sequence
FROM stop_times
GROUP BY 1
)
)
I would have written this using tuple syntax, but SQLite does not support it:
UPDATE stop_times
SET laststop = 1
WHERE (tripid, sequence) IN (
SELECT (tripid, sequence) FROM (
SELECT tripid, max(sequence) AS sequence
FROM stop_times
GROUP BY 1
)
)
Sorry, no SQLFiddle - it does not seem to work for me today.

Group by not returning 0 value

My table contains pk_id,reviewer_id,rating.
There are 4 type of rating.
1-very good.
2-good.
3-bad.
4-very bad.
I want to calculate how much rating given by each reviewer.
Means:
If Akee having id 200 has given 2 very good,4 good,3 bad and zero very bad rating to different code.
I want result
count--- rate
2---------1
4---------2
3---------3
0---------4
My query is
SELECT COUNT(RATE),RATE
FROM CODE_REVIEW WHERE CODE_REVIEWER_ID= 200
GROUP BY RATE;
It is showing result
count--- rate
2---------1
4---------2
3---------3
I want to show the fourth row that is 4 rating zero.
How can it be done??
If Rate is not the primary key in another table then you need define your own list of rates so MySQL knows what the permutations of rate are:
SELECT Rates.Rate,
COUNT(Code_Review.Rate) AS CountOfRate
FROM ( SELECT 1 AS Rate UNION ALL
SELECT 2 AS Rate UNION ALL
SELECT 3 AS Rate UNION ALL
SELECT 4
) AS Rates
LEFT JOIN Code_Review
ON Code_Review.Rate = Rates.Rate
AND CODE_REVIEWER_ID = 200
GROUP BY Rates.Rate
Try this query:
SELECT coalesce(c.cnt, 0), r.rate
FROM (SELECT 1 AS rate UNION ALL SELECT 2
UNION ALL SELECT 3 UNION ALL SELECT 4) AS r
LEFT JOIN (SELECT COUNT(RATE),RATE
FROM CODE_REVIEW WHERE CODE_REVIEWER_ID= 200
GROUP BY RATE) AS c
ON r.rate = c.rate;
The first subquery creates a list of possible rates. You can avoid it if you have a table which defines all rates;
Second subquery is yours;
LEFT JOIN guarantees that all rates will be shown;
coalesce() is needed to convert NULL into 0.
Assuming that you do not have a separate table where the rates are defined.
SElECT * from (
SELECT distinct(m.rate), countrate from code_review m
LEFT JOIN
(SELECT COUNT(rate) as countrate,rate FROM code_review
WHERE code_reviewer_id=200 GROUP BY rate) t
ON m.rate=t.rate) a
You could do it somthing like this
SELECT
rates.RATE
, SUM(COUNT) COUNT
FROM
(
SELECT 1 RATE, 0 COUNT UNION ALL
SELECT 2 RATE, 0 COUNT UNION ALL
SELECT 3 RATE, 0 COUNT UNION ALL
SELECT 4 RATE, 0 COUNT
) Rates
LEFT JOIN
(
SELECT
RATE
, COUNT(RATE) COUNT
FROM
CODE_REVIEW
WHERE
CODE_REVIEWER_ID= 200
GROUP BY RATE
) Ratings200
ON Ratings200.RATE = Rates.RATE
If you can, you should push to try to get it in column format as it is simple as:
SELECT
SUM(rate = 1) AS 1,
SUM(rate = 2) AS 2,
SUM(rate = 3) AS 3,
SUM(rate = 4) AS 4
FROM
code_review
WHERE
code_reviewer_id = 200
But if you really need a row format, you could do:
SELECT
a.rate,
COUNT(b.rate) AS cnt
FROM
(
SELECT 1 AS rate UNION ALL
SELECT 2 AS rate UNION ALL
SELECT 3 AS rate UNION ALL
SELECT 4 AS rate
) a
LEFT JOIN
code_review b ON a.rate = b.rate AND code_reviewer_id = 200
GROUP BY
a.rate
SELECT
Rate,
totCount
FROM
(
Select
Rate,
count(Rate) as totCount
from
Code_Review
where
CODE_REVIEWER_ID = 200
group by
Rate
union
select 4, 0
union
select 3, 0
union
select 2, 0
union
select 1, 0
) AS T
group by
T.Rate

Resources