I want to write a query to duplicate each row based on the value in the repeat column. The input table data looks like as below
Products, Repeat
----------------
A, 3
B, 5
C, 2
Now in the output data, the product A should be repeated 3 times, B should be repeated 5 times and C should be repeated 2 times. The output will look like as below
Products, Repeat
----------------
A, 3
A, 3
A, 3
B, 5
B, 5
B, 5
B, 5
B, 5
C, 2
C, 2
Can any body suggest me how to do it in Teradata?
There are several ways to get this result:
#1: classical SQL would be a JOIN to a number-table with sequential values ON n BETWEEN 1 AND Repeat, depending on the actula data might need a lot of CPU
#2: a Recursive Select, effcient as long as Repeat is a small number
WITH RECURSIVE cte AS
(
SELECT t.*, repeat - 1 AS lvl -- counting down to zero
FROM tab AS t
WHERE repeat > 0
UNION ALL
SELECT Products, repeat, lvl-1 FROM cte
WHERE lvl > 0
)
SELECT Products, repeat_
FROM cte
#3: Abusing EXPAND ON, proprietary syntax for PERIOD, but most efficient:
SELECT *
FROM tab
EXPAND ON PERIOD(DATE '2000-01-01', DATE '2000-01-01' + repeat_) AS pd
Related
I am just learning SQL and I got a task, that I need to find the final length of a discontinuous line when I have imput such as:
start | finish
0 | 3
2 | 7
15 | 17
And the correct answer here would be 9, because it spans from 0-3 and then I am suppsed to ignore the parts that are present multiple times so from 3-7(ignoring the two because it is between 0 and 3 already) and 15-17. I am supposed to get this answer solely through an sql query(no functions) and I am unsure of how. I have tried to experiment with some code using with, but I can't for the life of me figure out how to ignore all the multiples properly.
My half-attempt:
WITH temp AS(
SELECT s as l, f as r FROM lines LIMIT 1),
cte as(
select s, f from lines where s < (select l from temp) or f > (select r from temp)
)
select * from cte
This really only gives me all the rows tha are not completly usless and extend the length, but I dont know what to do from here.
Use a recursive CTE that breaks all the (start, finish) intervals to as many 1 unit length intervals as is the total length of the interval and then count all the distinct intervals:
WITH cte AS (
SELECT start x1, start + 1 x2, finish FROM temp
WHERE start < finish -- you can omit this if start < finish is always true
UNION
SELECT x2, x2 + 1, finish FROM cte
WHERE x2 + 1 <= finish
)
SELECT COUNT(DISTINCT x1) length
FROM cte
See the demo.
Result:
length
9
I have a big table which is 100k rows in size and the PRIMARY KEY is of the datatype NUMBER. The way data is populated in this column is using a random number generator.
So my question is, can there be a possibility to have a SQL query that can help me with getting partition the table evenly with the range of values. Eg: If my column value is like this:
1
2
3
4
5
6
7
8
9
10
And I would like this to be broken into three partitions, then I would expect an output like this:
Range 1 1-3
Range 2 4-7
Range 3 8-10
It sounds like you want the WIDTH_BUCKET() function. Find out more.
This query will give you the start and end range for a table of 1250 rows split into 20 buckets based on id:
with bkt as (
select id
, width_bucket(id, 1, 1251, 20) as id_bucket
from t23
)
select id_bucket
, min(id) as bkt_start
, max(id) as bkt_end
, count(*)
from bkt
group by id_bucket
order by 1
;
The two middle parameters specify min and max values; the last parameter specifies the number of buckets. The output is the rows between the minimum and maximum bows split as evenly as possible into the specified number of buckets. Be careful with the min and max parameters; I've found poorly chosen bounds can have an odd effect on the split.
This solution works without width_bucket function. While it is more verbose and certainly less efficient it will split the data as evenly as possible, even if some ID values are missing.
CREATE TABLE t AS
SELECT rownum AS id
FROM dual
CONNECT BY level <= 10;
WITH
data AS (
SELECT id, rownum as row_num
FROM t
),
total AS (
SELECT count(*) AS total_rows
FROM data
),
parts AS (
SELECT rownum as part_no, total.total_rows, total.total_rows / 3 as part_rows
FROM dual, total
CONNECT BY level <= 3
),
bounds AS (
SELECT parts.part_no,
parts.total_rows,
parts.part_rows,
COALESCE(LAG(data.row_num) OVER (ORDER BY parts.part_no) + 1, 1) AS start_row_num,
data.row_num AS end_row_num
FROM data
JOIN parts
ON data.row_num = ROUND(parts.part_no * parts.part_rows, 0)
)
SELECT bounds.part_no, d1.ID AS start_id, d2.ID AS end_id
FROM bounds
JOIN data d1
ON d1.row_num = bounds.start_row_num
JOIN data d2
ON d2.row_num = bounds.end_row_num
ORDER BY bounds.part_no;
PART_NO START_ID END_ID
---------- ---------- ----------
1 1 3
2 4 7
3 8 10
Sorry because I dont think good title for my problem.
I have table a(f1 integer, date Long), date increase, and the data
f1 date
1 1
2 2
3 3
...
I need to sum f1 by date, with record 1{1,1} the sum f1 is 1,with record 2 the sum f1 is 1+2, record 3 the sum f1 is 1+2+3...
How can I do that?
This requires a correlated subquery:
SELECT date,
(SELECT SUM(f1)
FROM a AS a2
WHERE a2.date <= a.date
) AS f1_sum
FROM a
ORDER BY date;
But it's inefficient. Consider just scanning the table, sorted by the date, and summing f1 as you're reading it.
I don't know if I'm being dumb here but I can't seem to find an efficient way to do this. I wrote a very long and inefficient query that does what I need, but what I WANT is a more efficient way.
I have 2 result sets that displays an ID (a PK which is generic/from the same source in both sets) and a FLAG (A - approve and V - Validate).
Result Set 1
ID FLAG
1 V
2 V
3 V
4 V
5 V
6 V
Result Set 2
ID FLAG
2 A
5 A
7 A
8 A
I want to "merge" these two sets to give me this output:
ID FLAG
1 V
2 (V/A)
3 V
4 V
5 (V/A)
6 V
7 A
8 A
Neither of the 2 result sets will at any time have all the ID's to make a simple left join with a case statement on the other result set an easy solution.
I'm currently doing a union between the two sets to get ALL the ID's. Thereafter I left join the 2 result sets to get the required '(V/A)' by use of a case statement.
There must be a more efficient way but I just can't seem to figure it out now as I'm running low on amps... I need a holiday... :-/
Thanks in advance!
Use a FULL OUTER JOIN:
SELECT ID,
CASE
WHEN t1.FLAG IS NULL THEN t2.FLAG
WHEN t2.FLAG IS NULL THEN t1.FLAG
ELSE '(' || t1.FLAG || '/' || t2.FLAG || ')'
END AS MERGED_FLAG
FROM TABLE1 t1
FULL OUTER JOIN TABLE2 t2
USING (ID)
ORDER BY ID
See this SQLFiddle.
Share and enjoy.
I think that you can use xmlagg. Here an exemple :
SELECT deptno,
SUBSTR (REPLACE (REPLACE (XMLAGG (XMLELEMENT ("x", ename)
ORDER BY ename),'</x>'),'<x>','|'),2) as concated_list
FROM emp
GROUP BY deptno
ORDER BY deptno;
Bye
how can i generate 6 numbers between 1 and 2 where 4 of the numbers will be 1 and the other 2 will be 2 in a random order i.e.
results
1
2
1
1
1
2
and also in a different ratio i.e. 3:2:1 for numbers between 1 and 3 for 12 numbers
i.e.
results
1
1
2
3
1
2
1
3
1
1
3
3
results don't have to be in this order but in the ratios as above in oracle SQL or PL/SQL
To get the ratios perfect you could do something like this - generate all the numbers, then sort in random order:
SELECT r
FROM (SELECT CASE
WHEN ROWNUM <=4 THEN 1
ELSE 2
END AS r
FROM DUAL
CONNECT BY LEVEL <= 6)
ORDER BY DBMS_RANDOM.value;
R
----------------------
2
1
1
2
1
1
I think this will work in straight SQL; it's horrifically inefficient, and a PL/SQL one might be less so. It's also completely static; differing ratios call for a different number of values selected.
select value
from (
select mod(value, 2) + 1 as value,
row_number() over (partition by
case mod(value, 2) = 1
then 1
else 0
end) as twos_row,
row_number() over (partition by
case mod(value, 2) = 0
then 1
else 0
end) as ones_row
from (select dbms_crypto.randominteger as value
from dba_objects
order by object_id
)
)
where twos_rows <= 2
or ones_rows <= 4
The inner-most select grabs a big stack of random numbers. The next query out determines whether that random value would be a 2 or a 1 by mod'ing the earlier random value. The last level of nesting just filters out all the rows after the correct number of that type of row has been returned.
This is untested and fragile. If you need a solution that's reliable and performance, I'd recommend PL/SQL, where you
loop
pick off random numbers
determine what partition in your set of values they'd fit into
keep them if that partition hasn't been satisfied
exit when all partitions have been satisfied.