I'm trying to improve a transit scheduling table by adding a column and flagging some rows to indicate they are the last stop for each trip.
Each trip will have many rows showing its stops and their sequence along the trip. I want to update the LastStop column with a '1' if the Sequence number is the highest for that trip.
I think the following SQL is on the right track but I am getting a "no such column: s1.stop_sequence" so I have no idea if I'm even on the right track until this unobvious to me error is resolved. I am a SQL lightweight barely beyond novice level. Stop_Sequence is definitely the correct name for the column.
UPDATE stop_times
SET LastStop = '1'
WHERE stop_sequence =(
SELECT max(st.stop_sequence)
FROM stop_times s1
WHERE s1.trip_id = trip_id
)
AND
trip_id = s1.trip_id
AND
stop_ID = s1.stop_id;
A simplified version of sample data is below.
TripID Stop Sequence LastStop
665381 1766 1
665381 3037 2
665381 3038 3 1
667475 1130 1
667475 2504 2 1
644501 2545 1
644501 3068 2
644501 2754 3
644501 3069 4
644501 2755 5 1
You cannot refer to a column in the subquery from the outer query.
Furthermore, the filter trip_id = s1.trip_id is duplicated, and you do not want to filter on stop_id because that would prevent the MAX from looking at any other stops of the trip.
Try this:
UPDATE stop_times
SET LastStop = '1'
WHERE Stop_Sequence = (SELECT MAX(Stop_Sequence)
FROM stop_times s1
WHERE s1.Trip_ID = stop_times.Trip_ID)
Alternatively, a last stop is a stop for which no other stop with a larger sequence number in the same trip exists:
UPDATE stop_times
SET LastStop = '1'
WHERE NOT EXISTS (SELECT 1
FROM Stop_Sequence s1
WHERE s1.Trip_ID = stop_times.Trip_ID
AND s1.Stop_Sequence > stop_times.Stop_Sequence)
This will work for you, as long as stops field is always less than 1000 (use bigger multiplier if it is):
UPDATE stop_times
SET laststop = 1
WHERE tripid*1000+sequence IN (
SELECT tripid*1000+sequence FROM (
SELECT tripid, max(sequence) AS sequence
FROM stop_times
GROUP BY 1
)
)
I would have written this using tuple syntax, but SQLite does not support it:
UPDATE stop_times
SET laststop = 1
WHERE (tripid, sequence) IN (
SELECT (tripid, sequence) FROM (
SELECT tripid, max(sequence) AS sequence
FROM stop_times
GROUP BY 1
)
)
Sorry, no SQLFiddle - it does not seem to work for me today.
Related
I have a big table which is 100k rows in size and the PRIMARY KEY is of the datatype NUMBER. The way data is populated in this column is using a random number generator.
So my question is, can there be a possibility to have a SQL query that can help me with getting partition the table evenly with the range of values. Eg: If my column value is like this:
1
2
3
4
5
6
7
8
9
10
And I would like this to be broken into three partitions, then I would expect an output like this:
Range 1 1-3
Range 2 4-7
Range 3 8-10
It sounds like you want the WIDTH_BUCKET() function. Find out more.
This query will give you the start and end range for a table of 1250 rows split into 20 buckets based on id:
with bkt as (
select id
, width_bucket(id, 1, 1251, 20) as id_bucket
from t23
)
select id_bucket
, min(id) as bkt_start
, max(id) as bkt_end
, count(*)
from bkt
group by id_bucket
order by 1
;
The two middle parameters specify min and max values; the last parameter specifies the number of buckets. The output is the rows between the minimum and maximum bows split as evenly as possible into the specified number of buckets. Be careful with the min and max parameters; I've found poorly chosen bounds can have an odd effect on the split.
This solution works without width_bucket function. While it is more verbose and certainly less efficient it will split the data as evenly as possible, even if some ID values are missing.
CREATE TABLE t AS
SELECT rownum AS id
FROM dual
CONNECT BY level <= 10;
WITH
data AS (
SELECT id, rownum as row_num
FROM t
),
total AS (
SELECT count(*) AS total_rows
FROM data
),
parts AS (
SELECT rownum as part_no, total.total_rows, total.total_rows / 3 as part_rows
FROM dual, total
CONNECT BY level <= 3
),
bounds AS (
SELECT parts.part_no,
parts.total_rows,
parts.part_rows,
COALESCE(LAG(data.row_num) OVER (ORDER BY parts.part_no) + 1, 1) AS start_row_num,
data.row_num AS end_row_num
FROM data
JOIN parts
ON data.row_num = ROUND(parts.part_no * parts.part_rows, 0)
)
SELECT bounds.part_no, d1.ID AS start_id, d2.ID AS end_id
FROM bounds
JOIN data d1
ON d1.row_num = bounds.start_row_num
JOIN data d2
ON d2.row_num = bounds.end_row_num
ORDER BY bounds.part_no;
PART_NO START_ID END_ID
---------- ---------- ----------
1 1 3
2 4 7
3 8 10
The PROD_AMT I'd like to get is when ACCT_NBR, PROD_NBR And PROD_AMT are the same, I only need one PROD_AMT which is 100 (from distinct), and when ACCT_NBR are the same but PROD_NBR are different, then the PROD_AMT I need is 90 (30+60)
SELECT ACCT_NBR
,COUNT(DISTINCT CASE WHEN PROD_NBR = 1 THEN SUM(DISTINCT PROD_AMT)
WHEN PROD_NBR > 1 THEN SUM(PROD_AMT)
END) AS AMT
FROM TABLE
ACCT_NBR PROD_NBR PROD_AMT
3007 001 30
3007 002 60
1000 003 100
1000 003 100
There's probably a few ways to solve this. Using a subquery to determine which records should be summed vs which ones should be distinct, you could use:
SELECT
acct_nbr,
CASE WHEN sumflag = 'X' THEN SUM(prod_amt) ELSE MAX(prod_amt) END as amt
FROM
(
SELECT
acct_nbr,
prod_nbr,
prod_amt,
CASE WHEN COUNT(*) OVER (PARTITION BY Acct_nbr, prod_nbr, prod_amt) = 1 THEN 'X' ELSE NULL END AS sumflag
FROM
table
)t1
GROUP BY acct_nbr, sumflag
I'm just using MAX() here since it doesn't matter... all the values that will be aggregated with max() we know are duplicates, so it's a wash.
You could get similar results with a UNION query where one query would do the summing in the event that the records are distinct, and the other would just return distinct prod_amt's where the records are duplicates.
While the above example is nice if you truly have different aggregation needs depending on complex logic, for your question there's a simpler way of doing the same thing that doesn't use window functions:
SELECT
acct_nbr,
sum(prod_amt) AS amt
FROM
(
SELECT DISTINCT
acct_nbr,
prod_amt
FROM
table
)t1
GROUP BY 1
If you need to adapt this to a complex statement you could just sling your complex statement in as subquery where table is above like:
SELECT
acct_nbr,
sum(prod_amt) AS amt
FROM
(
SELECT DISTINCT
acct_nbr,
prod_amt
FROM
(
YOUR REALLY COMPLEX QUERY GOES IN HERE
)t2
)t1
GROUP BY 1
I have gone through a similar post in Stack overflow...
but my query is :
If my table generates a flag in run time execution,then how can I increase Grp_number(generate run time) every time my flag changes.
my Oracle query:
Select emp_id,
Case when MOD(rownum/3)=1 and rownum>1 then 'Y' else 'N' as flag
from Transaction_table
Desired o/p Data format:
emp_id Flag GRP_number
1 N 1
2 N 1
3 N 1
4 Y 2
5 N 2
6 N 2
7 Y 3
You cannot reference a column in another column in the same select list. You need to use sub query to avoid INVALID IDENTIFIER error.
Do it like -
WITH DATA AS(
SELECT emp_id,
CASE
WHEN MOD(rownum/3)=1
AND rownum >1
THEN 'Y'
ELSE 'N' AS flag
FROM Transaction_table
)
SELECT emp_id, flag, SUM(gap) over (PARTITION BY person
ORDER BY DAY) grp
FROM(
SELECT emp_id, flag,
CASE WHEN flag = lag(flag) over (PARTITION BY person
ORDER BY DAY)
THEN 0
ELSE 1
END gap
FROM DATA)
I have a query in Teradata. I want to add an additional column that would be a VARCHAR.
It should say whether the selected record is even or odd
select id, name, CASE newColumn WHEN --- ???
from my table
Like this
id name newColumn
1 asdf odd
2 ts df even
32 htssdf odd
4 asdfsd even
23 gftht odd
How can I do this
Based on your example, I can't tell how you are sorting the results. You would need to define a sort order. Let's assume you would do it based on the id number.
SELECT id, name,
ROW_NUMBER() OVER(ORDER BY id) row_id,
CASE WHEN ROW_NUMBER() OVER(ORDER BY id) MOD 2 = 0 THEN 'Even' ELSE 'Odd' END newColumn
FROM my table
The row_id is incrementally assigned based on the id field being sorted ascending. You then use the MOD function to determine if there's a remainder after dividing the number by a value (in this case 2). Result would look like the following:
id name row_id newColumn
1 asdf 1 Odd
2 ts df 2 Even
4 asdfsd 3 Odd
23 gftht 4 Even
32 htssdf 5 Odd
I need some help to build SQL Query. I have table having data like:
ID Date Name
1 1/1/2009 a
2 1/2/2009 b
3 1/3/2009 c
I need to get result something like...
1 1/1/2009 a
2 1/2/2009 b
3 1/3/2009 c
4 1/4/2009 Null
5 1/5/2009 Null
6 1/6/2009 Null
7 1/7/2009 Null
8 1/8/2009 Null
............................
............................
............................
30 1/30/2009 Null
31 1/31/2009 Null
I want query something like..
Select * from tbl **where month(Date)=1 AND year(Date)=2010**
Above is not completed query.
I need to get all the record of particular month, even if some date missing..
I guess there must be equi Join in the query, I am trying to build this query using Equi join
Thanks
BIG EDIT
Now understand the OPs question.
Use a common table expression and a left join to get this effect.
DECLARE #FirstDay DATETIME;
-- Set start time
SELECT #FirstDay = '2009-01-01';
WITH Days AS
(
SELECT #FirstDay as CalendarDay
UNION ALL
SELECT DATEADD(d, 1, CalendarDay) as CalendarDay
FROM Days
WHERE DATEADD(d, 1, CalendarDay) < DATEADD(m, 1, #FirstDay)
)
SELECT DATEPART(d,d.CalendarDay), **t.date should be (d.CalendarDay)**, t.Name FROM Days d
LEFT JOIN tbl t
ON
d.CalendarDay = t.Date
ORDER BY
d.CalendarDay;
Left this original answer at bottom
You need DATEPART, sir.
SELECT * FROM tbl WHERE DATEPART(m,Date) = 1
If you want to choose month and year, then you can use DATEPART twice or go for a range.
SELECT * FROM tbl WHERE DATEPART(m,Date) = 1 AND DATEPART(yyyy,Date) = 2009
Range :-
SELECT * FROM tbl WHERE Date >= '2009-01-01' AND Date < '2009-02-01'
See this link for more info on DATEPART.
http://msdn.microsoft.com/en-us/library/ms174420.aspx
You can use less or equal to.
Like so:
select * from tbl where date > '2009-01-01' and date < '2009-02-01'
However, it is unclear if you want month 1 from all years?
You can check more examples and functions on "Date and Time Functions" from MSDN
Create a temporary table containing all days of that certain month,
Do left outer join between that table and your data table on tempTable.month = #month.
now you have a big table with all days of the desired month and all the records matching the proper dates + empty records for those dates who have no data.
i hope that's what you want.