Avoid repetition of subquery

Avoid repetition of subquery - sqlite

I have a table messages that contains a column message_internaldate. Now I'd like to count the messages within certain time periods (each hour of a day) over several months. I could manage to get the sum of messages per hour by having lots of subqueries (24) but I hope that there is a more brainy way to do that. The subqueries are similar except that the time period changes. Any suggestions?
e.g. for the first two hours
SELECT T1, T2 FROM
(
SELECT sum(T1c) as T1 FROM
(
SELECT strftime('%H:%M',message_internaldate) AS T1s ,count(*) as T1c FROM messages WHERE
message_internaldate BETWEEN '2005-01-01 00:00:00' AND '2012-12-31 00:00:00'
AND strftime('%H:%M',message_internaldate) BETWEEN '01:00'AND '01:59'
GROUP BY strftime('%H:%M',message_internaldate)
)
)
,
(
SELECT sum(T2c) as T2 FROM
(
SELECT strftime('%H:%M',message_internaldate) AS T2s ,count(*) as T2c FROM messages WHERE
message_internaldate BETWEEN '2005-01-01 00:00:00' AND '2012-12-31 00:00:00'
AND strftime('%H:%M',message_internaldate) BETWEEN '02:00'AND '02:59'
GROUP BY strftime('%H:%M',message_internaldate)
)
)
...

Your problem is that you want to have the individual hours as columns.
To get them as rows, try a query like this:
SELECT strftime('%H', message_internaldate) AS hour,
strftime('%H:%M', message_internaldate) AS Ts,
COUNT(*) AS Tc
FROM messages
WHERE message_internaldate BETWEEN '2005-01-01 00:00:00' AND '2012-12-31 23:59:59'
GROUP BY 1, 2

Related

Impala - Working hours between two dates in impala

I have two time stamps #starttimestamp and #endtimestamp. How to calculate number of working hours between these two
Working hours is defined below:
Mon- Thursday (9:00-17:00)
Friday (9:00-13:00)
Have to work in impala

think i found a better solution.
we will create a series of numbers using a large table. You can get a time dimension type table too. Make it doenst get truncated. I am using a large table from my db.
Use this series to generate a date range between start and end date.
date_add (t.start_date,rs.uniqueid) -- create range of dates
join (select row_number() over ( order by mycol) as uniqueid -- create range of unique ids
from largetab) rs
where end_date >=date_add (t.start_date,rs.uniqueid)
Then we will calculate total hour difference between the timestamp using unix timestamp considering date and time.
unix_timestamp(endtimestamp - starttimestamp )
Exclude non working hours like 16hours on M-T, 20hours on F, 24hours on S-S.
case when dayofweek ( dday) in (1,7) then 24
when dayofweek ( dday) =5 then 20
else 16 end as non work hours
Here is complete SQL.
select
end_date, start_date,
diff_in_hr - sum(case when dayofweek ( dday) in (1,7) then 24
when dayofweek ( dday) =5 then 20
else 16 end ) total_workhrs
from (
select (unix_timestamp(end_date)- unix_timestamp(start_date))/3600 as diff_in_hr , end_date, start_date,date_add (t.start_date,rs.uniqueid) as dDay
from tdate t
join (select row_number() over ( order by mycol) as uniqueid from largetab) rs
where end_date >=date_add (t.start_date,rs.uniqueid)
)rs2
group by 1,2,diff_in_hr

Getting a column with count from a nested query

as a newbie in SQL I am lost regarding nested queries.
I am trying to achieve the following: getting a table grouped by month with a count from all values of a column, then the same count filtered by status.
So for instance in January I could have the following result:
Jan 22 Count= 100 Count with Status filter= 57
I tried several variations along these lines:
SELECT
FORMAT ( [CreatedDate.table2] , 'yyyyMM' ) as create_month,
RecordTypeName__c,
count(*) as count_all,
count_filtered
FROM
(SELECT
FORMAT ( [CreatedDate] , 'yyyyMM' ) as create_month,
RecordTypeName__c,
Count(*) AS count_filtered
FROM DM_AccessNoAgg.DimLead
WHERE [CreatedDate] >= '2022-01-01'
AND [Status]='Qualifiziert'
GROUP BY RecordTypeName__c,FORMAT ( [CreatedDate] , 'yyyyMM' )
)
Basically I am using the same value in both cases, just that the second count has to be filtered. What's the best method to get this done?
Thanks for your help!
Pauline.

Oracle to_date return incorrect result

I have a table 4 columns - Code, Status, EffectiveDate (EFF_DT), EndDate (END_DT). All the columns are Varchar2 type. EFF_DT and END_DT has ISO format date (YYYY-MM-DD) with NULL values for few rows. Need to get the rows which has END_DT greater than today's date.
While executing the below query, it returns all the Not NULL rows for END_DT. Do not compare the END_DT at all.
select code, status, EFF_DT, END_DT
from (
select CODE, EFF_DT, Status,to_date("END_DT" ,'YYYY-MM-DD' ) as END_DT
from xxx.ZZZ
) TAB
where to_date(TAB.END_DT ,'DD-MM-YY' ) > to_date(CAST(CURRENT_TIMESTAMP as Date), 'DD-MM-YY')
ORDER BY 1 ASC
But the below query compares the END_DT and returns the result properly -
SELECT "TAB"."CODE" , "TAB"."STATUS" AS "STATUS" , "TAB"."EFF_DT" , "TAB"."END_DT"
FROM xxx.ZZZ "TAB"
WHERE ( (to_date("TAB"."END_DT",'YYYY-MM-DD') > to_date(CAST(CURRENT_TIMESTAMP as Date), 'YY-MM-DD')) )
ORDER BY 1 ASC
What is going wrong with the 1st query?
I see difference in return value of END_DT.
For the 1st query, the data is coming like -
while as for the 2nd query, the data is coming like

Only choosing oldest date (BigQuery)

I'd like to only choose the oldest date. Using Max/Min doesn't work because it's at row level, and I couldn't figure out a way to use over or NTH as this query will be run each day with a different number of server, w_id and z_id.
The following query:
select server, w_id, z_id, date(datetime) as day
from( SELECT server, w_id, datetime, demand.b_id as id, demand.c_type, z_id,
FROM TABLE_DATE_RANGE(v3_data.v3_,DATE_ADD(CURRENT_DATE(),-2,"day"),
DATE_ADD(CURRENT_DATE(),-1,"day"))
where demand.b_id is not null and demand.c_type = 'rtb'
group by 1,2,3,4,5,6
having datetime >= DATE_ADD(CURRENT_DATE(),-2,"day")
)
group by 1,2,3,4
having count(day)<2
order by z_id, day
Gives results:
Row server w_id z_id day
1 A 722 1837 2016-04-19
2 SPORTS 51 2534 2016-04-19
3 A 1002 2546 2016-04-18
4 A 1303 3226 2016-04-19
5 A 1677 4369 2016-04-18
6 NEW 13608 9370 2016-04-19
So from the above I'd only like 2016-04-18.

I think a GROUP_CONCAT might get the job done quite simply here:
SELECT
server,
w_id,
z_id,
day,
FROM (
SELECT
server,
w_id,
z_id,
GROUP_CONCAT(day) day,
FROM (
SELECT
server,
w_id,
DATE(datetime) day,
demand.b_id AS id,
demand.c_type,
z_id,
FROM
TABLE_DATE_RANGE(v3_data.v3_,DATE_ADD(CURRENT_DATE(),-2,"day"), DATE_ADD(CURRENT_DATE(),-1,"day"))
WHERE
demand.b_id IS NOT NULL
AND demand.c_type = 'rtb'
AND DATE(datetime) >= DATE(DATE_ADD(CURRENT_DATE(),-2,"day"))
GROUP BY
1,2,3,4,5,6
ORDER BY
day) # Critical to order this dimension to make the GROUP_CONCAT permutations unique
GROUP BY
server,
w_id,
z_id,
# day is aggregated in GROUP_CONCAT and so it does not get included in the GROUP BY
)
WHERE
day = DATE(DATE_ADD(CURRENT_DATE(),-2,"day"))

Most inner select is your untouched original one
The rest is wrapper taking care of min_day
Not tested - as done on go - but at least should give you an idea
SELECT server, w_id, z_id, [day]
FROM (
SELECT server, w_id, z_id, [day], MIN([day]) OVER() AS min_day
FROM (
SELECT server, w_id, z_id, DATE(datetime) AS [day]
FROM (
SELECT server, w_id, datetime, demand.b_id AS id, demand.c_type, z_id,
FROM TABLE_DATE_RANGE(v3_data.v3_,DATE_ADD(CURRENT_DATE(),-2,"day"), DATE_ADD(CURRENT_DATE(),-1,"day"))
WHERE demand.b_id IS NOT NULL AND demand.c_type = 'rtb'
GROUP BY 1,2,3,4,5,6
HAVING datetime >= DATE_ADD(CURRENT_DATE(),-2,"day")
)
GROUP BY 1,2,3,4
HAVING COUNT([day])<2
)
)
WHERE [day] = min_day
ORDER BY z_id, [day]

Both solutions have been helpful, but I believe neither worked the way I wanted and the following does:
select server, w_id, id, demand.c_type,z_id,
NTH(1, day) First, NTH(2, day) Second,
from(
SELECT
server,
w_id,
DATE(datetime) as day,
demand.b_id AS id,
demand.c_type,
z_id,
FROM
TABLE_DATE_RANGE([black-beach-789:v3_data.v3_],DATE_ADD(CURRENT_DATE(),-2,"day"), DATE_ADD(CURRENT_DATE(),-1,"day"))
WHERE
demand.b_id IS NOT NULL
AND demand.c_type = 'rtb'
AND DATE(datetime) >= DATE(DATE_ADD(CURRENT_DATE(),-2,"day"))
GROUP BY
1,2,3,4,5,6
order by day
)
group by 1,2,3,4,5
having first = date(DATE_ADD(CURRENT_DATE(),-2,"day")) and Second is null

SQLite Efficient Running Total

I have a table of transactions in SQLite
number date Category Amount runningBalance
I want the running balance column to have a running sum of the amount column after the table is sorted by Date first and number second.
I can do this with a select when reading. But this table has the potential to get very large and I don't want to recalculate every time. I want to make a trigger where all the transactions following (by date then number) the inserted/edited transaction have their runningBalance value updated.
This will mean that the calculations are reduced... as more recent transactions are likely to be edited more often, and older ones rarely. It also will spread the computation over writes so that reads are near instant.
Can anyone provide assistance on how to set up such a trigger?
so far this is what I have but it does not give desired results. And recalculates all every time. Not just the ones following the change.
CREATE TRIGGER RunningTotal AFTER UPDATE ON Transactions FOR EACH ROW
BEGIN
UPDATE Transactions
SET RunningBalance = (
SELECT (
SELECT sum(Amount)
FROM TopInfo t2
WHERE t2.Date <= t1.Date
)
FROM Transactions t1
);
END;
Thanks!

I've managed to find a way that works. Not sure how efficient it is though. Love to hear if anyone knows a more efficient way to update the Balance column.
CREATE TRIGGER Balance AFTER UPDATE OF Amount ON Transactions FOR EACH ROW
BEGIN
UPDATE Transactions
SET Balance = (
SELECT Balance
FROM (
SELECT TransactionID,
(
SELECT sum(t2.Amount)
FROM Transactions t2
WHERE t2.Date <= t1.Date
ORDER BY Date
)
AS Balance
FROM Transactions t1
WHERE TransactionID = Transactions.TransactionID
ORDER BY Date
)
)
WHERE Transactions.Date >= NEW.Date;
END;
UPDATE:
CREATE TRIGGER Balance AFTER UPDATE OF Amount ON Transactions FOR EACH ROW
BEGIN
UPDATE Transactions
SET Balance = (
SELECT Balance
FROM (
SELECT TransactionID,
(
SELECT sum(t2.Amount)
FROM Transactions t2
WHERE CASE WHEN t2.Date = t1.Date THEN t2.TransactionID <= t1.TransactionID ELSE t2.Date <= t1.Date END
ORDER BY Date,
TransactionID
)
AS Balance
FROM Transactions t1
WHERE TransactionID = Transactions.TransactionID
ORDER BY Date,
TransactionID
)
)
WHERE Transactions.Date >= NEW.Date;
END;
I've Done Some more with running total and have come up with 2 ways. The second is much slower than the first. Any ideas why???
method 1
SELECT TransactionID,Date, Account, Amount,
(SELECT sum(t2.Amount)
FROM Transactions t2
WHERE
CASE WHEN t2.Date = t1.Date
THEN t2.TransactionID <= t1.TransactionID
AND t2.Account == t1.Account
ELSE t2.Date <= t1.Date
AND t2.Account == t1.Account
END
ORDER BY Date, TransactionID)
AS Balance
FROM Transactions t1
ORDER BY Date, TransactionID
Method2
SELECT n.TransactionID, n.Date, n.Account, n.Amount,
SUM(o.Amount) As running_total
FROM Transactions n LEFT JOIN Transactions o
ON (
CASE WHEN o.Date = n.Date
THEN n.TransactionID >= o.TransactionID
AND o.Account == n.Account
ELSE n.Date >= o.Date
AND o.Account == n.Account
END
)
GROUP BY n.Account, n.Date, n.TransactionID
ORDER BY n.Date, n.TransactionID;

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Avoid repetition of subquery - sqlite

Related

Impala - Working hours between two dates in impala

Getting a column with count from a nested query

Oracle to_date return incorrect result

Only choosing oldest date (BigQuery)

SQLite Efficient Running Total

Categories

Resources