Moving average filter in postgresql - postgresql-9.1

I have a query that computes the moving average in a table over the last 7 days. My table has two columns date_of_data which is date type and is a date series with one day interval and val which is float.
with B as
(SELECT date_of_data, val
FROM mytable
group by date_of_data
order by date_of_data)
select
date_of_data,val, avg(val) over(order by date_of_data rows 7 preceding)mean7
from B
order by date_of_data;
I want to compute a moving filter for 7 days. It means that for every row , the moving window would contain the last 3 days, the row itself and 3 succeeding rows.I cannot find a command to take into account the succeeding rows. Can anybody help me on this?

Try this:
select date_of_data,
val,
avg(val) over(order by date_of_data ROWS BETWEEN 3 preceding AND 3 following) as mean7
from mytable
order by date_of_data;

Related

How to run ROWS UNBOUNDED PRECEDING on specific rows only

I have an SQL query that is calculating the running total of 2 columns. Every week, there is new data which will be added to DUMMY_TABLE and everytime when I run the running total for that table, it will calculate the running total of all preceding rows which I don't really need. I just need the running total for the data that has been newly inserted. It will be a waste of resource if I have to run the total of all previous rows. I would like to know if there is any way to return the running total only for the week newly inserted.
I tried to use where but it filters the data. The reason I am looking for this is if I have new data every week and the table is size of 10K records, the running total will re-calculate all the 10K records and the new data I inserted.
SELECT
RID,
FYFW,
VOL,
FAILED_VOL,
SUM(VOL) OVER (PARTITION BY RID, SUBSTR(TRIM(FYFW), 1, 4) ORDER BY RID,FYFW ROWS UNBOUNDED PRECEDING) AS YTD_VOL,
SUM(FAILED_VOL) OVER (PARTITION BY RID, SUBSTR(TRIM(FYFW), 1, 4) ORDER BY RID,FYFW ROWS UNBOUNDED PRECEDING) AS YTD_FAILED_VOL,
FROM DUMMY_TABLE
GROUP BY 1,2,3,4
ORDER BY 1,2;

Teradata make FOLLOWING dynamic based on a column

I have a table with columns item, store, date, and fcst. For each item store day I need to sum the next x number of days of forecast, however x changes for each item store combination. This following code does not work and I was advised that X FOLLOWING has to be replaced with a static innteger FOLLOWING:
Doesn’t run:
SELECT
ITEMNBR,
STORENBR,
DT,
SUM(FCST)OVER(PARTITION BY STORENBR,ITEMNBR ORDER BY DT BETWEEN CURRENT ROW AND X ROWS FOLLOWING) AS VALUE
FROM TABLENAME
Does run:
SELECT
ITEMNBR,
STORENBR,
DT,
SUM(FCST)OVER(PARTITION BY STORENBR,ITEMNBR ORDER BY DT BETWEEN CURRENT ROW AND 7 ROWS FOLLOWING) AS VALUE
FROM TABLENAME
Any fix or suggested workaround?

How can I query for min/max values over several subintervals at once in sqlite3?

I have an sqlite3 table with a timestamp column and another column with a certain value.
To get the overall min/max of the values I can query:
SELECT timestamp, max(value), min(value) from myTable;
The result is a single line.
But is there a way to query for the individual min/max values for equally sized subintervals?
For example, I want to consider intervals of size 10 (including left, but excluding right boundary):
[0,10), [10,20), [20,30), ...
The desired result would then be several lines and look something like this:
timestamp max(value) min(value)
0 3 1
10 5 2
20 13 0
30 42 24
Of course, it is easy to split up the overall query into several ones to get this result line-by-line, but is there a way to get it all in a single query?
I found a solution myself:
To get the min/max values for all subsequent 10s intervals, the following query appears to do the trick:
SELECT (timestamp / 10) * 10 as intervalStartTime, min(value), max(value) FROM Test GROUP BY intervalStartTime;
This is assuming that timestamp is of integer type, so (timestamp / 10) is truncated before it is multiplied again.

Finding the number of occurances of a distinct value in a column

I am exploring a new table in SQL and was wondering what is the best way find the count of occurrence of each value. In essence I would like to better understand the distribution of values in the column.
At first I did a select Top 10000 for the table and for this particular column I am interested in I get 2-3 differing values. Let's call them A, B, C.
But when I do a select distinct on that column I get 5 million separate values.
What I am wanting to do is know the distribution of the values in the column.
So an example of output from the query I am looking for being:
Distinct Value of Column Count of Occurrence
A A lot
B A lot
C A lot
D 1
E 1
F 1
G 1
What's your looking for is "GROUP BY" :
Exemple :
SELECT category, COUNT(*) FROM CATALOGS GROUP BY category
Will give you the number of element per category.

SQLite Ranking Time Stamps

I am new to SQL and am having trouble with a (fairly simple) query to rank time stamps.
I have one table with survey data from 2014. I am trying to determine the 'learning curve' for good customer satisfaction performance. I want to order and rank each survey at an agent level based on the time stamp of the survey. This would let me see what the average performance is when an agent has 5 total surveys, 10, 20 etc.
I imagine it should be something like (table name is tablerank):
select T1.*,
(select count(*)
from tablerank as T2
where T2.call_date > T1.call_date
) as SurveyRank
from tablerank as T1
where p1.Agent_ID = T2.Agent_ID;
For each agent, it would list each survey in order and tag a 1 for the earliest survey, a 2 for the second earliest, etc. Then I could Pivot the data in Excel and see the learning curve based on survey count rather than tenure or time (since surveys are more rare, sometimes you only get 1 or 2 in a month).
A correlated subquery must have the correlation in the subquery itself; any table names/aliases from the subquery (such as T2) are not visible in the outer query.
For ranking, you want to count earlier surveys, and you want to include the current survey so that the first one gets the rank number 1, so you need to use <= instead of >:
SELECT *,
(SELECT COUNT(*)
FROM tablerank AS T2
WHERE T2.Agent_ID = T1.Agent_ID
AND T2.call_date <= T1.call_date
) AS SurveyRank
FROM tablerank AS T1

Resources