SQL: Chunking hourly samples into 7-day averages - r

GIVENS:
Tools: SQL Server, SSMS 2016, R
Data: Hourly samples starting 2017-12-31 23:00:00 thru 2021-02-05 08:00:00
WANT: To chunk data into 7-day blocks ideally coinciding with week of year and grab averages for each 7-day period. Willing to sacrifice some data frontend and/or backend. Would like to reduce data frequency from 12x365 points down to perhaps 52 points per year. For end use in R.
PROBLEM(S):
A) SQL datepart(week,...) method does not consider 1st seven days of 2018 as week 1. Considers that week starts on a certain day of week, not necessarily on Jan. 1.
B) I suspect SQL datepart(week,...) will assign repeating week value across several years of data. So if I group by datepart(week...), won't it combine week 1 of 2018, 2019, 2020, 2021?
Here's my starting query (AvgDate is for debug purposes):
SELECT datepart(week,Date) Week,
FORMAT(AVG(HeadElev), '###.###') as AvgHeadEl,
COUNT(HeadElev) as Count,
FORMAT(AVG(datepart(Day, Date)), '##.###') as AvgDate
FROM [dbo].[Chickamauga] as CWL
WHERE '20171231' < Date AND Date <= '20181231'
GROUP BY datepart(week,Date)
ORDER BY Week
GO
Here's what my table looks like (I've split date & time from original data):
CREATE TABLE [dbo].[SomeLake](
[Date] [date] NULL,
[HourCT] [time](0) NULL,
[HeadElev] [float] NULL,
[TailElev] [float] NULL,
[Flow] [float] NULL
) ON [PRIMARY]
Again, trying to create simple 7-day blocks of samples and grab averages. (Not moving averages, I only want 1 data point per 7-day block.) I'm trying to reduce the data frequency from (hourly down to weekly data.)
End goal is to import into R and used time series functions that cannot accept high per year frequencies like 365. Trying to bring the frequency down to 52, ie. weekly data.)
THANK YOU for your kind assistance!

create simple 7-day blocks of samples and grab averages.
Group by something like:
1+datepart(dy,some_date)/7 week
which takes the day-of-year and performs integer division to group them into 7-day buckets, starting with 0.

Related

defining business day where hours are not same as standard days

While working on a sales report for an entertainment company ( bars and nightclubs), I normally just sum sales and I get the daily sum of sales. but I was communicated that their business day starts at 6 am of each and closes at 5:59:59 am the next day. basically sales reported Monday are the sales from 6 am Sunday thru 5:59:59 am Monday.
the company operates throughout the US so we have multiple time zones as well
the table has the following columns:
Transaction id, location, Transaction_datetimeLocal, TransactionDateTimeUTC, Transaction amount
how do I define / filter the calculation to be from 6am one day to 5:59:59 am the next day USING Power BI / DAX
TIA
In Power BI you have your table with the local time. You need to add a calculated column with the following DAX formula:
Business Time = 'Table'[Local Time] - TIME(6, 0, 0)
From this new column you could the create your business date with
Business Date = 'Table'[Business Time].[Date]
This is how it looks in the Data view:

Dax formula - calculate 6 rolling month back average

I have a table named [Tasks] which is linked to a date table named [Dimdate]. They are linked with the date of reference of a tasks [DC_date].
The objective is to have a measure that can calculate the average on in interval of 6 month before the specific date.
For example: if I have a date in 2020 August, the formula will calculate the average from March 2020 to 2020 August.
Here is actually the Dax formula that I have integrated on visual studio but still doesn’t work :
CALCULATE(CALCULATE( [average] ;FILTER(
[TASKS];
DATESINPERIOD(Dimdate[Date];MAX(Dimdate[Date]);-6;Month)));
CROSSFILTER(TASKS[Dc_Date];Dimdate[Date];None)) ```
The relationship on TASKS[Dc_Date] should not be removed, otherwise the Time Intelligence function would not work. Unless there are some other existing filters to be removed, this code should be enough
CALCULATE(
[average];
DATESINPERIOD(
Dimdate[Date];
MAX( Dimdate[Date] );
-6;
MONTH
)
)

Average temperature by day of year with SQLite DDB

I'm looking for a good solution to show the average temperature day by day of the year from a SQLite Database.
In my database, to be sample, I have a date column and a temp column, for each day since 5 years like that.
example
I would like to get, for each day of the year, the average temperature from my database.
I found the request to calculate the average for one day, but I don't how can I do like that for each day
SELECT avg(min) FROM historique WHERE strftime('%m-%d', date )= "04-01";
Could you help me please ?
You must use use GROUP BY:
SELECT strftime('%m-%d', date) day, avg(min)
FROM historique
GROUP BY day

Apache Drill: Group by week

I tried to group my daily data by week (given a reference date) to generate a smaller panel data set.
I used postgres before and there it was quite easy:
CREATE TABLE videos_weekly AS SELECT channel_id,
CEIL(DATE_PART('day', observation_date - '2016-02-10')/7) AS week
FROM videos GROUP BY channel_id, week;
But it seems like it is not possible to subtract a timestamp with a date string in Drill. I found the AGE function, which returns an interval between two dates, but how to convert this into an integer (number of days or weeks)?
DATE_SUB may help you here. Following is an example:
SELECT extract(day from date_sub('2016-11-13', cast('2015-01-01' as timestamp)))/7 FROM (VALUES(1));
This will return number of weeks between 2015-01-01 and 2016-11-13.
Click here for documentation

time series with 10 min frequency in R

My data is memory consumption of an application for every 10 minute interval for the last 26 days.My start date is Oct 6th 2013 and end date is Novemeber 2nd 2013.I've read the data in to a time frame and cleaned it up. Now am trying to create a time series , something along the lines of my_ts<-ts(mydata[3],start=c(2013,10),frequency=10)
Am sure this not correct as the frequency , can someone point me in the right direction so I can plot the time series
.
In R, frequency actually means the period of the seasonality. i.e., frequency = frequency of observations per season. In your case, the "season" is presumably one day. So you want
ts(mydata[3],start=c(2013,10),frequency=24*60/10)

Resources