Dax formula - calculate 6 rolling month back average - datetime

I have a table named [Tasks] which is linked to a date table named [Dimdate]. They are linked with the date of reference of a tasks [DC_date].
The objective is to have a measure that can calculate the average on in interval of 6 month before the specific date.
For example: if I have a date in 2020 August, the formula will calculate the average from March 2020 to 2020 August.
Here is actually the Dax formula that I have integrated on visual studio but still doesn’t work :
CALCULATE(CALCULATE( [average] ;FILTER(
[TASKS];
DATESINPERIOD(Dimdate[Date];MAX(Dimdate[Date]);-6;Month)));
CROSSFILTER(TASKS[Dc_Date];Dimdate[Date];None)) ```

The relationship on TASKS[Dc_Date] should not be removed, otherwise the Time Intelligence function would not work. Unless there are some other existing filters to be removed, this code should be enough
CALCULATE(
[average];
DATESINPERIOD(
Dimdate[Date];
MAX( Dimdate[Date] );
-6;
MONTH
)
)

Related

Analyzing disparate time series in R

Are there tools in R that simplify analysis of lagged and disparate time series. For example:
Daily values that only occur on weekdays (no entry on weekends or holidays)
vs
Bi-annual values
What I'm seeking is ways to:
Complete the missing daily values (with interpolated, or last value rolled forward, etc.)
Look for correlation between daily values and the bi-annual value (only the values that came before the bi-annual event)
As an example:
10-year treasury note interest rate (daily on non-holiday weekdays) as "X" and i-bond fixed rate as "Y" (set May 1/Nov 1)
Any suggestions appreciated.
I've built a test dataset manually for "x" and used functions in zoo to populate the missing values (interpolated), but I'm hoping for a less "brute-force" method for looking at analyzing the disparate time series. I've used lag functions in the past, but those were on matching interval time series.
What Jon commented is what I had in mind:
expand a weekday time series to full week using missing value function(s) in zoo
Sample the daily value - say April 15 for the May 1, Oct 15 for Nov 1
Ideally be able to automate - say loop through April 1-30, Oct 1-30 to look for highest RSqr for the model of choice (linear, polynomial, etc.)
Not have to build discrete datasets for each of the above - but if that is what is required I can do it programmatically - I've done that with stock data in the past. I was looking for a more efficient means of selecting the datasets ad hoc during the analysis.
I don't have code to post, because I'm clueless as to the feature/function that would make the date selection I'm after possible (at least in R).
Thanks for the input so far. It has already been useful in helping me look at alternative methods to achieve what I'm after.

Does the order of an ARMA(p,0) model mean it uses the last p lags or any previous p lags?

I'm having a problem with the definition of order of AR, MA and ARMA time series forecasting processes. Imagine we have a time series with data from January to December, and we're in July, trying to predict August. When we say AR(2), are we using lags relating to July and June, or can those two months be any month between January and June?
I tried checking multiple sources but they define order as different things.
The order indicates how many of the previous data points are to be used for forecasting. You use an AR(2) model and want to predict August. That means, the last two data points available and known to you are July und June. If you then want to predict September, your model will use the data from August and July and so on.
You can select the data points with the order. If you simply give a number, all previous data points in that range will be used, e.g. an order of two results in the last 2 lags or an order of 5 results in the last 5 lags. You can also specify the lags in []. An order of [1,3,5] means that your model uses the last, third-last and fifth-last lags, in your case that would be July, May and March, leaving out the second- and fourth-last lag. That would then be an AR([1,3,5]) model.

SQL: Chunking hourly samples into 7-day averages

GIVENS:
Tools: SQL Server, SSMS 2016, R
Data: Hourly samples starting 2017-12-31 23:00:00 thru 2021-02-05 08:00:00
WANT: To chunk data into 7-day blocks ideally coinciding with week of year and grab averages for each 7-day period. Willing to sacrifice some data frontend and/or backend. Would like to reduce data frequency from 12x365 points down to perhaps 52 points per year. For end use in R.
PROBLEM(S):
A) SQL datepart(week,...) method does not consider 1st seven days of 2018 as week 1. Considers that week starts on a certain day of week, not necessarily on Jan. 1.
B) I suspect SQL datepart(week,...) will assign repeating week value across several years of data. So if I group by datepart(week...), won't it combine week 1 of 2018, 2019, 2020, 2021?
Here's my starting query (AvgDate is for debug purposes):
SELECT datepart(week,Date) Week,
FORMAT(AVG(HeadElev), '###.###') as AvgHeadEl,
COUNT(HeadElev) as Count,
FORMAT(AVG(datepart(Day, Date)), '##.###') as AvgDate
FROM [dbo].[Chickamauga] as CWL
WHERE '20171231' < Date AND Date <= '20181231'
GROUP BY datepart(week,Date)
ORDER BY Week
GO
Here's what my table looks like (I've split date & time from original data):
CREATE TABLE [dbo].[SomeLake](
[Date] [date] NULL,
[HourCT] [time](0) NULL,
[HeadElev] [float] NULL,
[TailElev] [float] NULL,
[Flow] [float] NULL
) ON [PRIMARY]
Again, trying to create simple 7-day blocks of samples and grab averages. (Not moving averages, I only want 1 data point per 7-day block.) I'm trying to reduce the data frequency from (hourly down to weekly data.)
End goal is to import into R and used time series functions that cannot accept high per year frequencies like 365. Trying to bring the frequency down to 52, ie. weekly data.)
THANK YOU for your kind assistance!
create simple 7-day blocks of samples and grab averages.
Group by something like:
1+datepart(dy,some_date)/7 week
which takes the day-of-year and performs integer division to group them into 7-day buckets, starting with 0.

How to do "Today Minus Date Field" in Google Data Studio?

I have column long_term_remaining_days, and I want to create a Date field which
will be calculate TODAY - long_term_remaining_days and in this way will display a Date, for example 1 Jun 2020.
I tried to do as in Excel and used the formula below, but it doesn't work:
TODAY() - long_term_remaining_days
One way it can be achieved is by using the new PARSE_DATE, UNIX_DATE and CURRENT_DATE functions (introduced in the 17 Sep 2020 update to Dates and Times).
1) Calculated Date
Copy-paste the Calculated Field below which converts CURRENT_DATE to UNIX_DATE (the number of days since epoch, 01 Jan 1970) and subtracts long_term_remaining_days (where long_term_remaining_days represents the respective Number field), after which it's converted to a UNIX_TIMESTAMP by multiplying with 86,400 (seconds in a day) before then being recognised as a Google Data Studio date using the PARSE_DATE function:
PARSE_DATE(
"%s",
CAST(((UNIX_DATE(CURRENT_DATE())-long_term_remaining_days)*86400)AS TEXT))
Google Data Studio Report and a GIF to elaborate:

Create time series in R with data for week days only

I have a data set containing the energy usage by day (date) from 01 Jan 2016 through to 07 Nov 2017 on a daily basis. One of the fields therein is a flag for non working day (nwd) with values of 0 and 1 indicating whether or not this is a working day.
The structure of the data looks like this :-
Date,usage,avgtemp,nwd
2016-01-01,28.5,105986,1
2016-01-02,29.2,105548,1
.
.
.
2017-11-07,98457,23.5,0
I created a data frame with these values - no problems. I then created 2 other data frames, one with nwd = 1 and other with nwd = 1 for the data set for non working and working days respectively.
I am trying to generate a time series (using zoo or xts package - I am open to either) for each of these 2 data frames so that I can then do the non stationarity tests (adf/pp) on them and then do the arima modelling to build a forecast model of the usage.
Can I use a time series for such data sets where the data is not quite regular because each of these series will have gaps - the work day series may have less than 5 continuous days in a week if there are holidays in between. The same would apply to the non working day series.
I cannot summarize this at a weekly level as I need to forecast them at a daily level and possibly at the half hourly level subsequently. I might even want to do 'ardl' modelling later using 'avgtemp' as one of the regressors.
P.S.
Found a post which to some extent is similar to mine but I can't seem to get it going based on the responses there :-
how to convert data frame into time series in R

Resources