How to include 'Time' in Date Hierarchy in Power BI - datetime

I am working on a report in Power BI. One of the tables in my data model collects sensor data. It has the following columns:
Serial (int) i.e. 123456789
Timestamp (datetime) i.e. 12/20/2016 12:04:23 PM
Reading (decimal) i.e. 123.456
A new record is added every few minutes, with the current reading from the sensor.
Power BI automatically creates a Hierarchy for the datetime column, which includes Year, Quarter, Month and Day. So, when you add a visual to your report, you can easily drill down to each of those levels.
I would like to include the "Time" part of the data in the hierarchy, so that you can drill down one more level after "Day", and see the detailed readings during that period.
I have already set up a Date table, using the CALENDARAUTO() function, added all of the appropriate columns, and related it to my Readings table in order to summarize the data by date - which works great. But it does not include the "Time" dimension.
I have looked at the following SO questions, but they didn't help:
Time-based drilldowns in Power BI powered by Azure Data Warehouse
Creating time factors in PowerBI
I also found this article, but it was confusing:
Power BI Date & Time Dimension Toolkit
Any ideas?
Thanks!

Unfortunately, I can not comment on the previous answer, so I have to add this as separate answer:
Yes, there is a way to automatically generate Date and Time-Tables. Here's some example code I use in my reports:
let
Source = List.Dates(startDate, Duration.Days(DateTime.Date(DateTime.LocalNow()) - startDate)+1, #duration(1,0,0,0)),
convertToTable = Table.FromList(Source, Splitter.SplitByNothing(), {"Date"}, null, ExtraValues.Error),
calcDateKey = Table.AddColumn(convertToTable, "DateKey", each Date.ToText([Date], "YYYYMMDD")),
yearIndex = Table.AddColumn(calcDateKey, "Year", each Date.Year([Date])),
monthIndex = Table.AddColumn(yearIndex, "MonthIndex", each Date.Month([Date])),
weekIndex = Table.AddColumn(monthIndex, "WeekIndex", each Date.WeekOfYear([Date])),
DayOfWeekIndex = Table.AddColumn(weekIndex, "DayOfWeekIndex", each Date.DayOfWeek([Date], 1)),
DayOfMonthIndex = Table.AddColumn(DayOfWeekIndex, "DayOfMonthIndex", each Date.Day([Date])),
Weekday = Table.AddColumn(DayOfMonthIndex, "Weekday", each Date.ToText([Date], "dddd")),
setDataType = Table.TransformColumnTypes(Weekday,{{"Date", type date}, {"DateKey", type text}, {"Year", Int64.Type}, {"MonthIndex", Int64.Type}, {"WeekIndex", Int64.Type}, {"DayOfWeekIndex", Int64.Type}, {"DayOfMonthIndex", Int64.Type}, {"Weekday", type text}})
in
setDataType
Just paste it into an empty query. The code uses a parameter called startDate, so you want to make sure you have something similar in place.
And here's the snippet for a time-table:
let
Source = List.Times(#time(0,0,0) , 1440, #duration(0,0,1,0)),
convertToTable = Table.FromList(Source, Splitter.SplitByNothing(), {"DayTime"}, null, ExtraValues.Error),
createTimeKey = Table.AddColumn(convertToTable, "TimeKey", each Time.ToText([DayTime], "HHmmss")),
hourIndex = Table.AddColumn(createTimeKey, "HourIndex", each Time.Hour([DayTime])),
minuteIndex = Table.AddColumn(hourIndex, "MinuteIndex", each Time.Minute([DayTime])),
setDataType = Table.TransformColumnTypes(minuteIndex,{{"DayTime", type time}, {"TimeKey", type text}, {"HourIndex", Int64.Type}, {"MinuteIndex", Int64.Type}})
in
setDataType
If you use the DateKey and TimeKey (like suggested in the first answer) in your fact-table, you can easily generate the date/time-hierarchy by simply putting the time-element in the visualization below the date-element like this
date-time-hierarchy

You will want separate date & time tables. You don't want to put the time into the date table, because the time is repeated every day.
A Time dimension is the same principal as a Date dimension, except instead of a row for every day, you would have a row for every minute or every second (depending on how exact you want to be - I wouldn't recommend including second unless you absolutely needed it, as it greatly increases the number of rows you need - impacting performance). There would be no reference to date in the time table.
E.g.
Time | Time Text| Hour | Minute | AM/PM
---------|----------|------|--------|------
12:00 AM | 12:00 AM | 12 | 00 | AM
12:01 AM | 12:01 AM | 12 | 01 | AM
12:02 AM | 12:02 AM | 12 | 02 | AM
... | ... | ... | ... | ...
I include a time/text column since Power BI has a habit of adding a date from 1899 to time data types. You can add other columns if they'd be helpful to you too.
In your fact table, you'll want to split your datetime column into separate date & time columns, so that you can join the date to the date table & the time to the time table. The time will likely need to be converted to the nearest round minute or second so that every time in your data corresponds to a row in your time table.
It's worth keeping but hiding the original datetime field in your data in case you later want to calculate durations that span days.
In Power BI, you'd add the time attribute (or the hour (and minute) attribute) under the month/day attributes on your axis to make a column chart that can be drilled from year > quarter > month > day > hour > minute. Power BI doesn't care that the attributes come from different tables.
You can read more about time dimensions here: http://www.kimballgroup.com/2004/02/design-tip-51-latest-thinking-on-time-dimension-tables/
Hope this helps.

My approach was to create new column with given formula:
<new-column-name>=Format([<your-datetime-column>],"hh:mm:ss")
This will create a new column and now you can select it with your-datetime-column to create a drill-down effect.

I created a new custom column and set formula=[Timestamp] and change type to datetime.
#"Added Custom" = Table.AddColumn(#"Added Conditional Column16", "TestTimestamp", each [Timestamp]),
#"Changed Type" = Table.TransformColumnTypes(#"Added Custom",{{"TestTimestamp", type datetime}}),

Related

Pyspark GroupBy time span

I have data with a start and end date e.g.
+---+----------+------------+
| id| start| end|
+---+----------+------------+
| 1|2021-05-01| 2022-02-01|
| 2|2021-10-01| 2021-12-01|
| 3|2021-11-01| 2022-01-01|
| 4|2021-06-01| 2021-10-01|
| 5|2022-01-01| 2022-02-01|
| 6|2021-08-01| 2021-12-01|
+---+----------+------------+
I want a count for each month on how many observations were "active" in order to display that in a plot. With active I mean I want a count on how many observations have a start and end date that includes the given month. The result for the example data should look like this:
Example of a plot for the active times
I have looked into the pyspark Window function, but I don't think that can help me with my problem. So far my only idea is to specify an extra column for each month in the data and indicate whether the observation is active in that month and work from there. But I feel like there must be a much more efficient way to do this.
You can use sequence SQL. sequence will create the date range with start, end and interval and return the list.
Then, you can use explode to flatten the list and then count.
from pyspark.sql import functions as F
# Make sure your spark session is set to UTC.
# This SQL won't work well with a month interval if timezone is set to a place that has a daylight saving.
spark = (SparkSession
.builder
.config('spark.sql.session.timeZone', 'UTC')
... # other config
.getOrCreate())
df = (df.withColumn('range', F.expr('sequence(to_date(`start`), to_date(`end`), interval 1 month) as date'))
.withColumn('observation', F.explode('range')))
df = df.groupby('observation').count()

Using KQL (Kusto query language), how to group datetimes into weeks (or 7-day chunks)?

I am running KQL (Kusto query language) queries against Azure Application Insights. I have certain measurements that I want to aggregate weekly. I am trying to figure out how to split my data into weeks.
To illustrate what I seek, here is a query that computes daily averages of the duration column.
requests
| where timestamp > ago(7d)
| summarize
avg(duration)
by
Date = format_datetime(timestamp, "yyyy-MM-dd")
This produces something similar to this:
In the above I have converted datetimes to string and thus effectively "rounded them down" to the precision of one day. This may be ugly, but it's the easiest way I could think of in order to group all results from a given day. It would be trivial to round down to months or years with the same technique.
But what if I want to group datetimes by week? Is there a nice way to do that?
I do not care whether my "weeks" start on Monday or Sunday or January 1st or whatever. I just want to group a collection of KQL datetimes into 7-day chunks. How can I do that?
Thanks in advance!
Looks like you are looking for the "bin()" function:
requests
| where timestamp > ago(7d)
| summarize
avg(duration)
by
bin(timestamp, 1d) // one day, for 7 days change it to 7d
I found out that I can use the week_of_year function to split datetimes by week number:
requests
| where timestamp > ago(30d)
| summarize
avg(duration)
by
Week = week_of_year(timestamp)
| sort by Week

Aggregate/Summarize Timeseries data in Azure Data Explorer using Kusto

I have a requirement where I need to regularize/aggregate data which is polled every 1 sec into 1 min intervals. And I have two columns which need to be aggregated as well, say SensorName, SensorValue. I am able to bin the timestamp to 1 minute, but I am not able to get the corresponding two colums. How do I do that? Below is the query I used and the output I get.
Table
| where TimeStamp between (datetime(2020-09-01)..datetime(2020-09-30))
| summarize by bin(TimeStamp , 1min)
based on my understanding of the question (could be wrong, as there's no clear specification of sample input/schema and matching output), you could try following this example - it calculates the average sensor value for each sensor name, using an aggregation span of 1 minute:
Table
| where TimeStamp between (datetime(2020-09-01)..datetime(2020-09-30))
| summarize avg(SensorValue) by SensorName, bin(TimeStamp, 1min)

How to add Two Columns (Date/Time) Together in Tableau

I have a dataset which was imported from Excel to Tableau. In Excel the data is listed as "8:15:00 AM". When it's imported into Tableau it's now Date & Time as "12/30/1899 10:45:00 AM".
What I'm trying to perform is an addition problem between two date & time columns. An example being:
Sleep: 12/30/1899 10:45:00
Eating: 12/30/1899 00:45:00
Sleep + Eating which should yield 10:40 + 00:45 = 11:30
After much googling and video watching, I have not found a solution.
Creating a calculated as such should solve the problem:
DATEADD(
'hour', DATEPART('hour', [Eating]), DATEADD(
'minute', DATEPART('minute', [Eating]), DATEADD(
'second', DATEPART('second', [Eating]), [Sleep])))
From there, the time display of the field can be adjusted as necessary:
Right-click field name > Default Properties > Date Format
You can achieve the desired result by creating three calculated fields. Lets call one of the calculated field [hours] and another one [minutes]. A third field will use the values from [hours] and [minutes] in order to properly display the results. Let's call this third field [Sleep + Eating].
To create [hours]:
//[hours] calculated field
DATEPART('hour', [Sleep])+ DATEPART('hour', [Eating])
To create [minutes]:
//[hours] calculated field
DATEPART('minute', [Sleep])+DATEPART('minute', [Eating])
At this point [hours] and [minutes] are just the sums of the integer values from [Sleep] and [Eating]. To display these sums in the format you have requested, you'll need to have a third calculated field that concatenates hours and the minutes, and is also able to handle a situations where the minutes add up to more than 60, or when the minutes add up to less then 10.[Sleep + Eating] calculated field will achieve this.
Here is the [Sleep + Eating] calculated field:
//[Sleep + Eating] calculated field
IF [minutes]< 10 THEN STR([hours])+":0"+STR([minutes])
ELSEIF [minutes]< 60 THEN STR([hours])+":"+STR([minutes])
ELSEIF [minutes]>= 70 THEN STR([hours]+1)+":"+STR([minutes]-60)
ELSE STR([hours]+1)+":0"+STR([minutes]-60)
END
Other solutions that try to treat your data like a date field (See #Daniel Sims answer) could be problematic if the hours sum to greater than 23, the result will add a day. For example, lets say you had these values in Excel:
If you use my solution, the result will be 25:02. If you use #Daniel Sims answer, the result would be 12/31/1899 1:02 :00 AM, which I don't think is what you want.

Apache Drill: Group by week

I tried to group my daily data by week (given a reference date) to generate a smaller panel data set.
I used postgres before and there it was quite easy:
CREATE TABLE videos_weekly AS SELECT channel_id,
CEIL(DATE_PART('day', observation_date - '2016-02-10')/7) AS week
FROM videos GROUP BY channel_id, week;
But it seems like it is not possible to subtract a timestamp with a date string in Drill. I found the AGE function, which returns an interval between two dates, but how to convert this into an integer (number of days or weeks)?
DATE_SUB may help you here. Following is an example:
SELECT extract(day from date_sub('2016-11-13', cast('2015-01-01' as timestamp)))/7 FROM (VALUES(1));
This will return number of weeks between 2015-01-01 and 2016-11-13.
Click here for documentation

Resources