Power BI - Calculating Sum Between 2 Times with DAX - datetime

I have half-hourly consumption data and I need to calculate the sum of consumption that takes place between 2:30 AM and 5:00 AM.
I have achieved this in Excel with a SUMIF statement. How do I do this with DAX, though?

Assuming you have a table with columns similar to those in the sample table below, I've included the DAX to calculate the sum of consumption between the times given. This also assumes that you want to calculate this sum for ALL days between 2:30 AM and 5:00 AM.
Sample Table (Table)
id
consumption
timestamp
1
4
2022-05-28 02:00
2
4
2022-05-28 02:30
3
5
2022-05-28 03:00
4
5
2022-05-28 03:30
5
6
2022-05-28 04:00
6
6
2022-05-28 04:30
7
5
2022-05-28 05:00
8
5
2022-05-28 05:30
Solution Measure
Consumption Sum =
CALCULATE(
SUM('Table'[consumption]),
TIME(
HOUR('Table'[timestamp]),
MINUTE('Table'[timestamp]),
SECOND('Table'[timestamp])
) >= TIMEVALUE("02:30:00"),
TIME(
HOUR('Table'[timestamp]),
MINUTE('Table'[timestamp]),
SECOND('Table'[timestamp])
) <= TIMEVALUE("05:00:00")
)
Sample Result
A similar result could be achieved using the SUMX function if that's more intuitive for you.

Related

Upsample data with mean

I am trying to upsample my datetime data and fill in the gap with a mean rather than forward or backward fill.
Sample df-
TIME VALUE
01:00 4
02:00 8
03:00 2
desired output-
TIME VALUE
01:00 4
01:30 6
02:00 8
02:30 5
03:00 2
Currently I did a straightforward resample('30min') and want fill the NaN values
TIME VALUE
01:00 4
01:30 NaN
02:00 8
02:30 NaN
03:00 2
With the mean rather than backward or forward fill.
Figured one way to solve the problem.
df2=df.resample('30min',on='Time')
final=df2.interpolate(method='linear')
But I would keen to look at other ways to this apart from interpolation!

How to calculate time elapsed between rows in R

I have date and time as separate columns, which i combined into a single column using library(lubridate)
Now i want to create a new column that would calculate the elapsed time between two consecutive rows for each unique ID
I tried diff, however the error i am getting is that the new column has +1 rows compared to original data set
s1$DT<-with(s1, mdy(Date.of.Collection) + hm(MILITARY.TIME))#this worked - #needs the library lubridate
s1$ElapsedTime<-difff(s1$DT)
units(s1$ElapsedTime)<-"hours"
Subject.ID time DT Time elapsed
1 Dose 8/1/2018 8:15 0
1 time point1 8/1/2018 9:56 0.070138889
1 time point2 8/2/2018 9:56 1.070138889
2 Dose 9/4/2018 10:50 0
2 time point1 9/11/2018 11:00 7.006944444
3 Dose 10/1/2018 10:20 0
3 time point1 10/2/2018 14:22 1.168055556
3 time point2 10/3/2018 12:15 2.079861111
From your comment, you don't need a "diff"; in conventional R-speak, a "diff" would be T1-T0, T2-T1, T3-T2, ..., Tn - Tn-1.
For you, one of these will work to give you T1,2,...,n - T0.
Base R
do.call(
rbind,
by(patients, patients$Subject.ID, function(x) {
x$elapsed <- x$realDT - x$realDT[1]
units(x$elapsed) <- "hours"
x
})
)
# Subject.ID time1 DT Time elapsed realDT
# 1.1 1 Dose 8/1/2018 8:15 0.000000 hours 2018-08-01 08:15:00
# 1.2 1 time_point1 8/1/2018 9:56 1.683333 hours 2018-08-01 09:56:00
# 1.3 1 time_point2 8/2/2018 9:56 25.683333 hours 2018-08-02 09:56:00
# 2.4 2 Dose 9/4/2018 10:50 0.000000 hours 2018-09-04 10:50:00
# 2.5 2 time_point1 9/11/2018 11:00 168.166667 hours 2018-09-11 11:00:00
# 3.6 3 Dose 10/1/2018 10:20 0.000000 hours 2018-10-01 10:20:00
# 3.7 3 time_point1 10/2/2018 14:22 28.033333 hours 2018-10-02 14:22:00
# 3.8 3 time_point2 10/3/2018 12:15 49.916667 hours 2018-10-03 12:15:00
dplyr
library(dplyr)
patients %>%
group_by(Subject.ID) %>%
mutate(elapsed = `units<-`(realDT - realDT[1], "hours")) %>%
ungroup()
data.table
library(data.table)
patDT <- copy(patients)
setDT(patDT)
patDT[, elapsed := `units<-`(realDT - realDT[1], "hours"), by = "Subject.ID"]
Notes:
The "hours" in the $elapsed column is just an artifact of dealing with a time-difference thing, it should not affect most operations. To get rid of it, make sure you're in the right units ("hours", "secs", ..., see ?units) and use as.numeric.
The only reasons I used as.POSIXct as above are that I'm not a lubridate user, and the data as provided is not in a time format. You shouldn't need it if your Time is a proper time format, in which case you'd use that field instead of my hacky realDT.
On similar lines, if you do calculate realDT and use it, you really don't need both realDT and the pair of DT and Time.
The data I used:
patients <- read.table(header=TRUE, stringsAsFactors=FALSE, text="
Subject.ID time1 DT Time elapsed
1 Dose 8/1/2018 8:15 0
1 time_point1 8/1/2018 9:56 0.070138889
1 time_point2 8/2/2018 9:56 1.070138889
2 Dose 9/4/2018 10:50 0
2 time_point1 9/11/2018 11:00 7.006944444
3 Dose 10/1/2018 10:20 0
3 time_point1 10/2/2018 14:22 1.168055556
3 time_point2 10/3/2018 12:15 2.079861111")
# this is necessary for me because DT/Time here are not POSIXt (they're just strings)
patients$realDT <- as.POSIXct(paste(patients$DT, patients$Time), format = "%m/%d/%Y %H:%M")

Accounting for Time in R

I have a log of times for 2 periods (1 & 2) in a data frame. I need to account for the time accumulated for each person based on a third column 'in' vs 'out'. I then need to create an additional column to track the sum of accumulated time for both periods.
Period Time Subs
1 10:00 'Peter in'
1 .
1 .
1 8:00 'Peter out' #In this period he has accumulated 2 minutes
2 10:00 'Peter in'
2 .
2 2:00 'Peter out' #In this period he has accumulated 8 minutes
I know I need to use an if and ifelse statement but I'm not sure how to start. I started and stopped learning R and now I'm trying to pick back up where I left off.
It depends a lot on how your data is formatted, of course. if you have something like
df <- data.frame(Period=c(1,1,1,1,2,2,2), Time=c("10:00",NA,NA,"8:00","10:00",NA,"2:00"))
> df
Period Time
1 1 10:00
2 1 <NA>
3 1 <NA>
4 1 8:00
5 2 10:00
6 2 <NA>
7 2 2:00
If the Time variable is formatted as character, you can strip out the minutes column like so:
df$Min <- as.numeric(sapply(strsplit(as.character(df$Time), ":"), "[[", 1))
> df
Period Time Min
1 1 10:00 10
2 1 <NA> NA
3 1 <NA> NA
4 1 8:00 8
5 2 10:00 10
6 2 <NA> NA
7 2 2:00 2
This is much easier if you can have the Min column already as numeric!
Then, an easy way to return the total time accumulated for each period is the diff of the range for each period, within a tapply() call.
tapply(df$Min, df$Period, function(x) diff(range(x, na.rm=T)))
1 2
2 8

Given time column, how can I create time bins in R?

Given a dataframe with say 3 columns:
date time respond
1/1/2018 15:40 1
4/5/2017 08:25 0
3/4/2016 09:00 1
5/4/2017 09:25 1
....
I want to bin my time column say into 24 bins - for each our and if for example I have 50 samples I want all times between hour1 to hour2 (08:00 - 09:00) to represent bin of 08:00 hour etc.
Now when I will achieve this, I want to count how many responders I have within each bin:
bin08:00 = 10 responders
bin09:00 = 134 responders
and to plot it using ggplot2.
Also please guide me how can I create different bin map:
from 08:00 to 12:00 AM - hourly bins.
12:00AM - 15:00 every 15 minutes bins etc.
Please guide me how can I do this.
#akrun
One way to do this is to use strptime to format your time column as POSIX objects, and then use format on those objects to round down to the hour like so:
library(dplyr)
df$hour <- format(strptime(df$time, "%H:%M"), "%H:00")
df %>% group_by(hour) %>% summarize(respond = sum(respond))
# # A tibble: 3 x 2
# hour respond
# <chr> <int>
# 1 08:00 0
# 2 09:00 2
# 3 15:00 1

time frequency in R

Good Afternoon, colleagues!I have some problems with the following task: I need to plot the time-series graph by using parameter "frequency" that defines the time between two observations in my graph. The data are shown below:
date time open high low close
1 1999.04.08 11:00 1.0803 1.0817 1.0797 1.0809
2 1999.04.08 12:00 1.0808 1.0821 1.0806 1.0807
3 1999.04.08 13:00 1.0809 1.0814 1.0801 1.0813
4 1999.04.08 14:00 1.0819 1.0845 1.0815 1.0844
5 1999.04.08 15:00 1.0839 1.0857 1.0832 1.0844
6 1999.04.08 16:00 1.0842 1.0852 1.0824 1.0834
By default in this data the frequency is 1 hour, but I have two questions: - how to define this frequency in the data (by automatically, if the data will be other one) (because I tried to select column time and to calculate frequency = time[2]-time[1] but I got an error)
- if we task the required frequency is 3 hour how to select this data with frequency 3 hour (in other words: 1st observations, the next one is 4th observations, the next is 7th and etc).
Thank you!

Resources