Splunk Understanding Difference with epoch time and adding min - case

I have been able to read the blogs and understand somewhat how the calculations are taking place, but need further clarity.
Case: Ingestion_Time_Logged will output the CreationTime event on the 7th min or the 37th min regardless of what time the actual CreationTime event occurred. Example:
If CreationTime Event occured at "2021-03-06 07:38:59.000" Then the Ingestion_Time_Logged will be the closest 7th min or 37th min. In this case the Ingestion_Time_Logged will be
"2021-03-06 08:07:59.000". The code works fine. I am just trying to understand how it is calculating it.
What I understand so far:
% is the modulo operator which is a way to determine the remainder of a division operation. Instead of returning the result of the division, the modulo operation returns the whole number remainder.
latestComposed_min%30 will divide the minutes by 30 and result in the reminder. I used it to find if the minute is between 0 - 7 or 30 - 37. It is the same as below but much easier and efficient.
if(latestComposed_min < 7 OR (latestComposed_min>30 AND latestComposed_min < 37))
Excluded 37 and 7 to keep them as they are since they are already ok, if we do not exclude them, they will be added 37 minutes. Since both are NOT comparisons AND should be used between them otherwise with OR the result will be always true (which is wrong)
The piece I am struggling with:
| eval Ingestion_Time_Logged=strftime(case(latestCreated_min%30 < 7, CreationTime_epoch-CreationTime_epoch%1800+420+latestCreated_sec, latestCreated_min!=37 AND latestCreated_min!=7, CreationTime_epoch-CreationTime_epoch%1800+2220+latestCreated_sec,1=1,CreationTime_epoch),"%Y-%m-%d %H:%M:%S.%6N")
Full Code:
| makeresults
| eval CreationTime="2021-03-06 07:38:59.000"
| eval CreationTime_epoch=strptime(CreationTime, "%Y-%m-%d %H:%M:%S.%6N")
| eval latestCreated_hour=tonumber(strftime(CreationTime_epoch, "%H"))
| eval latestCreated_min=tonumber(strftime(CreationTime_epoch, "%M"))
| eval latestCreated_sec=round(CreationTime_epoch%60,6)
| eval Ingestion_Time_Logged=strftime(case(latestCreated_min%30 < 7, CreationTime_epoch-CreationTime_epoch%1800+420+latestCreated_sec, latestCreated_min!=37 AND latestCreated_min!=7, CreationTime_epoch-CreationTime_epoch%1800+2220+latestCreated_sec,1=1,CreationTime_epoch),"%Y-%m-%d %H:%M:%S.%6N")
| table Ingestion_Time_Logged, CreationTime, CreationTime_epoch, latestCreated_hour, latestCreated_min

Let's break it down
Here, we're setting the value of the Ingestion_Time_logged field to the result of the strftime function. That is, we're converting a epoch time into a string.
| eval Ingestion_Time_Logged=strftime(
The epoch to convert is determined by a case statement.
case(
If the created minute (38 in the example) is 0-6 or 30-36
latestCreated_min%30 < 7,
then round down. The %1800 is the same as %30 above, only in seconds rather than minutes. Subtracting that from the epoch time shifts the epoch time to the top or bottom of the hour. The 420 adds in 7 minutes then we add in seconds.
CreationTime_epoch-CreationTime_epoch%1800+420+latestCreated_sec,
If the created minute is neither 7 nor 37 (our case)
latestCreated_min!=37 AND latestCreated_min!=7,
then we do the same as above, except using 37 minutes, advancing the time to the next :07 or :37.
CreationTime_epoch-CreationTime_epoch%1800+2220+latestCreated_sec,
This is the default (catch-all) case, which takes the CreationTime_epoch field as-is.
1=1,CreationTime_epoch),
This is how the timestamp string will be formatted.
"%Y-%m-%d %H:%M:%S.%6N")

Related

Datetime column from Table1 is not matching the DateTime column from Table 2

Hello I have an issue of matching two different datetime columns.
I need to compare the two of them (and their data), but at the moment of putting them in the same table (using a datetime relation) I do not get the match I need:
What I need:
| Datetime_1 | Datetime_2 |
| ---------- | ---------- |
| 01/01/2023 08:00:00 AM | |
... ...
| 01/11/2023 12:00:00 AM | 01/11/2023 12:00:00 AM |
| 01/11/2023 01:00:00 AM | 01/11/2023 01:00:00 AM |
... ...
| 31/01/2023 12:00:00 PM | 31/01/2023 12:00:00 PM |
What I get:
Datetime_1 goes from 01/01/2023 12:00:00AM to 01/31/2023 11:00:00PM (with steps of 1h) and Datetime_2 goes from 01/11/2023 8:15:00 PM to 02/06/2023 7:45:00 PM (with steps of 30min).
I did a relation with the two of them and I didn't receive any error:
I already put both lists in Date/Time format in Power Query and Data panel.
However, I noticed my main datetime list doesn't have the hierarchy icon on fields panel, while the secondary datetime lists have it, (but not the hour section):
Also, as I mentioned before, my list have a range between Jan and Feb. I do not understand why this range continues and match some dates on the on my main datetime list:
Troubleshooting
Part of the difficulty troubleshooting this is the two columns are formatted differently. Just for now, make sure both are formatted as Long Date Time. When comparing the relationship, do not drag the hierarchy (for the one that has it) into the table but rather just the date itself. When you do, you will see the full timestamp for both columns and the issue will become more clear.
Power BI & Relationships on DateTime
Power BI will only match related rows if the date and time match exactly, so 4/15/2023 12:00:00 AM will not match 4/15/2023/12:00:01 AM. You mentioned one side of the relationship has 30 minute steps while the other has 1 hour steps. Power BI is not going to match up a 1:30am and 1:00am value for you. If you want that 1:30 value to match up to 1:00, create another column truncating the :30 minutes and build your relationship on the truncated column.
Time Dimension
I'm not sure of your application so don't know if this will work, but when dealing with time, I try to separate Date and Time into separate columns and have both a Date and Time dimension. Below is my time dimension DAX. You can generate any minute-precise interval with it. Notice the last defined column "timekey". I create a column in my fact table to relate to this key.
DimTime =
var every_n_minutes = 15 /* between 0 and 60; remainders in last hourly slice */
/* DO NOT CHANGE BELOW THIS LINE */
var slice_per_hour = trunc(DIVIDE(60,every_n_minutes),0)
var rtn =
ADDCOLUMNS(
SELECTCOLUMNS(
GENERATESERIES(0, 24*slice_per_hour - 1, 1),
"hour24", TRUNC(DIVIDE([Value],slice_per_hour),0),
"mins", MOD([Value],slice_per_hour) * every_n_minutes
),
"hour12", MOD([hour24] + 11,12) + 1,
"asTime", TIME([hour24],[mins],0),
"timekey", [hour24] * 100 + [mins]
)
return rtn
As requested, turning this into an answer. The reason you're getting these results is that your time stamps will never line up. Yes, it let you create the join, but my guess is that is only because both fields have the same formatting. Also, it is best practices to separate your dates and time in separate date and time dimensions, then join them via a fact table. See also here.

Informix FROM_UNIXTIME alternative

I was searching for a way to group data by interval (ex: every 30 minutes) using the date defined in that table, so i need to convert that date time to milliseconds so that i can divide it by the interval i need like in this query
SELECT FLOOR(UNIX_TIMESTAMP(timestamp)/(15 * 60 * 1000)) AS timekey
FROM table
GROUP BY timekey;
This query is running perfectly on SQL Server but on informix it's giving me the error
Routine (unix_timestamp) can not be resolved.
As it's not defined in IBM Informix server.
So i need a direct way to get epoch unix time from timestamp DATETIME YEAR TO FRACTION(3) column in IBM informix server like 'UNIX_TIMESTAMP' in SQL server.
If the timestamp column is of type DATETIME YEAR TO SECOND or similar, then you can convert it to a DECIMAL(18,5) number of seconds since the Unix Epoch, aka 1970-01-01 00:00:00Z (UTC; time zone offset +00:00) using a procedure such as this:
{
# "#(#)$Id: tounixtime.spl,v 1.6 2002/09/25 18:10:48 jleffler Exp $"
#
# Stored procedure TO_UNIX_TIME written by Jonathan Leffler (previously
# jleffler#informix.com and now jleffler#us.ibm.com). Includes fix for
# bug reported by Tsutomu Ogiwara <Tsutomu.Ogiwara#ctc-g.co.jp> on
# 2001-07-13. Previous version used DATETIME(0) SECOND TO SECOND
# instead of DATETIME(0:0:0) HOUR TO SECOND, and when the calculation
# extended the shorter constant to DATETIME HOUR TO SECOND, it added the
# current hour and minute fields, as documented in the Informix Guide to
# SQL: Syntax manual under EXTEND in the section on 'Expression'.
# Amended 2002-08-23 to handle 'eternity' and annotated more thoroughly.
# Amended 2002-09-25 to handle fractional seconds, as companion to the
# new stored procedure FROM_UNIX_TIME().
#
# If you run this procedure with no arguments (use the default), you
# need to worry about the time zone the database server is using because
# the value of CURRENT is determined by that, and you need to compensate
# for it if you are using a different time zone.
#
# Note that this version works for dates after 2001-09-09 when the
# interval between 1970-01-01 00:00:00+00:00 and current exceeds the
# range of INTERVAL SECOND(9) TO SECOND. Returning DECIMAL(18,5) allows
# it to work for all valid datetime values including fractional seconds.
# In the UTC time zone, the 'Unix time' of 9999-12-31 23:59:59 is
# 253402300799 (12 digits); the equivalent for 0001-01-01 00:00:00 is
# -62135596800 (11 digits). Both these values are unrepresentable in
# 32-bit integers, of course, so most Unix systems won't handle this
# range, and the so-called 'Proleptic Gregorian Calendar' used to
# calculate the dates ignores locale-dependent details such as the loss
# of days that occurred during the switch between the Julian and
# Gregorian calendar, but those are minutiae that most people can ignore
# most of the time.
}
CREATE PROCEDURE to_unix_time(d DATETIME YEAR TO FRACTION(5)
DEFAULT CURRENT YEAR TO FRACTION(5))
RETURNING DECIMAL(18,5);
DEFINE n DECIMAL(18,5);
DEFINE i1 INTERVAL DAY(9) TO DAY;
DEFINE i2 INTERVAL SECOND(6) TO FRACTION(5);
DEFINE s1 CHAR(15);
DEFINE s2 CHAR(15);
LET i1 = EXTEND(d, YEAR TO DAY) - DATETIME(1970-01-01) YEAR TO DAY;
LET s1 = i1;
LET i2 = EXTEND(d, HOUR TO FRACTION(5)) -
DATETIME(00:00:00.00000) HOUR TO FRACTION(5);
LET s2 = i2;
LET n = s1 * (24 * 60 * 60) + s2;
RETURN n;
END PROCEDURE;
Some of the commentary about email addresses is no longer valid – things have changed in the decade and a half since I wrote this.

Swift 3 Error in time interval between two dates

I am trying to determine the interval between two dates that I create using DateComponents. If I make the first date 1 year prior to the second, I get 365 days, 0 hours, 0 minutes and 0 seconds. If I make the dates further apart (400 years here), suddenly my date is off by 11 minutes 56 seconds. Here is the code:
import Foundation
var mycal = Calendar(identifier: .iso8601)
var datum = DateComponents(year:1600, month:1, day:1, hour:12, minute:0,
second:0)
let j2000 = DateComponents(year:2000, month:1, day:1, hour:12, minute:0,
second:0)
let datum_date = mycal.date(from: datum)
let j2000_date = mycal.date(from: j2000)
let interval = mycal.dateComponents([.day, .hour, .minute, .second], from:j2000_date!, to:datum_date!)
print("Datum: \(datum_date!)") //1600-01-01 19:48:04 +0000
print("j2000: \(j2000_date!)") //2000-01-01 20:00:00 +0000
Note the next-to-last line: Comments show what the print produces. I've tried it with the Gregorian calendar too, same problem. I'm not sure exactly how far back the inconsistency occurs, I've gone back far enough to produce and it sometimes seems to "stick" as I change the code moving closer in time again. Strangely, the "interval" appears to show the correct amount of days(here -146097), but the date shown is incorrect and I will likely need that in my calculations. Anyone have any ideas?
The difference could be related to leep year adjustments but that would give a difference of 11 minutes and 14 seconds (there'd still be 40 seconds unaccounted for, 26 of which could be leep seconds).
see: https://www.infoplease.com/leap-year-101-next-when-list-days-calendar-years-calculation-last-rules
In Theory, if you compute a multi-year time difference with a precision of minutes and seconds, you should get variations of 5 hours 48 minutes and 46 seconds every 3 out of four years and get within 11 minutes and 14 seconds on the fourth year. I don't know how macOS (Unix) deals with that there there is probably a bunch of considerations that they need to take into account (especially beyond 400 year where that 11 minutes 14 seconds gets adjusted).
If that level of precision is required by your use case, I would suggest reading up on the minute details of time calculations. Given that dates are stored internally as a number of seconds, going back to a precise day and time over these long periods must require some special math acrobatics.
See Apple's documentation here: https://developer.apple.com/reference/foundation/nscalendar

Descriptive statistics of time variables

I want to compute simple descriptive statistics (mean, etc) of times when people go to bed. I ran into two problems. The original data comes from an Excel file in which just the time that people went to bed, were typed in - in 24 hrs format. My problem is that r so far doesn't recognizes if people went to bed at 1.00 am the next day. Meaning that a person who went to bed at 10 pm is 3 hrs apart from the one at 1.00 am (and not 21 hrs).
In my dataframe the variable in_bed is a POSIXct format so I thought to apply an if-function telling that if the time is before 12:00 than I want to add 24 hrs.
My function is:
Patr$in_bed <- if(Patr$in_bed <= ) {
Patr$in_bed + 24*60*60
}
My data frame looks like this
in_bed
1 1899-12-30 22:13:00
2 1899-12-30 23:44:00
3 1899-12-30 00:08:00
If I run my function my variable gets deleted and the following error message gets printed:
Warning message:
In if (Patr$in_bed < "1899-12-30 12:00") { :
the condition has length > 1 and only the first element will be used
What do I do wrong or does anyone has a better idea? And can I run commands such as mean on variables in POSIXct format and if not how do I do it?
When you compare Patr$in_bed (vector) and "1899-12-30 12:00" (single value), you get a logical vector. But the IF statement requires a single logical, thus it generates a warning and consider only the first element of the vector.
You can try :
Patr$in_bed <- Patr$in_bed + 24*60*60 * (Patr$in_bed < as.POSIXct("1899-12-30 12:00"))
Explanations : the comparison in the parenthesis will return a logical vector, which will be converted to integer (0 for FALSE and 1 for TRUE). Then the dates for which the statement is true will have +24*60*60, and the others dates will have +0.
But since the POSIXct format includes the date, I don't see the purpose of adding 24 hrs. For instance,
as.POSIXct("1899-12-31 01:00:00") - as.POSIXct("1899-12-30 22:00:00")
returns a time difference of 3 hours, not 21.
To answer your last question, yes you can compute the mean of a POSIXct vector, simply with :
mean(Patr$in_bed)
Hope it helps,
Jérémy

Dates subtraction: has the event occurred or not?

If I have everyday datetime - how to find out, the event has already occurred or not, by subtraction with datetime.now()
Let we had everyday meeting at 15:35. Today John came earlier - at 12:45, but Alex was late for 2 h. and 15 min. (came at 17:40).
meet_dt = datetime(year=2015, month=8, day=19, hour=15, minute=35)
john_dt = datetime(year=2015, month=8, day=19, hour=12, minute=45)
alex_dt = datetime(year=2015, month=8, day=19, hour=17, minute=40)
print(meat_dt - john_dt) # came before > 2:50:00
print(meat_dt - alex_dt) # came after > -1 day, 21:55:00
If I take away from the big date less - then everything is fine, but conversely I recive -1 day, 21:55:00 why not -2:15:00, what a minus day?
Because timedeltas are normalized
All of the parts of the timedelta other than the days field are always nonnegative, as described in the documentation.
Incidentally, if you want to see what happened first, don't do this subtraction. Just compare directly with <:
if then < datetime.datetime.now():
# then is in the past

Resources