I need to save a time interval in a column in a table. based on: http://docs.sqlalchemy.org/en/rel_0_8/core/types.html
I can use Interval type for that. My database is SQLite, and I don't quite understand this description in the document:
"The Interval type deals with datetime.timedelta objects. In PostgreSQL, the
native INTERVAL type is used; for others, the value is stored as a date which
is relative to the “epoch” (Jan. 1, 1970)."
Can anybody tell me how should I do that?
So from what I get in the question, you want to just store an interval and take it out of the database to use it again? But you want to understand how it is stored?
Concerning the storage: This is probably easier with Unix timestamps than with DateTimes. Suppose you want to store timedelta(1), i.e. a delta of one day. What is stored in the database is the time since the "epoch", i.e. second "0" in Unix timestamps and as a date: 1970-01-01 00:00:00 (this is where Unix timestamps start counting the seconds). If you don't know about epoch or timestamp, then read Wikipedia on Unix time.
So we want to store one day of difference? The documentation claims it stored "time since epoch". We just learned "epoch" is "second 0", so a day later would be 60 seconds per minute, 60 minutes per hour, 24 hours per day: 60 * 60 * 24 = 86400. So stored as an integer this is easy to understand: If you find the value 86400 in your database, then it means 1 day, 0 hours, 0 minutes, 0 seconds.
Reality is a bit different: It does not store an integer but a DateTime object. Speaking from this perspective, the epoch is 1970-01-01 00:00:00. So what is a delta of one day since the epoch? That is easy: it's 1970-01-02 00:00:00. You can see, it is a day later.
An hour later? 1970-01-01 01:00:00.
Two days, four hours, 30 seconds?: 1970-01-03 04:00:30.
And you could even do it yourself:
epoch = datetime.utcfromtimestamp(0)
delta = timedelta(1)
one_day = datetime.utcfromtimestamp(86400)
print "Date to be stored in database:", epoch + delta
print "Timedelta from date:", one_day - epoch
As you can see, the calculation is easy and this is all that is done behind the scenes. Take a look at this full example:
interval = IntervalItem(interval=delta)
session.add(interval)
i = session.query(IntervalItem).first()
print "Timedelta from database:", i.interval
You can see it is no different from the above example except it goes through the database. The only thing to keep in mind with this, is this note:
Note that the Interval type does not currently provide date arithmetic operations
on platforms which do not support interval types natively.
That means you should be careful how you use it, for example addition in the query might not be a good idea, but you should just play around with it.
Related
Calling an API, I need to specify time in milliseconds. I use DateTime in Python 3 to convert from and to human readable dates and times. But when debugging, I get different results depending on what website I use to convert, so I'm having a hard time debugging when the timestamps are (maybe) wrong. The API doesn't combine time and date in their predicates, but uses milliseconds for both.
Let's look at the date predicate 1656547200000
If I go to https://currentmillis.com it says it's June 30th. All good.
If I go to https://www.epochconverter.com it says it's June 30th. All good.
Let's look at the time predicate from 12600000 to 26700000.
12600000, 26700000
If I go to https://currentmillis.com it says it's UTC (24h) 03:30 - 07:25
If I go to https://www.epochconverter.com it says it's UTC (24h) 20:00 - 00:40
Why that different results?
Historically, UNIX systems reckon time as temporal units (seconds, millis, nanos, etc.) elapsed since an absolute origin instant termed "the Epoch": January 1st, 1970 00:00:00 UTC. This is formally recognized in POSIX.
So "UNIX time" is that system of reckoning, and "Epoch timestamps" are points in time in that system.
Now, you appear to me to be conflating temporal units in your use of Epoch timestamps.
In the case of your "short" timestamp, 12600000 seconds since the Epoch is a different point in time than 12600000 milliseconds since the Epoch. That's why you see them resolve to different times of day, as your converters are interpreting them differently. If you'd included the date in your output, you'd have seen the two points in time are almost six months apart.
I was asked to create a query to pull a near-real-time report from an Informix database (I have select access only, I cannot create a SP) and I felt like I succeeded pretty well until I realized that there was a discrepancy in a datetime field. it seems that the program that is populating the db is hard-coded to enter the time in the datetime field in UTC (five hours off of the local time. When the time was 2:30 it entered a row in the database saying John Doe completed the task at 7:30). In my report I am supposed to calculate the number of seconds (as an int) since the user completed the task (field is "completionTime") and I was originally just using:
sysdate - completionTime interval seconds(9) to seconds cast to char then cast to int
When I realized the mistake in the timezone of the completionTime field I just subtracted the offset as an integer (I was already converting the interval to an integer, so I just adjusted the answer by 18000). This worked just fine until Daylight Saving started. Then all of a sudden local time was 4 hours (14400 seconds instead of 18000) off of UTC.
Since I can only select from the db, I next tried using an inefficient case statement (my query went from <0.5 seconds to 3-5 seconds for only 25 rows). Following a suggestion from another forum I changed the time to an integer of seconds from the unix epoch, then used the dbinfo('utc_to_datetime') sp to convert it back to a datetime in the right timezone.
This approach works, but the calculation looks terrible to me:
cast(cast(cast((sysdate - dbinfo("utc_to_datetime", cast(cast(cast((completionTime - TO_DATE('Friday January 1, 2010 0:00', '%A %B %d, %Y %R')) as interval second(9) to second) as char(10)) as int) +1262304000)) as interval second(9) to second) as char(10)) as int)
notice that I am calculating the length of time from the completiontime to 1-1-2010 then adding 12 billion seconds (going all the way back to the unix epoch is too big for Informix's interval seconds(9) to second, hence the two-steps) so that I can then plug it into the dbinfo("utc_to_datetime") sp to convert it back to a datetime in the right timezone, then subtracting it from sysdate. The worst part (besides the six casts) is that the completiontimes that I am dealing with are all within 24 hours of sysdate, most are within 10 minutes, yet I am adding on 12 billion seconds so that I can use the only function I can find that converts between timezones.
My question is, Is this really the best way to do this? By the way, this works very quickly, and my query is back down to a reasonable execution time (<0.5 seconds), I'm just looking at this query and thinking that there has got to be a better way.
Jared
Maybe instead of sysdate you can use DBINFO('utc_current'):
SELECT DBINFO('utc_current') - (completionTime interval seconds(9) to seconds) FROM ...
I am at a standstill with this problem. I outlined it in another question ( Creating data histograms/visualizations using ipython and filtering out some values ) which meandered a bit so I'd like to fix the question and give it more context since I am sure others must have a workaround for this or have the problem. I've also seen similar, not identical, questions asked and can't quite adapt any of the solutions thus far given.
I have columns in my data frame for Start Time and End Time and created a 'Duration' column for time lapsed. I'm using ipython.
The Start Time/End Time columns have fields that look like:
2014/03/30 15:45
A date and then a time in hh:mm
when I type:
pd.to_datetime('End Time') and
pd.to_datetime('Start Time')
I get fields resulting that look like:
2014-03-30 15:45:00
same date but with hyphens and same time but with :00 seconds appended
I then decided to create a new column for the difference between the End and Start times. The 'Duration' or time lapsed column was created by typing in one command:
df['Duration'] = pd.to_datetime(df['End Time'])-pd.to_datetime(df['Start Time'])
The format of the fields in the duration column is:
01:14:00
no date just a time lapsed in the format hh:mm:ss
to indicate time lapsed or 74 mins in the above example.
When I type:
df.Duration.dtype
dtype('m8[ns]') is returned, whereas, when I type
df.Duration.head(4)
0 00:14:00
1 00:16:00
2 00:03:00
3 00:09:00
Name: Duration, dtype: timedelta64[ns]
is returned which seems to indicate a different dtype for Duration.
How can I convert the format I have in the Duration column to a single integer value of minutes (time lapsed)? I see no methods that I can use, I'd write a function but wouldn't know how to treat the input of hh:mm:ss. This must be a common requirement of data analysis, should I be going about converting these dates and times differently if my end goal is to get a single integer indicating minutes lapsed? Should I just be using Excel?... because I have so far spent a day on this problem and it should be a simple problem to solve.
**update:
THANK YOU!! (Jeff and Dataswede) I added a column with the command:
df['Durationendminusstart'] = pd.to_timedelta(df.Duration,unit='ns').astype('timedelta64[m]')
which seems to give me the Duration (minutes lapsed) as wanted so that huge part is solved!
What still is not clear is why there were two different dtypes for the same column depending how I asked, oh well right now it doesn't matter.**
I'm using the following syntax
TIMESTAMPDIFF(2, CHAR(CREATED - TIMESTAMP('1970-01-01 00:00:00'))
where CREATED is of type TIMESTAMP and the database is DB2. The intension is to get the timestamp converted to millis from epoch. If there is a better function that would be more helpful.
Sample data:
For 2011-10-04 13:54:50 returned value is 1316613290 but actual value should be 1317732890 (got from http://www.epochconverter.com)
Query to run
SELECT TIMESTAMPDIFF(2, CHAR(TIMESTAMP('2011-10-04 13:54:50') - TIMESTAMP('1970-01-01 00:00:00'))) FROM SYSIBM.SYSDUMMY1;
This is the result of the fact that TIMESTAMPDIFF returns an estimate of the difference between the timestamps, not the actual value, as expected.
From the reference, page 435 (assuming for iSeries):
The following assumptions are used when converting the element values
to the requested interval type:
One year has 365 days.
One year has 52 weeks.
One year has 12 months.
One quarter has 3 months.
One month has 30 days.
One week has 7 days.
One day has 24 hours.
One hour has 60 minutes.
One minute has 60 seconds.
One second has 1000000 microseconds.
And the actual calculation used is:
seconds + (minutes + (hours + ((days + (months * 30) + (years * 365)) * 24)) * 60) * 60
This is, for obvious reasons, inexact. Not helpful.
This appears to be a direct consequence of the way the timestamp arithmetic results are returned.
That is;
SELECT
TIMESTAMP('1971-03-02 00:00:00') - TIMESTAMP('1970-01-01 00:00:00')
FROM sysibm/sysdummy1
returns:
10,201,000,000.000000
Which can be divided into:
1 year
02 months
01 days
00 hours
00 minutes
00 seconds
000000 microseconds
Which is imprecise period/duration information. While there are a multitude of situations where this type of data is useful, this isn't one of them.
Short answer: The exact answer cannot be correctly calculated in the database, and in fact should not.
Long answer:
The calculations are possible, but rather complex, and definitely not suited for in-database calculation. I'm not going to reproduce them here (look up JodaTime if you're interested, specifically the various Chronology subclasses). Your biggest problem is going to be the fact that months aren't all the same length. Also, you're going to run into major problems if your timestamps are anything other than UTC - more specifically, Daylight Savings time is going to play havoc with the calculation. Why? Because the offsets can change at any time, for any country.
Maybe you could explain why you need the number of milliseconds? Hopefully you're using Java (or able to do so), and can use java.time. But if you're on an iSeries, it's probably RPG...
According to the v9.7 info center, TIMESTAMPDIFF returns an estimated time difference, based on 365 days in a year (not true ~25% of the time), 30 days in a month (not true 75% of the time, though averages out a bit better than that), 24 hours in a day (not true a couple days of the year in some timezones), 60 minutes in an hour (hooray, one right!), and 60 seconds in a minute (true >99.9% of the time - we do get leap seconds).
So, no, this is not the way to get epoch time in DB2. Thus far, I've resorted to getting the time as a timestamp, and converting it in the client.
Part of your error occurs because of the inaccuracy of the TIMESTAMPDIFF function, as others have pointed out.
The other source of error occurs because the Epoch is based on GMT – so you have to take your local timezone into account.
So, you can do this with the following expression:
(DAYS(timestamp('2011-10-04-13.54.50.000000') - current timezone) - DAYS('1970-01-01-00.00.00.000000')) * 86400 + MIDNIGHT_SECONDS(timestamp('2011-10-04-13.54.50.000000') - current timezone)
You can write a simple UDF to simplify this:
create or replace function epoch (in db2ts timestamp)
returns bigint
language sql
deterministic
no external action
return (days(db2ts - current timezone) - days('1970-01-01-00.00.00.000000')) * 86400 + midnight_seconds(db2ts - current timezone);
Good luck,
I exported my Firefox bookmarks, and the 'dateAdded' fields look like this:
1260492675000000
1260492675000000
1266542833000000
They're too big to be a Unix timestamp, and I can't make sense of them. What are they? (I want to convert it into something usable/readable.)
It is PRTime.
This type is a 64-bit integer representing the number of microseconds since the NSPR epoch, midnight (00:00:00) 1 January 1970 Coordinated Universal Time (UTC). A time after the epoch has a positive value, and a time before the epoch has a negative value.
PRTime as described on this page.
You can extract the time using the f3e tool if you can find a link to it.