Date conversion query - datetime

I've come into possession of hundreds of ascii data files where the date and time are separate columns like so:
date time
1-Jan-08 23:05
I need to convert this to a usable R Date object, subtract 8 hours (timezone conversion from UTC to Pacific) and then turn it into unix time. I need to do this since the data are collected every evening (from 5pm through 2am the following morning). So if I were to use regular date/time format it would confound days (day1 spans two days when in fact it was just one evening of data collection). I'd like to consider each day's events separately.
Using unixtime will allow me to calculate time differences in events that occur each day (I will probably retain a date field in addition to the unix time). Can someone suggest an efficient way to do this?
Here is some data to use (this is in UTC)
dummy=data.frame(date="1-Jan-08",time="23:05")

Paste them together (which works vectorised) and then parse, e.g.
datetime <- paste(dummy$date, dummy$time)
parsed <- strptime(datetime, "%d-%b-%y %H:%M")
which you can also assign as columns in the data frame.
Edit: strptime() has an optional tz="" argument you can use.

Related

seconds since date to date in R

I have a dataset file with a time variable in "seconds since 1981-01-01 00:00:00". What I need is to convert this time into calendar date (YYYY-MM-DD HH:mm:ss). I've seen a lot of different ways to do this for time since epoch (1970) (timestamp, calendar.timegm, etc) but I'm failing to do this with a different reference date.
One option is to simply add 347133600s (11 years) to each value in seconds. this will then allow you to simply use conversion as it would be from 1970-01-01.

How to convert number into date-time format in excel ( i have equivalent solution achieved in R)

Referring to the above screenshot, i'm trying to crawl data from Singapore Stock Exchange, which the web content is loaded dynamically from an API call returning json, example here
I'm having some problem with the dates, which is given as a number by the json. For example, 1575491760000 is supposed to be 2019-12-04 20:36:00GMT.
After some trial and error, i've figured solution using R:
as.POSIXct(1575491760000/1000, origin="1970-01-01", tz = 'GMT')
# not sure why need to divide the number by 1000 here but i guess this is the way to make it work
and the above code does return "2019-12-04 20:36:00 GMT" in R.
However, my question is there a solution to the above conversion in Excel? I've tried a few different ways but none of them can deal with such long data scenario (date + time format). Appreciated if anyone can provide a specific solution!
Here's the Excel equivalent.
=DATE(1970,1,1) + 1575491760000/(1000*60*60*24)
# 12/4/19 20:36:00 with cell formatting set to m/d/yy h:mm:ss
Unix time increments one for every millisecond since 1/1/1970. Excel datetimes increment one for every day since 1/1/1900.
So to convert from UNIX time to excel, divide by the number of milliseconds in a day (1000*60*60*24) and add to the date 1/1/70 (25569 under the hood in Excel.)

Convert Epoch time in Sqlite (Dbeaver) and parse out time

I have a dataset with epoch time, but am having difficulty casting it to a timestamp, I also need to parse out the timestamp to add another column just for the time of day, so that I can group data by the time of day the transaction occurred.
I also need to convert to different time zones, from another column (IE, GMT-8 for rows of data, GMT-7 for some, etc etc.)
Example:
1520555554 is March 8th, 2018 16:32 (Pacific Time Zone, GMT-8)
I need to convert from epoch to time stamp, and create another column parsing out 16:32 as the time of day.
For the last part, use strftime() or another one of the sqlite date and time functions. For example,
SELECT strftime('%H:%M', timestamp_column, 'unixepoch') FROM ...
Dealing with timezones is going to be more complicated. Looks like there's no builtin support for them.

timedeltas and datetimes subtraction and converting to duration in minutes

I am at a standstill with this problem. I outlined it in another question ( Creating data histograms/visualizations using ipython and filtering out some values ) which meandered a bit so I'd like to fix the question and give it more context since I am sure others must have a workaround for this or have the problem. I've also seen similar, not identical, questions asked and can't quite adapt any of the solutions thus far given.
I have columns in my data frame for Start Time and End Time and created a 'Duration' column for time lapsed. I'm using ipython.
The Start Time/End Time columns have fields that look like:
2014/03/30 15:45
A date and then a time in hh:mm
when I type:
pd.to_datetime('End Time') and
pd.to_datetime('Start Time')
I get fields resulting that look like:
2014-03-30 15:45:00
same date but with hyphens and same time but with :00 seconds appended
I then decided to create a new column for the difference between the End and Start times. The 'Duration' or time lapsed column was created by typing in one command:
df['Duration'] = pd.to_datetime(df['End Time'])-pd.to_datetime(df['Start Time'])
The format of the fields in the duration column is:
01:14:00
no date just a time lapsed in the format hh:mm:ss
to indicate time lapsed or 74 mins in the above example.
When I type:
df.Duration.dtype
dtype('m8[ns]') is returned, whereas, when I type
df.Duration.head(4)
0 00:14:00
1 00:16:00
2 00:03:00
3 00:09:00
Name: Duration, dtype: timedelta64[ns]
is returned which seems to indicate a different dtype for Duration.
How can I convert the format I have in the Duration column to a single integer value of minutes (time lapsed)? I see no methods that I can use, I'd write a function but wouldn't know how to treat the input of hh:mm:ss. This must be a common requirement of data analysis, should I be going about converting these dates and times differently if my end goal is to get a single integer indicating minutes lapsed? Should I just be using Excel?... because I have so far spent a day on this problem and it should be a simple problem to solve.
**update:
THANK YOU!! (Jeff and Dataswede) I added a column with the command:
df['Durationendminusstart'] = pd.to_timedelta(df.Duration,unit='ns').astype('timedelta64[m]')
which seems to give me the Duration (minutes lapsed) as wanted so that huge part is solved!
What still is not clear is why there were two different dtypes for the same column depending how I asked, oh well right now it doesn't matter.**

How to compare two dates in SQLite?

I kind of assumed it was a string, so I compared it as a string, but not surprisingly it failed. I believe thats how it works in Mysql. I could be wrong as I haven't worked on it in a while. In either case, how can I check if dates are equal in SQLite? I will be using it in a WHERE clause.
SELECT a._id, b._id, b.start_date,a.event_name, b.start_time,
b.end_date, b.end_time, b.location FROM events_info b INNER JOIN events a ON
a._id=b.event_id WHERE b.start_time = '6:00';
(added space to make it easier to look at)
SQLite doesn't have a dedicated DATETIME type. Normally what people do is make sure they store the date as a formatted string that is consistent; for example, YYYY-MM-DD hh:mm:ss. If you do so, as long as you're consistent, then you can compare dates directly:
SELECT * FROM a WHERE q_date < '2013-01-01 00:00:00';
This works because even though the comparison is technically an alphabetical comparison and not a numeric one, dates in a consistent format like this sort alphabetically as well as numerically.
For such a schema, I would suggest storing dates in 24-hour format (the above example is midnight). Pad months, days, and hours with zeros. If your dates will span multiple timezones, store them all in UTC and do whatever conversion you need client-side to convert them to the local time zone.
Normally dates and times are stored all in one column. If you have to have them separated for whatever reason, just make sure you dates are all consistent and your times are all consistent. For example, dates should all be YYYY-MM-DD and times should all be hh:mm:ss.
The reason that YYYY-MM-DD hh:mm:ss is the preferred format is because when you go from the largest date interval (years) to the smallest (seconds), you can index and sort them very easily and with high performance.
SELECT * FROM a WHERE q_date = '2012-06-04 05:06:00';
would use the index to hone in on the date/time instead of having to do a full table scan. Or if they're in two separate rows:
SELECT * FROM a WHERE q_date = '2012-06-04' AND q_time = '05:06:00';
The key is to make sure that the dates and times are in a consistent format going into the database. For user-friendly presentation, do all conversion client-side, not in the database. (For example, convert '2012-06-04 05:06:00' to "1:06am Eastern 6/4/2012".)
If this doesn't answer question, could you please post the exact format that you're using to store your dates and times, and two example dates that you're trying to compare that aren't working the way you expect them to?
Sqlite can not compare dates directly. we need to convert them in seconds as well as integer also.
Example
SELECT * FROM Table
WHERE
CAST(strftime('%s', date_field) AS integer) <=CAST(strftime('%s', '2015-01-01') AS integer) ;
From Datatypes In SQLite Version 3:
1.2 Date and Time Datatype
SQLite does not have a storage class set aside for storing dates and/or times. Instead, the built-in Date And Time Functions of SQLite are capable of storing dates and times as TEXT, REAL, or INTEGER values:
TEXT as ISO8601 strings ("YYYY-MM-DD HH:MM:SS.SSS").
REAL as Julian day numbers, the number of days since noon in Greenwich on November 24, 4714 B.C. according to the proleptic Gregorian calendar.
INTEGER as Unix Time, the number of seconds since 1970-01-01 00:00:00 UTC.
Applications can chose to store dates and times in any of these formats and freely convert between formats using the built-in date and time functions.
If you look at the examples in Date And Time Functions, something like this should get you close to what you want (which, I'm assuming, is 6:00 of the current day):
WHERE b.start_time = date('now', 'start of day', '+6 hours')

Resources