I have dates and times stored in two columns. The first has the date as "20180831." The time is stored as the number of seconds from midnight; 3am would be stored as 10,800.
I need a combined date time column and am having a hard time with something that should be simple.
I can get the dates in no problem but lubridate "hms" interprets the time field as a period, not a 'time' per se.
I tried converting the date to posix.ct format and then using that as the origin for the time field but posix.ct does not set the time for midnight, instead it sets it for either 1800 or 1900 hours depending on the date. I need it set to midnight for all rows, I don't want any daylight savings time adjustment.
Here's the code:
First I made a function because there are several date and time fields I have to do this for.
mkdate<-function(x){
a<-as.Date(as.character(x),format='%Y%m%d')
a<-as.POSIXct(a)
return(a)
}
df$date<-mkdate(df$date) #applies date making function to date field
df$datetime<-as.POSIXct(df$time,origin=df$date)
I'm sure this has to do with time zones. I'm in Central time zone and I have experimented with adding the "tz" specification into these commands in both the mkdate function and in the time code creating "datetime" column.
I've tried:
tz="America/Chicago"
tz="CST"
tz="UTC"
Help would be much appreciated!
Edited with example:
x<-c(20180831,20180710,20160511,20170105,20180101) #these are dates.
as.POSIXct(as.Date(as.character(x),format="%Y%m%d"))
Above code converts dates to seconds from the Jan 1 1970. I could convert this to numeric and add my 'seconds' value to this field BUT it is not correct. This is what I see instead as the output:
[1] "2018-08-30 19:00:00 CDT" "2018-07-09 19:00:00 CDT" "2016-05-10 19:00:00 CDT" "2017-01-04 18:00:00 CST" "2017-12-31 18:00:00 CST"
Look at the first date - it should be 8/31 but instead it is 8/30. Somewhere in there there is a timezone adjustment taking place. It's moving the clock back 5 or 6 hours because I am on central time. The first entry should be 2018-08-31 00:00:00. I would then convert it to numeric and add the seconds field on and convert back to POSIXct format. I've tried including tz specification all over the place with no luck.
Sys.getlocale("LC_TIME")
returns "English_United States.1252"
I believe the following does what you want.
My locale is the following, so the results are different from yours.
Sys.getlocale("LC_TIME")
#[1] "Portuguese_Portugal.1252"
The difference will be due to the daylight savings time, the summer hour.
As for your problem, all you have to do is to remeber that the objects of class "POSIXct are coded as the number of seconds since an origin, and that origin is usually the midnight of 1970-01-01. So you have to add your seconds since midnight to the seconds of as.Date.
x <- "20180831"
xd <- mkdate(x)
y <- 10800
as.POSIXct(as.integer(xd) + y, origin = "1970-01-01")
#[1] "2018-08-31 04:00:00 BST"
as.POSIXct(as.integer(xd) + y, origin = "1970-01-01", tz = "America/Chicago")
#[1] "2018-08-30 22:00:00 CDT"
There are very many ways to do this:
mktime = function(a, b)modifyList(strptime(a, '%Y%m%d'), list(sec = as.numeric(gsub(',', '', b))))
mktime("20180831",'10,800')
[1] "2018-08-31 03:00:00 PDT"
mktime('20180301','10800')
[1] "2018-03-01 03:00:00 PST"
mktime('20180321','10800')
[1] "2018-03-21 03:00:00 PDT"
Looking at the above code, it does not adjust for the daylight saving time. Irrespective of the date, the seconds still show that it Is 3 AM, including the dates when ST-->DT. This will also take into consideration, your LOCAL timezone.
Related
I am unexperienced in working with data format in R, and I am struggling to understand the different behaviour with the first of April... is it an april fool?? :)
They have the same format, but it seems that the first day can't be parsed using as.POSIXct (when other dates show no issues) or it does not returns the time zone with as.POSIXlt?
(as.POSIXct("1/04/2012 02:58", format = "%d/%m/%Y %H:%M")) # this doesn't work
(as.POSIXct("2/04/2012 02:58", format = "%d/%m/%Y %H:%M")) # this works
(as.POSIXct("01/04/2012 02:58", format = "%d/%m/%Y %H:%M")) # this doesn't
(as.POSIXct("02/04/2012 02:58", format = "%d/%m/%Y %H:%M")) # this does...
(as.POSIXlt("1/04/2012 02:58", format = "%d/%m/%Y %H:%M")) # This works, but does not returns a time zone
(as.POSIXlt("2/04/2012 02:58", format = "%d/%m/%Y %H:%M")) # This works, and returns a time zone
(as.POSIXlt("01/04/2012 02:58", format = "%d/%m/%Y %H:%M")) # This works, and does not returns a time zone
(as.POSIXlt("02/04/2012 02:58", format = "%d/%m/%Y %H:%M")) # This works, and returns a time zone
Any direction as to why? Thanks!
This is almost certainly a daylight savings time issue. Not sure why POSIXct and POSIXlt behave differently though. From your profile, it looks like you're in Mexico.
From here:
most of Mexico, including capital Mexico City, will set the clocks 1 hour forward 3 weeks later, on Sunday, April 1, 2012.
So the problem is that 2:58 AM on 1 April 2012 did not exist in the time zone that is currently active in your locale.
Unless there is something specific having to do with the POSIXct/POSIXlt difference, this should probably be closed as a duplicate of e.g.:
What is wrong with this date and time?
R POSIXct returns NA with "03/12/2017 02:17:13"
PosixCT conversion in R fails
Weird as.POSIXct behavior depending on daylight savings time
Strange strptime behavior in R
as.POSIX error, can not convert a particular date
Weird POSIX behaviour for two closely time strings with and without specifying the format
And this r help question
If you want to deal with this e.g. by setting all times to UTC (i.e. ignoring your local time zone settings), I believe there are lots of suggestions on Stack Overflow (now that you know to search for "daylight savings time" it should be easy to find them).
obligatory xkcd
#Ben Bolker is correct that this is a daylight saving time issue. Specifically, this is what I call a nonexistent time issue. In Mexico City, on April 1st 2012, there was a DST gap of 1 hour where the clocks jumped from 01:59:59 AM straight to 03:00:00 AM, skipping the two o'clock hour entirely. So 02:58:00 AM is a nonexistent time on that day.
These problems can be really frustrating, so in the clock package I've made parsing issues like this an error by default, with many ways to get around them according to your needs.
For future visitors to this post, here is a reprex with the full output from as.POSIXc/lt() vs clock. The relevant clock function is date_time_parse().
library(clock)
x <- c("1/04/2012 02:58", "2/04/2012 02:58")
zone <- "America/Mexico_City"
format <- "%d/%m/%Y %H:%M"
# Nonexistent time - returns NA
as.POSIXct(x, tz = zone, format = format)
#> [1] NA "2012-04-02 02:58:00 CDT"
# Nonexistent time - can't determine zone
as.POSIXlt(x, tz = zone, format = format)
#> [1] "2012-04-01 02:58:00" "2012-04-02 02:58:00 CDT"
# Errors on nonexistent time so you don't have surprising results
date_time_parse(x, zone = zone, format = format)
#> Error: Nonexistent time due to daylight saving time at location 1.
#> ℹ Resolve nonexistent time issues by specifying the `nonexistent` argument.
# Next valid time
date_time_parse(x, zone = zone, format = format, nonexistent = "roll-forward")
#> [1] "2012-04-01 03:00:00 CDT" "2012-04-02 02:58:00 CDT"
# Previous valid time
date_time_parse(x, zone = zone, format = format, nonexistent = "roll-backward")
#> [1] "2012-04-01 01:59:59 CST" "2012-04-02 02:58:00 CDT"
# Shift forward by the size of the gap (1 hour)
date_time_parse(x, zone = zone, format = format, nonexistent = "shift-forward")
#> [1] "2012-04-01 03:58:00 CDT" "2012-04-02 02:58:00 CDT"
# NA on nonexistent times
date_time_parse(x, zone = zone, format = format, nonexistent = "NA")
#> [1] NA "2012-04-02 02:58:00 CDT"
I would like to use R for time series analysis. I want to make a time-series model and use functions from the packages timeDate and forecast.
I have intraday data in the CET time zone (15 minutes data, 4 data points per hour). On March 31st daylight savings time is implemented and I am missing 4 data points of the 96 that I usually have. On October 28th I have 4 data points too many as time is switched back.
For my time series model I always need 96 data points, as otherwise the intraday seasonality gets messed up.
Do you have any experiences with this? Do you know an R function or a package that would be of help to automat such data handling - something elegant?
Thank you!
I had a similar problem with hydrological data from a sensor. My timestamps were in UTC+1 (CET) and did not switch to daylight saving time (UTC+2, CEST). As I didn't want my data to be one hour off (which would be the case if UTC were used) I took the %z conversion specification of strptime. In ?strptime you'll find:
%z Signed offset in hours and minutes from UTC, so -0800 is 8 hours
behind UTC.
For example: In 2012, the switch from Standard Time to DST occured on 2012-03-25, so there is no 02:00 on this day. If you try to convert "2012-03-25 02:00:00" to a POSIXct-Object,
> as.POSIXct("2012-03-25 02:00:00", tz="Europe/Vienna")
[1] "2012-03-25 CET"
you don't get an error or a warning, you just get date without the time (this behavior is documented).
Using format = "%z" gives the desired result:
> as.POSIXct("2012-03-25 02:00:00 +0100", format="%F %T %z", tz="Europe/Vienna")
[1] "2012-03-25 03:00:00 CEST"
In order to facilitate this import, I wrote a small function with appropriate defaults values:
as.POSIXct.no.dst <- function (x, tz = "", format="%Y-%m-%d %H:%M", offset="+0100", ...)
{
x <- paste(x, offset)
format <- paste(format, "%z")
as.POSIXct(x, tz, format=format, ...)
}
> as.POSIXct.no.dst(c("2012-03-25 00:00", "2012-03-25 01:00", "2012-03-25 02:00", "2012-03-25 03:00"))
[1] "2012-03-25 00:00:00 CET" "2012-03-25 01:00:00 CET" "2012-03-25 03:00:00 CEST"
[4] "2012-03-25 04:00:00 CEST"
If you don't want daylight saving time, convert to a timezone that doesn't have it (e.g. GMT, UTC).
times <- .POSIXct(times, tz="GMT")
Here is getting the daylight savings time offset -
e.g. Central Daylight Savings time
> Sys.time()
"2015-08-20 07:10:38 CDT" # I am at America/Chicago daylight time
> as.POSIXct(as.character(Sys.time()), tz="America/Chicago")
"2015-08-20 07:13:12 CDT"
> as.POSIXct(as.character(Sys.time()), tz="UTC") - as.POSIXct(as.character(Sys.time()), tz="America/Chicago")
Time difference of -5 hours
> as.integer(as.POSIXct(as.character(Sys.time()), tz="UTC") - as.POSIXct(as.character(Sys.time()), tz="America/Chicago"))
-5
Some inspiration was from
Converting time zones in R: tips, tricks and pitfalls
I have a very specific problem. I have been trying to convert a date time character into a date time format in R. Example: "2017-05-21 00:00:00".
Whenever I try to convert it using strptime and as.POSIXct to a date time format it gives me "2017-05-21".
Thanks for any help
As #ngm says, this is only a formatting choice on the part of R. You can check to make sure it's actually midnight. Datetimes are stored as seconds past the epoch, and can actually be used in arithmetic.
t1 <- as.POSIXct("2017-05-21 00:00:00")
t1
# [1] "2017-05-21 EDT"
as.integer(t1)
# [1] 1495339200
So your time is 1,495,339,200 seconds after the epoch. Now we can look at midnight plus one second.
t2 <- as.POSIXct("2017-05-21 00:00:01")
t2
# [1] "2017-05-21 00:00:01 EDT"
as.integer(t2)
# [1] 1495339201
Which is one second higher than t1. So t1 is, in fact, midnight.
I am trying to convert UTC time to local standard time. I have found many functions which convert to Local Daylight Time, but I was not successful in getting the standard time. Right now, I have the following code which converts to local daylight time at my specific timezone:
pb.date <- as.POSIXct(date,tz="UTC")
format(pb.date, tz="timeZone",usetz=TRUE)
I would appreciate any help.
First, POSIXct date-times are always UCT internally. The print.POSIXt and format.POSIXt methods will appropriately make the TZ shift on output from their internal representations:
pb.date <- as.POSIXct(Sys.Date())
Sys.Date()
#[1] "2015-07-09"
So that was midnight of the current date in Greenwich:
format(pb.date, tz="America/Los_Angeles",usetz=TRUE)
#[1] "2015-07-08 17:00:00 PDT"
When it's midnight in Greenwich, it's 5PM Daylight Time in the previous day on the Left Coast of the US. You need to use the correct character values for your TZ (and your OS) both of which at the moment are unspecified.
The US Pacific timezone is 8 hours behind GMT (in winter months) so you can use a timezone that is Standard/Daylight-agnostic:
> format(pb.date,usetz=TRUE, tz="Etc/GMT+8")
[1] "2015-07-08 16:00:00 GMT+8"
(Notice the reversal of + with "behind" and - with "ahead".)
I know this question has an accepted answer, but in case anyone comes along and this can help. I needed a function to convert UTC times to MTN time (Server is in UTC, we operate in MTN).
Not sure why, but needed to force it to UTC/GMT first and the convert it to MTN. However it does work
mtn_ts = function(utcTime){
library(lubridate)
toTz = "us/mountain"
utcTime = force_tz(utcTime,tzone= 'GMT')
dt = as.POSIXct(format(utcTime,tz = toTz,origin ='GMT', usetz=TRUE))
dt = force_tz(dt,tzone= toTz)
return(dt)
}
mtn_ts(as.POSIXct("2021-09-27 14:48:51.000000000"))
I am trying to convert UTC time to local standard time. I have found many functions which convert to Local Daylight Time, but I was not successful in getting the standard time. Right now, I have the following code which converts to local daylight time at my specific timezone:
pb.date <- as.POSIXct(date,tz="UTC")
format(pb.date, tz="timeZone",usetz=TRUE)
I would appreciate any help.
First, POSIXct date-times are always UCT internally. The print.POSIXt and format.POSIXt methods will appropriately make the TZ shift on output from their internal representations:
pb.date <- as.POSIXct(Sys.Date())
Sys.Date()
#[1] "2015-07-09"
So that was midnight of the current date in Greenwich:
format(pb.date, tz="America/Los_Angeles",usetz=TRUE)
#[1] "2015-07-08 17:00:00 PDT"
When it's midnight in Greenwich, it's 5PM Daylight Time in the previous day on the Left Coast of the US. You need to use the correct character values for your TZ (and your OS) both of which at the moment are unspecified.
The US Pacific timezone is 8 hours behind GMT (in winter months) so you can use a timezone that is Standard/Daylight-agnostic:
> format(pb.date,usetz=TRUE, tz="Etc/GMT+8")
[1] "2015-07-08 16:00:00 GMT+8"
(Notice the reversal of + with "behind" and - with "ahead".)
I know this question has an accepted answer, but in case anyone comes along and this can help. I needed a function to convert UTC times to MTN time (Server is in UTC, we operate in MTN).
Not sure why, but needed to force it to UTC/GMT first and the convert it to MTN. However it does work
mtn_ts = function(utcTime){
library(lubridate)
toTz = "us/mountain"
utcTime = force_tz(utcTime,tzone= 'GMT')
dt = as.POSIXct(format(utcTime,tz = toTz,origin ='GMT', usetz=TRUE))
dt = force_tz(dt,tzone= toTz)
return(dt)
}
mtn_ts(as.POSIXct("2021-09-27 14:48:51.000000000"))