This question already has an answer here:
R conversion of week number (UK) to POSIXct issue
(1 answer)
Closed 9 years ago.
I have a POSIXct variable with the value "2012-04-15 16:49:36 CEST". The format function returns the year, week of the year and the weekday in decimal numbers, for this example 2012 15 0. The description of the format for those less familiar with it:
%Y: Year with century.
%W: Week of the year as decimal number
(00–53) using Monday as the first day of week (and typically with the
first Monday of the year as day 1 of week 1). The UK convention.
%w: Weekday as decimal number (0–6, Sunday is 0).
Then, I try to convert the values back to a POSIXct variable and something unexpected happens. When I read the values, the functions seems to interpret a wrong date (2012-04-08). However, the surprise comes when I do the same procedure with a second example using Sys.time() and it works as expected. Can someone explain me why it does not work in the first example?
(TS <- structure(1334501376, class = c("POSIXct", "POSIXt")))
(TS_YWw <- format(TS,format="%Y %W %w"))
as.POSIXct(TS_YWw,format="%Y %W %w")
(TS <- Sys.time())
(TS_YWw <- format(TS,format="%Y %W %w"))
as.POSIXct(TS_YWw,format="%Y %W %w")
Output
> (TS <- structure(1334501376, class = c("POSIXct", "POSIXt")))
[1] "2012-04-15 16:49:36 CEST"
> (TS_YWw <- format(TS,format="%Y %W %w"))
[1] "2012 15 0"
> as.POSIXct(TS_YWw,format="%Y %W %w")
[1] "2012-04-08 CEST"
>
> (TS <- Sys.time())
[1] "2013-05-16 15:27:44 CEST"
> (TS_YWw <- format(TS,format="%Y %W %w"))
[1] "2013 19 4"
> as.POSIXct(TS_YWw,format="%Y %W %w")
[1] "2013-05-16 CEST"
By the way, I ran the code on a Windows XP 32bit machine with R 2.15.3. Thank you all!
It seems like a bug. Below I create a sequence of the days in 2012 (dtimes) and convert to strings and back again using the '%Y %W %w' format. The two series are compared and the head output shows which datetimes weren't preserved in the conversion. There's an obvious weekly pattern. Note also that as.POSIXct('2012 0 0', '%Y %W %w') returns NA.
dtimes <- seq(as.POSIXct('2012-1-1'), as.POSIXct('2013-1-1'), by=as.difftime(1, units='days'))
convert.YWw <- function(dtime) {
fmt <- "%Y %W %w"
string <- format(dtime, format=fmt)
as.POSIXct(string, format=fmt)
}
converted <- lapply(dtimes, convert.YWw)
preserved <- dtimes == converted
dtimes.and.converted <- mapply(function(d, c) c(dtime=d, convert=c), dtimes, converted, SIMPLIFY=FALSE)
head(dtimes.and.converted[! preserved])
# [[1]]
# NULL
#
# [[2]]
# dtime convert
# "2012-01-08 EST" "2012-01-01 EST"
#
# [[3]]
# dtime convert
# "2012-01-15 EST" "2012-01-08 EST"
#
# [[4]]
# dtime convert
# "2012-01-22 EST" "2012-01-15 EST"
#
# [[5]]
# dtime convert
# "2012-01-29 EST" "2012-01-22 EST"
#
# [[6]]
# dtime convert
# "2012-02-05 EST" "2012-01-29 EST"
Related
This question already has answers here:
Convert integer to class Date
(3 answers)
Closed 1 year ago.
I have date and time information in the following format:
z <- 20201019083000
I want to convert it into a readable date and time format such as follows:
"2020-10-19 20:20"
So far I have tried this but cannot get the correct answer.
#in local
as.POSIXct(z, origin = "1904-01-01")
"642048-10-22 14:43:20 KST"
#in UTC
as.POSIXct(z, origin = "1960-01-01", tz = "GMT")
"642104-10-23 05:43:20 GMT"
#in
as.POSIXct(as.character(z), format = "%H%M%S")
"2021-07-13 20:20:10 KST"
Any better way to do it?
library(lubridate)
ymd_hms("20201019083000")
# [1] "2020-10-19 08:30:00 UTC"
# or, format the output:
format(ymd_hms("20201019083000"), "%Y-%m-%d %H:%M")
# "2020-10-19 08:30"
You can use as.POSIXct or strptime with the format %Y%m%d%H%M%S:
as.POSIXct(as.character(z), format="%Y%m%d%H%M%S")
#[1] "2020-10-19 08:30:00 CEST"
strptime(z, "%Y%m%d%H%M%S")
#[1] "2020-10-19 08:30:00 CEST"
Your tried format "%H%M%S" dos not include Year %Y , Month %m and Day %d.
I have a date that I convert to a numeric value and want to convert back to a date afterwards.
Converting date to numeric:
date1 = as.POSIXct('2017-12-30 15:00:00')
date1_num = as.numeric(date1)
# 1514646000
Reconverting numeric to date:
as.Date(date1_num, origin = '1/1/1970')
# "4146960-12-12"
What am I missing with the reconversion? I'd expect the last command to return my original date1.
As the numeric vector is created from an object with time component, reconversion can also be in the same way i.e. first to POSIXct and then wrap with as.Date
as.Date(as.POSIXct(date1_num, origin = '1970-01-01'))
#[1] "2017-12-30"
You could use anytime() and anydate() from the anytime package:
R> pt <- anytime("2017-12-30 15:00:00")
R> pt
[1] "2017-12-30 15:00:00 CST"
R>
R> anydate(pt)
[1] "2017-12-30"
R>
R> as.numeric(pt)
[1] 1514667600
R>
R> anydate(as.numeric(pt))
[1] "2017-12-30"
R>
POSIXct counts the number of seconds since the Unix Epoch, while Date counts the number of days. So you can recover the date by dividing by (60*60*24) (let's ignore leap seconds), or convert back to POSIXct instead.
as.Date(as.numeric(date1)/(60*60*24), origin="1970-01-01")
[1] "2017-12-30"
as.POSIXct(as.numeric(date1),origin="1970-01-01")
[1] "2017-12-30 15:00:00 GMT"
Using lubridate :
lubridate::as_datetime(1514646000)
[1] "2017-12-30 15:00:00 UTC"
I have dates encoded in a weekly time format (European convention >> 01 through 52/53, e.g. "2016-48") and would like to standardize them to a POSIX date:
require(magrittr)
(x <- as.POSIXct("2016-12-01") %>% format("%Y-%V"))
# [1] "2016-48"
as.POSIXct(x, format = "%Y-%V")
# [1] "2016-01-11 CET"
I expected the last statement to return "2016-12-01" again. What am I missing here?
Edit
Thanks to Dirk, I was able to piece it together:
y <- sprintf("%s-1", x)
While I still don't get why this doesn't work
(as.POSIXct(y, format = "%Y-%V-%u"))
# [1] "2016-01-11 CET"
this does
(as.POSIXct(y, format = "%Y-%U-%u")
# [1] "2016-11-28 CET"
Edit 2
Oh my, I think using %V is a very bad idea in general:
as.POSIXct("2016-01-01") %>% format("%Y-%V")
# [1] "2016-53"
Should this be considered to be on a "serious bug" level that requires further action?!
Sticking to either %U or %W seems to be the right way to go
as.POSIXct("2016-01-01") %>% format("%Y-%U")
# [1] "2016-00"
Edit 3
Nope, not quite finished/still puzzled: the approach doesn't work for the very first week
(x <- as.POSIXct("2016-01-01") %>% format("%Y-%W"))
# [1] "2016-00"
as.POSIXct(sprintf("%s-1", x), format = "%Y-%W-%u")
# [1] NA
It does for week 01 as defined in the underlying convention when using %U or %W (so "week 2", actually)
as.POSIXct("2016-01-1", format = "%Y-%W-%u")
# [1] "2016-01-04 CET"
As I have to deal a lot with reporting by ISO weeks, I've created the ISOweek package some years ago.
The package includes the function ISOweek2date() which returns the date of a given weekdate (year, week of the year, day of week according to ISO 8601). It's the inverse function to date2ISOweek().
With ISOweek, your examples become:
library(ISOweek)
# define dates to convert
dates <- as.Date(c("2016-12-01", "2016-01-01"))
# convert to full ISO 8601 week-based date yyyy-Www-d
(week_dates <- date2ISOweek(dates))
[1] "2016-W48-4" "2015-W53-5"
# convert back to class Date
ISOweek2date(week_dates)
[1] "2016-12-01" "2016-01-01"
Note that date2ISOweek() requires a full ISO week-based date in the format yyyy-Www-d including the day of the week (1 to 7, Monday to Sunday).
So, if you only have year and ISO week number you have to create a character string with a day of the week specified.
A typical phrase in many reports is, e.g., "reporting week 31 ending 2017-08-06":h
yr <- 2017
wk <- 31
ISOweek2date(sprintf("%4i-W%02i-%1i", yr, wk, 7))
[1] "2017-08-06"
Addendum
Please, see this answer for another use case and more background information on the ISOweek package.
I have a Data Frame with dates in this format 2014-06-10 06:12:35 BRT I would compare dates to see if they are part of the same social day (3:00 to 3:00 am on a day to another). But when I try to select only the day
format(as.Date(df$x,format="%Y-%m-%dT%H:%M:%SZ"), "%d"),
sometimes he adds + 1, for example
2014-06-13 22:54:36 BRT it shows 14.
And if I try to take the time
format(as.Date(df$x,format="%Y-%m-%dT%H:%M:%SZ"), "%H")
it appears always 00.
How should I work with dates in the R?
In R, times use POSIXct and POSIXlt classes and dates use the Date class.
Dates are stored as the number of days since January 1st, 1970 and times are stored as the number of seconds since January 1st, 1970.
So, for example:
d <- as.Date("1971-01-01")
unclass(d) # one year after 1970-01-01
# [1] 365
pct <- Sys.time() # in POSIXct
unclass(pct) # number of seconds since 1970-01-01
# [1] 1450276559
plt <- as.POSIXlt(pct)
up <- unclass(plt) # up is now a list containing the components of time
names(up)
# [1] "sec" "min" "hour" "mday" "mon" "year" "wday" "yday" "isdst" "zone"
# [11] "gmtoff"
up$hour
# [1] 9
To perform operations on dates and times:
plt - as.POSIXlt(d)
# Time difference of 16420.61 days
And to process dates, you can use strptime() (borrowing these examples from the manual page):
strptime("20/2/06 11:16:16.683", "%d/%m/%y %H:%M:%OS")
# [1] "2006-02-20 11:16:16 EST"
# And in vectorized form:
dates <- c("1jan1960", "2jan1960", "31mar1960", "30jul1960")
strptime(dates, "%d%b%Y")
# [1] "1960-01-01 EST" "1960-01-02 EST" "1960-03-31 EST" "1960-07-30 EDT"
To process your specific date[s]:
dateString <- "2014-06-10 06:12:35"
d <- strptime(dateString, "%Y-%m-%d %H:%M:%S")
NOTE: I'm not finding the "BRT" timezone in the R list of zones, so my time above is set to the EDT timezone:
"BRT" %in% OlsonNames()
# [1] FALSE
I'm probably doing something stupid and not seeing it, but:
> strptime("201101","%Y%m")
[1] NA
From help strptime:
%Y Year with century
%m Month as decimal number (01–12)
Just paste a day field (say, "01") that you ignore:
R> shortdate <- "201101"
R> as.Date(paste(shortdate, "01", sep=""), "%Y%m%d")
[1] "2011-01-01"
R>
I prefer as.Date() for dates and strptime() for POSIXct objects, i.e. dates and times.
You can then convert the parsed Date object into a POSIXlt object to retrieve year and month:
R> mydt <- as.Date(paste(shortdate, "01", sep=""), "%Y%m%d")
R> myp <- as.POSIXlt(mydt)
R> c(myp$year, myp$mon)
[1] 111 0
R>
This is standard POSIX behaviour with years as "year - 1900" and months as zero-indexed.
Edit seven years later: For completeness, and as someone just upvoted this, the functions in my anytime package can help:
R> anytime::anydate("201101") ## returns a Date
[1] "2011-01-01"
R> anytime::anytime("201101") ## returns a Datetime
[1] "2011-01-01 CST"
R>
The use a different parser (from Boost Date_time which is more generous and imputes the missing day (or day/hour/minute/second in the second case).