I am playing around with datetime stuff in R and cannot figure out how to alter the time origin to accept older dates. For example:
vals <- as.character(60:70)
as.POSIXct(vals, origin="1900-01-01", format = "%y")
# [1] "2060-07-25 EDT" "2061-07-25 EDT" "2062-07-25 EDT" "2063-07-25 EDT"
# [5] "2064-07-25 EDT" "2065-07-25 EDT" "2066-07-25 EDT" "2067-07-25 EDT"
# [9] "2068-07-25 EDT" "1969-07-25 EDT" "1970-07-25 EDT"
Is it possible to adjust the origin such that as.POSIXct will return 1960 for an input of "60"? What is the best way to handle an ambiguous century?
You can't make as.POSIXct return 1960 for an input of "60". See ?strptime:
‘%y’ Year without century (00-99). On input, values 00 to 68 are
prefixed by 20 and 69 to 99 by 19 - that is the behaviour
specified by the 2004 and 2008 POSIX standards, but they do
also say ‘it is expected that in a future version the default
century inferred from a 2-digit year will change’.
You need to prepend the century to the string and use the "%Y" format if you want different behavior with as.POSIXct.
vals <- as.character(60:70)
as.POSIXct(paste0("19",vals), format = "%Y")
If some of the two-digit dates are after 2000, you can use ifelse or something similar to prepend a different century.
newvals <- paste0(ifelse(vals < "20", "20", "19"), vals)
Assuming that you might want some years greater than 2000, prepending 19 to the vector might not be desirable.
In this case subtracting 100 years might be better.
library(lubridate)
vals <- as.character(60:70)
vals <- as.POSIXct(vals, origin="1900-01-01", format = "%y")
vals[year(vals)>2059] <- vals[year(vals)>2059] - years(100)
vals
[1] "1960-07-25 CDT" "1961-07-25 CDT" "1962-07-25 CDT"
[4] "1963-07-25 CDT" "1964-07-25 CDT" "1965-07-25 CDT"
[7] "1966-07-25 CDT" "1967-07-25 CDT" "1968-07-25 CDT"
[10] "1969-07-25 CDT" "1970-07-25 CDT"
Related
This question already has answers here:
Convert date-time string to class Date
(4 answers)
Closed 3 years ago.
I have date&time stamp as a character variable
"2018-12-13 11:00:01 EST" "2018-10-23 22:00:01 EDT" "2018-11-03 14:15:00 EDT" "2018-10-04 19:30:00 EDT" "2018-11-10 17:15:31 EST" "2018-10-05 13:30:00 EDT"
How can I strip the time from this character vector?
PS: Can someone please help. I have tried using strptime but I am getting NA values as a result
It's a bit unclear whether you want the date or time but if you want the date then as.Date ignores any junk after the date so:
x <- c("2018-12-13 11:00:01 EST", "2018-10-23 22:00:01 EDT")
as.Date(x)
## [1] "2018-12-13" "2018-10-23"
would be sufficient to get a Date vector from the input vector x. No packages are used.
If you want the time then:
read.table(text = x, as.is = TRUE)[[2]]
## [1] "11:00:01" "22:00:01"
If you want a data frame with each part in a separate column then:
read.table(text = x, as.is = TRUE, col.names = c("date", "time", "tz"))
## date time tz
## 1 2018-12-13 11:00:01 EST
## 2 2018-10-23 22:00:01 EDT
I think the OP wants to extract the time from date-time variable (going by the title of the question).
x <- "2018-12-13 11:00:01 EST"
as.character(strptime(x, "%Y-%m-%d %H:%M:%S"), "%H:%M:%S")
[1] "11:00:01"
Another option:
library(lubridate)
format(ymd_hms(x, tz = "EST"), "%H:%M:%S")
[1] "11:00:01"
The package lubridate makes everything like this easy:
library(lubridate)
x <- "2018-12-13 11:00:01 EST"
as_date(ymd_hms(x))
You can use the as.Date function and specify the format
> as.Date("2018-12-13 11:00:01 EST", format="%Y-%m-%d")
[1] "2018-12-13"
If all values are in a vector:
x = c("2018-12-13 11:00:01 EST", "2018-10-23 22:00:01 EDT",
"2018-11-03 14:15:00 EDT", "2018-10-04 19:30:00 EDT",
"2018-11-10 17:15:31 EST", "2018-10-05 13:30:00 EDT")
> as.Date(x, format="%Y-%m-%d")
[1] "2018-12-13" "2018-10-23" "2018-11-03" "2018-10-04" "2018-11-10"
[6] "2018-10-05"
This question already has answers here:
How to change the date format from yearmon to yyyy-mm-dd?
(2 answers)
Converting year and month ("yyyy-mm" format) to a date?
(9 answers)
Closed 5 years ago.
I have some data in year-month form that I want to format for graphing in ggplot.
date <- c("2016-03", "2016-04", "2016-05", "2016-06", "2016-07", "2016-08",
"2016-09", "2016-10", "2016-11", "2016-12")
I was using parsedate::parse_date, but since updating R it no longer functioning.
I have looked at
Format Date (Year-Month) in R
as.yearmon works fine but it doesn't format to POSIXct which I need for ggplot. Other formatting such as as.POSIXct and strptime are giving NAs or errors.
Note: I don't mind if the first of the month gets added to the "year-mo" format.
That's a FAQ: a date is comprised of day, month and year. You are missing one part. So by adding a day, say, '-01', you can impose the missing string structure and parse. Or you can use a more tolerant parser:
R> library(anytime)
R> anydate("2017-06")
[1] "2017-06-01"
R>
Which works for your data too:
R> date
[1] "2016-03" "2016-04" "2016-05" "2016-06"
[5] "2016-07" "2016-08" "2016-09" "2016-10"
[9] "2016-11" "2016-12"
R> anydate(date)
[1] "2016-03-01" "2016-04-01" "2016-05-01"
[4] "2016-06-01" "2016-07-01" "2016-08-01"
[7] "2016-09-01" "2016-10-01" "2016-11-01"
[10] "2016-12-01"
R>
Lastly, your request for POSIXct type is still short an hour, minute and second. But by the same principle:
R> anytime(date)
[1] "2016-03-01 CST" "2016-04-01 CDT"
[3] "2016-05-01 CDT" "2016-06-01 CDT"
[5] "2016-07-01 CDT" "2016-08-01 CDT"
[7] "2016-09-01 CDT" "2016-10-01 CDT"
[9] "2016-11-01 CDT" "2016-12-01 CST"
R>
These two functions return proper Date and POSIXct types, respectively.
I would like to convert the following dates
dates <-c(1149318000L, 1151910000L, 1154588400L, 1157266800L, 1159858800L, 1162540800L)
into date and time format
I don't know the origin of the date but I know that
1146685218 = 2006/05/03 07:00:00
** Update 1 **
I have sorted the unformatted dates and replace the sample above with a friendly sequence but I have real dates. I was thinking of using the the above key as origin, but It does not seem to work.
let
seconds_in_days <- 3600*24
(dates[2]-dates[1])/seconds_in_days
## [1] 30
If you know
1146685218 = 2006/05/03 07:00:00
then just make that the origin
dates <- c(1149318000L, 1151910000L, 1154588400L, 1157266800L, 1159858800L, 1162540800L)
orig.int <- 1146685218
orig.date <- as.POSIXct("2006/05/03 07:00:00", format="%Y/%m/%d %H:%M:%S")
as.POSIXct(dates-orig.int, origin=orig.date)
# [1] "2006-06-02 18:19:42 EDT" "2006-07-02 18:19:42 EDT" "2006-08-02 18:19:42 EDT"
# [4] "2006-09-02 18:19:42 EDT" "2006-10-02 18:19:42 EDT" "2006-11-02 18:19:42 EST"
This works assuming your "date" values are the number of seconds since a particular date/time which is how POSIXt stores it's date/time values.
When I put a single date to be parsed, it parses accurately
> ymd("20011001")
[1] "2001-10-01 UTC"
But when I try to create a vector of dates they all come out one day off:
> b=c(ymd("20111001"),ymd("20101001"),ymd("20091001"),ymd("20081001"),ymd("20071001"),ymd("20061001"),ymd("20051001"),ymd("20041001"),ymd("20031001"),ymd("20021001"),ymd("20011001"))
> b
[1] "2011-09-30 19:00:00 CDT" "2010-09-30 19:00:00 CDT" "2009-09-30 19:00:00 CDT"
[4] "2008-09-30 19:00:00 CDT" "2007-09-30 19:00:00 CDT" "2006-09-30 19:00:00 CDT"
[7] "2005-09-30 19:00:00 CDT" "2004-09-30 19:00:00 CDT" "2003-09-30 19:00:00 CDT"
[10] "2002-09-30 19:00:00 CDT" "2001-09-30 19:00:00 CDT"
how can I fix this??? Many thanks.
I don't claim to understand exactly what's going on here, but the proximal problem is that c() strips attributes, so using c() on a POSIX[c?]t vector changes it from UTC to the time zone specified by your locale strips the time zone attribute, messing it up (even if you set the time zone to agree with the one specified by your locale). On my system:
library(lubridate)
(y1 <- ymd("20011001"))
## [1] "2001-10-01 UTC"
(y2 <- ymd("20011002"))
c(y1,y2)
## now in EDT (and a day earlier/4 hours before UTC):
## [1] "2001-09-30 20:00:00 EDT" "2001-10-01 20:00:00 EDT"
(y12 <- ymd(c("20011001","20011002")))
## [1] "2001-10-01 UTC" "2001-10-02 UTC"
c(y12)
## back in EDT
## [1] "2001-09-30 20:00:00 EDT" "2001-10-01 20:00:00 EDT"
You can set the time zone explicitly ...
y3 <- ymd("20011001",tz="EDT")
## [1] "2001-10-01 EDT"
But c() is still problematic.
(y3c <- c(y3))
## [1] "2001-09-30 20:00:00 EDT"
So two solutions are
convert a character vector rather than combining the objects after converting them one by one or
restore the tzone attribute after combining.
For example:
attr(y3c,"tzone") <- attr(y3,"tzone")
#Joran points out that this is almost certainly a general property of applying c() to POSIX[c?]t objects, not specifically lubridate-related. I hope someone will chime in and explain whether this is a well-known design decision/infelicity/misfeature.
Update: there is some discussion of this on R-help in 2012, and Brian Ripley comments:
But in any case, the documentation (?c.POSIXct) is clear:
Using ‘c’ on ‘"POSIXlt"’ objects converts them to the current time
zone, and on ‘"POSIXct"’ objects drops any ‘"tzone"’ attributes
(even if they are all marked with the same time zone).
So the recommended way is to add a "tzone" attribute if you know what
you want it to be. POSIXct objects are absolute times: the timezone
merely affects how they are converted (including to character for
printing).
It might be nice if lubridate added a method to do this ...
I'm currently playing around a lot with dates and times for a package I'm building.
Stumbling across this post reminded me again that it's generally not a bad idea to check out if something can be done with basic R features before turning to contrib packages.
Thus, is it possible to round a date of class POSIXct with base R functionality?
I checked
methods(round)
which "only" gave me
[1] round.Date round.timeDate*
Non-visible functions are asterisked
This is what I'd like to do (Pseudo Code)
x <- as.POSIXct(Sys.time())
[1] "2012-07-04 10:33:55 CEST"
round(x, atom="minute")
[1] "2012-07-04 10:34:00 CEST"
round(x, atom="hour")
[1] "2012-07-04 11:00:00 CEST"
round(x, atom="day")
[1] "2012-07-04 CEST"
I know this can be done with timeDate, lubridate etc., but I'd like to keep package dependencies down. So before going ahead and checking out the source code of the respective packages, I thought I'd ask if someone has already done something like this.
base has round.POSIXt to do this. Not sure why it doesn't come up with methods.
x <- as.POSIXct(Sys.time())
x
[1] "2012-07-04 10:01:08 BST"
round(x,"mins")
[1] "2012-07-04 10:01:00 BST"
round(x,"hours")
[1] "2012-07-04 10:00:00 BST"
round(x,"days")
[1] "2012-07-04"
On this theme with lubridate, also look into the ceiling_date() and floor_date() functions:
x <- as.POSIXct("2009-08-03 12:01:59.23")
ceiling_date(x, "second")
# "2009-08-03 12:02:00 CDT"
ceiling_date(x, "hour")
# "2009-08-03 13:00:00 CDT"
ceiling_date(x, "day")
# "2009-08-04 CDT"
ceiling_date(x, "week")
# "2009-08-09 CDT"
ceiling_date(x, "month")
# "2009-09-01 CDT"
If you don't want to call external libraries and want to keep POSIXct as I do this is one idea (inspired by this question): use strptime and paste a fake month and day. It should be possible to do it more straight forward, as said in this comment
"For strptime the input string need not specify the date completely:
it is assumed that unspecified seconds, minutes or hours are zero, and
an unspecified year, month or day is the current one."
Thus it seems that you have to use strftime to output a truncated string, paste the missing part and convert again in POSIXct.
This is how an update answer could look:
x <- as.POSIXct(Sys.time())
x
[1] "2018-12-27 10:58:51 CET"
round(x,"mins")
[1] "2018-12-27 10:59:00 CET"
round(x,"hours")
[1] "2018-12-27 11:00:00 CET"
round(x,"days")
[1] "2018-12-27 CET"
as.POSIXct(paste0(strftime(x,format="%Y-%m"),"-01")) #trunc by month
[1] "2018-12-01 CET"
as.POSIXct(paste0(strftime(x,format="%Y"),"-01-01")) #trunc by year
[1] "2018-01-01 CET"