format year-month to POSIXct [duplicate] - r

This question already has answers here:
How to change the date format from yearmon to yyyy-mm-dd?
(2 answers)
Converting year and month ("yyyy-mm" format) to a date?
(9 answers)
Closed 5 years ago.
I have some data in year-month form that I want to format for graphing in ggplot.
date <- c("2016-03", "2016-04", "2016-05", "2016-06", "2016-07", "2016-08",
"2016-09", "2016-10", "2016-11", "2016-12")
I was using parsedate::parse_date, but since updating R it no longer functioning.
I have looked at
Format Date (Year-Month) in R
as.yearmon works fine but it doesn't format to POSIXct which I need for ggplot. Other formatting such as as.POSIXct and strptime are giving NAs or errors.
Note: I don't mind if the first of the month gets added to the "year-mo" format.

That's a FAQ: a date is comprised of day, month and year. You are missing one part. So by adding a day, say, '-01', you can impose the missing string structure and parse. Or you can use a more tolerant parser:
R> library(anytime)
R> anydate("2017-06")
[1] "2017-06-01"
R>
Which works for your data too:
R> date
[1] "2016-03" "2016-04" "2016-05" "2016-06"
[5] "2016-07" "2016-08" "2016-09" "2016-10"
[9] "2016-11" "2016-12"
R> anydate(date)
[1] "2016-03-01" "2016-04-01" "2016-05-01"
[4] "2016-06-01" "2016-07-01" "2016-08-01"
[7] "2016-09-01" "2016-10-01" "2016-11-01"
[10] "2016-12-01"
R>
Lastly, your request for POSIXct type is still short an hour, minute and second. But by the same principle:
R> anytime(date)
[1] "2016-03-01 CST" "2016-04-01 CDT"
[3] "2016-05-01 CDT" "2016-06-01 CDT"
[5] "2016-07-01 CDT" "2016-08-01 CDT"
[7] "2016-09-01 CDT" "2016-10-01 CDT"
[9] "2016-11-01 CDT" "2016-12-01 CST"
R>
These two functions return proper Date and POSIXct types, respectively.

Related

Using anytime() function in R [duplicate]

This question already has an answer here:
Issue with the format in anytime package
(1 answer)
Closed yesterday.
I am trying to convert a string of date and time in R using anytime() function. The string have the format 'date-month-year hour-minute-second'. It seems that the anytime() function does not work for a few cases, as shown in the example below.
library(lubridate)
x1<-"03-01-2019 01:00:00"
x2<-"23-01-2019 17:00:00"
anytime(x1)
[1] "2019-03-01 01:00:00 CET"
anytime(x2)
[1] NA
I am trying to figure out how to get rid of this problem. Thanks for your help :)
You can use addFormats() to add a specific format to the stack known and used by anytime() (and/or anydate())
> library(anytime)
> addFormats("%d-%m-%Y %H:%M:%S")
> anytime(c("03-01-2019 01:00:00", "23-01-2019 17:00:00"))
[1] "2019-01-03 01:00:00 CST" "2019-01-23 17:00:00 CST"
>
As explained a few times before on this site, the xx-yy notation is ambiguous and interpreted differently in different parts of the world.
So anytime is guided by use as the separator: / is more common in North America so we use "mm/dd/yyy". On the other hand a hyphen is more common in Europe so the "dd-mm-yyyy" starts that way. You can use getFormats() to see the formats in anytime(). In a fresh session:
> head(getFormats(), 12) ## abbreviated for display here
[1] "%Y-%m-%d %H:%M:%S%f" "%Y-%m-%e %H:%M:%S%f" "%Y-%m-%d %H%M%S%f"
[4] "%Y-%m-%e %H%M%S%f" "%Y/%m/%d %H:%M:%S%f" "%Y/%m/%e %H:%M:%S%f"
[7] "%Y%m%d %H%M%S%f" "%Y%m%d %H:%M:%S%f" "%m/%d/%Y %H:%M:%S%f"
[10] "%m/%e/%Y %H:%M:%S%f" "%m-%d-%Y %H:%M:%S%f" "%m-%e-%Y %H:%M:%S%f"
>
You can use it without the head() to see all.
An alternative approach could be using parsedate package:
library(parsedate)
> parsedate::parse_date(x1)
[1] "2019-03-01 01:00:00 UTC"
> parsedate::parse_date(x2)
[1] "2019-01-23 17:00:00 UTC"

Converting long integer into date and time in r [duplicate]

This question already has answers here:
Convert integer to class Date
(3 answers)
Closed 1 year ago.
I have date and time information in the following format:
z <- 20201019083000
I want to convert it into a readable date and time format such as follows:
"2020-10-19 20:20"
So far I have tried this but cannot get the correct answer.
#in local
as.POSIXct(z, origin = "1904-01-01")
"642048-10-22 14:43:20 KST"
#in UTC
as.POSIXct(z, origin = "1960-01-01", tz = "GMT")
"642104-10-23 05:43:20 GMT"
#in
as.POSIXct(as.character(z), format = "%H%M%S")
"2021-07-13 20:20:10 KST"
Any better way to do it?
library(lubridate)
ymd_hms("20201019083000")
# [1] "2020-10-19 08:30:00 UTC"
# or, format the output:
format(ymd_hms("20201019083000"), "%Y-%m-%d %H:%M")
# "2020-10-19 08:30"
You can use as.POSIXct or strptime with the format %Y%m%d%H%M%S:
as.POSIXct(as.character(z), format="%Y%m%d%H%M%S")
#[1] "2020-10-19 08:30:00 CEST"
strptime(z, "%Y%m%d%H%M%S")
#[1] "2020-10-19 08:30:00 CEST"
Your tried format "%H%M%S" dos not include Year %Y , Month %m and Day %d.

How to handle an ambiguous century in datetime objects?

I am playing around with datetime stuff in R and cannot figure out how to alter the time origin to accept older dates. For example:
vals <- as.character(60:70)
as.POSIXct(vals, origin="1900-01-01", format = "%y")
# [1] "2060-07-25 EDT" "2061-07-25 EDT" "2062-07-25 EDT" "2063-07-25 EDT"
# [5] "2064-07-25 EDT" "2065-07-25 EDT" "2066-07-25 EDT" "2067-07-25 EDT"
# [9] "2068-07-25 EDT" "1969-07-25 EDT" "1970-07-25 EDT"
Is it possible to adjust the origin such that as.POSIXct will return 1960 for an input of "60"? What is the best way to handle an ambiguous century?
You can't make as.POSIXct return 1960 for an input of "60". See ?strptime:
‘%y’ Year without century (00-99). On input, values 00 to 68 are
prefixed by 20 and 69 to 99 by 19 - that is the behaviour
specified by the 2004 and 2008 POSIX standards, but they do
also say ‘it is expected that in a future version the default
century inferred from a 2-digit year will change’.
You need to prepend the century to the string and use the "%Y" format if you want different behavior with as.POSIXct.
vals <- as.character(60:70)
as.POSIXct(paste0("19",vals), format = "%Y")
If some of the two-digit dates are after 2000, you can use ifelse or something similar to prepend a different century.
newvals <- paste0(ifelse(vals < "20", "20", "19"), vals)
Assuming that you might want some years greater than 2000, prepending 19 to the vector might not be desirable.
In this case subtracting 100 years might be better.
library(lubridate)
vals <- as.character(60:70)
vals <- as.POSIXct(vals, origin="1900-01-01", format = "%y")
vals[year(vals)>2059] <- vals[year(vals)>2059] - years(100)
vals
[1] "1960-07-25 CDT" "1961-07-25 CDT" "1962-07-25 CDT"
[4] "1963-07-25 CDT" "1964-07-25 CDT" "1965-07-25 CDT"
[7] "1966-07-25 CDT" "1967-07-25 CDT" "1968-07-25 CDT"
[10] "1969-07-25 CDT" "1970-07-25 CDT"

Convert Integers to Time and Make Calculations in R

I’m having difficulties with a date time problem. My data frame looks like this and I want to find the duration that each person watches TV.
Start.Time <- c(193221,201231,152324,182243,123432,192245)
End.Time <- c(202013,211232,154521,183422,133121,201513)
cbind(Start.Time,End.Time)
I have tried different methods to convert them in order to be able to make calculation but I didn’t produce any significant results.
as.POSIXct(Start.Time , origin="2015-11-01")
My results are completely wrong
[1] "2015-11-03 05:40:21 GMT" "2015-11-03 07:53:51 GMT"
[3] "2015-11-02 18:18:44 GMT" "2015-11-03 02:37:23 GMT"
[5] "2015-11-02 10:17:12 GMT" "2015-11-03 05:24:05 GMT"
For example I want 193221 to become 19:32:21 HH:MM:SS
Is there a package out there that easily does the conversion? and if its possible i don't want the date displayed, just the time.
You can convert your numbers to actual time stamps (in POSIXct format) like this:
Start.Time <- c(193221,201231,152324,182243,123432,192245)
Start.POSIX <- as.POSIXct(as.character(Start.Time), format = "%H%M%S")
Start.POSIX
## [1] "2015-12-19 19:32:21 CET" "2015-12-19 20:12:31 CET" "2015-12-19 15:23:24 CET"
## [4] "2015-12-19 18:22:43 CET" "2015-12-19 12:34:32 CET" "2015-12-19 19:22:45 CET"
As you can see, as.POSIXct assumes the times to belong to the current date. POSIXct alway denotes a specific moment in time and thus contains not only a time but also a date. You can now easily do calculations with these:
End.Time <- c(202013,211232,154521,183422,133121,201513)
End.POSIX <- as.POSIXct(as.character(End.Time), format = "%H%M%S")
End.POSIX - Start.POSIX
## Time differences in mins
## [1] 47.86667 60.01667 21.95000 11.65000 56.81667 52.46667
When you print the POSIXct objects (as I did above with Start.POSIX) they are acutally converted to characters and these are printed. You can see this, because there are " around the dates. You can control the format that is used when printing and thus, you could print the times only as follows:
format(Start.POSIX, "%H:%M:%S")
## [1] "19:32:21" "20:12:31" "15:23:24" "18:22:43" "12:34:32" "19:22:45"

Round a POSIX date (POSIXct) with base R functionality

I'm currently playing around a lot with dates and times for a package I'm building.
Stumbling across this post reminded me again that it's generally not a bad idea to check out if something can be done with basic R features before turning to contrib packages.
Thus, is it possible to round a date of class POSIXct with base R functionality?
I checked
methods(round)
which "only" gave me
[1] round.Date round.timeDate*
Non-visible functions are asterisked
This is what I'd like to do (Pseudo Code)
x <- as.POSIXct(Sys.time())
[1] "2012-07-04 10:33:55 CEST"
round(x, atom="minute")
[1] "2012-07-04 10:34:00 CEST"
round(x, atom="hour")
[1] "2012-07-04 11:00:00 CEST"
round(x, atom="day")
[1] "2012-07-04 CEST"
I know this can be done with timeDate, lubridate etc., but I'd like to keep package dependencies down. So before going ahead and checking out the source code of the respective packages, I thought I'd ask if someone has already done something like this.
base has round.POSIXt to do this. Not sure why it doesn't come up with methods.
x <- as.POSIXct(Sys.time())
x
[1] "2012-07-04 10:01:08 BST"
round(x,"mins")
[1] "2012-07-04 10:01:00 BST"
round(x,"hours")
[1] "2012-07-04 10:00:00 BST"
round(x,"days")
[1] "2012-07-04"
On this theme with lubridate, also look into the ceiling_date() and floor_date() functions:
x <- as.POSIXct("2009-08-03 12:01:59.23")
ceiling_date(x, "second")
# "2009-08-03 12:02:00 CDT"
ceiling_date(x, "hour")
# "2009-08-03 13:00:00 CDT"
ceiling_date(x, "day")
# "2009-08-04 CDT"
ceiling_date(x, "week")
# "2009-08-09 CDT"
ceiling_date(x, "month")
# "2009-09-01 CDT"
If you don't want to call external libraries and want to keep POSIXct as I do this is one idea (inspired by this question): use strptime and paste a fake month and day. It should be possible to do it more straight forward, as said in this comment
"For strptime the input string need not specify the date completely:
it is assumed that unspecified seconds, minutes or hours are zero, and
an unspecified year, month or day is the current one."
Thus it seems that you have to use strftime to output a truncated string, paste the missing part and convert again in POSIXct.
This is how an update answer could look:
x <- as.POSIXct(Sys.time())
x
[1] "2018-12-27 10:58:51 CET"
round(x,"mins")
[1] "2018-12-27 10:59:00 CET"
round(x,"hours")
[1] "2018-12-27 11:00:00 CET"
round(x,"days")
[1] "2018-12-27 CET"
as.POSIXct(paste0(strftime(x,format="%Y-%m"),"-01")) #trunc by month
[1] "2018-12-01 CET"
as.POSIXct(paste0(strftime(x,format="%Y"),"-01-01")) #trunc by year
[1] "2018-01-01 CET"

Resources