Using anytime() function in R [duplicate] - r

This question already has an answer here:
Issue with the format in anytime package
(1 answer)
Closed yesterday.
I am trying to convert a string of date and time in R using anytime() function. The string have the format 'date-month-year hour-minute-second'. It seems that the anytime() function does not work for a few cases, as shown in the example below.
library(lubridate)
x1<-"03-01-2019 01:00:00"
x2<-"23-01-2019 17:00:00"
anytime(x1)
[1] "2019-03-01 01:00:00 CET"
anytime(x2)
[1] NA
I am trying to figure out how to get rid of this problem. Thanks for your help :)

You can use addFormats() to add a specific format to the stack known and used by anytime() (and/or anydate())
> library(anytime)
> addFormats("%d-%m-%Y %H:%M:%S")
> anytime(c("03-01-2019 01:00:00", "23-01-2019 17:00:00"))
[1] "2019-01-03 01:00:00 CST" "2019-01-23 17:00:00 CST"
>
As explained a few times before on this site, the xx-yy notation is ambiguous and interpreted differently in different parts of the world.
So anytime is guided by use as the separator: / is more common in North America so we use "mm/dd/yyy". On the other hand a hyphen is more common in Europe so the "dd-mm-yyyy" starts that way. You can use getFormats() to see the formats in anytime(). In a fresh session:
> head(getFormats(), 12) ## abbreviated for display here
[1] "%Y-%m-%d %H:%M:%S%f" "%Y-%m-%e %H:%M:%S%f" "%Y-%m-%d %H%M%S%f"
[4] "%Y-%m-%e %H%M%S%f" "%Y/%m/%d %H:%M:%S%f" "%Y/%m/%e %H:%M:%S%f"
[7] "%Y%m%d %H%M%S%f" "%Y%m%d %H:%M:%S%f" "%m/%d/%Y %H:%M:%S%f"
[10] "%m/%e/%Y %H:%M:%S%f" "%m-%d-%Y %H:%M:%S%f" "%m-%e-%Y %H:%M:%S%f"
>
You can use it without the head() to see all.

An alternative approach could be using parsedate package:
library(parsedate)
> parsedate::parse_date(x1)
[1] "2019-03-01 01:00:00 UTC"
> parsedate::parse_date(x2)
[1] "2019-01-23 17:00:00 UTC"

Related

Convert system time to New York time zone

Is there a way to convert Sys.time to New york time zone in R?. I tried with following, but did not work
as.POSIXct(Sys.time(), tz="EDT")
Can anyone help?
When you call Sys.time(), the time zone is already included in the string:
Sys.time()
#> [1] "2020-09-18 08:56:56 BST"
If you want to convert that time to a different timezone, you can set its "tzone" attribute:
`attr<-`(as.POSIXct(Sys.time()), "tzone", "EST")
#> [1] "2020-09-18 02:57:23 EST"
My system doesn't recognise "EDT" as a named time zone. I have to do:
`attr<-`(as.POSIXct(Sys.time()), "tzone", "America/New_York")
#> [1] "2020-09-18 03:59:55 EDT"
You can display time in any time-zone using format :
time <- Sys.time()
format(time, tz="America/New_York",usetz=TRUE)
#[1] "2020-09-18 03:59:39 EDT"
#this works too
format(time, tz="US/Eastern",usetz=TRUE)
#[1] "2020-09-18 03:59:39 EDT"

Convert Integers to Time and Make Calculations in R

I’m having difficulties with a date time problem. My data frame looks like this and I want to find the duration that each person watches TV.
Start.Time <- c(193221,201231,152324,182243,123432,192245)
End.Time <- c(202013,211232,154521,183422,133121,201513)
cbind(Start.Time,End.Time)
I have tried different methods to convert them in order to be able to make calculation but I didn’t produce any significant results.
as.POSIXct(Start.Time , origin="2015-11-01")
My results are completely wrong
[1] "2015-11-03 05:40:21 GMT" "2015-11-03 07:53:51 GMT"
[3] "2015-11-02 18:18:44 GMT" "2015-11-03 02:37:23 GMT"
[5] "2015-11-02 10:17:12 GMT" "2015-11-03 05:24:05 GMT"
For example I want 193221 to become 19:32:21 HH:MM:SS
Is there a package out there that easily does the conversion? and if its possible i don't want the date displayed, just the time.
You can convert your numbers to actual time stamps (in POSIXct format) like this:
Start.Time <- c(193221,201231,152324,182243,123432,192245)
Start.POSIX <- as.POSIXct(as.character(Start.Time), format = "%H%M%S")
Start.POSIX
## [1] "2015-12-19 19:32:21 CET" "2015-12-19 20:12:31 CET" "2015-12-19 15:23:24 CET"
## [4] "2015-12-19 18:22:43 CET" "2015-12-19 12:34:32 CET" "2015-12-19 19:22:45 CET"
As you can see, as.POSIXct assumes the times to belong to the current date. POSIXct alway denotes a specific moment in time and thus contains not only a time but also a date. You can now easily do calculations with these:
End.Time <- c(202013,211232,154521,183422,133121,201513)
End.POSIX <- as.POSIXct(as.character(End.Time), format = "%H%M%S")
End.POSIX - Start.POSIX
## Time differences in mins
## [1] 47.86667 60.01667 21.95000 11.65000 56.81667 52.46667
When you print the POSIXct objects (as I did above with Start.POSIX) they are acutally converted to characters and these are printed. You can see this, because there are " around the dates. You can control the format that is used when printing and thus, you could print the times only as follows:
format(Start.POSIX, "%H:%M:%S")
## [1] "19:32:21" "20:12:31" "15:23:24" "18:22:43" "12:34:32" "19:22:45"

Date conversion without specifying the format

I do not understand how the "ymd" function from the library "lubridate" works in R. I am trying to build a feature which converts the date correctly without having to specify the format. I am checking for the minimum number of NA's occurring as a result of dmy(), mdy() and ymd() functions.
So ymd() is giving NA sometimes and sometimes not for the same Date value. Are there any other functions or packages in R, which will help me get over this problem.
> data$DTTM[1:5]
[1] "4-Sep-06" "27-Oct-06" "8-Jan-07" "28-Jan-07" "5-Jan-07"
> ymd(data$DTTM[1])
[1] NA
Warning message:
All formats failed to parse. No formats found.
> ymd(data$DTTM[2])
[1] "2027-10-06 UTC"
> ymd(data$DTTM[3])
[1] NA
Warning message:
All formats failed to parse. No formats found.
> ymd(data$DTTM[4])
[1] "2028-01-07 UTC"
> ymd(data$DTTM[5])
[1] NA
Warning message:
All formats failed to parse. No formats found.
>
> ymd(data$DTTM[1:5])
[1] "2004-09-06 UTC" "2027-10-06 UTC" "2008-01-07 UTC" "2028-01-07 UTC"
[5] "2005-01-07 UTC"
Thanks
#user1317221_G has already pointed out that you dates are in day-month-year format, which suggests that you should use dmy instead of ymd. Furthermore, because your month is in %b format ("Abbreviated month name in the current locale"; see ?strptime), your problem may have something to do with your locale. The month names you have seem to be English, which may differ from how they are spelled in the locale you are currently using.
Let's see what happens when I try dmy on the dates in my locale:
date_english <- c("4-Sep-06", "27-Oct-06", "8-Jan-07", "28-Jan-07", "5-Jan-07")
dmy(date_english)
# [1] "2006-09-04 UTC" NA "2007-01-08 UTC" "2007-01-28 UTC" "2007-01-05 UTC"
# Warning message:
# 1 failed to parse.
"27-Oct-06" failed to parse. Let's check my time locale:
Sys.getlocale("LC_TIME")
# [1] "Norwegian (Bokmål)_Norway.1252"
dmy does not recognize "oct" as a valid %b month in my locale.
One way to deal with this issue would be to change "oct" to the corresponding Norwegian abbreviation, "okt":
date_nor <- c("4-Sep-06", "27-Okt-06", "8-Jan-07", "28-Jan-07", "5-Jan-07" )
dmy(date_nor)
# [1] "2006-09-04 UTC" "2006-10-27 UTC" "2007-01-08 UTC" "2007-01-28 UTC" "2007-01-05 UTC"
Another possibility is to use the original dates (i.e. in their original 'locale'), and set the locale argument in dmy. Exactly how this is done is platform dependent (see ?locales. Here is how I would do it in Windows:
dmy(date_english, locale = "English")
[1] "2006-09-04 UTC" "2006-10-27 UTC" "2007-01-08 UTC" "2007-01-28 UTC" "2007-01-05 UTC"
Using the guess_formats function in the lubridate package would be the closest to what you are after.
library(lubridate)
x <- c("4-Sep-06", "27-Oct-06","8-Jan-07" ,"28-Jan-07","5-Jan-2007")
format <- guess_formats(x, c("mdY", "BdY", "Bdy", "bdY", "bdy", "mdy", "dby"))
strptime(x, format)
HTH
from the documentation on ymd on page 70
As long as the order of formats is
correct, these functions will parse dates correctly even when the input vectors contain differently
formatted dates
ymd() expects year-month-day, you have day-month-year
x <- c("2009-01-01", "2009-01-02", "2009-01-03")
ymd(x)
maybe you need something like
y <- c("4-Sep-06", "27-Oct-06", "8-Jan-07", "28-Jan-07", "5-Jan-07" )
as.POSIXct(y, format = "%d-%b-%y")
PS the reason I think you get NAs for some is that you only have a single digit for year and ymd doesn't know what to do with that, but it works when you have two digits for year e.g. "27-Oct-06" "28-Jan-07" but fails for "5-Jan-07" etc

Round a POSIX date (POSIXct) with base R functionality

I'm currently playing around a lot with dates and times for a package I'm building.
Stumbling across this post reminded me again that it's generally not a bad idea to check out if something can be done with basic R features before turning to contrib packages.
Thus, is it possible to round a date of class POSIXct with base R functionality?
I checked
methods(round)
which "only" gave me
[1] round.Date round.timeDate*
Non-visible functions are asterisked
This is what I'd like to do (Pseudo Code)
x <- as.POSIXct(Sys.time())
[1] "2012-07-04 10:33:55 CEST"
round(x, atom="minute")
[1] "2012-07-04 10:34:00 CEST"
round(x, atom="hour")
[1] "2012-07-04 11:00:00 CEST"
round(x, atom="day")
[1] "2012-07-04 CEST"
I know this can be done with timeDate, lubridate etc., but I'd like to keep package dependencies down. So before going ahead and checking out the source code of the respective packages, I thought I'd ask if someone has already done something like this.
base has round.POSIXt to do this. Not sure why it doesn't come up with methods.
x <- as.POSIXct(Sys.time())
x
[1] "2012-07-04 10:01:08 BST"
round(x,"mins")
[1] "2012-07-04 10:01:00 BST"
round(x,"hours")
[1] "2012-07-04 10:00:00 BST"
round(x,"days")
[1] "2012-07-04"
On this theme with lubridate, also look into the ceiling_date() and floor_date() functions:
x <- as.POSIXct("2009-08-03 12:01:59.23")
ceiling_date(x, "second")
# "2009-08-03 12:02:00 CDT"
ceiling_date(x, "hour")
# "2009-08-03 13:00:00 CDT"
ceiling_date(x, "day")
# "2009-08-04 CDT"
ceiling_date(x, "week")
# "2009-08-09 CDT"
ceiling_date(x, "month")
# "2009-09-01 CDT"
If you don't want to call external libraries and want to keep POSIXct as I do this is one idea (inspired by this question): use strptime and paste a fake month and day. It should be possible to do it more straight forward, as said in this comment
"For strptime the input string need not specify the date completely:
it is assumed that unspecified seconds, minutes or hours are zero, and
an unspecified year, month or day is the current one."
Thus it seems that you have to use strftime to output a truncated string, paste the missing part and convert again in POSIXct.
This is how an update answer could look:
x <- as.POSIXct(Sys.time())
x
[1] "2018-12-27 10:58:51 CET"
round(x,"mins")
[1] "2018-12-27 10:59:00 CET"
round(x,"hours")
[1] "2018-12-27 11:00:00 CET"
round(x,"days")
[1] "2018-12-27 CET"
as.POSIXct(paste0(strftime(x,format="%Y-%m"),"-01")) #trunc by month
[1] "2018-12-01 CET"
as.POSIXct(paste0(strftime(x,format="%Y"),"-01-01")) #trunc by year
[1] "2018-01-01 CET"

R converting POSIXct dates with BST/GMT tags using as.Date()

I need to use as.Date on the index of a zoo object. Some of the dates are in BST and so when converting I lose a day on (only) these entries. I don't care about one hour's difference or even the time part of the date at all, I just want to make sure that the dates displayed stay the same. I'm guessing this is not very hard but I can't manage it. Can somebody help please?
class(xtsRet)
#[1] "xts" "zoo"
index(xtsRet)
#[1] "2007-07-31 BST" "2007-08-31 BST" "2007-09-30 BST" "2007-10-31 GMT"
class(index(xtsRet))
#[1] "POSIXt" "POSIXct"
index(xtsRet) <- as.Date(index(xtsRet))
index(xtsRet)
#[1] "2007-07-30" "2007-08-30" "2007-09-29" "2007-10-31"
Minimally reproducible example (not requiring zoo package):
my_date <- as.POSIXct("2007-04-01") # Users in non-UK timezone will need to
# do as.POSIXct("2007-04-01", "Europe/London")
my_date
#[1] "2017-04-01 BST"
as.Date(my_date)
#[1] "2017-03-31"
Suppose we have this sample data:
library(zoo)
x <- as.POSIXct("2000-01-01", tz = "GMT")
Then see if any of these are what you want:
# use current time zone
as.Date(as.character(x, tz = ""))
# use GMT
as.Date(as.character(x, tz = "GMT"))
# set entire session to GMT
Sys.setenv(TZ = "GMT")
as.Date(x)
Also try "BST" in place of "GMT" and note the article on dates and times in R News 4/1 .
You can offset the POSIX objects so its not based around midnight. 1 hour (3600 secs) should be sufficient:
d <- as.POSIXct(c("2007-07-31","2007-08-31","2007-09-30","2007-10-31"))
d
[1] "2007-07-31 BST" "2007-08-31 BST" "2007-09-30 BST" "2007-10-31 GMT"
as.Date(d)
[1] "2007-07-30" "2007-08-30" "2007-09-29" "2007-10-31"
as.Date(d+3600)
[1] "2007-07-31" "2007-08-31" "2007-09-30" "2007-10-31"
I would suggest using as.POSIXlt to convert to a date object, wrapped in as.Date:
d <- as.POSIXct(c("2007-07-31","2007-08-31","2007-09-30","2007-10-31"))
d
[1] "2007-07-31 BST" "2007-08-31 BST" "2007-09-30 BST" "2007-10-31 GMT"
as.Date(as.POSIXlt(d))
[1] "2007-07-31" "2007-08-31" "2007-09-30" "2007-10-31"
Achieves the same thing as the +3600 above, but slightly less of a hack

Resources