ymd with vector of dates - r

A simple question, I think. I have some dates, d:
d <- as.POSIXct(c("2014-01-01 00:00:00 BST", "2014-01-01 00:30:00 BST"))
> class(d)
[1] "POSIXct" "POSIXt"
If I try and extract just the date part with lubridate, it works fine with a single value but not the whole vector, i.e.:
> ymd(d[1])
[1] "2014-01-01 UTC"
> ymd(d)
[1] NA NA
Warning message:
All formats failed to parse. No formats found.
For the record, this works:
> as.Date(d, format="%F")
[1] "2014-01-01" "2014-01-01"
What's going on here?

Your issue is that your vector is not just year, month, day (ymd), but also hour, minute, second (hms). Consider using this instead:
ymd_hms(d)
If you want to just extract the date, you can use:
strftime(ymd_hms(d),'%Y-%m-%d')

Related

Format date in R with lubridate

My input data, formatted as character, looks like this
"2020-07-10T00:00:00"
I tried
library(lubridate)
mdy_hms("2020-07-10T00:00:00", format='%Y-%m-%dT%H:%M:%S', tz=Sys.timezone())
But I get
[1] NA NA
Warning message:
All formats failed to parse. No formats found.
I tried the more flexibel approach parse_date_time(), but without luck
parse_date_time("2020-07-10T00:00:00", '%Y-%m-%dT%H:%M:%S', tz=Sys.timezone())
How can I convert this date "2020-07-10T00:00:00" to a date R recognizes? Note: I am not interested in the time really, only the date!
Why not just
as.Date("2020-07-10T00:00:00")
# [1] "2020-07-10"
Fun fact:
as.Date("2020-07-101sddT00:1sdafsdfsdf0:00sdfzsdfsdfsdf")
# [1] "2020-07-10"
Assuming that the 07 is the month of July, and the 10 is the 10th:
x <- "2020-07-10T00:00:00"
ymd_hms(x, tz = Sys.timezone())
> [1] "2020-07-10 AEST"
If it's in format year-day-month, swap the ymd for ydm.
Hope this helps!

How to convert "date" in "chr" format to "date" format? [duplicate]

can anyone tell me why R give such outcome below:
> as.POSIXct("2013-01-01 08:00")
[1] "2013-01-01 08:00:00 HKT"
> as.Date(as.POSIXct("2013-01-01 08:00"))
[1] "2013-01-01"
> as.POSIXct("2013-01-01 07:00")
[1] "2013-01-01 07:00:00 HKT"
> as.Date(as.POSIXct("2013-01-01 07:00"))
[1] "2012-12-31"
Shouldn't it be 2013-01-01 after converting POSIXct to Date for 2013-01-01 07:00, is there any way to change the cutoff from 08:00 to 00:00?
Update #1
I found the following can fix my problem, but in a less neat way
> as.Date(as.character(as.POSIXct("2013-01-01 07:00")))
[1] "2013-01-01"
The problem here is timezones - you can see you're in "HKT". Try:
as.Date(as.POSIXct("2013-01-01 07:00", 'GMT'))
[1] "2013-01-01"
From ?as.Date():
["POSIXct" is] converted to days by ignoring the time after midnight
in the representation of the time in specified timezone, default UTC
Use the time zone parameter of as.Date:
as.Date(as.POSIXct("2013-01-01 07:00",tz="Hongkong"))
#[1] "2012-12-31"
as.Date(as.POSIXct("2013-01-01 07:00",tz="Hongkong"),tz="Hongkong")
#[1] "2013-01-01"
In fact, I recommend always using the tz parameter when using date-time converting functions. There are other nasty surprises, e.g. with daylight saving time.
This happens as documented and previously explained when contemporaneous UTC time is before (your third example) or after midnight on your POSIXct date. To see the math for yourself, inspect as.Date.POSIXct at the console. The math under the default tz="UTC" is clear. In the non-default case, R essentially calls as.Date.POSIXlt, and the date-travel does not occur. In fact, if you had started with the lt object you would not have had this problem:
> as.Date(as.POSIXlt("2013-01-01 07:00", tz = "Hongkong"))
[1] "2013-01-01"
The easiest work-around is to call as.Date with tz="" to force using the less offending as.Date.POSIXlt algorithm:
> as.Date(as.POSIXct("2013-01-01 07:00"), tz = "")
[1] "2013-01-01"

Error while converting to Date format in R

It should be an easy issue, but I got stacked with it. I have a data.frame with dates and values:
class(var_data)
[1] "tbl_df" "tbl" "data.frame"
var_data
A tibble: 42 x 2
date Tourists
<dttm> <dbl>
1 2006-03-01 00:00:00 55280.
2 2006-06-01 00:00:00 84392.
3 2006-09-01 00:00:00 132714.
Then I want to copy some dates and values into other data.frame:
var_list_DB$var_last[ii] <- var_data[last,"Tourists"]
var_list_DB$var_date_start[ii] <- var_data[1,"date"]
var_list_DB$var_date_last[ii] <- var_data[last,"date"]
But instead of dates I got numbers:
var_date_start var_date_last var_val_last
951868800 1496275200 10044.3162
And while trying to convert to date format, got an error:
as.Date(var_data[last,"date"], format = "%m/%d/%Y")
Error in as.Date.default(x, ...) :
do not know how to convert 'x' to class “Date”
I recently updated to 3.5.0 version, may be this is an issue.
Add as.character convertion before pass to date and move var_data to data.frame format, like this two examples using as.Date and as.POSIXct:
var_data<-data.frame(var_data)
as.Date(as.character(var_data[,"date"]))
[1] "2006-03-01" "2006-06-01" "2006-09-01"
as.POSIXct(as.character(var_data[,"date"]))
[1] "2006-03-01 CET" "2006-06-01 CEST" "2006-09-01 CEST"

Date conversion without specifying the format

I do not understand how the "ymd" function from the library "lubridate" works in R. I am trying to build a feature which converts the date correctly without having to specify the format. I am checking for the minimum number of NA's occurring as a result of dmy(), mdy() and ymd() functions.
So ymd() is giving NA sometimes and sometimes not for the same Date value. Are there any other functions or packages in R, which will help me get over this problem.
> data$DTTM[1:5]
[1] "4-Sep-06" "27-Oct-06" "8-Jan-07" "28-Jan-07" "5-Jan-07"
> ymd(data$DTTM[1])
[1] NA
Warning message:
All formats failed to parse. No formats found.
> ymd(data$DTTM[2])
[1] "2027-10-06 UTC"
> ymd(data$DTTM[3])
[1] NA
Warning message:
All formats failed to parse. No formats found.
> ymd(data$DTTM[4])
[1] "2028-01-07 UTC"
> ymd(data$DTTM[5])
[1] NA
Warning message:
All formats failed to parse. No formats found.
>
> ymd(data$DTTM[1:5])
[1] "2004-09-06 UTC" "2027-10-06 UTC" "2008-01-07 UTC" "2028-01-07 UTC"
[5] "2005-01-07 UTC"
Thanks
#user1317221_G has already pointed out that you dates are in day-month-year format, which suggests that you should use dmy instead of ymd. Furthermore, because your month is in %b format ("Abbreviated month name in the current locale"; see ?strptime), your problem may have something to do with your locale. The month names you have seem to be English, which may differ from how they are spelled in the locale you are currently using.
Let's see what happens when I try dmy on the dates in my locale:
date_english <- c("4-Sep-06", "27-Oct-06", "8-Jan-07", "28-Jan-07", "5-Jan-07")
dmy(date_english)
# [1] "2006-09-04 UTC" NA "2007-01-08 UTC" "2007-01-28 UTC" "2007-01-05 UTC"
# Warning message:
# 1 failed to parse.
"27-Oct-06" failed to parse. Let's check my time locale:
Sys.getlocale("LC_TIME")
# [1] "Norwegian (Bokmål)_Norway.1252"
dmy does not recognize "oct" as a valid %b month in my locale.
One way to deal with this issue would be to change "oct" to the corresponding Norwegian abbreviation, "okt":
date_nor <- c("4-Sep-06", "27-Okt-06", "8-Jan-07", "28-Jan-07", "5-Jan-07" )
dmy(date_nor)
# [1] "2006-09-04 UTC" "2006-10-27 UTC" "2007-01-08 UTC" "2007-01-28 UTC" "2007-01-05 UTC"
Another possibility is to use the original dates (i.e. in their original 'locale'), and set the locale argument in dmy. Exactly how this is done is platform dependent (see ?locales. Here is how I would do it in Windows:
dmy(date_english, locale = "English")
[1] "2006-09-04 UTC" "2006-10-27 UTC" "2007-01-08 UTC" "2007-01-28 UTC" "2007-01-05 UTC"
Using the guess_formats function in the lubridate package would be the closest to what you are after.
library(lubridate)
x <- c("4-Sep-06", "27-Oct-06","8-Jan-07" ,"28-Jan-07","5-Jan-2007")
format <- guess_formats(x, c("mdY", "BdY", "Bdy", "bdY", "bdy", "mdy", "dby"))
strptime(x, format)
HTH
from the documentation on ymd on page 70
As long as the order of formats is
correct, these functions will parse dates correctly even when the input vectors contain differently
formatted dates
ymd() expects year-month-day, you have day-month-year
x <- c("2009-01-01", "2009-01-02", "2009-01-03")
ymd(x)
maybe you need something like
y <- c("4-Sep-06", "27-Oct-06", "8-Jan-07", "28-Jan-07", "5-Jan-07" )
as.POSIXct(y, format = "%d-%b-%y")
PS the reason I think you get NAs for some is that you only have a single digit for year and ymd doesn't know what to do with that, but it works when you have two digits for year e.g. "27-Oct-06" "28-Jan-07" but fails for "5-Jan-07" etc

Date conversion from POSIXct to Date in R

can anyone tell me why R give such outcome below:
> as.POSIXct("2013-01-01 08:00")
[1] "2013-01-01 08:00:00 HKT"
> as.Date(as.POSIXct("2013-01-01 08:00"))
[1] "2013-01-01"
> as.POSIXct("2013-01-01 07:00")
[1] "2013-01-01 07:00:00 HKT"
> as.Date(as.POSIXct("2013-01-01 07:00"))
[1] "2012-12-31"
Shouldn't it be 2013-01-01 after converting POSIXct to Date for 2013-01-01 07:00, is there any way to change the cutoff from 08:00 to 00:00?
Update #1
I found the following can fix my problem, but in a less neat way
> as.Date(as.character(as.POSIXct("2013-01-01 07:00")))
[1] "2013-01-01"
The problem here is timezones - you can see you're in "HKT". Try:
as.Date(as.POSIXct("2013-01-01 07:00", 'GMT'))
[1] "2013-01-01"
From ?as.Date():
["POSIXct" is] converted to days by ignoring the time after midnight
in the representation of the time in specified timezone, default UTC
Use the time zone parameter of as.Date:
as.Date(as.POSIXct("2013-01-01 07:00",tz="Hongkong"))
#[1] "2012-12-31"
as.Date(as.POSIXct("2013-01-01 07:00",tz="Hongkong"),tz="Hongkong")
#[1] "2013-01-01"
In fact, I recommend always using the tz parameter when using date-time converting functions. There are other nasty surprises, e.g. with daylight saving time.
This happens as documented and previously explained when contemporaneous UTC time is before (your third example) or after midnight on your POSIXct date. To see the math for yourself, inspect as.Date.POSIXct at the console. The math under the default tz="UTC" is clear. In the non-default case, R essentially calls as.Date.POSIXlt, and the date-travel does not occur. In fact, if you had started with the lt object you would not have had this problem:
> as.Date(as.POSIXlt("2013-01-01 07:00", tz = "Hongkong"))
[1] "2013-01-01"
The easiest work-around is to call as.Date with tz="" to force using the less offending as.Date.POSIXlt algorithm:
> as.Date(as.POSIXct("2013-01-01 07:00"), tz = "")
[1] "2013-01-01"

Resources