I am facing an issue with formatting of date and cannot find a solution. Here is the code - the second date becomes not the format like I want to.
date1
#[1] "01. Nov 11"
ndate1 <- as.Date(date1, "%d. %B %y")
ndate1
#[1] "2011-11-01"
date2
#[1] "26-May-13"
ndate2 <- as.Date(date2, "%d-%B-%y")
ndate2
#[1] NA
You can determine the complete or abbreviated month names in your locale using the example on the ?Constants page:
format(ISOdate(2000, 1:12, 1), "%b")
Per ?strptime on input you can use either "%B" or "%b" for either abbreviated or complete names.
This is most probably due to an incompatibility with the locale settings. If the output of Sys.getlocale("LC_TIME") does not correspond to an English setting, like "en_US.UTF-8" or "en_GB.UTF-8", the abbreviation "May" (which, coincidentally, is not even an abbreviation in this case) is not recognized in most (all?) other settings. In contrast, "Nov" is a valid abbreviation for the month of November in several languages. This might explain why the first case with date1 does not cause trouble.
We could try this:
Sys.setlocale("LC_TIME", "en_US.UTF-8")
date2 <- "26-May-13"
ndate2 <- as.Date(date2, "%d-%b-%y")
ndate2
#[1] "2013-05-26"
Related
I was looking to convert the following character string in a date format:
string <- '22APR2020'
date <- as.Date(string, format='%d%b%y')
date
>>> NA
Unfortunately, the result is NA and I need to convert it as a date in order to being able to calculate the time between two dates.
Thanks you
You could use the anytime package:
library(anytime)
string <- '22APR2020'
anytime::anydate(string)
[1] "2020-04-22"
When in doubt always use lubridate:
string <- '22APR2020'
library(lubridate)
dmy(string)
[1] "2020-04-22"
here dmy is order of date, month and year appearance.
The problem is your locale setting. It probably is set to a language where the fourth month is not abbreviated as "APR".
Sys.setlocale("LC_TIME", "French")
string <- '22APR2020'
as.Date(string, format='%d%b%Y')
#[1] NA
Sys.setlocale("LC_TIME", "German")
as.Date(string, format='%d%b%Y')
#[1] "2020-04-22"
Also, note the capital Y is used in the format string. It's important. y only refers to the decade (it would give the same result, by chance, for 2020, but give the wrong result for 2021).
I have dates in this format:
Apr-12,
Dec-12,
30-Jul-14,
Mar-16,
29-Feb-16,
May-17,
20-Nov-14,
R is treating it like factor variable. I want it to treat it like a date, and wherever the day of the date is missing, it should replace it with 1st.
Thank you in advance!
I think we need to parse them separately because the format is not consistent. We first parse the ones which have date, month and year component. The ones which return NA's are then parsed by adding "01" in them.
new_x <- as.Date(x, "%d-%b-%y")
new_x[is.na(new_x)] <- as.Date(paste0("01-", x[is.na(new_x)]), "%d-%b-%y")
new_x
#[1] "2012-04-01" "2012-12-01" "2014-07-30" "2016-03-01" "2016-02-29" "2017-05-01"
#[7] "2014-11-20"
Read more about formats at ?strptime.
data
x <-factor(c("Apr-12", "Dec-12", "30-Jul-14", "Mar-16", "29-Feb-16",
"May-17","20-Nov-14"))
Conditionally append a "01-" when the first three characters are not in the system vector, month.abb
as.Date( ifelse( substr(dtvec,1,3) %in% month.abb, paste0("01-",dtvec), dtvec) ,"%d-%b-%y")
[1] "2012-04-01" "2012-12-01" "2014-07-30" "2016-03-01" "2016-02-29" "2017-05-01" "2014-11-20"
I have dates encoded in a weekly time format (European convention >> 01 through 52/53, e.g. "2016-48") and would like to standardize them to a POSIX date:
require(magrittr)
(x <- as.POSIXct("2016-12-01") %>% format("%Y-%V"))
# [1] "2016-48"
as.POSIXct(x, format = "%Y-%V")
# [1] "2016-01-11 CET"
I expected the last statement to return "2016-12-01" again. What am I missing here?
Edit
Thanks to Dirk, I was able to piece it together:
y <- sprintf("%s-1", x)
While I still don't get why this doesn't work
(as.POSIXct(y, format = "%Y-%V-%u"))
# [1] "2016-01-11 CET"
this does
(as.POSIXct(y, format = "%Y-%U-%u")
# [1] "2016-11-28 CET"
Edit 2
Oh my, I think using %V is a very bad idea in general:
as.POSIXct("2016-01-01") %>% format("%Y-%V")
# [1] "2016-53"
Should this be considered to be on a "serious bug" level that requires further action?!
Sticking to either %U or %W seems to be the right way to go
as.POSIXct("2016-01-01") %>% format("%Y-%U")
# [1] "2016-00"
Edit 3
Nope, not quite finished/still puzzled: the approach doesn't work for the very first week
(x <- as.POSIXct("2016-01-01") %>% format("%Y-%W"))
# [1] "2016-00"
as.POSIXct(sprintf("%s-1", x), format = "%Y-%W-%u")
# [1] NA
It does for week 01 as defined in the underlying convention when using %U or %W (so "week 2", actually)
as.POSIXct("2016-01-1", format = "%Y-%W-%u")
# [1] "2016-01-04 CET"
As I have to deal a lot with reporting by ISO weeks, I've created the ISOweek package some years ago.
The package includes the function ISOweek2date() which returns the date of a given weekdate (year, week of the year, day of week according to ISO 8601). It's the inverse function to date2ISOweek().
With ISOweek, your examples become:
library(ISOweek)
# define dates to convert
dates <- as.Date(c("2016-12-01", "2016-01-01"))
# convert to full ISO 8601 week-based date yyyy-Www-d
(week_dates <- date2ISOweek(dates))
[1] "2016-W48-4" "2015-W53-5"
# convert back to class Date
ISOweek2date(week_dates)
[1] "2016-12-01" "2016-01-01"
Note that date2ISOweek() requires a full ISO week-based date in the format yyyy-Www-d including the day of the week (1 to 7, Monday to Sunday).
So, if you only have year and ISO week number you have to create a character string with a day of the week specified.
A typical phrase in many reports is, e.g., "reporting week 31 ending 2017-08-06":h
yr <- 2017
wk <- 31
ISOweek2date(sprintf("%4i-W%02i-%1i", yr, wk, 7))
[1] "2017-08-06"
Addendum
Please, see this answer for another use case and more background information on the ISOweek package.
Thanks to comments below, I realized I should use "%b" for "FEB" (originally I used "%m"; thanks for the reference to ?strptime). But my problem still stands.
When I do
as.Date("13-FEB-15", "%d-%b-%y")
# [1] NA
I know this will work:
as.Date("13-02-2015", "%d-%m-%Y")
# [1] "2015-02-13"
But is there a way to avoid converting FEB to 02 and 15 to 2015 in order to get my expected result? Thanks!
A general and useful diagnostic
Try this and what do you get?
format(strptime(Sys.Date(), format="%Y-%m-%d"), "%y-%b-%d")
I got
[1] "16- 7月-22"
Haha, the middle one is Chinese. So what is going wrong? Nothing wrong. The issue is that %b is sensitive to your current locale. When you read ?strptime, pay special attention to what format is sensitive to your current locale.
My locale is:
Sys.getlocale("LC_TIME")
#[1] "zh_CN.UTF-8"
Yep, that is in China region.
Locales make a difference in Date-Time format. On my machine:
as.Date("16-JUL-22", "%y-%b-%d")
# NA
as.Date("16- 7月-22", "%y-%b-%d")
#[1] "2016-07-22"
Now let's reset time locale:
Sys.setlocale("LC_TIME", "C")
as.Date("16-JUL-22", "%y-%b-%d")
#[1] "2016-07-22"
Wow, it works! Read ?locales for more, and you will understand what locale = "C" means.
Solution for you
Sys.setlocale("LC_TIME", "C")
as.Date("13-FEB-15", format = "%d-%b-%y")
Using lubridate:
library(lubridate)
date1 = "2014-12-11 00:00:00"
date2 = "14-DEC-11"
ymd_hms(date1) == ymd(date2,tz = "UTC")
These equal each other and should be able to be joined.
Apologies for the simple question, but I can't find help for this type of date.
April 5th, 2012 is saved as numeric as "20120405"
How can I convert a vector of such values into usable dates?
You just need the as.Date function:
R> x = "20120405"
R> as.Date(x, "%Y%m%d")
[1] "2012-04-05"
Look at the help file: ?as.Date, but essentially
%Y means year in the form 2012, use %y for 12.
%m is the month.
%d the day.
If your date had separators, say, 2012-04-05, then use something like: %Y-%m-%d. Alternatively, you can use:
R> strptime(x, "%Y%m%d")
[1] "2012-04-05"
In particular, you can pass vectors of dates to these functions, so:
R> y = c("20120405", "20121212")
R> as.Date(y, "%Y%m%d")
[1] "2012-04-05" "2012-12-12"
like this,
(foo <- as.Date("20120405", "%Y%m%d"))
# "2012-04-05"
and maybe you want to format to get the month printed out
format(foo, "%Y %b %d")
# "2012 Apr 05"
You could take a look at this page
With strptime you can convert it to POSIXlt class and with as.Date you can convert it to a Date class using format "%Y%m%d":
strptime( "20120405",format="%Y%m%d")
[1] "2012-04-05"
as.Date( "20120405",format="%Y%m%d")
[1] "2012-04-05"
Edit:
It is not really clear if you have character "20120405" or numeric 20120405. In the latter case you have to convert to character first with as.character(20120405)
You could also use the lubridate package:
library(lubridate)
ymd("20120405")