I was looking to convert the following character string in a date format:
string <- '22APR2020'
date <- as.Date(string, format='%d%b%y')
date
>>> NA
Unfortunately, the result is NA and I need to convert it as a date in order to being able to calculate the time between two dates.
Thanks you
You could use the anytime package:
library(anytime)
string <- '22APR2020'
anytime::anydate(string)
[1] "2020-04-22"
When in doubt always use lubridate:
string <- '22APR2020'
library(lubridate)
dmy(string)
[1] "2020-04-22"
here dmy is order of date, month and year appearance.
The problem is your locale setting. It probably is set to a language where the fourth month is not abbreviated as "APR".
Sys.setlocale("LC_TIME", "French")
string <- '22APR2020'
as.Date(string, format='%d%b%Y')
#[1] NA
Sys.setlocale("LC_TIME", "German")
as.Date(string, format='%d%b%Y')
#[1] "2020-04-22"
Also, note the capital Y is used in the format string. It's important. y only refers to the decade (it would give the same result, by chance, for 2020, but give the wrong result for 2021).
Related
So I have this long dataset, where in one column I have a date specified as character in format YYYMMM, but month abbreviated. So for example 1995MAR, 1995APR and so on. How can I transform that to date format?
I tried as.Date but it obviously hasn't worked, and with lubridate::ymd which hasn't worked as well.
Using parse_date_time from lubridate
date <- "1995MAR"
library(lubridate)
parse_date_time(date, order = "Yb")
Output:
[1] "1995-03-01 UTC"
Alternatively using zoo
library(zoo)
as.Date(as.yearmon(date, '%Y%b'))
Output:
"1995-03-01"
str(as.Date(as.yearmon(date, '%Y%b')))
Date[1:1], format: "1995-03-01"
In Base R, add a day number to parse:
date <- "1995MAR"
as.Date(paste(date, "01"), format = "%Y%b %d")
#[1] "1995-03-01"
I have a vector of date strings in the form month_name-2_digit_year i.e.
a = rbind("April-21", "March-21", "February-21", "January-21")
I'm trying to convert that vector into a vector of date objects. I'm aware this question is very similar to this: Convert non-standard date format to date in R posted some years ago, but unfortunately, it has not answered my question.
I have tried the following as.Date() calls to do this, but it just returns a vector of NA. I.e.
b = as.Date(a, format = "%B-%y")
b = as.Date(a, format = "%B%y")
b = as.Date(a, "%B-%y")
b = as.Date(a, "%B%y")
I'm also attempted to do it using the convertToDate function from the openxlsx package:
b = convertToDate(a, format = "%B-%y")
I have also tried all the above but using a single character string rather than a vector, but that produced the same issue.
I'm a little lost as to why this isn't working, as this format has worked in reverse earlier in my script (that is, I had a date object already in dd-mm-yyyy format and converted it to month_name-yy using %B-%y). Is there another way to go from string to date when the string is a non-standard (anything other than dd-mm-yyy or mm-dd-yy if you're in the US) date format?
For the record my R locales are all UK and english.
Thanks in advance.
A Date must have all three of day, month and year. Convert to yearmon class which requires only month and year and then to Date as in (1) and (2) below or add the day as in (3).
(1) and (3) give first of month and (2) gives the end of the month.
(3) uses only functions from base R.
Also consider not converting to Date at all but just use yearmon objects instead since they directly represent a year and month which is what the input represents.
library(zoo)
# test input
a <- c("April-21", "March-21", "February-21", "January-21")
# 1
as.Date(as.yearmon(a, "%B-%y"))
## [1] "2021-04-01" "2021-03-01" "2021-02-01" "2021-01-01"
# 2
as.Date(as.yearmon(a, "%B-%y"), frac = 1)
## [1] "2021-04-30" "2021-03-31" "2021-02-28" "2021-01-31"
# 3
as.Date(paste(1, a), "%d %B-%y")
## [1] "2021-04-01" "2021-03-01" "2021-02-01" "2021-01-01"
In addition to zoo, which #G. Grothendieck mentioned, you can also use clock or lubridate.
clock supports a variable precision calendar type called year_month_day. In this case you'd want "month" precision, then you can set the day to whatever you'd like and convert back to Date.
library(clock)
x <- c("April-21", "March-21", "February-21", "January-21")
ymd <- year_month_day_parse(x, format = "%B-%y", precision = "month")
ymd
#> <year_month_day<month>[4]>
#> [1] "2021-04" "2021-03" "2021-02" "2021-01"
# First of month
as.Date(set_day(ymd, 1))
#> [1] "2021-04-01" "2021-03-01" "2021-02-01" "2021-01-01"
# End of month
as.Date(set_day(ymd, "last"))
#> [1] "2021-04-30" "2021-03-31" "2021-02-28" "2021-01-31"
The simplest solution may be to use lubridate::my(), which parses strings in the order of "month then year". That assumes that you want the first day of the month, which may or may not be correct for you.
library(lubridate)
x <- c("April-21", "March-21", "February-21", "January-21")
# Assumes first of month
my(x)
#> [1] "2021-04-01" "2021-03-01" "2021-02-01" "2021-01-01"
Dates formatted as 4/29/2016 are parsed correctly, but dates formatted as 6242016 and 2042016 are not parsed.
Does R think that some of the dates without the slash have day first instead of month?
I've tried including dmy in lubrdiate and it still doesn't work.
I've tried looking at Sys.getlocale("LC_TIME") and it gives me "English_United States.1252".
demo$date <- as.character(demo$date)
demo <- demo %>%
mutate(date = parse_date_time(date, "mdy"))
You can wrangle your dates all into the same format using stringr. Then convert to numeric and use lubridate to parse.
library(stringr)
library(lubridate)
dates <- c("6242016", "2042016", "4/29/2016")
dates <- str_remove_all(dates, "/")
dates <- as.numeric(dates)
lubridate::mdy(dates)
# [1] "2016-06-24" "2016-02-04" "2016-04-29"
as.Date(sprintf("%08d",
as.numeric(gsub("/", "", c("6242016", "2042016", "4/29/2016")))),
format = "%m%d%Y")
# [1] "2016-06-24" "2016-02-04" "2016-04-29"
This
as.Date("2042016", "%m%d%Y")
returns NA as opposed to
as.Date("02042016", "%m%d%Y")
This is because the month has to be represented by two digits (00-12)
Try adding a leading zero for months in the range [1,9].
If I convert the date 10.10.61 (DD.MM.YY) with as.Date(date, format="%d.%m.%y") for some reason it converts it into 2061-10-10.
Is there an elegant way to correct for this or do I have to do it manually by slicing the string and adding "19" in front?
I've also tried the zoo package which brings up the same (wrong) result.
x = format(as.Date("10.10.61", "%d.%m.%y"), "19%y-%m-%d")
x = as.Date(x)
x
class(x)
Note that a single sub will slice the string and prepend the year with 19 so it is not so onerous:
as.Date(sub("(..)$", "19\\1", date), "%d.%m.%Y")
## [1] "1961-10-10"
chron Alternately, the chron package defaults to a cutoff of 30 so it will use 1961 by default:
library(chron)
as.Date(dates(date, format = "d.m.y"))
## [1] "1961-10-10"
In chron the year expansion rule is defined by the "chron.year.expand" option which by default is set to the year.expand function and that function's default cut.off is 30. See this SO post for more info:
Add correct century to dates with year provided as "Year without century", %y
If all your dates in 1900s, then you can use this solution:
library(magrittr)
your_date = '10.10.61'
as.Date(your_date,format="%d.%m.%y") %>% format("19%y%m%d") %>% as.Date("%Y%m%d")
I can't figure out how to turn Sys.Date() into a number in the format YYYYDDD. Where DDD is the day of the year, i.e. Jan 1 would be 2016001 Dec 31 would be 2016365
Date <- Sys.Date() ## The Variable Date is created as 2016-01-01
SomeFunction(Date) ## Returns 2016001
You can just use the format function as follows:
format(Date, '%Y%j')
which gives:
[1] "2016161" "2016162" "2016163"
If you want to format it in other ways, see ?strptime for all the possible options.
Alternatively, you could use the year and yday functions from the data.table or lubridate packages and paste them together with paste0:
library(data.table) # or: library(lubridate)
paste0(year(Date), yday(Date))
which will give you the same result.
The values that are returned by both options are of class character. Wrap the above solutions in as.numeric() to get real numbers.
Used data:
> Date <- Sys.Date() + 1:3
> Date
[1] "2016-06-09" "2016-06-10" "2016-06-11"
> class(Date)
[1] "Date"
Here's one option with lubridate:
library(lubridate)
x <- Sys.Date()
#[1] "2016-06-08"
paste0(year(x),yday(x))
#[1] "2016160"
This should work for creating a new column with the specified date format:
Date <- Sys.Date
df$Month_Yr <- format(as.Date(df$Date), "%Y%d")
But, especially when working with larger data sets, it is easier to do the following:
library(data.table)
setDT(df)[,NewDate := format(as.Date(Date), "%Y%d"
Hope this helps. May have to tinker if you only want one value and are not working with a data set.