Turning an actual date into an integer value in R - r

Using R, I want to be able to turn a date in format "2014-12-31" to an integer of 20141231, for the purpose of creating a serial number.
My aim is simply to be able to complete the REVERSE of this user question (Turning an integer string date into an actual date).
Appreciate any help provided.

Simply format the Date in that way, then call as.integer on the result.
R> as.integer(format(Sys.Date(), "%Y%m%d"))
[1] 20151008
If your date isn't a Date, and is a character string: either convert it to a Date first, or remove the "-" before calling as.integer.
R> dateString <- "2014-12-31"
R> as.integer(format(as.Date(dateString), "%Y%m%d"))
[1] 20141231
R> as.integer(gsub("-", "", dateString))
[1] 20141231

Or if you already have character dates:
a <- "2015-03-15"
as.integer(gsub(pattern = "-", replacement="", x = a))
#[1] 20150315

Related

What does calling as.numeric() do to a lubridate Date object?

I am working with an external package that's converting columns of a dataframe with the lubridate date type Date into numeric type. (Confirmed by running as.numeric() on the columns).
I'm wondering if there's a way to convert it back?
For example, if I have the date "O1-01-2021" then running as.numeric on it returns -719143. How can I turn that back into "O1-01-2021" ?
Note that Date class is part of base R, not lubridate.
You probably assumed that the data was year/month/day by mistake. Using base R to eliminate lubridate as a problem we can replicate the question's result like this:
as.numeric(as.Date("01-01-2021", "%Y-%m-%d"))
## [1] -719143
Had we used day/month/year we would have gotten:
as.numeric(as.Date("01-01-2021", "%d-%m-%Y"))
## [1] 18628
or using lubridate
library(lubridate)
as.numeric(dmy("01-01-2021"))
## [1] 18628
It would be best if you fix the mistake that resulted in -719143 but if you don't control that and are faced with an input of
-719143 and want to get as.Date("2021-01-01") as the output then:
# input x is numeric; result is Date class
fixup <- function(x) as.Date(format(.Date(x), "%y-%m-%d"), "%d-%m-%y")
fixup(-719143)
## [1] "2020-01-01"
Note that we can't tell from the question whether 01-01-2020 is supposed to represent day-month-year or month-day-year so we assumed the first but if it is to represent the second then it should be obvious at this point how to proceed.
EDIT #2: It looks like the original data is being parsed as Jan 20, year 1, which might happen if the year-month-day columns were jumbled while being parsed:
as.numeric(as.Date("01-01-2021", format = "%Y-%m-%d", origin = "1970-01-01"))
[1] -719143
as.numeric(as.Date("0001-01-20", origin = "1970-01-01"))
[1] -719143
Is there a way to share an example of the raw data as you have it? e.g. dput(MY_DATA[1:10, DATE_COL])
EDIT: -719143 is about 1970 years of days, which can't be a coincidence, given that many date/time formats use 1970 as a baseline. I wonder if 01-01-2021 is being interpreted as the numeric formula equal to -2021 and so we're looking at perhaps -2021 seconds/days/[?] before year zero, which would be about -1970 years before the epoch...
-719143/(365)
[1] -1970.255
For instance, we can get something close with:
as.numeric(as.Date("0000-01-01", origin = "1970-01-01"))
[1] -719528
Original answer:
R treats a string describing a date as text:
x <- "01-01-2021"
class(x)
[1] "character"
We can convert it to a Date data type using these two equivalent commands:
base_dt <- as.Date(x, "%m-%d-%Y") # base R version
lubridt <- lubridate::mdy(x) # convenience lubridate function
identical(base_dt, lubridt)
[1] TRUE
Under the hood, a Date object in R is a numeric value with a flag telling R it's a date:
> typeof(lubridt) # What general type of data is it?
[1] "double" # --> numeric, stored as a double
> as.numeric(lubridt)
[1] 18628
> class(lubridt) # Does it have any special class attributes?
[1] "Date" # --> yes, it's a Date
> dput(lubridt) # How would we construct it from scratch?
structure(18628, class = "Date") # --> by giving 18628 a Date attribute
In R, a Date is encoded as the number of days since 1970 began:
> as.Date("1970-01-1") + as.numeric(lubridt)
[1] "2021-01-01"
We could convert it back to the original text using:
format(base_dt, "%m-%d-%Y")
[1] "01-01-2021"
identical(x, format(base_dt, "%m-%d-%Y"))
[1] TRUE

Converting non-standard date format strings ("April-20") to date objects R

I have a vector of date strings in the form month_name-2_digit_year i.e.
a = rbind("April-21", "March-21", "February-21", "January-21")
I'm trying to convert that vector into a vector of date objects. I'm aware this question is very similar to this: Convert non-standard date format to date in R posted some years ago, but unfortunately, it has not answered my question.
I have tried the following as.Date() calls to do this, but it just returns a vector of NA. I.e.
b = as.Date(a, format = "%B-%y")
b = as.Date(a, format = "%B%y")
b = as.Date(a, "%B-%y")
b = as.Date(a, "%B%y")
I'm also attempted to do it using the convertToDate function from the openxlsx package:
b = convertToDate(a, format = "%B-%y")
I have also tried all the above but using a single character string rather than a vector, but that produced the same issue.
I'm a little lost as to why this isn't working, as this format has worked in reverse earlier in my script (that is, I had a date object already in dd-mm-yyyy format and converted it to month_name-yy using %B-%y). Is there another way to go from string to date when the string is a non-standard (anything other than dd-mm-yyy or mm-dd-yy if you're in the US) date format?
For the record my R locales are all UK and english.
Thanks in advance.
A Date must have all three of day, month and year. Convert to yearmon class which requires only month and year and then to Date as in (1) and (2) below or add the day as in (3).
(1) and (3) give first of month and (2) gives the end of the month.
(3) uses only functions from base R.
Also consider not converting to Date at all but just use yearmon objects instead since they directly represent a year and month which is what the input represents.
library(zoo)
# test input
a <- c("April-21", "March-21", "February-21", "January-21")
# 1
as.Date(as.yearmon(a, "%B-%y"))
## [1] "2021-04-01" "2021-03-01" "2021-02-01" "2021-01-01"
# 2
as.Date(as.yearmon(a, "%B-%y"), frac = 1)
## [1] "2021-04-30" "2021-03-31" "2021-02-28" "2021-01-31"
# 3
as.Date(paste(1, a), "%d %B-%y")
## [1] "2021-04-01" "2021-03-01" "2021-02-01" "2021-01-01"
In addition to zoo, which #G. Grothendieck mentioned, you can also use clock or lubridate.
clock supports a variable precision calendar type called year_month_day. In this case you'd want "month" precision, then you can set the day to whatever you'd like and convert back to Date.
library(clock)
x <- c("April-21", "March-21", "February-21", "January-21")
ymd <- year_month_day_parse(x, format = "%B-%y", precision = "month")
ymd
#> <year_month_day<month>[4]>
#> [1] "2021-04" "2021-03" "2021-02" "2021-01"
# First of month
as.Date(set_day(ymd, 1))
#> [1] "2021-04-01" "2021-03-01" "2021-02-01" "2021-01-01"
# End of month
as.Date(set_day(ymd, "last"))
#> [1] "2021-04-30" "2021-03-31" "2021-02-28" "2021-01-31"
The simplest solution may be to use lubridate::my(), which parses strings in the order of "month then year". That assumes that you want the first day of the month, which may or may not be correct for you.
library(lubridate)
x <- c("April-21", "March-21", "February-21", "January-21")
# Assumes first of month
my(x)
#> [1] "2021-04-01" "2021-03-01" "2021-02-01" "2021-01-01"

Return date automatically in R [duplicate]

I have a date in R, e.g.:
dt = as.Date('2010/03/17')
I would like to subtract 2 years from this date, without worrying about leap years and such issues, getting as.Date('2008-03-17').
How would I do that?
With lubridate
library(lubridate)
ymd("2010/03/17") - years(2)
The easiest thing to do is to convert it into POSIXlt and subtract 2 from the years slot.
> d <- as.POSIXlt(as.Date('2010/03/17'))
> d$year <- d$year-2
> as.Date(d)
[1] "2008-03-17"
See this related question: How to subtract days in R?.
You could use seq:
R> dt = as.Date('2010/03/17')
R> seq(dt, length=2, by="-2 years")[2]
[1] "2008-03-17"
If leap days are to be taken into account then I'd recommend using this lubridate function to subtract months, as other methods will return either March 1st or NA:
> library(lubridate)
> dt %m-% months(12*2)
[1] "2008-03-17"
# Try with leap day
> leapdt <- as.Date('2016/02/29')
> leapdt %m-% months(12*2)
[1] "2014-02-28"
Same answer than the one by rcs but with the possibility to operate it on a vector (to answer to MichaelChirico, I can't comment I don't have enough rep):
R> unlist(lapply(c("2015-12-01", "2016-12-01"),
function(x) { return(as.character(seq(as.Date(x), length=2, by="-1 years")[2])) }))
[1] "2014-12-01" "2015-12-01"
This way seems to do the job as well
dt = as.Date("2010/03/17")
dt-365*2
[1] "2008-03-17"
as.Date("2008/02/29")-365*2
## [1] "2006-03-01"
cur_date <- str_split(as.character(Sys.Date()), pattern = "-")
cur_yr <- cur_date[[1]][1]
cur_month <- cur_date[[1]][2]
cur_day <- cur_date[[1]][3]
new_year <- as.integer(year) - 2
new_date <- paste(new_year, cur_month, cur_day, sep="-")
Using Base R, you can simply use the following without installing any package.
1) Transform your character string to Date format, specifying the input format in the second argument, so R can correctly interpret your date format.
dt = as.Date('2010/03/17',"%Y/%m/%d")
NOTE: If you look now at your enviroment tab you will see dt as variable with the following value "2010-03-17" (Year-month-date separated by "-" not by "/")
2) specify how many years to substract
years_substract=2
3) Use paste() combined with format () to only keep Month and Day and Just substract 2 year from your original date. Format() function will just keep the specific part of your date accordingly with format second argument.
dt_substract_2years<-
as.Date(paste(as.numeric(format(dt,"%Y"))-years_substract,format(dt,"%m"),format(dt,"%d"),sep = "-"))
NOTE1: We used paste() function to concatenate date components and specify separator as "-" (sep = "-")as is the R separator for dates by default.
NOTE2: We also used as.numeric() function to transform year from character to numeric

R Date time conversion - lapply and unlist is converting to days after the epoch

I am converting a string for dates that look like 12/8/12, probably in a rather roundabout way by 1) adding leading zeros (zero padding) to the day and month numbers and then converting to date object using the as.Date function in a custom function that I wrote.
date_conversion <- function(date_string){
split <- unlist(strsplit(date_string, "/"))
split[1] = str_pad(split[1], 2, pad = "0")
split[2] = str_pad(split[1], 2, pad = "0")
t = as.Date(paste(split, collapse = "/"),format='%m/%d/%y')
return (t)
}
What's very confusing is that I can get back the result that I think I want when I call the function on one input:
> date_conversion(test[1])
[1] "2012-12-12"
> typeof(date_conversion(test[1]))
[1] "double"
Then when I use lapply, it returns this string representation:
> lapply(test, date_conversion)
[[1]]
[1] "2012-12-12"
[[2]]
[1] "2015-12-12"
[[3]]
[1] "2015-09-09"
Then when I call unlist, the dates are changed into days since the epoch:
unlist(lapply(test, date_conversion))
[1] 15686 16781 16687
I guess the underlying representation of the date is days since the epoch (hence typeof returns double, but why would the list value show the dates formatted in more human readable form and then calling unlist make them go to back to this days since the epoch form?
Also, is there a elegant way for me to do a conversion like this? Maybe I shouldn't be using Lapply?
There is another solution with the lubridatepackage:
test <- c("12/8/12", "1/1/13", "2/4/13")
dates <- lubridate::mdy(test)
##[1] "2012-12-08" "2013-01-01" "2013-02-04"
class(dates)
##Date[1:3], format: "2012-12-08" "2013-01-01" "2013-02-04"
str(dates)
##[1] "Date"
The lubridate function mdy() converts the strings in your object test into month (m), days (d) and years (y).
If your object test is a list, the same answer as aichao wrote, is possible with lubridate::mdy((unlist(test)).
You do not need your date_conversion function since as.Date will work with the format of your date strings, and it is vectorized. So, if test is a vector of your dates as strings:
test <- c("12/8/12", "1/1/13", "2/4/13")
print(result)
result <- as.Date(test, format="%m/%d/%y")
##[1] "2012-12-08" "2013-01-01" "2013-02-04"
str(result)
##Date[1:3], format: "2012-12-08" "2013-01-01" "2013-02-04"
class(result)
##[1] "Date"
If test is a list, then unlist first (i.e., as.Date(unlist(test), format="%m/%d/%y")).

Convert integer to class Date

I have an integer which I want to convert to class Date. I assume I first need to convert it to a string, but how?
My attempt:
v <- 20081101
date <- as.Date(v, format("%Y%m%d"))
Error in charToDate(x) : character string is not in a standard
unambiguous format
Using paste() works, but is that really the correct way to do the conversion?
date <- as.Date(paste(v), format("%Y%m%d"))
date
[1] "2008-11-01"
class(date)
# [1] "Date"
as.character() would be the general way rather than use paste() for its side effect
> v <- 20081101
> date <- as.Date(as.character(v), format = "%Y%m%d")
> date
[1] "2008-11-01"
(I presume this is a simple example and something like this:
v <- "20081101"
isn't possible?)
Another way to get the same result:
date <- strptime(v,format="%Y%m%d")
You can use ymd from lubridate
lubridate::ymd(v)
#[1] "2008-11-01"
Or anytime::anydate
anytime::anydate(v)
#[1] "2008-11-01"

Resources