in my data set the values for date is like this '10.01.2012' and I want to convert it to Date "10/01/2012", but it's not working. I've looked at many examples here but they haven't worked for me. Can someone help me please!
One issue here is that we actually do NOT know what 10.01.2012 is: is it Jan 10, or Oct 1:
R> as.Date("10.01.2012", "%d.%m.%Y")
[1] "2012-01-10"
R> as.Date("10.01.2012", "%m.%d.%Y")
[1] "2012-10-01"
R>
But you do, presumably, so pick either %d.%m.%Y or %m.%d.%Y as needed. But there is a reason we all like ISO 8601 formats ...
My old frien use gsub()
x = "10.01.2012"
gsub('\\.', '/', x)
[1] "10/01/2012"
Related
I am working with an external package that's converting columns of a dataframe with the lubridate date type Date into numeric type. (Confirmed by running as.numeric() on the columns).
I'm wondering if there's a way to convert it back?
For example, if I have the date "O1-01-2021" then running as.numeric on it returns -719143. How can I turn that back into "O1-01-2021" ?
Note that Date class is part of base R, not lubridate.
You probably assumed that the data was year/month/day by mistake. Using base R to eliminate lubridate as a problem we can replicate the question's result like this:
as.numeric(as.Date("01-01-2021", "%Y-%m-%d"))
## [1] -719143
Had we used day/month/year we would have gotten:
as.numeric(as.Date("01-01-2021", "%d-%m-%Y"))
## [1] 18628
or using lubridate
library(lubridate)
as.numeric(dmy("01-01-2021"))
## [1] 18628
It would be best if you fix the mistake that resulted in -719143 but if you don't control that and are faced with an input of
-719143 and want to get as.Date("2021-01-01") as the output then:
# input x is numeric; result is Date class
fixup <- function(x) as.Date(format(.Date(x), "%y-%m-%d"), "%d-%m-%y")
fixup(-719143)
## [1] "2020-01-01"
Note that we can't tell from the question whether 01-01-2020 is supposed to represent day-month-year or month-day-year so we assumed the first but if it is to represent the second then it should be obvious at this point how to proceed.
EDIT #2: It looks like the original data is being parsed as Jan 20, year 1, which might happen if the year-month-day columns were jumbled while being parsed:
as.numeric(as.Date("01-01-2021", format = "%Y-%m-%d", origin = "1970-01-01"))
[1] -719143
as.numeric(as.Date("0001-01-20", origin = "1970-01-01"))
[1] -719143
Is there a way to share an example of the raw data as you have it? e.g. dput(MY_DATA[1:10, DATE_COL])
EDIT: -719143 is about 1970 years of days, which can't be a coincidence, given that many date/time formats use 1970 as a baseline. I wonder if 01-01-2021 is being interpreted as the numeric formula equal to -2021 and so we're looking at perhaps -2021 seconds/days/[?] before year zero, which would be about -1970 years before the epoch...
-719143/(365)
[1] -1970.255
For instance, we can get something close with:
as.numeric(as.Date("0000-01-01", origin = "1970-01-01"))
[1] -719528
Original answer:
R treats a string describing a date as text:
x <- "01-01-2021"
class(x)
[1] "character"
We can convert it to a Date data type using these two equivalent commands:
base_dt <- as.Date(x, "%m-%d-%Y") # base R version
lubridt <- lubridate::mdy(x) # convenience lubridate function
identical(base_dt, lubridt)
[1] TRUE
Under the hood, a Date object in R is a numeric value with a flag telling R it's a date:
> typeof(lubridt) # What general type of data is it?
[1] "double" # --> numeric, stored as a double
> as.numeric(lubridt)
[1] 18628
> class(lubridt) # Does it have any special class attributes?
[1] "Date" # --> yes, it's a Date
> dput(lubridt) # How would we construct it from scratch?
structure(18628, class = "Date") # --> by giving 18628 a Date attribute
In R, a Date is encoded as the number of days since 1970 began:
> as.Date("1970-01-1") + as.numeric(lubridt)
[1] "2021-01-01"
We could convert it back to the original text using:
format(base_dt, "%m-%d-%Y")
[1] "01-01-2021"
identical(x, format(base_dt, "%m-%d-%Y"))
[1] TRUE
Thanks to comments below, I realized I should use "%b" for "FEB" (originally I used "%m"; thanks for the reference to ?strptime). But my problem still stands.
When I do
as.Date("13-FEB-15", "%d-%b-%y")
# [1] NA
I know this will work:
as.Date("13-02-2015", "%d-%m-%Y")
# [1] "2015-02-13"
But is there a way to avoid converting FEB to 02 and 15 to 2015 in order to get my expected result? Thanks!
A general and useful diagnostic
Try this and what do you get?
format(strptime(Sys.Date(), format="%Y-%m-%d"), "%y-%b-%d")
I got
[1] "16- 7月-22"
Haha, the middle one is Chinese. So what is going wrong? Nothing wrong. The issue is that %b is sensitive to your current locale. When you read ?strptime, pay special attention to what format is sensitive to your current locale.
My locale is:
Sys.getlocale("LC_TIME")
#[1] "zh_CN.UTF-8"
Yep, that is in China region.
Locales make a difference in Date-Time format. On my machine:
as.Date("16-JUL-22", "%y-%b-%d")
# NA
as.Date("16- 7月-22", "%y-%b-%d")
#[1] "2016-07-22"
Now let's reset time locale:
Sys.setlocale("LC_TIME", "C")
as.Date("16-JUL-22", "%y-%b-%d")
#[1] "2016-07-22"
Wow, it works! Read ?locales for more, and you will understand what locale = "C" means.
Solution for you
Sys.setlocale("LC_TIME", "C")
as.Date("13-FEB-15", format = "%d-%b-%y")
Using lubridate:
library(lubridate)
date1 = "2014-12-11 00:00:00"
date2 = "14-DEC-11"
ymd_hms(date1) == ymd(date2,tz = "UTC")
These equal each other and should be able to be joined.
I would like to convert a YYYYMM-integer to a Date without converting it to character (=is there a mdy-function like in SAS?). I would like to replace this code:
dateint<-201511
datestr<-paste(toString(dateint,length=8),'01')
date<-as.Date(datestr,'%Y%m%d')
print(date)
class(date)
with a working version of this. If possible the resulting class should be a date too:
year<-dateint %% 100
month<-floor(dateint/100)
date2<-ISOdate(year,month,1) # I can't make this work ..
print(date2)
class(date2)
Thanks & kind regards
The package lubridate has a function ymd, which accepts numeric input:
> library(lubridate)
> ymd(20151101)
[1] "2015-11-01 UTC"
You need however to add the day at the end.
I have a numeric vector as follows
aa <- c(1022011, 2022011, 13022011, 23022011) (this vector is just a sample, it is very long)
Values are written in such a way that first value is day then month and then year.
What I am doing right now is
as.Date(as.character(aa), %d%m%Y")
but,
it is causing problems (returning NA) in case of single digits day numbers. (i.e. 1022011, 2022011).
so basically
as.Date("1022011", "%d%m%Y") does not work
but
as.Date("01022011", "%d%m%Y") (pasting '0' ahead of the number) works.
I want to avoid pasting '0' in such cases. Is there any other (direct) alternative to convert numeric values to dates at once?
It could be rearranged using sub in which case a plain as.Date with no format works:
x <- c(1022011, 11022011) # test data
pat <- "^(..?)(..)(....)$"
as.Date(sub(pat, "\\3-\\2-\\1", x))
giving:
[1] "2011-02-01" "2011-02-11"
Depending on your platform, you could use sprintf in order to add a zero at the beginning. It seems that Mac is OK with this, but not windows 7 given the discussion with the OP.
aa <- c(1022011, 2022011, 13022011, 23022011)
as.Date(sprintf("%08s", aa), format = "%d%m%Y")
[1] "2011-02-01" "2011-02-02" "2011-02-13" "2011-02-23"
UPDATE
#CathyG kindly mentioned that sprintf("%08i",aa) works on Windows 7.
You can use dmy in lubridate:
library(lubridate)
aa <- c(1022011, 2022011, 13022011, 23022011)
> dmy(aa)
[1] "2011-02-01 UTC" "2011-02-02 UTC" "2011-02-13 UTC" "2011-02-23 UTC"
and if you don't want the timezone just wrap it in as.Date:
> as.Date(dmy(aa))
[1] "2011-02-01" "2011-02-02" "2011-02-13" "2011-02-23"
Thank you #Ben Bolker,
> as.Date(mdy(aa))
[1] "2011-01-02" "2011-02-02" "2012-01-02" "2011-01-02"
I know you don't want to add a "0" but still, in base R, this works :
as.Date(sapply(aa,function(x){ifelse(nchar(x)==8,x,paste("0",x,sep=""))}),format = "%d%m%Y")
I am trying to convert the string "2013-JAN-14" into a Date as follow :
sdate1 <- "2013-JAN-14"
ddate1 <- as.Date(sdate1,format="%Y-%b-%d")
ddate1
but I get :
[1] NA
What am I doing wrong ? should I install a package for this purpose (I tried installing chron) .
Works for me. The reasons it doesn't for you probably has to do with your system locale.
?as.Date has the following to say:
## This will give NA(s) in some locales; setting the C locale
## as in the commented lines will overcome this on most systems.
## lct <- Sys.getlocale("LC_TIME"); Sys.setlocale("LC_TIME", "C")
x <- c("1jan1960", "2jan1960", "31mar1960", "30jul1960")
z <- as.Date(x, "%d%b%Y")
## Sys.setlocale("LC_TIME", lct)
Worth a try.
This can also happen if you try to convert your date of class factor into a date of class Date. You need to first convert into POSIXt otherwise as.Date doesn't know what part of your string corresponds to what.
Wrong way: direct conversion from factor to date:
a<-as.factor("24/06/2018")
b<-as.Date(a,format="%Y-%m-%d")
You will get as an output:
a
[1] 24/06/2018
Levels: 24/06/2018
class(a)
[1] "factor"
b
[1] NA
Right way, converting factor into POSIXt and then into date
a<-as.factor("24/06/2018")
abis<-strptime(a,format="%d/%m/%Y") #defining what is the original format of your date
b<-as.Date(abis,format="%Y-%m-%d") #defining what is the desired format of your date
You will get as an output:
abis
[1] "2018-06-24 AEST"
class(abis)
[1] "POSIXlt" "POSIXt"
b
[1] "2018-06-24"
class(b)
[1] "Date"
My solution below might not work for every problem that results in as.Date() returning NA's, but it does work for some, namely, when the Date variable is read in in factor format.
Simply read in the .csv with stringsAsFactors=FALSE
data <- read.csv("data.csv", stringsAsFactors = FALSE)
data$date <- as.Date(data$date)
After trying (and failing) to solve the NA problem with my system locale, this solution worked for me.