Getting NA when using as.Date() - r

Thanks to comments below, I realized I should use "%b" for "FEB" (originally I used "%m"; thanks for the reference to ?strptime). But my problem still stands.
When I do
as.Date("13-FEB-15", "%d-%b-%y")
# [1] NA
I know this will work:
as.Date("13-02-2015", "%d-%m-%Y")
# [1] "2015-02-13"
But is there a way to avoid converting FEB to 02 and 15 to 2015 in order to get my expected result? Thanks!

A general and useful diagnostic
Try this and what do you get?
format(strptime(Sys.Date(), format="%Y-%m-%d"), "%y-%b-%d")
I got
[1] "16- 7月-22"
Haha, the middle one is Chinese. So what is going wrong? Nothing wrong. The issue is that %b is sensitive to your current locale. When you read ?strptime, pay special attention to what format is sensitive to your current locale.
My locale is:
Sys.getlocale("LC_TIME")
#[1] "zh_CN.UTF-8"
Yep, that is in China region.
Locales make a difference in Date-Time format. On my machine:
as.Date("16-JUL-22", "%y-%b-%d")
# NA
as.Date("16- 7月-22", "%y-%b-%d")
#[1] "2016-07-22"
Now let's reset time locale:
Sys.setlocale("LC_TIME", "C")
as.Date("16-JUL-22", "%y-%b-%d")
#[1] "2016-07-22"
Wow, it works! Read ?locales for more, and you will understand what locale = "C" means.
Solution for you
Sys.setlocale("LC_TIME", "C")
as.Date("13-FEB-15", format = "%d-%b-%y")

Using lubridate:
library(lubridate)
date1 = "2014-12-11 00:00:00"
date2 = "14-DEC-11"
ymd_hms(date1) == ymd(date2,tz = "UTC")
These equal each other and should be able to be joined.

Related

Format date in R with lubridate

My input data, formatted as character, looks like this
"2020-07-10T00:00:00"
I tried
library(lubridate)
mdy_hms("2020-07-10T00:00:00", format='%Y-%m-%dT%H:%M:%S', tz=Sys.timezone())
But I get
[1] NA NA
Warning message:
All formats failed to parse. No formats found.
I tried the more flexibel approach parse_date_time(), but without luck
parse_date_time("2020-07-10T00:00:00", '%Y-%m-%dT%H:%M:%S', tz=Sys.timezone())
How can I convert this date "2020-07-10T00:00:00" to a date R recognizes? Note: I am not interested in the time really, only the date!
Why not just
as.Date("2020-07-10T00:00:00")
# [1] "2020-07-10"
Fun fact:
as.Date("2020-07-101sddT00:1sdafsdfsdf0:00sdfzsdfsdfsdf")
# [1] "2020-07-10"
Assuming that the 07 is the month of July, and the 10 is the 10th:
x <- "2020-07-10T00:00:00"
ymd_hms(x, tz = Sys.timezone())
> [1] "2020-07-10 AEST"
If it's in format year-day-month, swap the ymd for ydm.
Hope this helps!

R How to convert date format

I was looking to convert the following character string in a date format:
string <- '22APR2020'
date <- as.Date(string, format='%d%b%y')
date
>>> NA
Unfortunately, the result is NA and I need to convert it as a date in order to being able to calculate the time between two dates.
Thanks you
You could use the anytime package:
library(anytime)
string <- '22APR2020'
anytime::anydate(string)
[1] "2020-04-22"
When in doubt always use lubridate:
string <- '22APR2020'
library(lubridate)
dmy(string)
[1] "2020-04-22"
here dmy is order of date, month and year appearance.
The problem is your locale setting. It probably is set to a language where the fourth month is not abbreviated as "APR".
Sys.setlocale("LC_TIME", "French")
string <- '22APR2020'
as.Date(string, format='%d%b%Y')
#[1] NA
Sys.setlocale("LC_TIME", "German")
as.Date(string, format='%d%b%Y')
#[1] "2020-04-22"
Also, note the capital Y is used in the format string. It's important. y only refers to the decade (it would give the same result, by chance, for 2020, but give the wrong result for 2021).

Convert Integer to Date in R

in my data set the values for date is like this '10.01.2012' and I want to convert it to Date "10/01/2012", but it's not working. I've looked at many examples here but they haven't worked for me. Can someone help me please!
One issue here is that we actually do NOT know what 10.01.2012 is: is it Jan 10, or Oct 1:
R> as.Date("10.01.2012", "%d.%m.%Y")
[1] "2012-01-10"
R> as.Date("10.01.2012", "%m.%d.%Y")
[1] "2012-10-01"
R>
But you do, presumably, so pick either %d.%m.%Y or %m.%d.%Y as needed. But there is a reason we all like ISO 8601 formats ...
My old frien use gsub()
x = "10.01.2012"
gsub('\\.', '/', x)
[1] "10/01/2012"

R - format of a date

I am facing an issue with formatting of date and cannot find a solution. Here is the code - the second date becomes not the format like I want to.
date1
#[1] "01. Nov 11"
ndate1 <- as.Date(date1, "%d. %B %y")
ndate1
#[1] "2011-11-01"
date2
#[1] "26-May-13"
ndate2 <- as.Date(date2, "%d-%B-%y")
ndate2
#[1] NA
You can determine the complete or abbreviated month names in your locale using the example on the ?Constants page:
format(ISOdate(2000, 1:12, 1), "%b")
Per ?strptime on input you can use either "%B" or "%b" for either abbreviated or complete names.
This is most probably due to an incompatibility with the locale settings. If the output of Sys.getlocale("LC_TIME") does not correspond to an English setting, like "en_US.UTF-8" or "en_GB.UTF-8", the abbreviation "May" (which, coincidentally, is not even an abbreviation in this case) is not recognized in most (all?) other settings. In contrast, "Nov" is a valid abbreviation for the month of November in several languages. This might explain why the first case with date1 does not cause trouble.
We could try this:
Sys.setlocale("LC_TIME", "en_US.UTF-8")
date2 <- "26-May-13"
ndate2 <- as.Date(date2, "%d-%b-%y")
ndate2
#[1] "2013-05-26"

as.Date returning NA while converting from 'ddmmmyyyy'

I am trying to convert the string "2013-JAN-14" into a Date as follow :
sdate1 <- "2013-JAN-14"
ddate1 <- as.Date(sdate1,format="%Y-%b-%d")
ddate1
but I get :
[1] NA
What am I doing wrong ? should I install a package for this purpose (I tried installing chron) .
Works for me. The reasons it doesn't for you probably has to do with your system locale.
?as.Date has the following to say:
## This will give NA(s) in some locales; setting the C locale
## as in the commented lines will overcome this on most systems.
## lct <- Sys.getlocale("LC_TIME"); Sys.setlocale("LC_TIME", "C")
x <- c("1jan1960", "2jan1960", "31mar1960", "30jul1960")
z <- as.Date(x, "%d%b%Y")
## Sys.setlocale("LC_TIME", lct)
Worth a try.
This can also happen if you try to convert your date of class factor into a date of class Date. You need to first convert into POSIXt otherwise as.Date doesn't know what part of your string corresponds to what.
Wrong way: direct conversion from factor to date:
a<-as.factor("24/06/2018")
b<-as.Date(a,format="%Y-%m-%d")
You will get as an output:
a
[1] 24/06/2018
Levels: 24/06/2018
class(a)
[1] "factor"
b
[1] NA
Right way, converting factor into POSIXt and then into date
a<-as.factor("24/06/2018")
abis<-strptime(a,format="%d/%m/%Y") #defining what is the original format of your date
b<-as.Date(abis,format="%Y-%m-%d") #defining what is the desired format of your date
You will get as an output:
abis
[1] "2018-06-24 AEST"
class(abis)
[1] "POSIXlt" "POSIXt"
b
[1] "2018-06-24"
class(b)
[1] "Date"
My solution below might not work for every problem that results in as.Date() returning NA's, but it does work for some, namely, when the Date variable is read in in factor format.
Simply read in the .csv with stringsAsFactors=FALSE
data <- read.csv("data.csv", stringsAsFactors = FALSE)
data$date <- as.Date(data$date)
After trying (and failing) to solve the NA problem with my system locale, this solution worked for me.

Resources