Format date in R with lubridate - r

My input data, formatted as character, looks like this
"2020-07-10T00:00:00"
I tried
library(lubridate)
mdy_hms("2020-07-10T00:00:00", format='%Y-%m-%dT%H:%M:%S', tz=Sys.timezone())
But I get
[1] NA NA
Warning message:
All formats failed to parse. No formats found.
I tried the more flexibel approach parse_date_time(), but without luck
parse_date_time("2020-07-10T00:00:00", '%Y-%m-%dT%H:%M:%S', tz=Sys.timezone())
How can I convert this date "2020-07-10T00:00:00" to a date R recognizes? Note: I am not interested in the time really, only the date!

Why not just
as.Date("2020-07-10T00:00:00")
# [1] "2020-07-10"
Fun fact:
as.Date("2020-07-101sddT00:1sdafsdfsdf0:00sdfzsdfsdfsdf")
# [1] "2020-07-10"

Assuming that the 07 is the month of July, and the 10 is the 10th:
x <- "2020-07-10T00:00:00"
ymd_hms(x, tz = Sys.timezone())
> [1] "2020-07-10 AEST"
If it's in format year-day-month, swap the ymd for ydm.
Hope this helps!

Related

strptime outputing NA when I try and convert from "%Y-%m" format

Strptime outputs NA when I set format to "%Y-%m"
I have tried adding the day as a test and it worked, but whenever I do "%Y-%m" or "%m" i get NA
print(strptime("2007-07", format = "%Y-%m"))
[1] NA
print(strptime("07", format = "%m"))
[1] NA
print(strptime("2007", format = "%Y"))
[1] "2007-07-30 EDT"
Use library zoo. It is useful when you have to deal with dates like that.
require(zoo)
yearmon(c(2017,01))
Then you can manipulate the object yearmon.
as.Date(yearmon(c(2017,01)))
[1] "2017-01-01" "7-01-01"

Function on Date in R gives results as "NA"

My date value is in this format
02:27:16 05-Mar-2019, Tue stored in Assigned date column
Am converting
srdetails1$Assigned On GMT<-as.POSIXct(srdetails1$Assigned On GMT, tz="", format = "%H:%M:%S %m/%d/%Y")
srdetails$Assigned On GMT
the value get converted as
43497.067407407405
Instead of showing a date and any function i use on this column for
e.g :-
day(ymd_hms() etc gives me "NA"
How do i resolve this - Any help appreciated
When i trim the date with only m/d/y (without time) it works properly
Your format mask does not match the timestamp which you are trying to use with as.POSIXct. Consider the following version:
x <- "02:27:16 05-Mar-2019"
as.POSIXct(x, tz="", format = "%H:%M:%S %d-%b-%Y")
[1] "2019-03-05 02:27:16 CET"
We can use anytime
library(anytime)
addFormats("%H:%M:%S %d-%b-%Y")
anytime(x)
#[1] "2019-03-05 02:27:16 EST"
data
x <- "02:27:16 05-Mar-2019"

Fixing date format in R

I have three data tables in R. Each one has a date column. The tables are vix_data,gold_ohlc_data,btc_ohlc_data. They are formatted as follows:
head(vix_data$Date)
[1] 1/2/04 1/5/04 1/6/04 1/7/04 1/8/04 1/9/04
3435 Levels: 1/10/05 1/10/06 1/10/07 1/10/08 1/10/11 ... 9/9/16
head(gold_ohlc_data$date)
[1] 8/23/17 8/22/17 8/21/17 8/18/17 8/17/17 8/16/17
2519 Levels: 1/10/08 1/10/11 1/10/12 1/10/13 1/10/14 ... 9/9/16
head(btc_ohlc_data$Date)
[1] "2017-08-23" "2017-08-22" "2017-08-21" "2017-08-20" "2017-08-19"
[6] "2017-08-18"
How can I change the date column in the vix_data and gold_ohlc_data tables to match the btc_ohlc_data format? I have tried several methods, for example using as.Date to transform each column- but this usually messes up the values and inserts a lot of N/A's
An option is to use functions from the package lubridate. The users need to know which one is day and which one is month to select the right function to use, such as dmy or mdy
# Load package
library(lubridate)
# Create example string
date1 <- c("1/2/04", "1/5/04", "1/6/04", "1/7/04", "1/8/04", "1/9/04")
date2 <- c("8/23/17", "8/22/17", "8/21/17", "8/18/17", "8/17/17", "8/16/17")
# Convert to date class
dmy(date1)
# [1] "2004-02-01" "2004-05-01" "2004-06-01" "2004-07-01" "2004-08-01" "2004-09-01"
mdy(date1)
# [1] "2004-01-02" "2004-01-05" "2004-01-06" "2004-01-07" "2004-01-08" "2004-01-09"
mdy(date2)
# [1] "2017-08-23" "2017-08-22" "2017-08-21" "2017-08-18" "2017-08-17" "2017-08-16"
Look into the package lubridate. lubridate::dmy() and ymd() should handle this just fine.
It looks like your data are read in as factors, so first you'll have to change them to characters. Then after that you can convert it to a date and specify the input format where %m represents the numerical month, %d represents the day, and %y represents the 2-digit year.
x <- c('1/2/04', '1/5/04', '1/6/04', '1/7/04', '1/8/04', '1/9/04')
y <- as.Date(x, format = "%m/%d/%y")
y
[1] "2004-01-02" "2004-01-05" "2004-01-06" "2004-01-07" "2004-01-08"
[6] "2004-01-09"
Are you sure you're specifying as.Date correctly? For example, do you have %y, instead of %Y?
I did the following and it worked:
> vix <- c("1/2/04", "1/5/04", "1/6/04", "1/7/04", "1/8/04", "1/9/04")
> vix<- as.factor(vix)
> vix
[1] 1/2/04 1/5/04 1/6/04 1/7/04 1/8/04 1/9/04
Levels: 1/2/04 1/5/04 1/6/04 1/7/04 1/8/04 1/9/04
> as.Date(vix, "%m/%d/%y")
[1] "2004-01-02" "2004-01-05" "2004-01-06" "2004-01-07" "2004-01-08" "2004-01-09"

Change from date time to numeric AND back to date time in R

Seeing that others can't reproduce this Any speculation about system settings that might cause what I'm seeing would be appreciated. This is on a work PC configured by IT, but I will compare with my personal install this evening and then update the question.
Using base R, I'm trying to read in date and time, convert to numeric, and then convert back to date time. The problem I'm running into is a + 5 hour shift that gets introduced, I think due to timezone defaults.
From a previous question, an example of date time to numeric was provided:
Change from date and hour format to numeric format
> x <- as.POSIXct("9/27/2011 3:33:00 PM", format="%m/%d/%Y %H:%M:%S %p")
> x
[1] "2011-09-27 03:33:00 EDT"
> y <- as.numeric(x)
[1] 1317108780
*Typo in above code fixed
When I try to bring this back to date time, I get:
> z <- as.POSIXct(y, origin="1970-01-01")
> z
[1] "2011-09-27 08:33:00 EDT"
I tried some variants, including specifying time zones explicitly, but am consistently getting this shift.
I think it is just a problem of specifying time zones :
x <- as.POSIXct("9/27/2011 15:33:00", format="%m/%d/%Y %H:%M:%S")
> as.POSIXct(as.numeric(x), origin="1970-01-01",tz="EST") # as.numeric(x)=1317130380
[1] "2011-09-27 08:33:00 EST"
but :
x <- as.POSIXct("9/27/2011 15:33:00", format="%m/%d/%Y %H:%M:%S",tz="EST")
> as.POSIXct(as.numeric(x), origin="1970-01-01",tz="EST") # as.numeric(x)=1317155580
[1] "2011-09-27 15:33:00 EST"
remark : I simplified 03:33:00 PM in 15:33:00

as.Date returning NA while converting from 'ddmmmyyyy'

I am trying to convert the string "2013-JAN-14" into a Date as follow :
sdate1 <- "2013-JAN-14"
ddate1 <- as.Date(sdate1,format="%Y-%b-%d")
ddate1
but I get :
[1] NA
What am I doing wrong ? should I install a package for this purpose (I tried installing chron) .
Works for me. The reasons it doesn't for you probably has to do with your system locale.
?as.Date has the following to say:
## This will give NA(s) in some locales; setting the C locale
## as in the commented lines will overcome this on most systems.
## lct <- Sys.getlocale("LC_TIME"); Sys.setlocale("LC_TIME", "C")
x <- c("1jan1960", "2jan1960", "31mar1960", "30jul1960")
z <- as.Date(x, "%d%b%Y")
## Sys.setlocale("LC_TIME", lct)
Worth a try.
This can also happen if you try to convert your date of class factor into a date of class Date. You need to first convert into POSIXt otherwise as.Date doesn't know what part of your string corresponds to what.
Wrong way: direct conversion from factor to date:
a<-as.factor("24/06/2018")
b<-as.Date(a,format="%Y-%m-%d")
You will get as an output:
a
[1] 24/06/2018
Levels: 24/06/2018
class(a)
[1] "factor"
b
[1] NA
Right way, converting factor into POSIXt and then into date
a<-as.factor("24/06/2018")
abis<-strptime(a,format="%d/%m/%Y") #defining what is the original format of your date
b<-as.Date(abis,format="%Y-%m-%d") #defining what is the desired format of your date
You will get as an output:
abis
[1] "2018-06-24 AEST"
class(abis)
[1] "POSIXlt" "POSIXt"
b
[1] "2018-06-24"
class(b)
[1] "Date"
My solution below might not work for every problem that results in as.Date() returning NA's, but it does work for some, namely, when the Date variable is read in in factor format.
Simply read in the .csv with stringsAsFactors=FALSE
data <- read.csv("data.csv", stringsAsFactors = FALSE)
data$date <- as.Date(data$date)
After trying (and failing) to solve the NA problem with my system locale, this solution worked for me.

Resources