Transforming data.frame factor into date [duplicate] - r

I imported a CSV file with dates from a SQL query, but the dates are really date-time values and R doesn't seem to recognize them as dates:
> mydate
[1] 1/15/2006 0:00:00
2373 Levels: 1/1/2006 0:00:00 1/1/2007 0:00:00 1/1/2008 0:00:00 ... 9/9/2012 0:00:00
> class(mydate)
[1] "factor"
> as.Date(mydate)
Error in charToDate(x) :
character string is not in a standard unambiguous format
How do I convert mydate to date format? (I don't need to include the time portion.)

You were close. format= needs to be added to the as.Date call:
mydate <- factor("1/15/2006 0:00:00")
as.Date(mydate, format = "%m/%d/%Y")
## [1] "2006-01-15"

You can try lubridate package which makes life much easier
library(lubridate)
mdy_hms(mydate)
The above will change the date format to POSIXct
A sample working example:
> data <- "1/15/2006 01:15:00"
> library(lubridate)
> mydate <- mdy_hms(data)
> mydate
[1] "2006-01-15 01:15:00 UTC"
> class(mydate)
[1] "POSIXct" "POSIXt"
For case with factor use as.character
data <- factor("1/15/2006 01:15:00")
library(lubridate)
mydate <- mdy_hms(as.character(data))

Take a look at the formats in ?strptime
R> foo <- factor("1/15/2006 0:00:00")
R> foo <- as.Date(foo, format = "%m/%d/%Y %H:%M:%S")
R> foo
[1] "2006-01-15"
R> class(foo)
[1] "Date"
Note that this will work even if foo starts out as a character. It will also work if using other date formats (as.POSIXlt, as.POSIXct).

Related

Converting characters like "01APR2020" to date class R

I have the following column in my dataframe
> df$dates
[1] "01APR2020" "01JUN2020" "01MAR2020" "01MAY2020" "02APR2020" "02JUN2020"
[7] "02MAR2020"
I would like to format this to an object of Date class, so I want my output to look like this
> df$dates
[1] "01-04" "01-06" "01-03" "01-05" "02-04" "02-06"
[7] "02-03"
And I would like to order them from the oldest to the newest.
Edit:
For example I tried this but it doesn't work:
> format(as.Date("01APR2020", "%d%b%Y"), "%d-%m")
[1] NA
Thanks!
Just use the anydate() function from the anytime package
R> anydate(c("01APR2020", "01JUN2020", "01MAR2020"))
[1] "2020-04-01" "2020-06-01" "2020-03-01"
R>
It's idea is to not require a format for a variety of common and sensible date (or datetime) inputs. Once they are parsed, putting out day and months is easy too:
R> format(anydate(c("01APR2020", "01JUN2020", "01MAR2020")), "%d-%m")
[1] "01-04" "01-06" "01-03"
R>
We can use as.Date with format
df$dates <- format(as.Date(df$dates, "%d%b%Y"), "%d-%m")
df$dates
#[1] "01-04" "01-06" "01-03" "01-05" "02-04" "02-06" "02-03"
Or using lubridate
library(lubridate)
df$dates <- format(dmy(df$dates), "%d-%m")
NOTE: Both the solutions work on R 4.0
data
df <- data.frame(dates = c("01APR2020" ,"01JUN2020", "01MAR2020",
"01MAY2020", "02APR2020" ,"02JUN2020" , "02MAR2020"))

Reconvert numeric date to POSIXct R

I have a date that I convert to a numeric value and want to convert back to a date afterwards.
Converting date to numeric:
date1 = as.POSIXct('2017-12-30 15:00:00')
date1_num = as.numeric(date1)
# 1514646000
Reconverting numeric to date:
as.Date(date1_num, origin = '1/1/1970')
# "4146960-12-12"
What am I missing with the reconversion? I'd expect the last command to return my original date1.
As the numeric vector is created from an object with time component, reconversion can also be in the same way i.e. first to POSIXct and then wrap with as.Date
as.Date(as.POSIXct(date1_num, origin = '1970-01-01'))
#[1] "2017-12-30"
You could use anytime() and anydate() from the anytime package:
R> pt <- anytime("2017-12-30 15:00:00")
R> pt
[1] "2017-12-30 15:00:00 CST"
R>
R> anydate(pt)
[1] "2017-12-30"
R>
R> as.numeric(pt)
[1] 1514667600
R>
R> anydate(as.numeric(pt))
[1] "2017-12-30"
R>
POSIXct counts the number of seconds since the Unix Epoch, while Date counts the number of days. So you can recover the date by dividing by (60*60*24) (let's ignore leap seconds), or convert back to POSIXct instead.
as.Date(as.numeric(date1)/(60*60*24), origin="1970-01-01")
[1] "2017-12-30"
as.POSIXct(as.numeric(date1),origin="1970-01-01")
[1] "2017-12-30 15:00:00 GMT"
Using lubridate :
lubridate::as_datetime(1514646000)
[1] "2017-12-30 15:00:00 UTC"

Convert dates to text in R

I have the following dataset with dates (YYYY-MM-DD):
> dates
[1] "20180412" "20180424" "20180506" "20180518" "20180530" "20180611" "20180623" "20180705" "20180717" "20180729"
I want to convert them in:
DD-MMM-YYYY but with the month being text. For example 20180412 should become 12Apr2018
Any suggestion on how to proceed?
M
You can try something like this :
# print today's date
today <- Sys.Date()
format(today, format="%B %d %Y") "June 20 2007"
where The following symbols can be used with the format( ) function to print dates 1
You need to first parse the text strings as Date objects, and then format these Date objects to your liking to have the different text output:
R> library(anytime) ## one easy way to parse dates and times
R> dates <- anydate(c("20180412", "20180424", "20180506", "20180518", "20180530",
+ "20180611", "20180623", "20180705", "20180717", "20180729"))
R> dates
[1] "2018-04-12" "2018-04-24" "2018-05-06" "2018-05-18" "2018-05-30"
[6] "2018-06-11" "2018-06-23" "2018-07-05" "2018-07-17" "2018-07-29"
R>
R> txtdates <- format(dates, "%d%b%Y")
R> txtdates
[1] "12Apr2018" "24Apr2018" "06May2018" "18May2018" "30May2018"
[6] "11Jun2018" "23Jun2018" "05Jul2018" "17Jul2018" "29Jul2018"
R>
You could use the as.Date() and format() functions:
dts <- c("20180412", "20180424", "20180506", "20180518", "20180530",
"20180611", "20180623")
format(as.Date(dts, format = "%Y%m%d"), "%d%b%Y")
More information here
Simply use as.POSIXct and as.format:
dates <- c("20180412", "20180424", "20180506")
format(as.POSIXct(dates, format="%Y%m%d"),format="%d%b%y")
Output:
[1] "12Apr18" "24Apr18" "06May18"

How do I convert a factor into date format?

I imported a CSV file with dates from a SQL query, but the dates are really date-time values and R doesn't seem to recognize them as dates:
> mydate
[1] 1/15/2006 0:00:00
2373 Levels: 1/1/2006 0:00:00 1/1/2007 0:00:00 1/1/2008 0:00:00 ... 9/9/2012 0:00:00
> class(mydate)
[1] "factor"
> as.Date(mydate)
Error in charToDate(x) :
character string is not in a standard unambiguous format
How do I convert mydate to date format? (I don't need to include the time portion.)
You were close. format= needs to be added to the as.Date call:
mydate <- factor("1/15/2006 0:00:00")
as.Date(mydate, format = "%m/%d/%Y")
## [1] "2006-01-15"
You can try lubridate package which makes life much easier
library(lubridate)
mdy_hms(mydate)
The above will change the date format to POSIXct
A sample working example:
> data <- "1/15/2006 01:15:00"
> library(lubridate)
> mydate <- mdy_hms(data)
> mydate
[1] "2006-01-15 01:15:00 UTC"
> class(mydate)
[1] "POSIXct" "POSIXt"
For case with factor use as.character
data <- factor("1/15/2006 01:15:00")
library(lubridate)
mydate <- mdy_hms(as.character(data))
Take a look at the formats in ?strptime
R> foo <- factor("1/15/2006 0:00:00")
R> foo <- as.Date(foo, format = "%m/%d/%Y %H:%M:%S")
R> foo
[1] "2006-01-15"
R> class(foo)
[1] "Date"
Note that this will work even if foo starts out as a character. It will also work if using other date formats (as.POSIXlt, as.POSIXct).

Parse timestamp with a.m./p.m

I have a file that formats time stamps like 25/03/2011 9:15:00 p.m.
How can I parse this text to a Date-Time class with either strptime or as.POSIXct?
Here is what almost works:
> as.POSIXct("25/03/2011 9:15:00", format="%d/%m/%Y %I:%M:%S", tz="UTC")
[1] "2011-03-25 09:15:00 UTC"
Here is what is not working, but I'd like to have working:
> as.POSIXct("25/03/2011 9:15:00 p.m.", format="%d/%m/%Y %I:%M:%S %p", tz="UTC")
[1] NA
I'm using R version 2.13.2 (2011-09-30) on MS Windows. My working locale is "C":
Sys.setlocale("LC_TIME", "C")
It appears the AM/PM indicator can't include punctuation. Try it after removing the punctuation:
td <- "25/03/2011 9:15:00 p.m."
tdClean <- gsub("(.)\\.?[Mm]\\.?","\\1m",td)
as.POSIXct(tdClean, format="%d/%m/%Y %I:%M:%S %p", tz="UTC")
# [1] "2011-03-25 21:15:00 UTC"
Just came across this, as another option you can use stringr package.
library(stringr)
data$date2 <- str_sub(data$date, end = -4)
# this removes the punctuation but holds onto the A/P values
data$date2 <- str_c(data$date2, 'm')
# adds the required m

Resources