Formatting Dates in R with times in the same column [duplicate] - r

I have a data frame with a character column of date-times.
When I use as.Date, most of my strings are parsed correctly, except for a few instances. The example below will hopefully show you what is going on.
# my attempt to parse the string to Date -- uses the stringr package
prods.all$Date2 <- as.Date(str_sub(prods.all$Date, 1,
str_locate(prods.all$Date, " ")[1]-1),
"%m/%d/%Y")
# grab two rows to highlight my issue
temp <- prods.all[c(1925:1926), c(1,8)]
temp
# Date Date2
# 1925 10/9/2009 0:00:00 2009-10-09
# 1926 10/15/2009 0:00:00 0200-10-15
As you can see, the year of some of the dates is inaccurate. The pattern seems to occur when the day is double digit.
Any help you can provide will be greatly appreciated.

The easiest way is to use lubridate:
library(lubridate)
prods.all$Date2 <- mdy(prods.all$Date2)
This function automatically returns objects of class POSIXct and will work with either factors or characters.

You may be overcomplicating things, is there any reason you need the stringr package? You can use as.Date and its format argument to specify the input format of your string.
df <- data.frame(Date = c("10/9/2009 0:00:00", "10/15/2009 0:00:00"))
as.Date(df$Date, format = "%m/%d/%Y %H:%M:%S")
# [1] "2009-10-09" "2009-10-15"
Note the Details section of ?as.Date:
Character strings are processed as far as necessary for the format specified: any trailing characters are ignored
Thus, this also works:
as.Date(df$Date, format = "%m/%d/%Y")
# [1] "2009-10-09" "2009-10-15"
All the conversion specifications that can be used to specify the input format are found in the Details section in ?strptime. Make sure that the order of the conversion specification as well as any separators correspond exactly with the format of your input string.
More generally and if you need the time component as well, use as.POSIXct or strptime:
as.POSIXct(df$Date, "%m/%d/%Y %H:%M:%S")
strptime(df$Date, "%m/%d/%Y %H:%M:%S")
I'm guessing at what your actual data might look at from the partial results you give.

If you don't know the format you could use anytime::anydate, which tries to match to common formats:
library(anytime)
date <- c("01/01/2000 0:00:00", "Jan 1, 2000 0:00:00", "2000-Jan-01 0:00:00")
anydate(date)
[1] "2000-01-01" "2000-01-01" "2000-01-01"

library(lubridate)
if your date format is like this '04/24/2017 05:35:00'then change it like below
prods.all$Date2<-gsub("/","-",prods.all$Date2)
then change the date format
parse_date_time(prods.all$Date2, orders="mdy hms")

Related

as.Date keeps returning NA - Mont-Year format

I imported csv data saved from excel and I am using a Mac if that matters.
To simplify things, I have been working with one particular entry from my data "Jan-12". This is of type character and in the form Month-Year.
This is what I have tried:
as.Date("Jan-12", format = "%b-%y")
I keep getting NA. I have browsed through other answers but haven't been able to figure out whats happening.
as.Date() help page says:
If the date string does not specify the date completely, the returned
answer may be system-specific.
If you really need to use as.Date() you can append some fixed day (say 1st) to date and convert
mydate <- "Jan-12"
as.Date(paste0("01-", mydate), format= "%d-%b-%y")
"2012-01-01"
Or you can use lubridate::fast_strptime():
library(lubridate)
fast_strptime("Jan-12", "%b-%y")
"2012-01-01 UTC"
Here is another option using zoo, then you can use as.Date.
library(zoo)
as.Date(as.yearmon("Jan-12", "%b-%y"))
# [1] "2012-01-01"

Converting time/date format character string to Date in R [duplicate]

A client sent me an Excel file with dates formatted as e.g 3/15/2012 for March 15. I saved this as a .csv file and then used
camm$Date <- as.Date(camm$Date, "%m/%d/%y")
but this gave me values starting in the year 2020!
I tried to reformat the dates in the original csv file so that they were e.g. 03/14/2013 but was unable to do so.
Any help appreciated
Use capital Y in as.Date call instead. This should do the trick:
> as.Date("3/15/2012", "%m/%d/%Y")
[1] "2012-03-15"
From the help file's examples you can realize when year is full specified you should use %Y otherwise %y for example:
> dates <- c("02/27/92", "02/27/92", "01/14/92", "02/28/92", "02/01/92")
> as.Date(dates, "%m/%d/%y")
[1] "1992-02-27" "1992-02-27" "1992-01-14" "1992-02-28" "1992-02-01"
You can see that in your example the Year format is 2012 then you should use %Y, and in the other example (taken from the as.Date help file) Year format is 92 then using %y is the correct way to go. See as.Date for further details.
You might also give a try to the lubridate package if you do not want to deal with the hieroglyphics :)
> library(lubridate)
> parse_date_time('3/15/2012', 'mdy')
1 parsed with %m/%d/%Y
[1] "2012-03-15 UTC"
PS.: of course I do not encourage anyone to use any extra dependencies, this answer was just posted here as an alternative (and quick to remeber) solution
To complete the picture, you might also try the recently introduced (2016-09) package anytime which takes advantage of the Boost C++ libraries:
anytime::anytime("3/15/2012")
#[1] "2012-03-15 CET"
We can use mdy from lubridate
lubridate::mdy('3/15/2012')
#[1] "2012-03-15"
Or parse_date from readr which uses same format as as.Date
readr::parse_date('3/15/2012', '%m/%d/%Y')
#[1] "2012-03-15"

R - Help in Converting factor to date (%m/%d/%Y %H:%M)

I am importing a data frame into R, but R is not recognizing the columns with the dates as being in dates format.
> mydata[1,1]
[1] 1/1/2003 0:00
216332 Levels: 1/1/2003 0:00 1/1/2003 0:15 1/1/2003 0:30 ... 9/9/2007 9:55
I tried:
> as.Date(mydata[1,1], format = "%m/%d/%Y %H:%M")
[1] "2003-01-01"
But then I miss the time.
If I do
> strptime(mydata[2,1], format = "%m/%d/%Y %H:%M")
[1] "2003-01-01 00:15:00 EST"
I get what I need. However it does not work when I assign this result to my variable
> mydata[,1] <- strptime(mydata[,1], format = "%m/%d/%Y %H:%M")
Warning message:
In `[<-.data.frame`(`*tmp*`, , 1, value = list(sec = c(0, 0, 0, :
provided 11 variables to replace 1 variables
My question is similar to the question at Set time value into data frame cell
Although, it is well explained, after spending some time reading and trying I could not figure that out on my own.
The levels mean you have a factor. You need to convert to character with as.character():
dt <- as.POSIXct(as.character(mydata[ ,1]) format = "%m/%d/%Y %H:%M")
The first item with time = 0:00 will not show the time when printed but the others will. The error is occuring because the POSIXlt object is a list of 11 item lists. Generally it is better to use as.POSIXct than to use strptime because strptime returns a POSIXlt object and they are a bit of a mess to work with.:
d <- factor("1/1/2003 0:01")
as.POSIXct( as.character(d), format = "%m/%d/%Y %H:%M")
[1] "2003-01-01 00:01:00 PST"
If you are using read.table, read.csv or similar functions to read in the data then you could look at this solution for a way to specify which columns will be dates and have them automatically converted as they are read in. This will do the conversion on the character strings without any conversion to factor (which may be part of your problem).
When dealing with dates, I find lubridate can be very helpful:
library(lubridate)
mydata[, 1] <- mdy_hm(mydata[, 1])
If you don't want to deal with Levels, try this:
First convert your data into character:
data<- as.character(mydata[1,1])
Then give the date format you need, for example:
date<- format(as.POSIXct(data, tz="EST"),"%Y-%m-%d %H")

as.Date with dates in format m/d/y in R

A client sent me an Excel file with dates formatted as e.g 3/15/2012 for March 15. I saved this as a .csv file and then used
camm$Date <- as.Date(camm$Date, "%m/%d/%y")
but this gave me values starting in the year 2020!
I tried to reformat the dates in the original csv file so that they were e.g. 03/14/2013 but was unable to do so.
Any help appreciated
Use capital Y in as.Date call instead. This should do the trick:
> as.Date("3/15/2012", "%m/%d/%Y")
[1] "2012-03-15"
From the help file's examples you can realize when year is full specified you should use %Y otherwise %y for example:
> dates <- c("02/27/92", "02/27/92", "01/14/92", "02/28/92", "02/01/92")
> as.Date(dates, "%m/%d/%y")
[1] "1992-02-27" "1992-02-27" "1992-01-14" "1992-02-28" "1992-02-01"
You can see that in your example the Year format is 2012 then you should use %Y, and in the other example (taken from the as.Date help file) Year format is 92 then using %y is the correct way to go. See as.Date for further details.
You might also give a try to the lubridate package if you do not want to deal with the hieroglyphics :)
> library(lubridate)
> parse_date_time('3/15/2012', 'mdy')
1 parsed with %m/%d/%Y
[1] "2012-03-15 UTC"
PS.: of course I do not encourage anyone to use any extra dependencies, this answer was just posted here as an alternative (and quick to remeber) solution
To complete the picture, you might also try the recently introduced (2016-09) package anytime which takes advantage of the Boost C++ libraries:
anytime::anytime("3/15/2012")
#[1] "2012-03-15 CET"
We can use mdy from lubridate
lubridate::mdy('3/15/2012')
#[1] "2012-03-15"
Or parse_date from readr which uses same format as as.Date
readr::parse_date('3/15/2012', '%m/%d/%Y')
#[1] "2012-03-15"

Convert date-time string to class Date

I have a data frame with a character column of date-times.
When I use as.Date, most of my strings are parsed correctly, except for a few instances. The example below will hopefully show you what is going on.
# my attempt to parse the string to Date -- uses the stringr package
prods.all$Date2 <- as.Date(str_sub(prods.all$Date, 1,
str_locate(prods.all$Date, " ")[1]-1),
"%m/%d/%Y")
# grab two rows to highlight my issue
temp <- prods.all[c(1925:1926), c(1,8)]
temp
# Date Date2
# 1925 10/9/2009 0:00:00 2009-10-09
# 1926 10/15/2009 0:00:00 0200-10-15
As you can see, the year of some of the dates is inaccurate. The pattern seems to occur when the day is double digit.
Any help you can provide will be greatly appreciated.
The easiest way is to use lubridate:
library(lubridate)
prods.all$Date2 <- mdy(prods.all$Date2)
This function automatically returns objects of class POSIXct and will work with either factors or characters.
You may be overcomplicating things, is there any reason you need the stringr package? You can use as.Date and its format argument to specify the input format of your string.
df <- data.frame(Date = c("10/9/2009 0:00:00", "10/15/2009 0:00:00"))
as.Date(df$Date, format = "%m/%d/%Y %H:%M:%S")
# [1] "2009-10-09" "2009-10-15"
Note the Details section of ?as.Date:
Character strings are processed as far as necessary for the format specified: any trailing characters are ignored
Thus, this also works:
as.Date(df$Date, format = "%m/%d/%Y")
# [1] "2009-10-09" "2009-10-15"
All the conversion specifications that can be used to specify the input format are found in the Details section in ?strptime. Make sure that the order of the conversion specification as well as any separators correspond exactly with the format of your input string.
More generally and if you need the time component as well, use as.POSIXct or strptime:
as.POSIXct(df$Date, "%m/%d/%Y %H:%M:%S")
strptime(df$Date, "%m/%d/%Y %H:%M:%S")
I'm guessing at what your actual data might look at from the partial results you give.
If you don't know the format you could use anytime::anydate, which tries to match to common formats:
library(anytime)
date <- c("01/01/2000 0:00:00", "Jan 1, 2000 0:00:00", "2000-Jan-01 0:00:00")
anydate(date)
[1] "2000-01-01" "2000-01-01" "2000-01-01"
library(lubridate)
if your date format is like this '04/24/2017 05:35:00'then change it like below
prods.all$Date2<-gsub("/","-",prods.all$Date2)
then change the date format
parse_date_time(prods.all$Date2, orders="mdy hms")

Resources