I have a data frame with a character column of date-times.
When I use as.Date, most of my strings are parsed correctly, except for a few instances. The example below will hopefully show you what is going on.
# my attempt to parse the string to Date -- uses the stringr package
prods.all$Date2 <- as.Date(str_sub(prods.all$Date, 1,
str_locate(prods.all$Date, " ")[1]-1),
"%m/%d/%Y")
# grab two rows to highlight my issue
temp <- prods.all[c(1925:1926), c(1,8)]
temp
# Date Date2
# 1925 10/9/2009 0:00:00 2009-10-09
# 1926 10/15/2009 0:00:00 0200-10-15
As you can see, the year of some of the dates is inaccurate. The pattern seems to occur when the day is double digit.
Any help you can provide will be greatly appreciated.
The easiest way is to use lubridate:
library(lubridate)
prods.all$Date2 <- mdy(prods.all$Date2)
This function automatically returns objects of class POSIXct and will work with either factors or characters.
You may be overcomplicating things, is there any reason you need the stringr package? You can use as.Date and its format argument to specify the input format of your string.
df <- data.frame(Date = c("10/9/2009 0:00:00", "10/15/2009 0:00:00"))
as.Date(df$Date, format = "%m/%d/%Y %H:%M:%S")
# [1] "2009-10-09" "2009-10-15"
Note the Details section of ?as.Date:
Character strings are processed as far as necessary for the format specified: any trailing characters are ignored
Thus, this also works:
as.Date(df$Date, format = "%m/%d/%Y")
# [1] "2009-10-09" "2009-10-15"
All the conversion specifications that can be used to specify the input format are found in the Details section in ?strptime. Make sure that the order of the conversion specification as well as any separators correspond exactly with the format of your input string.
More generally and if you need the time component as well, use as.POSIXct or strptime:
as.POSIXct(df$Date, "%m/%d/%Y %H:%M:%S")
strptime(df$Date, "%m/%d/%Y %H:%M:%S")
I'm guessing at what your actual data might look at from the partial results you give.
If you don't know the format you could use anytime::anydate, which tries to match to common formats:
library(anytime)
date <- c("01/01/2000 0:00:00", "Jan 1, 2000 0:00:00", "2000-Jan-01 0:00:00")
anydate(date)
[1] "2000-01-01" "2000-01-01" "2000-01-01"
library(lubridate)
if your date format is like this '04/24/2017 05:35:00'then change it like below
prods.all$Date2<-gsub("/","-",prods.all$Date2)
then change the date format
parse_date_time(prods.all$Date2, orders="mdy hms")
Related
I imported csv data saved from excel and I am using a Mac if that matters.
To simplify things, I have been working with one particular entry from my data "Jan-12". This is of type character and in the form Month-Year.
This is what I have tried:
as.Date("Jan-12", format = "%b-%y")
I keep getting NA. I have browsed through other answers but haven't been able to figure out whats happening.
as.Date() help page says:
If the date string does not specify the date completely, the returned
answer may be system-specific.
If you really need to use as.Date() you can append some fixed day (say 1st) to date and convert
mydate <- "Jan-12"
as.Date(paste0("01-", mydate), format= "%d-%b-%y")
"2012-01-01"
Or you can use lubridate::fast_strptime():
library(lubridate)
fast_strptime("Jan-12", "%b-%y")
"2012-01-01 UTC"
Here is another option using zoo, then you can use as.Date.
library(zoo)
as.Date(as.yearmon("Jan-12", "%b-%y"))
# [1] "2012-01-01"
A client sent me an Excel file with dates formatted as e.g 3/15/2012 for March 15. I saved this as a .csv file and then used
camm$Date <- as.Date(camm$Date, "%m/%d/%y")
but this gave me values starting in the year 2020!
I tried to reformat the dates in the original csv file so that they were e.g. 03/14/2013 but was unable to do so.
Any help appreciated
Use capital Y in as.Date call instead. This should do the trick:
> as.Date("3/15/2012", "%m/%d/%Y")
[1] "2012-03-15"
From the help file's examples you can realize when year is full specified you should use %Y otherwise %y for example:
> dates <- c("02/27/92", "02/27/92", "01/14/92", "02/28/92", "02/01/92")
> as.Date(dates, "%m/%d/%y")
[1] "1992-02-27" "1992-02-27" "1992-01-14" "1992-02-28" "1992-02-01"
You can see that in your example the Year format is 2012 then you should use %Y, and in the other example (taken from the as.Date help file) Year format is 92 then using %y is the correct way to go. See as.Date for further details.
You might also give a try to the lubridate package if you do not want to deal with the hieroglyphics :)
> library(lubridate)
> parse_date_time('3/15/2012', 'mdy')
1 parsed with %m/%d/%Y
[1] "2012-03-15 UTC"
PS.: of course I do not encourage anyone to use any extra dependencies, this answer was just posted here as an alternative (and quick to remeber) solution
To complete the picture, you might also try the recently introduced (2016-09) package anytime which takes advantage of the Boost C++ libraries:
anytime::anytime("3/15/2012")
#[1] "2012-03-15 CET"
We can use mdy from lubridate
lubridate::mdy('3/15/2012')
#[1] "2012-03-15"
Or parse_date from readr which uses same format as as.Date
readr::parse_date('3/15/2012', '%m/%d/%Y')
#[1] "2012-03-15"
I have a data frame with a character column of date-times.
When I use as.Date, most of my strings are parsed correctly, except for a few instances. The example below will hopefully show you what is going on.
# my attempt to parse the string to Date -- uses the stringr package
prods.all$Date2 <- as.Date(str_sub(prods.all$Date, 1,
str_locate(prods.all$Date, " ")[1]-1),
"%m/%d/%Y")
# grab two rows to highlight my issue
temp <- prods.all[c(1925:1926), c(1,8)]
temp
# Date Date2
# 1925 10/9/2009 0:00:00 2009-10-09
# 1926 10/15/2009 0:00:00 0200-10-15
As you can see, the year of some of the dates is inaccurate. The pattern seems to occur when the day is double digit.
Any help you can provide will be greatly appreciated.
The easiest way is to use lubridate:
library(lubridate)
prods.all$Date2 <- mdy(prods.all$Date2)
This function automatically returns objects of class POSIXct and will work with either factors or characters.
You may be overcomplicating things, is there any reason you need the stringr package? You can use as.Date and its format argument to specify the input format of your string.
df <- data.frame(Date = c("10/9/2009 0:00:00", "10/15/2009 0:00:00"))
as.Date(df$Date, format = "%m/%d/%Y %H:%M:%S")
# [1] "2009-10-09" "2009-10-15"
Note the Details section of ?as.Date:
Character strings are processed as far as necessary for the format specified: any trailing characters are ignored
Thus, this also works:
as.Date(df$Date, format = "%m/%d/%Y")
# [1] "2009-10-09" "2009-10-15"
All the conversion specifications that can be used to specify the input format are found in the Details section in ?strptime. Make sure that the order of the conversion specification as well as any separators correspond exactly with the format of your input string.
More generally and if you need the time component as well, use as.POSIXct or strptime:
as.POSIXct(df$Date, "%m/%d/%Y %H:%M:%S")
strptime(df$Date, "%m/%d/%Y %H:%M:%S")
I'm guessing at what your actual data might look at from the partial results you give.
If you don't know the format you could use anytime::anydate, which tries to match to common formats:
library(anytime)
date <- c("01/01/2000 0:00:00", "Jan 1, 2000 0:00:00", "2000-Jan-01 0:00:00")
anydate(date)
[1] "2000-01-01" "2000-01-01" "2000-01-01"
library(lubridate)
if your date format is like this '04/24/2017 05:35:00'then change it like below
prods.all$Date2<-gsub("/","-",prods.all$Date2)
then change the date format
parse_date_time(prods.all$Date2, orders="mdy hms")
A client sent me an Excel file with dates formatted as e.g 3/15/2012 for March 15. I saved this as a .csv file and then used
camm$Date <- as.Date(camm$Date, "%m/%d/%y")
but this gave me values starting in the year 2020!
I tried to reformat the dates in the original csv file so that they were e.g. 03/14/2013 but was unable to do so.
Any help appreciated
Use capital Y in as.Date call instead. This should do the trick:
> as.Date("3/15/2012", "%m/%d/%Y")
[1] "2012-03-15"
From the help file's examples you can realize when year is full specified you should use %Y otherwise %y for example:
> dates <- c("02/27/92", "02/27/92", "01/14/92", "02/28/92", "02/01/92")
> as.Date(dates, "%m/%d/%y")
[1] "1992-02-27" "1992-02-27" "1992-01-14" "1992-02-28" "1992-02-01"
You can see that in your example the Year format is 2012 then you should use %Y, and in the other example (taken from the as.Date help file) Year format is 92 then using %y is the correct way to go. See as.Date for further details.
You might also give a try to the lubridate package if you do not want to deal with the hieroglyphics :)
> library(lubridate)
> parse_date_time('3/15/2012', 'mdy')
1 parsed with %m/%d/%Y
[1] "2012-03-15 UTC"
PS.: of course I do not encourage anyone to use any extra dependencies, this answer was just posted here as an alternative (and quick to remeber) solution
To complete the picture, you might also try the recently introduced (2016-09) package anytime which takes advantage of the Boost C++ libraries:
anytime::anytime("3/15/2012")
#[1] "2012-03-15 CET"
We can use mdy from lubridate
lubridate::mdy('3/15/2012')
#[1] "2012-03-15"
Or parse_date from readr which uses same format as as.Date
readr::parse_date('3/15/2012', '%m/%d/%Y')
#[1] "2012-03-15"
I read a file with the function
site_wind <- read.delim(import,header=F,sep="\t",skip=nline,quote="\"")
In the first column I have dates and times in the form:
01/05/2011 0:10 where "day-month-year hour:min"
I want to convert site_wind$V1 to class POSIXct and POSIXlt but when I do it:
as.POSIXct(site_wind$V1,"%d-%m-%Y %H:%M",TZ="GMT")
and I get:
"0026-01-20 GMT"
I have tried some alternatives, but I don't know how to solve this problem.
You need literal / as the delimiter in the dates. In the format = "%d-%m-%Y %H:%M" part you are using literal - as the data separator, which doesn't match the date example you showed. I think you want
as.POSIXct(as.character(site_wind$V1), format = "%d/%m/%Y %H:%M", tz="GMT")
Note that the argument is tz not TZ - R was silently ignoring that in your original call.