A client sent me an Excel file with dates formatted as e.g 3/15/2012 for March 15. I saved this as a .csv file and then used
camm$Date <- as.Date(camm$Date, "%m/%d/%y")
but this gave me values starting in the year 2020!
I tried to reformat the dates in the original csv file so that they were e.g. 03/14/2013 but was unable to do so.
Any help appreciated
Use capital Y in as.Date call instead. This should do the trick:
> as.Date("3/15/2012", "%m/%d/%Y")
[1] "2012-03-15"
From the help file's examples you can realize when year is full specified you should use %Y otherwise %y for example:
> dates <- c("02/27/92", "02/27/92", "01/14/92", "02/28/92", "02/01/92")
> as.Date(dates, "%m/%d/%y")
[1] "1992-02-27" "1992-02-27" "1992-01-14" "1992-02-28" "1992-02-01"
You can see that in your example the Year format is 2012 then you should use %Y, and in the other example (taken from the as.Date help file) Year format is 92 then using %y is the correct way to go. See as.Date for further details.
You might also give a try to the lubridate package if you do not want to deal with the hieroglyphics :)
> library(lubridate)
> parse_date_time('3/15/2012', 'mdy')
1 parsed with %m/%d/%Y
[1] "2012-03-15 UTC"
PS.: of course I do not encourage anyone to use any extra dependencies, this answer was just posted here as an alternative (and quick to remeber) solution
To complete the picture, you might also try the recently introduced (2016-09) package anytime which takes advantage of the Boost C++ libraries:
anytime::anytime("3/15/2012")
#[1] "2012-03-15 CET"
We can use mdy from lubridate
lubridate::mdy('3/15/2012')
#[1] "2012-03-15"
Or parse_date from readr which uses same format as as.Date
readr::parse_date('3/15/2012', '%m/%d/%Y')
#[1] "2012-03-15"
Related
I imported csv data saved from excel and I am using a Mac if that matters.
To simplify things, I have been working with one particular entry from my data "Jan-12". This is of type character and in the form Month-Year.
This is what I have tried:
as.Date("Jan-12", format = "%b-%y")
I keep getting NA. I have browsed through other answers but haven't been able to figure out whats happening.
as.Date() help page says:
If the date string does not specify the date completely, the returned
answer may be system-specific.
If you really need to use as.Date() you can append some fixed day (say 1st) to date and convert
mydate <- "Jan-12"
as.Date(paste0("01-", mydate), format= "%d-%b-%y")
"2012-01-01"
Or you can use lubridate::fast_strptime():
library(lubridate)
fast_strptime("Jan-12", "%b-%y")
"2012-01-01 UTC"
Here is another option using zoo, then you can use as.Date.
library(zoo)
as.Date(as.yearmon("Jan-12", "%b-%y"))
# [1] "2012-01-01"
I have a data frame with a character column of date-times.
When I use as.Date, most of my strings are parsed correctly, except for a few instances. The example below will hopefully show you what is going on.
# my attempt to parse the string to Date -- uses the stringr package
prods.all$Date2 <- as.Date(str_sub(prods.all$Date, 1,
str_locate(prods.all$Date, " ")[1]-1),
"%m/%d/%Y")
# grab two rows to highlight my issue
temp <- prods.all[c(1925:1926), c(1,8)]
temp
# Date Date2
# 1925 10/9/2009 0:00:00 2009-10-09
# 1926 10/15/2009 0:00:00 0200-10-15
As you can see, the year of some of the dates is inaccurate. The pattern seems to occur when the day is double digit.
Any help you can provide will be greatly appreciated.
The easiest way is to use lubridate:
library(lubridate)
prods.all$Date2 <- mdy(prods.all$Date2)
This function automatically returns objects of class POSIXct and will work with either factors or characters.
You may be overcomplicating things, is there any reason you need the stringr package? You can use as.Date and its format argument to specify the input format of your string.
df <- data.frame(Date = c("10/9/2009 0:00:00", "10/15/2009 0:00:00"))
as.Date(df$Date, format = "%m/%d/%Y %H:%M:%S")
# [1] "2009-10-09" "2009-10-15"
Note the Details section of ?as.Date:
Character strings are processed as far as necessary for the format specified: any trailing characters are ignored
Thus, this also works:
as.Date(df$Date, format = "%m/%d/%Y")
# [1] "2009-10-09" "2009-10-15"
All the conversion specifications that can be used to specify the input format are found in the Details section in ?strptime. Make sure that the order of the conversion specification as well as any separators correspond exactly with the format of your input string.
More generally and if you need the time component as well, use as.POSIXct or strptime:
as.POSIXct(df$Date, "%m/%d/%Y %H:%M:%S")
strptime(df$Date, "%m/%d/%Y %H:%M:%S")
I'm guessing at what your actual data might look at from the partial results you give.
If you don't know the format you could use anytime::anydate, which tries to match to common formats:
library(anytime)
date <- c("01/01/2000 0:00:00", "Jan 1, 2000 0:00:00", "2000-Jan-01 0:00:00")
anydate(date)
[1] "2000-01-01" "2000-01-01" "2000-01-01"
library(lubridate)
if your date format is like this '04/24/2017 05:35:00'then change it like below
prods.all$Date2<-gsub("/","-",prods.all$Date2)
then change the date format
parse_date_time(prods.all$Date2, orders="mdy hms")
This should be simple but I can't figure it out. How should I go about formatting dates that are '20150703' into '07-03-2015'? Thanks
You may use format after converting to 'Date' class
format(as.Date(dates, '%Y%m%d'), '%m-%d-%Y')
#[1] "07-03-2015"
data
dates <- '20150703'
Also take a look at lubridate package here which makes it easier to work with dates.
ymd("20150703")
gives
[1] "2015-07-03 UTC"
A client sent me an Excel file with dates formatted as e.g 3/15/2012 for March 15. I saved this as a .csv file and then used
camm$Date <- as.Date(camm$Date, "%m/%d/%y")
but this gave me values starting in the year 2020!
I tried to reformat the dates in the original csv file so that they were e.g. 03/14/2013 but was unable to do so.
Any help appreciated
Use capital Y in as.Date call instead. This should do the trick:
> as.Date("3/15/2012", "%m/%d/%Y")
[1] "2012-03-15"
From the help file's examples you can realize when year is full specified you should use %Y otherwise %y for example:
> dates <- c("02/27/92", "02/27/92", "01/14/92", "02/28/92", "02/01/92")
> as.Date(dates, "%m/%d/%y")
[1] "1992-02-27" "1992-02-27" "1992-01-14" "1992-02-28" "1992-02-01"
You can see that in your example the Year format is 2012 then you should use %Y, and in the other example (taken from the as.Date help file) Year format is 92 then using %y is the correct way to go. See as.Date for further details.
You might also give a try to the lubridate package if you do not want to deal with the hieroglyphics :)
> library(lubridate)
> parse_date_time('3/15/2012', 'mdy')
1 parsed with %m/%d/%Y
[1] "2012-03-15 UTC"
PS.: of course I do not encourage anyone to use any extra dependencies, this answer was just posted here as an alternative (and quick to remeber) solution
To complete the picture, you might also try the recently introduced (2016-09) package anytime which takes advantage of the Boost C++ libraries:
anytime::anytime("3/15/2012")
#[1] "2012-03-15 CET"
We can use mdy from lubridate
lubridate::mdy('3/15/2012')
#[1] "2012-03-15"
Or parse_date from readr which uses same format as as.Date
readr::parse_date('3/15/2012', '%m/%d/%Y')
#[1] "2012-03-15"
I have a simple question regarding R's lubridate package. I've a series of timestamps in seconds since epoch. I want to convert this to YYYY-MM-DD-HH format. In base R, I can do something like this to first convert it to a date format
> x = as.POSIXct(1356129107,origin = "1970-01-01",tz = "GMT")
> x
[1] "2012-12-21 22:31:47 GMT"
Note the above just converts it to a date format, not the YYYY-MM-DD-HH format. How would I do this in lubridate? How would I do it using base R?
Thanks much in advance
lubridate has an as_datetime() that happens to have UNIX epoch time as the default origin time to make this really simple:
> as_datetime(1356129107)
[1] "2012-12-21 22:31:47 UTC"
more details can be found here: https://rdrr.io/cran/lubridate/man/as_date.html
Dirk is correct. However, if you are intent on using lubridate functions:
paste( year(dt), month(dt), mday(dt), hour(dt) sep="-")
If on the other hand you want to handle the POSIXct objects the way they were supposed to be used then this should satisfy:
format(x, format="%Y-%m-%d-%H")
I use the lubridate solution provided by #leerssej
But in case anyone prefers #IRTFM's solution in base R, but also wants minutes and seconds, here's an example of how to do that:
as.POSIXct("2019-03-15 16:17:42" , format="%Y-%m-%d %H:%M:%OS")