I need to convert a string of dates that is in multiple formats to valid dates.
e.g.
dates <- c("01-01-2017","02-01-2017","12-01-2016","20160901","20161001", "20161101")
> as.Date(dates, format=c("%m-%d-%Y","%Y%m%d"))
[1] "2017-01-01" NA "2016-12-01" "2016-09-01" NA "2016-11-01"
two dates show as NA
This is pretty much I wrote the anytime package for:
R> dates <- c("01-01-2017","02-01-2017","12-01-2016","20160901","20161001",
+ "20161101")
R> library(anytime)
R> anydate(dates)
[1] "2017-01-01" "2017-02-01" "2016-12-01" "2016-09-01"
[5] "2016-10-01" "2016-11-01"
R>
Parse any sane input reliably and without explicit format or origin or other line noise.
That being said, not starting ISO style with the year is asking for potential trouble, so 02-03-2017 could be February 3 or March 2. I am following the North American convention I too consider somewhat broken -- but is so darn prevalent. Do yourself a favour and try to limit inputs to ISO dates, at least ISO order YYYYMMDD.
I have tried library(anytime), however for big data did not work.
Then, I found useful this sequence:
df$Date2 <- format(as.Date(df$Date, format="%m/%d/%Y"), "%d/%m/%y")
df$Date2 <- as.Date(df$Date2,"%d/%m/%y")
It worked for me to "8/10/2005" as well as "08/13/05" in the same column.
Related
I'm having trouble formatting a list of dates in R. The conventional methods of formatting in R such as as.Date or as.POSIXct don't seem to be working.
I have dates in the format: 1012015
using
as.POSIXct(as.character(data$Start_Date), format = "%m%d%Y")
does not give me an error, but my date returns
"0015-10-12" because the month is not a two digit number.
Is there a way to change this into the correct date format?F
The lubridate package can help with this:
lubridate::mdy(1012015)
[1] "2015-01-01"
The format looks ambiguous but the OP gave two hints:
He is using format = "%m%d%Y" in his own attempt, and
he argues the issue is because the month is not a two digit number
This uses only base R. The %08d specifies a number to be formatted into 8 characters with 0 fill giving in this case "01012015".
as.POSIXct(sprintf("%08d", 1012015), format = "%m%d%Y")
## [1] "2015-01-01 EST"
Note that if you don't have any hours/minutes/seconds it would be less error prone to use "Date" class since then the possibility of subtle time zone errors is eliminated.
as.Date(sprintf("%08d", 1012015), format = "%m%d%Y")
## [1] "2015-01-01"
This is the information contained within my dataframe:
## minuteofday: factor w/ 89501 levels "2013-06-01 08:07:00",...
## dDdt: num 7.8564 2.318 ...
## minutes: POSIXlt, format: NA NA NA
I need to convert the minute of day column to a date/time format:
minuteave$minutes <- as.POSIXlt(as.character(minuteave$minuteofday), format="%m/%d/%Y %H:%M:%S")
I've tried as.POSIXlt, as.POSIXct and as.Date. None of which worked. Does anyone have ANY thoughts.
The goal is to plot minutes vs. dDdt, but it won't let me plot in the specified time period that I want to as a factor. I have no idea what to try next...
You need to insert an as.character() before parsing as a Datetime or Date.
A factor will always come back first as a number corresponding to its level.
You can save the conversion from factor to character by telling read.csv() etc to no store as a factor: stringsAsFactors=FALSE. You can also set that as a global option.
Once you have it as character, make sure you match the format string to your data:
R> as.POSIXct("2013-06-01 08:07:00", format="%Y-%m-%d %H:%M:%S")
[1] "2013-06-01 08:07:00 CDT"
R>
Note the %Y-%m-%d I used, as opposed to your %m/%d/%y.
Edit on 3 Jan 2016: This is now much easier thanks to the anytime package which automagically converts from many types, including factor, and does so without requiring a format string.
R> as.factor("2013-06-01 08:07:00")
[1] 2013-06-01 08:07:00
Levels: 2013-06-01 08:07:00
R>
R> library(anytime)
R> anytime(as.factor("2013-06-01 08:07:00"))
[1] "2013-06-01 08:07:00 CDT"
R>
R> class(anytime(as.factor("2013-06-01 08:07:00")))
[1] "POSIXct" "POSIXt"
R>
As you can see we just feed the factor variable into anytime() and out comes the desired POSIXct type.
Try this
library(lubridate)
minuteave$minutes <- ymd_hms(minuteave$minutes)
this will return minuteave$minutes as a POSIXct object.
Hope this helps you.
This is the information contained within my dataframe:
## minuteofday: factor w/ 89501 levels "2013-06-01 08:07:00",...
## dDdt: num 7.8564 2.318 ...
## minutes: POSIXlt, format: NA NA NA
I need to convert the minute of day column to a date/time format:
minuteave$minutes <- as.POSIXlt(as.character(minuteave$minuteofday), format="%m/%d/%Y %H:%M:%S")
I've tried as.POSIXlt, as.POSIXct and as.Date. None of which worked. Does anyone have ANY thoughts.
The goal is to plot minutes vs. dDdt, but it won't let me plot in the specified time period that I want to as a factor. I have no idea what to try next...
You need to insert an as.character() before parsing as a Datetime or Date.
A factor will always come back first as a number corresponding to its level.
You can save the conversion from factor to character by telling read.csv() etc to no store as a factor: stringsAsFactors=FALSE. You can also set that as a global option.
Once you have it as character, make sure you match the format string to your data:
R> as.POSIXct("2013-06-01 08:07:00", format="%Y-%m-%d %H:%M:%S")
[1] "2013-06-01 08:07:00 CDT"
R>
Note the %Y-%m-%d I used, as opposed to your %m/%d/%y.
Edit on 3 Jan 2016: This is now much easier thanks to the anytime package which automagically converts from many types, including factor, and does so without requiring a format string.
R> as.factor("2013-06-01 08:07:00")
[1] 2013-06-01 08:07:00
Levels: 2013-06-01 08:07:00
R>
R> library(anytime)
R> anytime(as.factor("2013-06-01 08:07:00"))
[1] "2013-06-01 08:07:00 CDT"
R>
R> class(anytime(as.factor("2013-06-01 08:07:00")))
[1] "POSIXct" "POSIXt"
R>
As you can see we just feed the factor variable into anytime() and out comes the desired POSIXct type.
Try this
library(lubridate)
minuteave$minutes <- ymd_hms(minuteave$minutes)
this will return minuteave$minutes as a POSIXct object.
Hope this helps you.
I have a data.frame (CSV originally) in R with dates are in the following 3 formats:
2011-06-02T17:16:05Z
2012-06-02T17:16:05-07:00
6/2/11 17:16:05
which is year-month-day-time. I don't quite know what the -07:00 is, as it seems to be the same for all timestamps (except for some where it is -08:00), but I guess it's some type of time zone offset.
I am not quite sure what format these are (does anyone know?), but I need to convert it to this format:
6/2/11 17:16:05
which is year-month-day-time
I would like to do this in such a way so that all the dates in the CSV (in one and the same row) is converted to the second format. How can I accomplish this in R?
The full dataset can be found here.
Here's another attempt, assuming your data is text to start with:
test <- c("2011-06-02T17:16:05Z","2012-06-02T17:16:05-07:00")
format(as.POSIXct(test,format="%Y-%m-%dT17:%H:%M"),"%m/%d/%y %H:%M")
[1] "06/02/11 16:05" "06/02/12 16:05"
You can try the following, where myDates would be the column of dates
format(strptime(myDates, format="%Y-%m-%dT17:%H:%M"), format= "%m/%d/%Y %H:%M")
[1] "06/02/2011 16:05" "06/02/2012 16:05"
or with 2-digit year
# Note the lower-case %y at the end
format(strptime(myDates, format="%Y-%m-%dT17:%H:%M"), format= "%m/%d/%y %H:%M")
[1] "06/02/11 16:05" "06/02/12 16:05"
As for the Z, that indicates GMT (think: London).
the -7:00 indicates 7 hours back from GMT (think: Colorado / MST etc)
Please see here for more reference
The title has it: how do you convert a POSIX date to day-of-year?
An alternative is to format the "POSIXt" object using strftime():
R> today <- Sys.time()
R> today
[1] "2012-10-19 19:12:04 BST"
R> doy <- strftime(today, format = "%j")
R> doy
[1] "293"
R> as.numeric(doy)
[1] 293
which is preferable to remembering that the day of the years is zero-based in the POSIX standard.
As ?POSIXlt reveals, a $yday suffix to a POSIXlt date (or even a vector of such) will convert to day of year. Beware that POSIX counts Jan 1 as day 0, so you might want to add 1 to the result.
It took me embarrassingly long to find this, so I thought I'd ask and answer my own question.
Alternatively, the excellent lubridate package provides the yday function, which is just a wrapper for the above method. It conveniently defines similar functions for other units (month, year, hour, ...).
today <- Sys.time()
yday(today)
I realize it isn't quite what the poster was looking for, but I needed to convert POSIX date-times into a fractional day of the year for time series analysis and ended up doing this:
today <- Sys.time()
doy2015f<-difftime(today,as.POSIXct(as.Date("2015-01-01 00:00", tzone="GMT")),units='days')
The data.table package also provides a yday() function.
require(data.table)
today <- Sys.time()
yday(today)
This is the way how I do it:
as.POSIXlt(c("15.4", "10.5", "15.5", "10.6"), format = "%d.%m")$yday
# [1] 104 129 134 160