I am new to R and I am trying to change date format in the data frame for date columns. My date column is in format Mar 13 2007 01:05:123AM. Now this date format values are same except day change and time remains same. So I was thinking to change it to format as Mar 13 2007.
I tried the following code:
df <- read.csv("mydata.csv")
df$collectdate <- format(as.Date(df$collectdate,"%b %d %Y"))
but it gives error saying "character string is not in a standard unambiguous format". What can I try next?
You could try:
date <- "Mar 13 2007 01:05:123AM"
gsub("(.*)(?=\\s\\d{2}:).*", "\\1", date, perl=TRUE)
#[1] "Mar 13 2007"
For the as.Date, it didn't show any errors.
format(as.Date(date,"%b %d %Y"), "%b %d %Y")
#[1] "Mar 13 2007
Related
I received Excel spreadsheet with text "MMM DD YYYY" for a date column.
Unfortunately, this needs to be dumped into R. Anyone can help convert this to R date?
excel string Jan 05 2004 to r date 2004-01-05
Thanks
We can use as.Date with the format argument
df1$Date <- as.Date(df1$Date, "%b %d %Y")
df1$Date
#[1] "2004-01-05" "2004-01-06"
Or with lubridate
library(lubridate)
mdy(df1$Date)
Or automaticaly pick te format with anydate
library(anytime)
anydate(df1$Date)
data
df1 <- data.frame(Date = c("Jan 05 2004", "Jan 06 2004"), stringsAsFactors = FALSE)
I have date in this character format "2017-03" and I want to convert it in "March 2017" for display in ggplot in R. But when I try to convert it using as.Date("2017-03","%Y-%m") it gives NA
You can consider using zoo::as.yearmon function as:
library(zoo)
#Sample data
v <- c("2014-05", "2017-03")
as.yearmon(v, "%Y-%m")
#[1] "May 2014" "Mar 2017"
#if you want the month name to be in full. Then you can format yearmon type as
format(as.yearmon(v, "%Y-%m"), "%B %Y")
#[1] "May 2014" "March 2017"
Parse dates back and forth can be done like this:
The one you mentioned is done by quoting MKR:
Use zoo package
library(zoo)
date <- "2017-03"
as.yearmon(date, "%Y-%m")
#[1] "Mar 2017"
format(as.yearmon(date, "%Y-%m"), "%B %Y")
#[1] "March 2017"
If you want to parse March 2017 or other similar formats back to 2017-03:
Use hms package because base R doesn't provide a nice built-in class for date
library(hms)
DATE <- "March 1 2017"
parse_date(DATE, "%B %d %Y")
#[1] "2017-03-01"
Or if you are parsing dates with foreign language:
foreign_date <- "1 janvier 2018"
parse_date(foreign_date, "%d %B %Y", locale = locale("fr"))
#[1] "2018-01-01"
By using the locale = locale("language") you can parse dates with foreign months names to standard dates. Use this to check the language:
date_names_langs()
-Format:
-Year: %Y(4 digits) %y(2 digits; 00-69->2000-2069, 70-99 -> 1970-1999)
-Month: %m (2 digits), %b (abbreviation: Jan), %B full name January
-Day: %d (2 digits)
Currently, my dataset has a time variable (factor) in the following format:
weekday month day hour min seconds +0000 year
I don't know what the "+0000" field is but all observations have this. For example:
"Tues Feb 02 11:05:21 +0000 2018"
"Mon Jun 12 06:21:50 +0000 2017"
"Wed Aug 01 11:24:08 +0000 2018"
I want to convert these values to POSIXlt or POSIXct objects(year-month-day hour:min:sec) and make them numeric. Currently, using as.numeric(as.character(time-variable)) outputs incorrect values.
Thank you for the great responses! I really appreciate a lot.
Not sure how to reproduce the transition from factor to char, but starting from that this code should work:
t <- unlist(strsplit(as.character("Tues Feb 02 11:05:21 +0000 2018")," "))
strptime(paste(t[6],t[2],t[3], t[4]),format='%Y %b %d %H:%M:%S')
PS: More on date formats and conversion: https://www.stat.berkeley.edu/~s133/dates.html
For this problem you can get by without using lubridate. First, to extract individual dates we can use regmatches and gregexpr:
date_char <- 'Tue Feb 02 11:05:21 +0000 2018 Mon Jun 12 06:21:50 +0000 2017'
ptrn <- '([[:alpha:]]{3} [[:alpha:]]{3} [[:digit:]]{2} [[:digit:]]{2}\\:[[:digit:]]{2}\\:[[:digit:]]{2} \\+[[:digit:]]{4} [[:digit:]]{4})'
date_vec <- unlist( regmatches(date_char, gregexpr(ptrn, date_char)))
> date_vec
[1] "Tue Feb 02 11:05:21 +0000 2018" "Mon Jun 12 06:21:50 +0000 2017"
You can learn more about regular expressions here.
In the above example +0000 field is the UTC offset in hours e.g. it would be -0500 for EST timezone. To convert to R date-time object:
> as.POSIXct(date_vec, format = '%a %b %d %H:%M:%S %z %Y', tz = 'UTC')
[1] "2018-02-02 11:05:21 UTC" "2017-06-12 06:21:50 UTC"
which is the desired output. The formats can be found here or you can use lubridate::guess_formats(). If you don't specify the tz, you'll get the output in your system's time zone (e.g. for me that would be EST). Since the offset is specified in the format, R correctly carries out the conversion.
To get numeric values, the following works:
> as.numeric(as.POSIXct(date_vec, format = '%a %b %d %H:%M:%S %z %Y', tz = 'UTC'))
[1] 1517569521 1497248510
Note: this is based on uniform string structure. In the OP there was Tues instead of Tue which wouldn't work. The above example is based on the three-letter abbreviation which is the standard reporting format.
If however, your data is a mix of different formats, you'd have to extract individual time strings (customized regexes, of course), then use lubridate::guess_formats() to get the formats and then use those to carry out the conversion.
Hope this is helpful!!
In a data.frame, I have a date time stamp in the form:
head(x$time)
[1] "Thu Oct 11 22:18:02 2012" "Thu Oct 11 22:50:15 2012" "Thu Oct 11 22:54:17 2012"
[4] "Thu Oct 11 22:43:13 2012" "Thu Oct 11 22:41:18 2012" "Thu Oct 11 22:15:19 2012"
Everytime I try to convert it with as.Date, lubridate, or zoo I get NAs or Errors.
What is the way to convert this time to a readable form?
I've tried:
Time<-strptime(x$time,format="&m/%d/%Y %H:$M")
x$minute<-parse_date_time(x$time)
x$minute<-mdy(x$time)
x$minute<-as.Date(x$time,"%m/%d/%Y %H:%M:%S")
x$minute<-as.time(x$time)
x$minute<-as.POSIXct(x$time,format="%H:%M")
x$minute<-minute(x$time)
What you really want is strptime(). Try something like:
strptime(x$time, "%a %b %d %H:%M:%S %Y")
As an example of the interesting things you can do with strptime(), consider the following:
thedate <- "I came to your house at 11:45 on January 21, 2012."
strptime(thedate, "I came to your house at %H:%M on %B %d, %Y.")
# [1] "2012-01-21 11:45:00"
Another option is to use lubridate::parse_date_time():
library(lubridate)
parse_date_time(x$time, "%a %b %d %H:%M:%S %Y")
Or more simply:
parse_date_time(x$time, "abdHMSY")
From the docs:
It differs from base::strptime() in two respects. First, it allows specification of the order in which the formats occur without the need to include separators and % prefix. Such a formating argument is refered to as "order". Second, it allows the user to specify several format-orders to handle heterogeneous date-time character representations.
The docs contain all the formats (the "abdHMSY" etc.) recognized by lubridate.
I was wondering if there is a way for R to turn this format into any date object. The format is 'month [space] day'. For example: Jan 1 or Jul 29 or Jul 30. I just want those examples to be read as a date object so I can manipulate them.
Yes, use as.Date, but you also have to specify a year:
x <- c("Jan 1", "Jul 29", "Jul 30")
as.Date(paste("2012", x), format="%Y %b %d")
[1] "2012-01-01" "2012-07-29" "2012-07-30"
See ?as.Date for more help on Date objects, and ?strptime for help on the formatting codes.