Converting character to date R - r

I am running into some date issues when working with Dates in R.
Here's my situation.
I'm working on a dataset with a column date (ProjectDate) having the following values
class(Dataset$ProjectDate)
"character"
head(Dataset$ProjectDate)
"End July 2014" "End August 2014" "End September 2014" "End October 2014"
I would like to convert it to "%M %Y" format
How can I do that ?
Thanks

You should think of using 2 step process. First remove the End part from the ProjectDate using sub.
Now you can apply yearmon from zoo library to convert to month year date format.
library(zoo)
as.yearmon(sub("^End ", "", df$ProjectDate), "%b %Y")
#[1] "Aug 2014" "Sep 2014"

Try the following.
First, the data.
x <- scan(what = character(),
text = '"End July 2014" "End August 2014"
"End September 2014" "End October 2014"')
Now the conversion to dates. Note that your dates do not have a day, so I replace "End" by day "1".
as.Date(sub("^[[:alpha:]]+", "1", x), "%d %B %Y")
#[1] "2014-07-01" "2014-08-01" "2014-09-01" "2014-10-01"

Related

Conditionally formatting a column with mutate and regex in R

I'm brand new in R and programming in general. I have a column containing a list of dates. Some are in the "01 January 2020" format, some have only month and year (ie "January 2020" only). I want to mutate them to a new field where I add a 01 in front of all the dates that are in the month year format, and then I will use lubridate to process it into dates
This is what I've tried. I'm trying to extract the first character of the Date column. If it is an upper case letter, then I will append "01" to it. I am using the tinyverse package including dplyr
df %>% mutate(new_date = ifelse(str_sub(Date, start = 1, end = 1)== "[:upper:]"), paste('01', Date, sep = ' '), new_date = Date)
I'm getting the error message "no is missing", but I thought I have included new_date = Date to keep the current formatting.
Thank you for your help!
This can be done in many ways.
base R using lookahead and backreference:
sub("(^)(?=[A-Za-z]+)", "\\101 ", date, perl = TRUE)
[1] "01 January 2020" "01 January 2020" "12 February 1999" "01 March 2033"
base R using only backreference:
sub("(^[A-Za-z]+)", "01 \\1", date, perl = TRUE)
dplyr and stringr using the same logic:
library(dplyr)
library(stringr)
data.frame(date) %>%
mutate(date = str_replace(date, "(^)(?=[A-Za-z]+)", "\\101 "))
If you do insist on using ifelse:
library(dplyr)
library(stringr)
data.frame(date) %>%
mutate(date = ifelse(str_detect(date, "^[:upper:]"),
sub("^", "01 ", date),
date))
Data:
date <- c("01 January 2020","January 2020", "12 February 1999", "March 2033")
Here is a non-regex option where we convert to Date class and format it
library(parsedate)
format(parse_date(date), '%d %B %Y')
[1] "01 January 2020" "01 January 2020" "12 February 1999" "01 March 2033"
data
date <- c("01 January 2020","January 2020", "12 February 1999", "March 2033")

convert "2014-05" into date format as "May 2015" for display in ggplot in R

I have date in this character format "2017-03" and I want to convert it in "March 2017" for display in ggplot in R. But when I try to convert it using as.Date("2017-03","%Y-%m") it gives NA
You can consider using zoo::as.yearmon function as:
library(zoo)
#Sample data
v <- c("2014-05", "2017-03")
as.yearmon(v, "%Y-%m")
#[1] "May 2014" "Mar 2017"
#if you want the month name to be in full. Then you can format yearmon type as
format(as.yearmon(v, "%Y-%m"), "%B %Y")
#[1] "May 2014" "March 2017"
Parse dates back and forth can be done like this:
The one you mentioned is done by quoting MKR:
Use zoo package
library(zoo)
date <- "2017-03"
as.yearmon(date, "%Y-%m")
#[1] "Mar 2017"
format(as.yearmon(date, "%Y-%m"), "%B %Y")
#[1] "March 2017"
If you want to parse March 2017 or other similar formats back to 2017-03:
Use hms package because base R doesn't provide a nice built-in class for date
library(hms)
DATE <- "March 1 2017"
parse_date(DATE, "%B %d %Y")
#[1] "2017-03-01"
Or if you are parsing dates with foreign language:
foreign_date <- "1 janvier 2018"
parse_date(foreign_date, "%d %B %Y", locale = locale("fr"))
#[1] "2018-01-01"
By using the locale = locale("language") you can parse dates with foreign months names to standard dates. Use this to check the language:
date_names_langs()
-Format:
-Year: %Y(4 digits) %y(2 digits; 00-69->2000-2069, 70-99 -> 1970-1999)
-Month: %m (2 digits), %b (abbreviation: Jan), %B full name January
-Day: %d (2 digits)

Convert Excel numeric to date

I have a vector of numeric excel dates i.e.
date <- c(42963,42994,42903,42933,42964)
The output am I expecting when using excel_numeric_to_date function from janitor package and as.yearmon function from zoo package
as.yearmon(excel_numeric_to_date(date)) [1] "Aug 2016" "Sep 2016" "Jun 2017" "Jul 2017" "Aug 2017".
However, the conversion for the first to elements of the date vector are incorrect. The actual result are:
as.yearmon(excel_numeric_to_date(date)) [1] "Aug 2017" "Sep 2017" "Jun 2017" "Jul 2017" "Aug 2017"
I have tried using different option(modern and mac pre-2011) for the date_system argument in the excel_numeric_to_date but it does not help either
The excel version is 2010
You can simply use as.Date and specify the origin, i.e.
as.Date(date, origin="1899-12-30")
#[1] "2017-08-16" "2017-09-16" "2017-06-17" "2017-07-17" "2017-08-17"
#or format it to your liking,
format(as.Date(date, origin="1899-12-30"), '%b %Y')
#[1] "Aug 2017" "Sep 2017" "Jun 2017" "Jul 2017" "Aug 2017"
This link gives quite a bit of information on this matter.
If you want to convert dates from Excel, you can use as.Date() with a specific origin. According to the documentation, "1900-01-0"' is used as day in Excel on Windows, but "this is complicated by Excel incorrectly treating 1900 as a leap year". So "1899-12-30" should be used for dates post 1901:
date <- c(42963,42994,42903,42933,42964)
This is the result of as.Date():
as.Date(date, origin = "1899-12-30")
[1] "2017-08-18" "2017-09-18" "2017-06-19" "2017-07-19" "2017-08-19"
You can then use zoo::as.yearmon()` to get the expected outcome:
zoo::as.yearmon(as.Date(date, origin = "1899-12-30"))
[1] "Aug 2017" "Sep 2017" "Jun 2017" "Jul 2017" "Aug 2017"
Type excel_numeric_to_date to look at the function's code and you'll see it's a wrapper for the line of code used by the other answers to this question: as.Date(date_num, origin = "1899-12-30").
So that's not the issue.
The underlying matter here is confusion about date formatting. You say you expect your first number 42963 to become "Aug 2016", and your last number 42964 to become "Aug 2017". The latter is just one more than the former, which shows up in the conversion - they should be a day apart, not a year apart as you are expecting:
> excel_numeric_to_date(c(42963, 42964))
[1] "2017-08-16" "2017-08-17" # as expected, they are one day apart
Perhaps the day and year fields are switched upstream in your data at the point where these get mapped to integer dates, and it was hard to tell here because of the values chosen.

convert characters to month and year ( date ) in R

How do I convert a string "Apr-16" to Month April of 2016 ?
I tired as.Date(x, format ..) , that isn't helping .
What options do I have here ?
x <- "Apr-16"
x <- as.Date(paste0("1-",x),format="%d-%b-%y")
format(x,"%B of %Y")
[1] "April of 2016"
Another option is regex
sub("-", "il of 20", x)
#[1] "April of 2016"
Or using yearmon from zoo
library(zoo)
format(as.Date(as.yearmon(x, "%b-%y")), "%B of %Y")
#[1] "April of 2016"
data
x <- "Apr-16"

Convert numeric data such as "715" into Date "July-2015" in R

I would like to friendly ask a question about converting numeric data into Date format.
I would like to convert the numeric data like:
time1<-c(715, 1212, 0416)
to
July-2015, Dec-2012, Apr-2016
I have tried these code but it is not working.
time2<-as.Date(as.character(time1), format="%m%y")
Does anyone have some ideas to solve this issue?
Part of the issue is that "July 2015", "December 2012", and "April 2016" are not dates since the specific day is missing. Another approach is to convert to zoo::yearmon. Here, the numeric input needs to be converted to a string with leading zero so that the month is from 01 to 12:
library(zoo)
ym <- as.yearmon(sprintf("%04d",time1),format="%m%y")
ym
##[1] "Jul 2015" "Dec 2012" "Apr 2016"
The result is of class yearmon, which can then be coerced to Date:
class(ym)
##[1] "yearmon"
d <- as.Date(ym)
d
##[1] "2015-07-01" "2012-12-01" "2016-04-01"
class(d)
##[1] "Date"
Try lubridate::parse_date_time():
library(lubridate)
time2 <- parse_date_time(time1, orders = "my")
format.Date(time2, "%b-%Y")
[1] "juil.-2015" "déc.-2012" "avril-2016" # my locale lang is French

Resources