How to show year only where necessary on date axis - r

I'm writing code in R that uses ggplot2 to generate several bar graphs based on test data from several trainings, spanning (at present) about a year, like the following:
Currently all the dates are formatted "mon DD YYYY" ("%b %d %Y" as a Date format):
c("Oct 05 2020", "Nov 02 2020", "Nov 30 2020", "Jan 11 2021", "Feb 22 2021",
"Mar 08 2021", "Mar 29 2021", "Apr 12 2021", "May 03 2021", "May 17 2021")
But I'd like to only display the year on the first date, and any subsequent dates that are the first date in a year:
c("Oct 05 2020", "Nov 02", "Nov 30", "Jan 11 2021", "Feb 22", "Mar 08", "Mar 29",
"Apr 12", "May 03", "May 17")
Is there a way to do this, either via some kind of filtering of the Date column or something in ggplot?

We could convert the vector of date into Date class ('v1'), extract the 'Year' component, create a logical vector with duplicated, use that index ('i1') to replace the values in the original vector with formatted Dates from 'v2'
v2 <- as.Date(v1, '%b %d %Y')
i1 <- duplicated(format(v2, '%Y'))
v1[i1] <- format(v2[i1], '%b %d')
-ouptut
v1
[1] "Oct 05 2020" "Nov 02" "Nov 30"
[4] "Jan 11 2021" "Feb 22" "Mar 08" "Mar 29"
[8] "Apr 12" "May 03" "May 17"
data
v1 <- c("Oct 05 2020", "Nov 02 2020", "Nov 30 2020", "Jan 11 2021", "Feb 22 2021",
"Mar 08 2021", "Mar 29 2021", "Apr 12 2021", "May 03 2021", "May 17 2021")

Related

as.Date(date, format='%d %B %Y') turns '%d %B %Y' format column to NAs

Given a sample csv file (to download from here), I read it with code below:
df <- read.csv(file = './report1.csv', header = TRUE)
head(df, 5)
Out:
As you can see, date columns is in format of %d %B %Y (based on the reference table from this link) but with data type of factor.
So I try to use as.Date(date, format='%d %B %Y') to convert it to date format.
library(lubridate)
df %>%
mutate(date=as.Date(date, format = '%d %B %Y'))
# mutate(date=dmy(date))
But as you can see, all date values become NAs for that column. Meanwhile, mutate(date=dmy(date)) works well.
Does someone could explain what's error in as.Date(...) code? Thanks.
Data:
dput(head(df, 5))
Out:
structure(list(date = structure(c(31L, 43L, 55L, 67L, 79L), .Label = c("1 April 2020",
"1 August 2020", "1 December 2020", "1 February 2021", "1 January 2021",
"1 July 2020", "1 June 2020", "1 March 2021", "1 May 2020", "1 November 2020",
"1 October 2020", "1 September 2020", "10 April 2020", "10 August 2020",
"10 December 2020", "10 February 2021", "10 January 2021", "10 July 2020",
"10 June 2020", "10 May 2020", "10 November 2020", "10 October 2020",
"10 September 2020", "11 April 2020", "11 August 2020", "11 December 2020",
"11 February 2021", "11 January 2021", "11 July 2020", "11 June 2020",
"11 March 2020", "11 May 2020", "11 November 2020", "11 October 2020",
"11 September 2020", "12 April 2020", "12 August 2020", "12 December 2020",
"12 February 2021", "12 January 2021", "12 July 2020", "12 June 2020",
"12 March 2020", "12 May 2020", "12 November 2020", "12 October 2020",
"12 September 2020", "13 April 2020", "13 August 2020", "13 December 2020",
"13 February 2021", "13 January 2021", "13 July 2020", "13 June 2020",
"13 March 2020", "13 May 2020", "13 November 2020", "13 October 2020",
"13 September 2020", "14 April 2020", "14 August 2020", "14 December 2020",
"14 February 2021", "14 January 2021", "14 July 2020", "14 June 2020",
"14 March 2020", "14 May 2020", "14 November 2020", "14 October 2020",
"14 September 2020", "15 April 2020", "15 August 2020", "15 December 2020",
"15 February 2021", "15 January 2021", "15 July 2020", "15 June 2020",
"15 March 2020", "15 May 2020", "15 November 2020", "15 October 2020",
"15 September 2020", "16 April 2020", "16 August 2020", "16 December 2020",
"16 February 2021", "16 January 2021", "16 July 2020", "16 June 2020",
"16 March 2020", "16 May 2020", "16 November 2020", "16 October 2020",
"16 September 2020", "17 April 2020", "17 August 2020", "17 December 2020",
"17 February 2021", "17 January 2021", "17 July 2020", "17 June 2020",
"17 March 2020", "17 May 2020", "17 November 2020", "17 October 2020",
"17 September 2020", "18 April 2020", "18 August 2020", "18 December 2020",
"18 February 2021", "18 January 2021", "18 July 2020", "18 June 2020",
"18 March 2020", "18 May 2020", "18 November 2020", "18 October 2020",
"18 September 2020", "19 April 2020", "19 August 2020", "19 December 2020",
"19 February 2021", "19 January 2021", "19 July 2020", "19 June 2020",
"19 March 2020", "19 May 2020", "19 November 2020", "19 October 2020",
"19 September 2020", "2 April 2020", "2 August 2020", "2 December 2020",
"2 February 2021", "2 January 2021", "2 July 2020", "2 June 2020",
"2 March 2021", "2 May 2020", "2 November 2020", "2 October 2020",
"2 September 2020", "20 April 2020", "20 August 2020", "20 December 2020",
"20 February 2021", "20 January 2021", "20 July 2020", "20 June 2020",
"20 March 2020", "20 May 2020", "20 November 2020", "20 October 2020",
"20 September 2020", "21 April 2020", "21 August 2020", "21 December 2020",
"21 February 2021", "21 January 2021", "21 July 2020", "21 June 2020",
"21 March 2020", "21 May 2020", "21 November 2020", "21 October 2020",
"21 September 2020", "22 April 2020", "22 August 2020", "22 December 2020",
"22 February 2021", "22 January 2021", "22 July 2020", "22 June 2020",
"22 March 2020", "22 May 2020", "22 November 2020", "22 October 2020",
"22 September 2020", "23 April 2020", "23 August 2020", "23 December 2020",
"23 February 2021", "23 January 2021", "23 July 2020", "23 June 2020",
"23 March 2020", "23 May 2020", "23 November 2020", "23 October 2020",
"23 September 2020", "24 April 2020", "24 August 2020", "24 December 2020",
"24 February 2021", "24 January 2021", "24 July 2020", "24 June 2020",
"24 March 2020", "24 May 2020", "24 November 2020", "24 October 2020",
"24 September 2020", "25 April 2020", "25 August 2020", "25 December 2020",
"25 February 2021", "25 January 2021", "25 July 2020", "25 June 2020",
"25 March 2020", "25 May 2020", "25 November 2020", "25 October 2020",
"25 September 2020", "26 April 2020", "26 August 2020", "26 December 2020",
"26 February 2021", "26 January 2021", "26 July 2020", "26 June 2020",
"26 March 2020", "26 May 2020", "26 November 2020", "26 October 2020",
"26 September 2020", "27 April 2020", "27 August 2020", "27 December 2020",
"27 February 2021", "27 January 2021", "27 July 2020", "27 June 2020",
"27 March 2020", "27 May 2020", "27 November 2020", "27 October 2020",
"27 September 2020", "28 April 2020", "28 August 2020", "28 December 2020",
"28 February 2021", "28 January 2021", "28 July 2020", "28 June 2020",
"28 March 2020", "28 May 2020", "28 November 2020", "28 October 2020",
"28 September 2020", "29 April 2020", "29 August 2020", "29 December 2020",
"29 January 2021", "29 July 2020", "29 June 2020", "29 March 2020",
"29 May 2020", "29 November 2020", "29 October 2020", "29 September 2020",
"3 April 2020", "3 August 2020", "3 December 2020", "3 February 2021",
"3 January 2021", "3 July 2020", "3 June 2020", "3 March 2021",
"3 May 2020", "3 November 2020", "3 October 2020", "3 September 2020",
"30 April 2020", "30 August 2020", "30 December 2020", "30 January 2021",
"30 July 2020", "30 June 2020", "30 March 2020", "30 May 2020",
"30 November 2020", "30 October 2020", "30 September 2020", "31 August 2020",
"31 December 2020", "31 January 2021", "31 July 2020", "31 March 2020",
"31 May 2020", "31 October 2020", "4 April 2020", "4 August 2020",
"4 December 2020", "4 February 2021", "4 January 2021", "4 July 2020",
"4 June 2020", "4 March 2021", "4 May 2020", "4 November 2020",
"4 October 2020", "4 September 2020", "5 April 2020", "5 August 2020",
"5 December 2020", "5 February 2021", "5 January 2021", "5 July 2020",
"5 June 2020", "5 March 2021", "5 May 2020", "5 November 2020",
"5 October 2020", "5 September 2020", "6 April 2020", "6 August 2020",
"6 December 2020", "6 February 2021", "6 January 2021", "6 July 2020",
"6 June 2020", "6 March 2021", "6 May 2020", "6 November 2020",
"6 October 2020", "6 September 2020", "7 April 2020", "7 August 2020",
"7 December 2020", "7 February 2021", "7 January 2021", "7 July 2020",
"7 June 2020", "7 March 2021", "7 May 2020", "7 November 2020",
"7 October 2020", "7 September 2020", "8 April 2020", "8 August 2020",
"8 December 2020", "8 February 2021", "8 January 2021", "8 July 2020",
"8 June 2020", "8 May 2020", "8 November 2020", "8 October 2020",
"8 September 2020", "9 April 2020", "9 August 2020", "9 December 2020",
"9 February 2021", "9 January 2021", "9 July 2020", "9 June 2020",
"9 May 2020", "9 November 2020", "9 October 2020", "9 September 2020"
), class = "factor"), immobility = c(0.407428566, 1.553824969,
3.033190503, 4.950250769, 7.40514975), cases = c(4.753590191,
5.241747015, 5.690359454, 5.924255797, 6.150602768), icu = c(4.407936176,
4.543724109, 4.72143235, 5.005429015, 5.280731862), deaths = c(0.922324625,
1.315709119, 1.69429665, 2.057554016, 2.403912035)), row.names = c(NA,
5L), class = "data.frame")

as.Date() not working with %B on Mac maybe? [duplicate]

This question already has answers here:
Converting year and month ("yyyy-mm" format) to a date?
(9 answers)
Closed 2 years ago.
I don't know what I'm doing wrong with my date data. A sample of the Date column in the housing_data data frame are as follows:
"February 2012" "March 2012" "April 2012" "May 2013" "July 2015" "March 2016"
"May 2016" "April 2017" "July 2017" "October 2017" "December 2017" "February 2018"
I run housing_data$Date <- base::as.Date(housing_data$Date, "%B %Y") and I get nothing but NA back as if I have the formatting incorrect. What am I missing?
I would suggest adding a day value to your string in this way:
#Data
x1 <- c("February 2012", "March 2012", "April 2012", "May 2013", "July 2015")
Code:
#For date
as.Date(paste('01',x1),'%d %B %Y')
Output:
[1] "2012-02-01" "2012-03-01" "2012-04-01" "2013-05-01" "2015-07-01"
Or you can try:
#Format date
format(as.Date(paste('01',x1),'%d %B %Y'),'%B %Y')
Output:
[1] "February 2012" "March 2012" "April 2012" "May 2013" "July 2015"

How to transform character to Date if the date is in a foreign language? #Danish

I have scraped data from a Danish newspaper and the dates are like this:
"08. Maj 2012"
This is in character class and I wanna as data class.
I tried as.Date(dates, "%d. %b %Y")
and I got:
Error in as.Date.default(allarticles.dr, "%d. %b %Y") : do not know
how to convert 'allarticles.dr' to class “Date”
How can I do? I need to transform character to date but it is not recognizing in the normal way.
I also tried
Sys.setlocale("LC_TIME", "da_DK.UTF-8")
as.Date(dates, "%d. %b %Y)
and I am getting a lot of NAs
When applying dputthese are a sample of NAs that appear:
"10. Feb. 2018", "13. Feb. 2018", "18. Feb. 2018", "21. Feb. 2018",
"27. Feb. 2018", "01. Mar. 2018", "01. Mar. 2018", "09. Mar. 2018",
"14. Mar. 2018", "24. Mar. 2018", "26. Mar. 2018", "07. Apr. 2018",
"12. Apr. 2018", "15. Apr. 2018", "28. Apr. 2018", "04. Jun. 2018",
"05. Jun. 2018", "05. Jun. 2018", "12. Jun. 2018", "14. Jun. 2018",
"16. Jun. 2018", "17. Jun. 2018", "19. Jun. 2018", "21. Jun. 2018",
"29. Jun. 2018", "12. Jul. 2018", "13. Jul. 2018", "15. Jul. 2018",
"22. Jul. 2018", "07. Aug. 2018", "08. Aug. 2018", "20. Aug. 2018",
"21. Aug. 2018", "25. Aug. 2018", "28. Aug. 2018", "31. Aug. 2018",
"31. Aug. 2018", "02. Sep. 2018", "02. Sep. 2018", "06. Sep. 2018",
"20. Sep. 2018", "27. Sep. 2018", "01. Okt. 2018", "06. Okt. 2018",
"09. Okt. 2018", "11. Okt. 2018", "13. Okt. 2018", "13. Okt. 2018",
"13. Okt. 2018", "13. Okt. 2018", "15. Okt. 2018", "17. Okt. 2018",
"18. Okt. 2018", "18. Okt. 2018", "18. Okt. 2018", "20. Okt. 2018",
"22. Okt. 2018", "23. Okt. 2018", "24. Okt. 2018", "27. Okt. 2018",
"27. Okt. 2018", "27. Okt. 2018", "27. Okt. 2018", "29. Okt. 2018",
"08. Nov. 2018", "08. Nov. 2018", "08. Nov. 2018", "08. Nov. 2018",
"13. Nov. 2018", "15. Nov. 2018", "16. Nov. 2018", "27. Nov. 2018",
"27. Nov. 2018", "28. Nov. 2018", "29. Nov. 2018", "02. Dec. 2018",
"05. Dec. 2018", "05. Dec. 2018", "05. Dec. 2018", "06. Dec. 2018",
"07. Dec. 2018", "08. Dec. 2018", "12. Dec. 2018", "13. Dec. 2018",
"19. Dec. 2018", "20. Dec. 2018", "01. Jan. 2019", "06. Jan. 2019",
"04. Feb. 2019", "06. Feb. 2019", "07. Feb. 2019", "18. Feb. 2019",
"21. Feb. 2019", "07. Mar. 2019", "21. Mar. 2019", "27. Mar. 2019",
"28. Mar. 2019"
Assuming Windows, set it to Danish, perform the operations and then set it back.
Sys.setlocale("LC_TIME", "Danish")
date <- c("08. Maj 2012", "09. Okt 2012")
fmt <- "%d. %b %Y"
as.Date(date, fmt)
## [1] "2012-05-08" "2012-10-09"
Sys.setlocale("LC_TIME")

Parsing complicated date column

I have a date column that contains dates public opinion polls occur.
These polls occasionally run over several days (usually but not always continuously), the polls sometimes start in one month and finish in the next, and the year has occasionally been entered as YY and other times as YYYY.
Where there is date range it's normally separated using a -, but sometimes – has been used, there are sometimes spaces between dates in a range.
I need to clean this into a consistent date format with a start_date and end_date column. Where polls occur on a single day I'd the end_date columns should be NA or filled with the start date (if you have a solution that does either I can always work from there to do the reverse if needed). Where there are non-continuous date ranges, the earliest date and latest date and the intermediate stop and restarts can be discarded.
Because the formatting is so annoyingly inconsistent I've provided the full the data since any solution would need to work on all dates in the date set (or work on some of them and not break the others so we could solve the issue iteratively).
dates <- c("12-15 Feb 2019", "6–11 Feb 2019", "7–10 Feb 2019", "23–30 Jan 2019",
"24–27 Jan 2019", "9–13 Jan 2019", "13-16 Dec 2018", "13–15 Dec 2018",
"6–9 Dec 2018", "29 Nov – 2 Dec 2018", "23–25 Nov 2018", "15-18 Nov 2018",
"15–17 Nov 2018", "8–11 Nov 2018", "1–4 Nov 2018", "25–28 Oct 2018",
"19–21 Oct 2018", "10–13 Oct 2018", "10–13 Oct 2018", "5–7 Oct 2018",
"22–24 Sep 2018", "20–23 Sep 2018", "12–15 Sep 2018", "8–10 Sep 2018",
"6–9 Sep 2018", "25–26 Aug 2018", "24–26 Aug 2018", "24–25 Aug 2018",
"15-18 Aug 2018", "12-Aug-18", "06-Aug-18", "29-Jul-18", "17-Jul-18",
"16-Jul-18", "03-Jul-18", "02-Jul-18", "21–24 Jun 2018", "14–17 Jun 2018",
"14–17 Jun 2018", "02-Jun-18", "31 May – 3 Jun 2018", "24–27 May 2018",
"17–20 May 2018", "10–13 May 2018", "10–13 May 2018", "10–12 May 2018",
"3–6 May 2018", "30-Apr-18", "19–22 Apr 2018", "22-Apr-18", "5–8 Apr 2018",
"5–8 Apr 2018", "3–5 Apr 2018", "24 Mar – 1 Apr 2018", "28-Mar-18",
"22–25 Mar 2018", "22–25 Mar 2018", "17–25 Mar 2018", "8–11 Mar 2018",
"3–11 Mar 2018", "1–4 Mar 2018", "22–25 Feb 2018", "24-Feb-18",
"15–18 Feb 2018", "8–11 Feb 2018", "1–3 Feb 2018", "26–28 Jan 2018",
"25-Jan-18", "11–15 Jan 2018", "19-Dec-17", "14–17 Dec 2017",
"12-Dec-17", "7–10 Dec 2017", "05-Dec-17", "30 Nov ? 3 Dec 2017",
"29-Nov-17", "28-Nov-17", "23–27 Nov 2017", "21-Nov-17", "14-Nov-17",
"14-Nov-17", "13-Nov-17", "30-Oct-17", "26–29 Oct 2017", "24-Oct-17",
"12–15 Oct 2017", "04-Oct-17", "01-Oct-17", "26-Sep-17", "21–24 Sep 2017",
"19-Sep-17", "14–18 Sep 2017", "12-Sep-17", "6–9 Sep 2017", "05-Sep-17",
"31 Aug – 4 Sep 2017", "28 Aug – 2 Sep 2017", "29-Aug-17", "23-Aug-17",
"22-Aug-17", "17–21 Aug 2017", "17–20 Aug 2017", "15-Aug-17",
"08-Aug-17", "3–6 Aug 2017", "01-Aug-17", "25-Jul-17", "20–24 Jul 2017",
"20–23 Jul 2017", "19-Jul-17", "18-Jul-17", "6–11 Jul 2017",
"6–9 Jul 2017", "29-Jun-17", "22–27 Jun 2017", "15–18 Jun 2017",
"14-Jun-17", "26–29 May 2017", "23-May-17", "12–15 May 2017",
"11-May-17", "10–11 May 2017", "26–30 Apr 2017", "20–23 Apr 2017",
"13–16 Apr 2017", "6–9 Apr 2017", "1–4 Apr 2017", "30 Mar – 2 Apr 2017",
"24–27 Mar 2017", "22–25 Mar 2017", "17–20 Mar 2017", "16–19 Mar 2017",
"10–13 Mar 2017", "3–6 Mar 2017", "23–26 Feb 2017", "16–19 Feb 2017",
"9–12 Feb 2017", "2–5 Feb 2017", "20–23 Jan 2017", "13–16 Jan 2017",
"12-Jan-17", "9–12 Dec 2016", "1–4 Dec 2016", "25–28 Nov 2016",
"24–26 Nov 2016", "17–20 Nov 2016", "11–14 Nov 2016", "3–6 Nov 2016",
"20–23 Oct 2016", "14–17 Oct 2016", "7–10 Oct 2016", "6–9 Oct 2016",
"22–25 Sep 2016", "9–12 Sep 2016", "8–11 Sep 2016", "26–29 Aug 2016",
"25–28 Aug 2016", "19–22 Aug 2016", "12–15 Aug 2016", "5–8 Aug 2016",
"27 Jul – 1 Aug 2016", "20–24 Jul 2016", "13–17 Jul 2016", "6–10 Jul 2016",
"30 Jun – 3 Jul 2016", "28 Jun – 1 Jul 2016", "30-Jun-16", "27–30 Jun 2016",
"28–29 Jun 2016", "26–29 Jun 2016", "28 Jun – 1 Jul 2016", "30-Jun-16",
"27–30 Jun 2016", "28–29 Jun 2016", "26–29 Jun 2016", "23–26 Jun 2016",
"23–26 Jun 2016", "23-Jun-16", "20–22 Jun 2016", "16–19 Jun 2016",
"16–19 Jun 2016", "16-Jun-16", "14–16 Jun 2016", "9–12 Jun 2016",
"09-Jun-16", "2–5 Jun 2016", "2–5 Jun 2016", "02-Jun-16", "31 May – 2 Jun 2016",
"26–29 May 2016", "21–22,\n 28–29 May 2016",
"26-May-16", "19–22 May 2016", "19–22 May 2016", "19-May-16",
"17–19 May 2016", "14–15 May 2016", "12–15 May 2016", "6–8 May 2016",
"5–8 May 2016", "5–8 May 2016", "5–7 May 2016", "4–6 May 2016",
"05-May-16", "27 Apr – 1 May 2016", "23–24, 30 Apr – 1 May 2016",
"20–24 Apr 2016", "14–17 Apr 2016", "13–17 Apr 2016", "9–10,\n 16–17 Apr 2016",
"14–16 Apr 2016", "14-Apr-16", "6–10 Apr 2016", "31 Mar – 3 Apr 2016",
"26–27 Mar, 2–3 Apr 2016", "21-Mar-16", "17–20 Mar 2016", "16–20 Mar 2016",
"12–13,\n 19–20 Mar 2016", "10–12 Mar 2016",
"3–6 Mar 2016", "2–6 Mar 2016", "27–28 Feb, 5–6 Mar 2016", "24–28 Feb 2016",
"18–21 Feb 2016", "17–21 Feb 2016", "13–14, 20–21 Feb 2016",
"11–13 Feb 2016", "11-Feb-16", "3–7 Feb 2016", "30–31 Jan,\n 6–7 Feb 2016",
"28–31 Jan 2016", "16–17, 23–24 Jan 2016", "21-Jan-16", "15–18 Jan 2016",
"2–3, 9–10 Jan 2016", "15-Dec-15", "5–6, 12–13 Dec 2015", "08-Dec-15",
"4–6 Dec 2015", "01-Dec-15", "21–22, 28–29 Nov 2015", "26-Nov-15",
"24-Nov-15", "19–22 Nov 2015", "7–8, 14–15 Nov 2015", "12–14 Nov 2015",
"10-Nov-15", "6–8 Nov 2015", "03-Nov-15", "24–25 Oct,\n 1 Nov 2015",
"27-Oct-15", "23–25 Oct 2015", "22-Oct-15", "20-Oct-15", "10–11, 17–18 Oct 2015",
"15–17 Oct 2015", "13-Oct-15", "9–11 Oct 2015", "26–27 Sep, 1–5 Oct 2015",
"1–4 Oct 2015", "24–28 Sep 2015", "17–21 Sep 2015", "19–20 Sep 2015",
"17–20 Sep 2015", "15–16 Sep 2015", "15-Sep-15", "12–13 Sep 2015",
"5–6 Sep 2015", "4–6 Sep 2015", "26–30 Aug 2015", "27-Aug-15",
"22–23 Aug 2015", "20–23 Aug 2015", "13–15 Aug 2015", "11–14 Aug 2015",
"8–9 Aug 2015", "8–9 Aug 2015", "4–7 Aug 2015", "06-Aug-15",
"28–31 Jul 2015", "30-Jul-15", "25–26 Jul 2015", "16–19 Jul 2015",
"14–17 Jul 2015", "11–12 Jul 2015", "4–5 Jul 2015", "2–4 Jul 2015",
"27–28 Jun 2015", "16-Jun-15", "16-Jun-15", "13–14 Jun 2015",
"11–13 Jun 2015", "11–13 Jun 2015", "02-Jun-15", "02-Jun-15",
"23–24, 30–31 May 2015", "26-May-15", "18-May-15", "17-May-15",
"17-May-15", "13-May-15", "7–10 May 2015", "04-May-15", "04-May-15",
"28-Apr-15", "21-Apr-15", "11–12,\n 18–19 Apr 2015",
"14-Apr-15", "10–12 Apr 2015", "9–11 Apr 2015", "28–29 Mar, 3–6 Apr 2015",
"29-Mar-15", "20–22 Mar 2015", "14–15, 21–22 Mar 2015", "17-Mar-15",
"10-Mar-15", "7–8 Mar 2015", "28 Feb–1, 7–8 Mar 2015", "26–28 Feb 2015",
"20–22 Feb 2015", "20–22 Feb 2015", "6–8 Feb 2015", "31 Jan–1, 7–8 Feb 2015",
"05-Feb-15", "4–5 Feb 2015", "28–30 Jan 2015", "27-Jan-15", "r27 Jan 2015",
"20-Jan-15", "13-Jan-15", "12-Jan-15", "23–27 Dec 2014", "16-Dec-14",
"12–15 Dec 2014", "6–7, 13–14 Dec 2014", "4–6 Dec 2014", "2–4 Dec 2014",
"02-Dec-14", "29–30 Nov 2014", "22–23, 29–30 Nov 2014", "25-Nov-14",
"21-Nov-14", "18-Nov-14", "17-Nov-14", "17-Nov-14", "11-Nov-14",
"04-Nov-14", "04-Nov-14", "25–26 Oct,\n 1–2 Nov 2014",
"30 Oct–1 Nov 2014", "28-Oct-14", "23-Oct-14", "21-Oct-14", "21-Oct-14",
"20-Oct-14", "14-Oct-14", "07-Oct-14", "4–5 Oct 2014", "4–5 Oct 2014",
"23-Sep-14", "13–14,\n 20–21 Sep 2014",
"18-Sep-14", "30–31 Aug, 6–7 Sep 2014", "5–7 Sep 2014", "22–24 Aug 2014",
"16–17, 23–24 Aug 2014", "19-Aug-14", "9–10 Aug 2014", "8–10 Aug 2014",
"25–27 Jul 2014", "11–13 Jul 2014", "01-Jul-14", "30-Jun-14",
"27–29 Jun 2014", "13–15 Jun 2014", "30 May–1 Jun 2014", "27-May-14",
"20-May-14", "17–18 May 2014", "16–18 May 2014", "15–17 May 2014",
"04-May-14", "2–4 May 2014", "30-Apr-14", "22-Apr-14", "15-Apr-14",
"13-Apr-14", "08-Apr-14", "07-Apr-14", "4–6 Apr 2014", "25-Mar-14",
"25-Mar-14", "21–23 Mar 2014", "18-Mar-14", "13–15 Mar 2014",
"7–9 Mar 2014", "05-Mar-14", "23-Feb-14", "21–23 Feb 2014", "15-Feb-14",
"7–9 Feb 2014", "28-Jan-14", "23-Jan-14", "17–20 Jan 2014", "13-Jan-14",
"16-Dec-13", "15-Dec-13", "6–8 Dec 2013", "28 Nov–2 Dec 2013",
"30 Nov–1 Dec 2013", "22–24 Nov 2013", "21–23 Nov 2013", "8–10 Nov 2013",
"25–27 Oct 2013", "19–20 Oct 2013", "21–22 Sep 2013", "19–22 Sep 2013",
"12–15 Sep 2013", "4–6 Sep 2013", "05-Sep-13", "3–5 Sep 2013",
"4–6 Sep 2013", "05-Sep-13", "4–5 Sep 2013", "3–5 Sep 2013",
"04-Sep-13", "2–4 Sep 2013", "1–4 Sep 2013", "03-Sep-13", "30 Aug–1 Sep 2013",
"30 Aug–1 Sep 2013", "29 Aug–1 Sep 2013", "28–29 Aug 2013", "28–29 Aug 2013",
"26-Aug-13", "21–25 Aug 2013", "23–25 Aug 2013", "23–25 Aug 2013",
"18–22 Aug 2013", "16–18 Aug 2013", "16–18 Aug 2013", "16–18 Aug 2013",
"14–18 Aug 2013", "14–15 Aug 2013", "12–13 Aug 2013", "9–12 Aug 2013",
"9–11 Aug 2013", "9–11 Aug 2013", "10-Aug-13", "7–9 Aug 2013",
"6–8 Aug 2013", "04-Aug-13", "2–4 Aug 2013", "2–4 Aug 2013",
"1–4 Aug 2013", "26–28 Jul 2013", "25–28 Jul 2013", "23–25 Jul 2013",
"18–22 Jul 2013", "19–21 Jul 2013", "19–21 Jul 2013", "18-Jul-13",
"12–14 Jul 2013", "11–14 Jul 2013", "11–13 Jul 2013", "5–8 Jul 2013",
"5–7 Jul 2013", "5–7 Jul 2013", "4–7 Jul 2013", "28–30 Jun 2013",
"28–30 Jun 2013", "27–30 Jun 2013", "27–28 Jun 2013", "27-Jun-13",
"21–23 Jun 2013", "21–23 Jun 2013", "20–23 Jun 2013", "14–16 Jun 2013",
"13–16 Jun 2013", "13–15 Jun 2013", "11–13 Jun 2013", "7–10 Jun 2013",
"6–10 Jun 2013", "31 May–2 Jun 2013", "31 May–2 Jun 2013", "30 May–2 Jun 2013",
"24–26 May 2013", "23–26 May 2013", "17–19 May 2013", "17–19 May 2013",
"16–19 May 2013", "16–18 May 2013", "15–16 May 2013", "10–12 May 2013",
"9–12 May 2013", "3–5 May 2013", "3–5 May 2013", "2–5 May 2013",
"02-May-13", "26–28 Apr 2013", "25–28 Apr 2013", "18–22 Apr 2013",
"18–22 Apr 2013", "19–21 Apr 2013", "11–14 Apr 2013", "11–14 Apr 2013",
"11–13 Apr 2013", "9–11 Apr 2013", "02-May-13", "5–7 Apr 2013",
"4–7 Apr 2013", "4–7 Apr 2013", "29 Mar–1 Apr 2013", "28 Mar–1 Apr 2013",
"22–24 Mar 2013", "21–24 Mar 2013", "22–23 Mar 2013", "21–24 Mar 2013",
"22–25 Mar 2013", "14–17 Mar 2013", "14–17 Mar 2013", "14–16 Mar 2013",
"7–10 Mar 2013", "7–10 Mar 2013", "8–10 Mar 2013", "5–7 Mar 2013",
"28 Feb–3 Mar 2013", "28 Feb–3 Mar 2013", "21–24 Feb 2013", "16–17/23–24 Feb 2013",
"22–24 Feb 2013", "14–17 Feb 2013", "14–16 Feb 2013", "7–10 Feb 2013",
"9–10 Feb 2013", "1–4 Feb 2013", "2–3 Feb 2013", "1–3 Feb 2013",
"1–3 Feb 2013", "23–28 Jan 2013", "19–20/26–27 Jan 2013", "16–20 Jan 2013",
"9–13 Jan 2013", "11–13 Jan 2013", "5–6/12–13 Jan 2013", "12–16 Dec 2012",
"8–9/15–16 Dec 2012", "13–15 Dec 2012", "5–9 Dec 2012", "7–9 Dec 2012",
"28 Nov–2 Dec 2012", "24–25 Nov/1–2 Dec 2012", "29–30 Nov 2012",
"27–29 Nov 2012", "23–25 Nov 2012", "21–25 Nov 2012", "14–18 Nov 2012",
"10–11/17–18 Nov 2012", "15–17 Nov 2012", "9–11 Nov 2012", "7–11 Nov 2012",
"2–6 Nov 2012", "2–4 Nov 2012", "27–28 Oct/3–4 Nov 2012", "26–28 Oct 2012",
"25–28 Oct 2012", "13–14/20–21 Oct 2012", "17–21 Oct 2012", "18–20 Oct 2012",
"10–14 Oct 2012", "5–7 Oct 2012", "3–7 Oct 2012", "29–30 Sep/6–7 Oct 2012",
"26–30 Sep 2012", "22–23 Sep 2012", "19–23 Sep 2012", "17–20 Sep 2012",
"14–16 Sep 2012", "12–16 Sep 2012", "8–9/15–16 Sep 2012", "13–15 Sep 2012",
"29 Aug–2 Sep 2012", "31 Aug–2 Sep 2012", "1–2 Sep 2012", "22–26 Aug 2012",
"23–25 Aug 2012", "15–19 Aug 2012", "17–19 Aug 2012", "11–12/18–19 Aug 2012",
"8–12 Aug 2012", "3–5 Aug 2012", "1–5 Aug 2012", "28–29 Jul/4–5 Aug 2012",
"25–29 Jul 2012", "26–28 Jul 2012", "20–22 Jul 2012", "18–22 Jul 2012",
"14–15/21–22 Jul 2012", "11–15 Jul 2012", "6–8 Jul 2012", "4–8 Jul 2012",
"30 Jun–1/7–8 Jul 2012", "27 Jun–1 Jul 2012", "22–24 Jun 2012",
"20–24 Jun 2012", "16–17/23–24 Jun 2012", "13–17 Jun 2012", "15–17 Jun 2012",
"6–11 Jun 2012", "9–10 Jun 2012", "7–10 Jun 2012", "2–3 Jun 2012",
"31 May–2 Jun 2012", "30 May–3 Jun 2012", "26–27 May 2012", "23–27 May 2012",
"25–27 May 2012", "16–20 May 2012", "19–20 May 2012", "12–13 May 2012",
"11–13 May 2012", "9–13 May 2012", "9–10 May 2012", "9–10 May 2012",
"5–6 May 2012", "2–6 May 2012", "27–29 Apr 2012", "27–29 Apr 2012",
"25–29 Apr 2012", "21–22 Apr 2012", "18–22 Apr 2012", "17–19 Apr 2012",
"13–15 Apr 2012", "11–15 Apr 2012", "7–8/14–15 Apr 2012", "4–9 Apr 2012",
"31 Mar–1 Apr 2012", "28 Mar–1 Apr 2012", "29–31 Mar 2012", "21–25 Mar 2012",
"24–25 Mar 2012", "23–25 Mar 2012", "14–18 Mar 2012", "10–11/17–18 Mar 2012",
"9–11 Mar 2012", "7–11 Mar 2012", "3–4 Mar 2012", "29 Feb–4 Mar 2012",
"25–26 Feb 2012", "23–26 Feb 2012", "22–26 Feb 2012", "23–24 Feb 2012",
"22–23 Feb 2012", "15–19 Feb 2012", "11–12/18–19 Feb 2012", "10–12 Feb 2012",
"8–10 Feb 2012", "7–8 Feb 2012", "4–5 Feb 2012", "1–5 Feb 2012",
"2–4 Feb 2012", "28–29 Jan 2012", "27–29 Jan 2012", "25–29 Jan 2012",
"27–28 Jan 2012", "18–22 Jan 2012", "14–15/21–22 Jan 2012", "17–18 Jan 2012",
"11–15 Jan 2012", "7–8 Jan 2012", "14–18 Dec 2011", "10–11/17–18 Dec 2011",
"7–11 Dec 2011", "8–10 Dec 2011", "2–4 Dec 2011", "30 Nov–4 Dec 2011",
"26–27 Nov/3–4 Dec 2011", "23–27 Nov 2011", "19–20 Nov 2011",
"18–20 Nov 2011", "16–20 Nov 2011", "9–13 Nov 2011", "5–6/12–13 Nov 2011",
"10–12 Nov 2011", "3–6 Nov 2011", "2–6 Nov 2011", "2–3 Nov 2011",
"26–30 Oct 2011", "29–30 Oct 2011", "25–26 Oct 2011", "22–23 Oct 2011",
"21–23 Oct 2011", "19–23 Oct 2011", "15–16Oct 2011", "14–16 Oct 2011",
"12–16 Oct 2011", "13–15 Oct 2011", "8–9 Oct 2011", "7–9 Oct 2011",
"4–9 Oct 2011", "27 Sep–2 Oct 2011", "24–25 Sep/1–2 Oct 2011",
"20–25 Sep 2011", "16–18 Sep 2011", "13–18 Sep 2011", "10–11/17–18 Sep 2011",
"7–11 Sep 2011", "8–10 Sep 2011", "2–4 Sep 2011", "31 Aug–4 Sep 2011",
"27–28 Aug/3–4 Sep 2011", "24–28 Aug 2011", "19–21 Aug 2011",
"17–21 Aug 2011", "13–14/20–21 Aug 2011", "10–14 Aug 2011", "11–13 Aug 2011",
"9–10 Aug 2011", "5–7 Aug 2011", "3–7 Aug 2011", "30–31 Jul/6–7 Aug 2011",
"c. 3 Aug 2011", "27–31 Jul 2011", "22–24 Jul 2011", "20–24 Jul 2011",
"16–17/23–24 Jul 2011", "13–17 Jul 2011", "14–16 Jul 2011", "13–14 Jul 2011",
"9–10 Jul 2011", "8–10 Jul 2011", "6–10 Jul 2011", "29 Jun–3 Jul 2011",
"25–26 Jun/1–2 Jul 2011", "24–26 Jun 2011", "22–26 Jun 2011",
"11–12/18–19 Jun 2011", "15–19 Jun 2011", "14–16 Jun 2011", "8–13 Jun 2011",
"10–12 Jun 2011", "4–5 Jun 2011", "1–5 Jun 2011", "31 May–2 Jun 2011",
"25–29 May 2011", "27–29 May 2011", "21–22/28–29 May 2011", "18–22 May 2011",
"14–15 May 2011", "13–15 May 2011", "11–15 May 2011", "12–14 May 2011",
"7–8 May 2011", "4–8 May 2011", "3–4 May 2011", "29 Apr–1 May 2011",
"28 Apr–1 May 2011", "23–24/30 Apr–1 May 2011", "20–26 Apr 2011",
"13–17 Apr 2011", "9–10/16–17 Apr 2011", "14–16 Apr 2011", "6–10 Apr 2011",
"2–3 Apr 2011", "1–3 Apr 2011", "30 Mar–3 Apr 2011", "26–27 Mar 2011",
"23–27 Mar 2011", "22–24 Mar 2011", "19–20 Mar 2011", "18–20 Mar 2011",
"16–20 Mar 2011", "16–17 Mar 2011", "12–13 Mar 2011", "9–13 Mar 2011",
"10–12 Mar 2011", "8–10 Mar 2011", "5–6 Mar 2011", "4–6 Mar 2011",
"2–6 Mar 2011", "26–27 Feb 2011", "22–27 Feb 2011", "21–23 Feb 2011",
"18–20 Feb 2011", "15–20 Feb 2011", "12–13/19–20 Feb 2011", "8–13 Feb 2011",
"10–12 Feb 2011", "4–6 Feb 2011", "1–6 Feb 2011", "29–30 Jan/5–6 Feb 2011",
"1–3 Feb 2011", "25–30 Jan 2011", "18–23 Jan 2011", "15–16/22–23 Jan 2011",
"11–16 Jan 2011", "8–9 Jan 2011", "14–19 Dec 2010", "11–12 Dec 2010",
"8–12 Dec 2010", "7–12 Dec 2010", "4–5 Dec 2010", "3–5 Dec 2010",
"30 Nov–5 Dec 2010", "23–28 Nov 2010", "20–21/27–28 Nov 2010",
"19–21 Nov 2010", "16–21 Nov 2010", "18–20 Nov 2010", "9–14 Nov 2010",
"6–7/13–14 Nov 2010", "5–7 Nov 2010", "2–7 Nov 2010", "26–31 Oct 2010",
"23–24/30–31 Oct 2010", "22–24 Oct 2010", "19–24 Oct 2010", "21–23 Oct 2010",
"12–17 Oct 2010", "9–10/16–17 Oct 2010", "8–10 Oct 2010", "5–10 Oct 2010",
"2–3 Oct 2010", "30 Sep–1 Oct 2010", "21–26 Sep 2010", "18–19 Sep 2010",
"14–19 Sep 2010", "15–16 Sep 2010", "10–12 Sep 2010", "7–12 Sep 2010",
"31 Aug–5 Sep 2010", "28–29 Aug/4–5 Sep 2010", "24–29 Aug 2010",
"25–26 Aug 2010", "c. 21 Aug 2010", "17–19 Aug 2010", "13–19 Aug 2010"
)
Some particularly odd dates to look out for in the dataset
"30 Nov ? 3 Dec 2017"
"21–22,\n 28–29 May 2016"
"24–25 Oct,\n 1 Nov 2015"
"29–30 Sep/6–7 Oct 2012"
This was an interesting problem! But I think it can be solved with regex.
How about this:
library(tidyverse)
tibble(dates = dates) %>%
mutate(end_year = str_extract(dates, "[0-9]*$"),
end_year = ifelse(str_length(end_year) == 2, paste0("20", end_year), end_year),
month_one = str_extract(dates, "[A-Z][a-z][a-z]"),
month_two = str_sub(str_extract(dates, "[A-Z][a-z][a-z].*[A-Z][a-z][a-z]"), start = -3),
month_two = if_else(is.na(month_two), month_one, month_two),
day_one = str_extract(dates, "[0-9]+"),
dates_without_day_one = gsub("^[0-9]+", "", dates),
day_two = str_extract(dates_without_day_one, "[0-9]+"),
day_two = str_squish(gsub("[-–]", "", day_two)),
day_three_four = str_extract(dates, "/.+[-–] *[0-9]+"),
day_three = str_extract(day_three_four, "/ *[0-9]+"),
day_three = str_squish(gsub("/", "", day_three)),
day_four = str_extract(day_three_four, "[-–] *[0-9]+"),
day_four = str_squish(gsub("[-–]", "", day_four))
) %>%
# dates that are only a single day:
mutate(day_two = if_else(is.na(day_two), day_one, day_two)) %>%
# dates that actually have four days:
mutate(day_one = ifelse(is.na(day_three),
day_one,
round((as.numeric(day_one) + as.numeric(day_two)) / 2)),
day_two = ifelse(is.na(day_three),
day_two,
round((as.numeric(day_three) + as.numeric(day_four)) / 2))) %>%
select(-day_three_four, -dates_without_day_one) %>%
mutate(start_date = as.Date(paste(end_year,month_one, day_one, sep = "-"), format = "%Y-%b-%d"),
end_date = as.Date(paste(end_year,month_two, day_two, sep = "-"), format = "%Y-%b-%d")) %>%
select(dates, start_date, end_date, everything())
Delivers:
# A tibble: 838 x 10
dates start_date end_date end_year month_one month_two day_one day_two day_three day_four
<chr> <date> <date> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 12-15 Feb 2019 2019-02-12 2019-02-15 2019 Feb Feb 12 15 NA NA
2 6–11 Feb 2019 2019-02-06 2019-02-11 2019 Feb Feb 6 11 NA NA
3 7–10 Feb 2019 2019-02-07 2019-02-10 2019 Feb Feb 7 10 NA NA
4 23–30 Jan 2019 2019-01-23 2019-01-30 2019 Jan Jan 23 30 NA NA
5 24–27 Jan 2019 2019-01-24 2019-01-27 2019 Jan Jan 24 27 NA NA
6 9–13 Jan 2019 2019-01-09 2019-01-13 2019 Jan Jan 9 13 NA NA
7 13-16 Dec 2018 2018-12-13 2018-12-16 2018 Dec Dec 13 16 NA NA
8 13–15 Dec 2018 2018-12-13 2018-12-15 2018 Dec Dec 13 15 NA NA
9 6–9 Dec 2018 2018-12-06 2018-12-09 2018 Dec Dec 6 9 NA NA
10 29 Nov – 2 Dec 2018 2018-11-29 2018-12-02 2018 Nov Dec 29 2 NA NA
I was faced with the same issue having to deal with documents from all around the world. The best answer is to force the date format upstream when creating the input formular.
This answer will not solve your exact problem of date ranges, but you could tweak the solution I've brough here to deal with it. I used what's called regular expression patterns. I paste here the patterns i've used in a similar solution in python.
# 2019/02/20 or 2019-02-20
(?:|[\s\/\.:])+(\d{4})[\/\-\.\s](\d{2})[\/\-\.\s](\d{2})(?:$|[\s\/\.\-])+
# 02/20/2019 or 20/02/2019
(?:|[\/\s\.:])+(\d{2})[\/\-\.\s](\d{2})[\/\-\.\s](\d{4})(?:$|[\/\s\.\-])+
# 20 Feb 2019 or 20-Feb-2019
(?:^|[\s\.:])+(\d{2})[\/\-\.\s]?([a-zA-Z]{2,3})[\/\-\.\s]?(\d{4})(?:$|[\s\.\-])+
# 2019 Feb 20
(?:^|[\s\.:])+(\d{4})[\/\-\.\s]?([a-zA-Z]{2,3})[\/\-\.\s]?(\d{2})(?:$|[\s\.\-])+
# February 20th, 2019
(?:^|[\s\.:])+([a-zA-Z]{3,15})\s(\d{1,2})\s?[a-zA-Z]{2},\s?(\d{2,4})(?:$|[\s\.\-])+
# Feb 20 2019 or February 20 2019
(?:^|[\s\.:])*([a-zA-Z]{3,15})[ _\-\/\\\.]?(\d{1,2})[ _\-\/\\\.](\d{2,4})(?:$|[\s\.\-])+
#20-FEB-2019
(?:^|[\s\.\-:])+(\d{1,2})[ _\-\/\\\.]([a-zA-Z]{3,15})[ _\-\/\\\.](\d{2,4})(?:$|[\s\.\-])+
#2019.Feb.20
(?:^|[\s\.\-:])+(\d{4})[ _\-\/\\\.]([a-zA-Z]{3,15})[ _\-\/\\\.](\d{1,2})(?:$|[\s\.\-])+
# 20 Feb. 2019
(?:^|[\s\.\-:])+(\d{2})[\/\-\.\s]([a-zA-Z]{3,15})[\/\-\.\s]{1,2}(\d{4})(?:$|[\s\.\-])+
These regular expressions patterns contain three groups, that you can extract and manually test in case you parse ambiguous dates (e.g.: 02/01/2019).
Before implementing the solution, I recommend you to proceed as follows:
make a list of all your dates to parse, one per line
paste them all in a regex tester (e.g.: online regex tester)
try (and eventually modify) each of the following patterns
verify you properly catch groups 1 to 3
check out the dates properly caught until no one is left in the list, and that no conflicts occur
implement a date parsing function in R
secretly wish there would be an international standard for dates

Converting date formats from one class to another

I have the following dates.
c("Aug 08, 2017", "Aug 09, 2017", "Aug 11, 2017", "Aug 11, 2017",
"Aug 10, 2017", "Sep 22, 2017", "Aug 11, 2017", "Aug 11, 2017",
"Aug 11, 2017", "Aug 14, 2017", "Aug 16, 2017", "Aug 16, 2017",
"Aug 18, 2017", "Aug 18, 2017", "Aug 18, 2017", "Sep 20, 2017",
"Aug 22, 2017", "Sep 20, 2017", "Sep 14, 2017", "Sep 25, 2017"
)
I am trying to convert them into the following format:
structure(c(17386, 17387, 17388, 17389, 17392, 17393, 17394,
17395, 17396, 17399, 17400, 17401, 17402, 17403, 17406, 17407,
17408, 17409, 17410, 17414), class = "Date")
Which looks like;
[1] "2017-08-08" "2017-08-09" "2017-08-10" "2017-08-11" "2017-08-14" "2017-08-15" "2017-08-16" "2017-08-17"
[9] "2017-08-18" "2017-08-21" "2017-08-22" "2017-08-23" "2017-08-24" "2017-08-25" "2017-08-28" "2017-08-29"
[17] "2017-08-30" "2017-08-31" "2017-09-01" "2017-09-05"
How can I convert characters to date format?
EDIT:
I run the following;
> C=c("Aug 08, 2017", "Aug 09, 2017", "Aug 11, 2017", "Aug 11, 2017",
+ "Aug 10, 2017", "Sep 22, 2017", "Aug 11, 2017", "Aug 11, 2017",
+ "Aug 11, 2017", "Aug 14, 2017", "Aug 16, 2017", "Aug 16, 2017",
+ "Aug 18, 2017", "Aug 18, 2017", "Aug 18, 2017", "Sep 20, 2017",
+ "Aug 22, 2017", "Sep 20, 2017", "Sep 14, 2017", "Sep 25, 2017"
+ )
> as.numeric(as.Date(C,format='%B %d, %Y'))
[1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
> as.Date(C,format='%B %d, %Y')
[1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
EDIT:
The following also does not work;
date <- gsub(",", "", date)
date <- gsub(" ", "-", date)
date
as.numeric(as.Date(date, format='%b %d, %Y'))
EDIT:
The following seems to work;
mdy(C)
Giving;
[1] "2017-08-08" "2017-08-09" "2017-08-11" "2017-08-11" "2017-08-10" "2017-09-22" "2017-08-11" "2017-08-11"
[9] "2017-08-11" "2017-08-14" "2017-08-16" "2017-08-16" "2017-08-18" "2017-08-18" "2017-08-18" "2017-09-20"
[17] "2017-08-22" "2017-09-20" "2017-09-14" "2017-09-25"
You need using the format in as.Date
as.numeric(as.Date(C,format='%B %d, %Y'))
[1] 17386 17387 17389 17389 17388 17431 17389 17389 17389 17392 17394 17394 17396 17396 17396 17429 17400 17429 17423 17434
as.Date(C,format='%B %d, %Y')
[1] "2017-08-08" "2017-08-09" "2017-08-11" "2017-08-11" "2017-08-10" "2017-09-22" "2017-08-11" "2017-08-11" "2017-08-11" "2017-08-14" "2017-08-16"
[12] "2017-08-16" "2017-08-18" "2017-08-18" "2017-08-18" "2017-09-20" "2017-08-22" "2017-09-20" "2017-09-14" "2017-09-25"
Dinput:
C=c("Aug 08, 2017", "Aug 09, 2017", "Aug 11, 2017", "Aug 11, 2017",
"Aug 10, 2017", "Sep 22, 2017", "Aug 11, 2017", "Aug 11, 2017",
"Aug 11, 2017", "Aug 14, 2017", "Aug 16, 2017", "Aug 16, 2017",
"Aug 18, 2017", "Aug 18, 2017", "Aug 18, 2017", "Sep 20, 2017",
"Aug 22, 2017", "Sep 20, 2017", "Sep 14, 2017", "Sep 25, 2017"
)

Resources