as.Date() not working with %B on Mac maybe? [duplicate] - r

This question already has answers here:
Converting year and month ("yyyy-mm" format) to a date?
(9 answers)
Closed 2 years ago.
I don't know what I'm doing wrong with my date data. A sample of the Date column in the housing_data data frame are as follows:
"February 2012" "March 2012" "April 2012" "May 2013" "July 2015" "March 2016"
"May 2016" "April 2017" "July 2017" "October 2017" "December 2017" "February 2018"
I run housing_data$Date <- base::as.Date(housing_data$Date, "%B %Y") and I get nothing but NA back as if I have the formatting incorrect. What am I missing?

I would suggest adding a day value to your string in this way:
#Data
x1 <- c("February 2012", "March 2012", "April 2012", "May 2013", "July 2015")
Code:
#For date
as.Date(paste('01',x1),'%d %B %Y')
Output:
[1] "2012-02-01" "2012-03-01" "2012-04-01" "2013-05-01" "2015-07-01"
Or you can try:
#Format date
format(as.Date(paste('01',x1),'%d %B %Y'),'%B %Y')
Output:
[1] "February 2012" "March 2012" "April 2012" "May 2013" "July 2015"

Related

How to show year only where necessary on date axis

I'm writing code in R that uses ggplot2 to generate several bar graphs based on test data from several trainings, spanning (at present) about a year, like the following:
Currently all the dates are formatted "mon DD YYYY" ("%b %d %Y" as a Date format):
c("Oct 05 2020", "Nov 02 2020", "Nov 30 2020", "Jan 11 2021", "Feb 22 2021",
"Mar 08 2021", "Mar 29 2021", "Apr 12 2021", "May 03 2021", "May 17 2021")
But I'd like to only display the year on the first date, and any subsequent dates that are the first date in a year:
c("Oct 05 2020", "Nov 02", "Nov 30", "Jan 11 2021", "Feb 22", "Mar 08", "Mar 29",
"Apr 12", "May 03", "May 17")
Is there a way to do this, either via some kind of filtering of the Date column or something in ggplot?
We could convert the vector of date into Date class ('v1'), extract the 'Year' component, create a logical vector with duplicated, use that index ('i1') to replace the values in the original vector with formatted Dates from 'v2'
v2 <- as.Date(v1, '%b %d %Y')
i1 <- duplicated(format(v2, '%Y'))
v1[i1] <- format(v2[i1], '%b %d')
-ouptut
v1
[1] "Oct 05 2020" "Nov 02" "Nov 30"
[4] "Jan 11 2021" "Feb 22" "Mar 08" "Mar 29"
[8] "Apr 12" "May 03" "May 17"
data
v1 <- c("Oct 05 2020", "Nov 02 2020", "Nov 30 2020", "Jan 11 2021", "Feb 22 2021",
"Mar 08 2021", "Mar 29 2021", "Apr 12 2021", "May 03 2021", "May 17 2021")

For loop help (R)

I would like to have a for loop for this:
months = c("January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December")
years = c(2018, 2019)
input = 17
for (i in 1:input) {
output[i] = paste(months[i], years[i], sep = " ")????
NEED HELP HERE. rep() ???
}
And I would like the output to be a vector that consists on 17 months:
Output = c("January 2018", "February 2018", "March 2018", "April 2018", ... , "May 2019")
Thank you very much for your help.
Another option will be this:
> c(outer(month.name, 2018:2019, paste))[1:17]
[1] "January 2018" "February 2018" "March 2018"
[4] "April 2018" "May 2018" "June 2018"
[7] "July 2018" "August 2018" "September 2018"
[10] "October 2018" "November 2018" "December 2018"
[13] "January 2019" "February 2019" "March 2019"
[16] "April 2019" "May 2019"
There's already a system supplied vector of month names: month.name. Since paste is vectorized and does recycling, there's no need for a for loop and the default separator for paste is " ", so the code could just be:
output <- paste( month.name, rep( years, each=12) )[1:17]
# test result ----
> output
[1] "January 2018" "February 2018" "March 2018" "April 2018" "May 2018" "June 2018"
[7] "July 2018" "August 2018" "September 2018" "October 2018" "November 2018" "December 2018"
[13] "January 2019" "February 2019" "March 2019" "April 2019"
The other way to do it would be with format applied to seq.Date results:
output <- format( seq( as.Date('2018-01-01'), as.Date('2019-04-01'), by="month") ,
"%B %Y" ) # argument to the format parameter for output
#---------------------
> output
[1] "January 2018" "February 2018" "March 2018" "April 2018" "May 2018" "June 2018"
[7] "July 2018" "August 2018" "September 2018" "October 2018" "November 2018" "December 2018"
[13] "January 2019" "February 2019" "March 2019" "April 2019"
See ?seq.Date and ?format.Date
c(paste(months,"2018"),paste(months,"2019"))[1:17]
## [1] "January 2018" "February 2018" "March 2018" "April 2018" "May 2018" "June 2018"
## [7] "July 2018" "August 2018" "September 2018" "October 2018" "November 2018" "December 2018"
## [13] "January 2019" "February 2019" "March 2019" "April 2019" "May 2019"

Sequence of Months in Words r?

Let's say I want to have a vector that contains each month in order starting with March 2012. I want that month and the next 33.
How would I do this?
With base R:
dates <- seq.Date(from = as.Date("2012-03-01"), length.out = 33, by = "month")
format(dates, "%B %Y")
#> [1] "March 2012" "April 2012" "May 2012" "June 2012"
#> [5] "July 2012" "August 2012" "September 2012" "October 2012"
#> [9] "November 2012" "December 2012" "January 2013" "February 2013"
#> [13] "March 2013" "April 2013" "May 2013" "June 2013"
#> [17] "July 2013" "August 2013" "September 2013" "October 2013"
#> [21] "November 2013" "December 2013" "January 2014" "February 2014"
#> [25] "March 2014" "April 2014" "May 2014" "June 2014"
#> [29] "July 2014" "August 2014" "September 2014" "October 2014"
#> [33] "November 2014"
Created on 2019-01-27 by the reprex package (v0.2.1)
I found this
You can do this:
df <- ymd("2012-03-01")+ months(0:33)
df
Give me:
[1] "2012-03-01" "2012-04-01" "2012-05-01" "2012-06-01" "2012-07-01" "2012-08-01" "2012-09-01"
[8] "2012-10-01" "2012-11-01" "2012-12-01" "2013-01-01" "2013-02-01" "2013-03-01" "2013-04-01"
[15] "2013-05-01" "2013-06-01" "2013-07-01" "2013-08-01" "2013-09-01" "2013-10-01" "2013-11-01"
[22] "2013-12-01" "2014-01-01" "2014-02-01" "2014-03-01" "2014-04-01" "2014-05-01" "2014-06-01"
[29] "2014-07-01" "2014-08-01" "2014-09-01" "2014-10-01" "2014-11-01" "2014-12-01

Using '\\s' vs ' ' in regular expressions, is '\\s' looking for space and backslash simultaneously?

Say I use the string below:
x <- "County January 2016 February 2016 March 2016 April 2016 May 2016 June 2016 July 2016 August 2016 September 2016 October 2016 November 2016 December 2016\r"
What I gather from running the two commands below is that '\s' will seek out a ' ' (space) as well as a '\' backslash.
strsplit(x, "(?<=2016) +", perl=T)
[[1]]
[1] "County January 2016"
[2] "February 2016"
[3] "March 2016"
[4] "April 2016"
[5] "May 2016"
[6] "June 2016"
[7] "July 2016"
[8] "August 2016"
[9] "September 2016"
[10] "October 2016"
[11] "November 2016"
[12] "December 2016\r"
strsplit(x, "(?<=2016)\\s+", perl=T)
[[1]]
[1] "County January 2016"
[2] "February 2016"
[3] "March 2016"
[4] "April 2016"
[5] "May 2016"
[6] "June 2016"
[7] "July 2016"
[8] "August 2016"
[9] "September 2016"
[10] "October 2016"
[11] "November 2016"
[12] "December 2016"
Since the second command also splits on '\' following the last 2016. Am I understanding this correct? I appreciate the help thanks.

Convert Date String in Format "Mon Day, Year Time am/pm" to POSIXlt format in R?

I'm trying to convert a string to a date time. I can't even get R to recognize and format the month though.
Here's a sample of the date formatting:
> for(i in sample_times){print(i)}
[1] "Aug 17, 2015 08:00 am"
[1] "Aug 15, 2015 12:06 am"
[1] "Jun 19, 2013 05:54 pm"
[1] "Aug 23, 2015 07:16 pm"
[1] "Aug 13, 2015 11:04 pm"
[1] "May 28, 2015 04:00 pm"
[1] "Jul 27, 2014 11:00 pm"
[1] "Aug 18, 2015 02:46 pm"
[1] "Aug 19, 2015 05:06 am"
[1] "Aug 25, 2015 07:27 pm"
[1] "Mar 12, 2014 10:14 am"
[1] "Aug 15, 2015 01:08 pm"
[1] "Aug 14, 2015 12:05 am"
[1] "Aug 18, 2015 11:07 am"
[1] "Aug 22, 2015 10:07 am"
[1] "Aug 24, 2015 12:38 pm"
[1] "Aug 21, 2015 08:17 pm"
[1] "Aug 23, 2015 08:34 pm"
[1] "Jun 11, 2015 09:44 pm"
[1] "Aug 23, 2015 01:54 pm"
Just calling strptime fails. Returns NAs.
Here's an example of what I tried, just to get the month out:
strptime(x = tolower(substr(sample_times, 0, 3)), format = "%b")
This returns NA for every value, even though tolower(substr(sample_times, 0, 3)) returns lower case month abbreviations
> tolower(substr(sample_times, 0, 3))
[1] "aug" "aug" "jun" "aug" "aug" "may" "jul" "aug" "aug" "aug" "mar" "aug" "aug" "aug" "aug" "aug" "aug" "aug" "jun" "aug"
How do I convert my strings of dates to POSIXlt format? Sorry if this is a newb question: just getting started with time formatting in R.
You will want to use the format "%b %d, %Y %I:%M %p" in as.POSIXlt()
%b for the abbreviated month (in the current locale)
%d for the day
%Y for the full four-digit year
%I for the hour (when using %p a.k.a am/pm)
%M for the minutes
%p for am/pm
So for a POSIXlt date-time, you can do
(x <- as.POSIXlt("Aug 19, 2015 07:09 am", format = "%b %d, %Y %I:%M %p"))
# [1] "2015-08-19 07:09:00 PDT"
As mentioned in the comments, you must have created the entire POSIX object in order to extract its parts (well, to formally extract them anyway).
months(x)
# [1] "August"
months(x, abbreviate = TRUE)
# [1] "Aug"
Additionally, in order to use strptime(), you must plan on creating the entire date-time object as you cannot just create a month and have it classed as such. See help(strptime) and help(as.POSIXlt) for more.

Resources