This question already has answers here:
Extract Month and Year From Date in R
(5 answers)
Closed 1 year ago.
Let's say I have this example df dataset
Order.Date
2011-10-20
2011-12-25
2012-04-15
2012-08-23
2013-09-25
I want to extract the month and the year, and be like this
Order.Date Month Year
2011-10-20 October 2011
2011-12-25 December 2011
2012-04-15 April 2012
2012-08-23 August 2012
2013-09-25 September 2013
any solution? anything, can use lubridate or anything
lubridate month and year will work.
as.data.frame(Order.Date) %>%
mutate(Month = lubridate::month(Order.Date, label = FALSE),
Year = lubridate::year(Order.Date))
Order.Date Month Year
1 2011-10-20 10 2011
2 2011-12-25 12 2011
3 2012-04-15 4 2012
4 2012-08-23 8 2012
5 2013-09-25 9 2013
If you want month format as Jan, use month.abb and as January, use month.name
as.data.frame(Order.Date) %>%
mutate(Month = month.abb[lubridate::month(Order.Date, label = TRUE)],
Year = lubridate::year(Order.Date))
Order.Date Month Year
1 2011-10-20 Oct 2011
2 2011-12-25 Dec 2011
3 2012-04-15 Apr 2012
4 2012-08-23 Aug 2012
5 2013-09-25 Sep 2013
as.data.frame(Order.Date) %>%
mutate(Month = month.name[lubridate::month(Order.Date, label = TRUE)],
Year = lubridate::year(Order.Date))
Order.Date Month Year
1 2011-10-20 October 2011
2 2011-12-25 December 2011
3 2012-04-15 April 2012
4 2012-08-23 August 2012
5 2013-09-25 September 2013
You can use format with %B for Month and %Y for Year or using months for Month.
format(as.Date("2011-10-20"), "%B")
#months(as.Date("2011-10-20")) #Alternative
#[1] "October"
format(as.Date("2011-10-20"), "%Y")
#[1] "2011"
Related
I have monthly data and want to convert period columns as.date in r.
In addition, rows are not ordered according to time in data frame
df <- data.frame (period = c("March 2019", "February 2019", "January 2019", "May 2019","April 2019","August 2019","June 2019","July 2019","November 2019","September 2019","October 2019","December 2019"),sales = rnorm(12))
period sales
1 March 2019 1.841711557
2 February 2019 0.403043685
3 January 2019 0.524417978
4 May 2019 0.236378511
5 April 2019 -0.099441313
6 August 2019 0.001731664
7 June 2019 0.792067260
8 July 2019 -0.352379347
9 November 2019 1.174681909
10 September 2019 0.075480279
11 October 2019 -0.258695621
12 December 2019 -1.775315927
Using as.Date with appropriate format on pasted 1 to period, then order.
transform(dat, period=as.Date(paste(1, period), '%d %b %Y')) |>
{\(.) .[order(.$period), ]}()
# period sales
# 1 2019-01-01 0.25542882
# 5 2019-02-01 0.11748736
# 10 2019-03-01 0.98889173
# 6 2019-04-01 0.47499708
# 2 2019-05-01 0.46229282
# 8 2019-06-01 0.90403139
# 12 2019-07-01 0.08243756
# 7 2019-08-01 0.56033275
# 4 2019-09-01 0.97822643
# 9 2019-10-01 0.13871017
# 11 2019-11-01 0.94666823
# 3 2019-12-01 0.94001452
Data:
set.seed(42)
dat <- data.frame(period=sample(paste(month.name, 2019)),
sales=runif(12))
I have a list of dates for company fiscal years. I would like to convert all dates that lie between 1st Jan - 31st May into a new variable where it says that it belongs to the prior year. I also have dates that range between 1st June - 31st Dec and I want those years to stay the same.
Example of what we want:
date year
2010-05-31 2009
2015-03-31 2014
2007-04-30 2006
2011-08-31 2011
2002-11-30 2002
Your help is much appreciated! Thank you!
You can do in base R:
> df <- data.frame(date = as.Date(c("2010-05-31", "2015-03-31", "2007-04-30", "2011-08-31", "2002-11-30")))
> df$year <- as.numeric(format(df$date, "%Y")) - (as.numeric(format(df$date, "%m")) < 6)
> df
date year
1 2010-05-31 2009
2 2015-03-31 2014
3 2007-04-30 2006
4 2011-08-31 2011
5 2002-11-30 2002
Final year is the year minus 1 if month is before June.
Using dplyr and lubridate :
library(dplyr)
library(lubridate)
df %>% mutate(year = year(date) - as.integer(month(date) <= 5))
# date year
#1 2010-05-31 2009
#2 2015-03-31 2014
#3 2007-04-30 2006
#4 2011-08-31 2011
#5 2002-11-30 2002
I have a simple df with a column of dates in yearmon class:
df <- structure(list(year_mon = structure(c(2015.58333333333, 2015.66666666667,
2015.75, 2015.83333333333, 2015.91666666667, 2016, 2016.08333333333,
2016.16666666667, 2016.25, 2016.33333333333), class = "yearmon")), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -10L))
I'd like a simple way, preferably using base R, lubridate or xts / zoo to calculate the first and last days of each month.
I've seen other packages that do this, but I'd like to stick with the aforementioned if possible.
We can use
library(dplyr)
library(lubridate)
library(zoo)
df %>%
mutate(firstday = day(year_mon), last = day(as.Date(year_mon, frac = 1)))
Using base R, you could convert the yearmon object to date using as.Date which would give you the first day of the month. For the last day, we could increment the date by a month (1/12) and subtract 1 day from it.
df$first_day <- as.Date(df$year_mon)
df$last_day <- as.Date(df$year_mon + 1/12) - 1
df
# year_mon first_day last_day
# <S3: yearmon> <date> <date>
# 1 Aug 2015 2015-08-01 2015-08-31
# 2 Sep 2015 2015-09-01 2015-09-30
# 3 Oct 2015 2015-10-01 2015-10-31
# 4 Nov 2015 2015-11-01 2015-11-30
# 5 Dec 2015 2015-12-01 2015-12-31
# 6 Jan 2016 2016-01-01 2016-01-31
# 7 Feb 2016 2016-02-01 2016-02-29
# 8 Mar 2016 2016-03-01 2016-03-31
# 9 Apr 2016 2016-04-01 2016-04-30
#10 May 2016 2016-05-01 2016-05-31
Use as.Date.yearmon from zoo as shown. frac specifies the fractional amount through the month to use so that 0 is beginning of the month and 1 is the end.
The default value of frac is 0.
You must already be using zoo if you are using yearmon (since that is where the yearmon methods are defined) so this does not involve using any additional packages beyond what you are already using.
If you are using dplyr, optionally replace transform with mutate.
transform(df, first = as.Date(year_mon), last = as.Date(year_mon, frac = 1))
gives:
year_mon first last
1 Aug 2015 2015-08-01 2015-08-31
2 Sep 2015 2015-09-01 2015-09-30
3 Oct 2015 2015-10-01 2015-10-31
4 Nov 2015 2015-11-01 2015-11-30
5 Dec 2015 2015-12-01 2015-12-31
6 Jan 2016 2016-01-01 2016-01-31
7 Feb 2016 2016-02-01 2016-02-29
8 Mar 2016 2016-03-01 2016-03-31
9 Apr 2016 2016-04-01 2016-04-30
10 May 2016 2016-05-01 2016-05-31
I have a df with dates formatted in the following way.
Date Year
<chr> <dbl>
Sunday, Jul 27 2008
Tuesday, Jul 29 2008
Wednesday, July 31 (1) 2008
Wednesday, July 31 (2) 2008
Is there a simple way to achieve the following format of columns and values? I'd also like to remove the (1) and (2) notations on the two July 31 dates.
Date Year Month Day Day_of_Week
2008-07-27 2008 07 27 Sunday
With base R, you can do:
dat <- data.frame(
Date = c("Sunday, Jul 27" ,"Tuesday, Jul 29", "Wednesday, July 31", "Wednesday, July 31"),
Year = rep(2008, 4),
stringsAsFactors = FALSE
)
dts <- as.POSIXlt(paste(dat$Year, dat$Date), format = "%Y %A, %B %d")
POSIXlt provides a list-based reference for the date/time. To see them, try unclass(dts[1]).
From here it can be rather academic:
dat$Month = 1 + dts$mon # months are 0-based in POSIXlt
dat$Day = dts$mday
dat$Day_of_Week = weekdays(dts)
dat
# Date Year Month Day Day_of_Week
# 1 Sunday, Jul 27 2008 7 27 Sunday
# 2 Tuesday, Jul 29 2008 7 29 Tuesday
# 3 Wednesday, July 31 2008 7 31 Thursday
# 4 Wednesday, July 31 2008 7 31 Thursday
library(dplyr)
library(lubridate)
dat = data_frame(date = c('Sunday, Jul 27','Tuesday, Jul 29', 'Wednesday, July
31 (1)','Wednesday, July 31 (2)'), year=rep(2008,4))
dat %>%
mutate(date = gsub("\\s*\\([^\\)]+\\)","",as.character(date)),
date = parse_date_time(date,'A, b! d ')) -> dat1
year(dat1$date) <- dat1$year
# A tibble: 4 × 2
date year
<dttm> <dbl>
1 2008-07-27 2008
2 2008-07-29 2008
3 2008-07-31 2008
4 2008-07-31 2008
I have a dataset with dates in following format:
Initial:
Jan-2015 Apr-2013 Jun-2014 Jan-2015 Jan-2016 Jan-2015 Jan-2016 Jan-2015 Apr-2012 Nov-2012 Jun-2013 Sep-2013
Final:
Feb-2014 Jan-2013 Sep-2014 Apr-2013 Sep-2014 Mar-2013 Aug-2012 Apr-2012 Oct-2012 Oct-2013 Jun-2014 Oct-2013
I would like to perform these steps:
create dummy variables for Month and Year
Subtract these dates from another dates to find out duration (final- initials) in months
I would like to do these in R?
You could use as.yearmon from the zoo package for this.
library(zoo)
12 * (as.yearmon("Jan-2015", "%b-%Y") - as.yearmon("Feb-2014", "%b-%Y"))
# result
# [1] 11
To expand on #neilfws answer, you can use the month and year functions from the lubridate package to create your dummy variables with the month and year in your data frame.
Here is the code:
library(lubridate)
library(zoo)
df <- data.frame(Initial = c("Jan-2015", "Apr-2013", "Jun-2014", "Jan-2015", "Jan-2016", "Jan-2015",
"Jan-2016", "Jan-2015", "Apr-2012", "Nov-2012", "Jun-2013", "Sep-2013"),
Final = c("Feb-2014", "Jan-2013", "Sep-2014", "Apr-2013", "Sep-2014", "Mar-2013",
"Aug-2012", "Apr-2012", "Oct-2012", "Oct-2013", "Jun-2014", "Oct-2013"))
df$Initial <- as.character(df$Initial)
df$Final <- as.character(df$Final)
df$Initial <- as.yearmon(df$Initial, "%b-%Y")
df$Final <- as.yearmon(df$Final, "%b-%Y")
df$month_initial <- month(df$Initial)
df$year_intial <- year(df$Initial)
df$month_final <- month(df$Final)
df$year_final <- year(df$Final)
df$Difference <- 12*(df$Initial-df$Final)
And here is the final data.frame:
> head(df)
Initial Final month_initial year_intial month_final year_final Difference
1 Jan 2015 Feb 2014 1 2015 2 2014 11
2 Apr 2013 Jan 2013 4 2013 1 2013 3
3 Jun 2014 Sep 2014 6 2014 9 2014 -3
4 Jan 2015 Apr 2013 1 2015 4 2013 21
5 Jan 2016 Sep 2014 1 2016 9 2014 16
6 Jan 2015 Mar 2013 1 2015 3 2013 22
Hope this helps!