I have a simple df with a column of dates in yearmon class:
df <- structure(list(year_mon = structure(c(2015.58333333333, 2015.66666666667,
2015.75, 2015.83333333333, 2015.91666666667, 2016, 2016.08333333333,
2016.16666666667, 2016.25, 2016.33333333333), class = "yearmon")), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -10L))
I'd like a simple way, preferably using base R, lubridate or xts / zoo to calculate the first and last days of each month.
I've seen other packages that do this, but I'd like to stick with the aforementioned if possible.
We can use
library(dplyr)
library(lubridate)
library(zoo)
df %>%
mutate(firstday = day(year_mon), last = day(as.Date(year_mon, frac = 1)))
Using base R, you could convert the yearmon object to date using as.Date which would give you the first day of the month. For the last day, we could increment the date by a month (1/12) and subtract 1 day from it.
df$first_day <- as.Date(df$year_mon)
df$last_day <- as.Date(df$year_mon + 1/12) - 1
df
# year_mon first_day last_day
# <S3: yearmon> <date> <date>
# 1 Aug 2015 2015-08-01 2015-08-31
# 2 Sep 2015 2015-09-01 2015-09-30
# 3 Oct 2015 2015-10-01 2015-10-31
# 4 Nov 2015 2015-11-01 2015-11-30
# 5 Dec 2015 2015-12-01 2015-12-31
# 6 Jan 2016 2016-01-01 2016-01-31
# 7 Feb 2016 2016-02-01 2016-02-29
# 8 Mar 2016 2016-03-01 2016-03-31
# 9 Apr 2016 2016-04-01 2016-04-30
#10 May 2016 2016-05-01 2016-05-31
Use as.Date.yearmon from zoo as shown. frac specifies the fractional amount through the month to use so that 0 is beginning of the month and 1 is the end.
The default value of frac is 0.
You must already be using zoo if you are using yearmon (since that is where the yearmon methods are defined) so this does not involve using any additional packages beyond what you are already using.
If you are using dplyr, optionally replace transform with mutate.
transform(df, first = as.Date(year_mon), last = as.Date(year_mon, frac = 1))
gives:
year_mon first last
1 Aug 2015 2015-08-01 2015-08-31
2 Sep 2015 2015-09-01 2015-09-30
3 Oct 2015 2015-10-01 2015-10-31
4 Nov 2015 2015-11-01 2015-11-30
5 Dec 2015 2015-12-01 2015-12-31
6 Jan 2016 2016-01-01 2016-01-31
7 Feb 2016 2016-02-01 2016-02-29
8 Mar 2016 2016-03-01 2016-03-31
9 Apr 2016 2016-04-01 2016-04-30
10 May 2016 2016-05-01 2016-05-31
Related
I have monthly data and want to convert period columns as.date in r.
In addition, rows are not ordered according to time in data frame
df <- data.frame (period = c("March 2019", "February 2019", "January 2019", "May 2019","April 2019","August 2019","June 2019","July 2019","November 2019","September 2019","October 2019","December 2019"),sales = rnorm(12))
period sales
1 March 2019 1.841711557
2 February 2019 0.403043685
3 January 2019 0.524417978
4 May 2019 0.236378511
5 April 2019 -0.099441313
6 August 2019 0.001731664
7 June 2019 0.792067260
8 July 2019 -0.352379347
9 November 2019 1.174681909
10 September 2019 0.075480279
11 October 2019 -0.258695621
12 December 2019 -1.775315927
Using as.Date with appropriate format on pasted 1 to period, then order.
transform(dat, period=as.Date(paste(1, period), '%d %b %Y')) |>
{\(.) .[order(.$period), ]}()
# period sales
# 1 2019-01-01 0.25542882
# 5 2019-02-01 0.11748736
# 10 2019-03-01 0.98889173
# 6 2019-04-01 0.47499708
# 2 2019-05-01 0.46229282
# 8 2019-06-01 0.90403139
# 12 2019-07-01 0.08243756
# 7 2019-08-01 0.56033275
# 4 2019-09-01 0.97822643
# 9 2019-10-01 0.13871017
# 11 2019-11-01 0.94666823
# 3 2019-12-01 0.94001452
Data:
set.seed(42)
dat <- data.frame(period=sample(paste(month.name, 2019)),
sales=runif(12))
I have a Year-Week format date. Is it possible to convert it to the first day of the week i.e. 201553 is 2015-12-28 and 201601 is 2016-01-04.
I found here how to do it, however, it does not work correctly on my dates. Could you help to do it without ISOweek package.
date<-c(201553L, 201601L, 201602L, 201603L, 201604L, 201605L, 201606L,
201607L, 201608L, 201609L)
as.POSIXct(paste(date, "0"),format="%Y%u %w")
Here's a way,
date<-data.frame(first = c(201553L, 201601L, 201602L, 201603L, 201604L, 201605L, 201606L,
201607L, 201608L, 201609L))
First separate the week and year from integer,
library(stringr)
library(dplyr)
date = date %>% mutate(week = str_sub(date$first,5,6))
date = date %>% mutate(year = str_sub(date$first,1,4))
The use aweek package to find the date,
library(aweek)
date = date %>% mutate(actual_date = get_date(week = date$week, year = date$year))
first week year actual_date
1 201553 53 2015 2015-12-28
2 201601 01 2016 2016-01-04
3 201602 02 2016 2016-01-11
4 201603 03 2016 2016-01-18
5 201604 04 2016 2016-01-25
6 201605 05 2016 2016-02-01
7 201606 06 2016 2016-02-08
8 201607 07 2016 2016-02-15
9 201608 08 2016 2016-02-22
10 201609 09 2016 2016-02-29
I have a dataset with dates in following format:
Initial:
Jan-2015 Apr-2013 Jun-2014 Jan-2015 Jan-2016 Jan-2015 Jan-2016 Jan-2015 Apr-2012 Nov-2012 Jun-2013 Sep-2013
Final:
Feb-2014 Jan-2013 Sep-2014 Apr-2013 Sep-2014 Mar-2013 Aug-2012 Apr-2012 Oct-2012 Oct-2013 Jun-2014 Oct-2013
I would like to perform these steps:
create dummy variables for Month and Year
Subtract these dates from another dates to find out duration (final- initials) in months
I would like to do these in R?
You could use as.yearmon from the zoo package for this.
library(zoo)
12 * (as.yearmon("Jan-2015", "%b-%Y") - as.yearmon("Feb-2014", "%b-%Y"))
# result
# [1] 11
To expand on #neilfws answer, you can use the month and year functions from the lubridate package to create your dummy variables with the month and year in your data frame.
Here is the code:
library(lubridate)
library(zoo)
df <- data.frame(Initial = c("Jan-2015", "Apr-2013", "Jun-2014", "Jan-2015", "Jan-2016", "Jan-2015",
"Jan-2016", "Jan-2015", "Apr-2012", "Nov-2012", "Jun-2013", "Sep-2013"),
Final = c("Feb-2014", "Jan-2013", "Sep-2014", "Apr-2013", "Sep-2014", "Mar-2013",
"Aug-2012", "Apr-2012", "Oct-2012", "Oct-2013", "Jun-2014", "Oct-2013"))
df$Initial <- as.character(df$Initial)
df$Final <- as.character(df$Final)
df$Initial <- as.yearmon(df$Initial, "%b-%Y")
df$Final <- as.yearmon(df$Final, "%b-%Y")
df$month_initial <- month(df$Initial)
df$year_intial <- year(df$Initial)
df$month_final <- month(df$Final)
df$year_final <- year(df$Final)
df$Difference <- 12*(df$Initial-df$Final)
And here is the final data.frame:
> head(df)
Initial Final month_initial year_intial month_final year_final Difference
1 Jan 2015 Feb 2014 1 2015 2 2014 11
2 Apr 2013 Jan 2013 4 2013 1 2013 3
3 Jun 2014 Sep 2014 6 2014 9 2014 -3
4 Jan 2015 Apr 2013 1 2015 4 2013 21
5 Jan 2016 Sep 2014 1 2016 9 2014 16
6 Jan 2015 Mar 2013 1 2015 3 2013 22
Hope this helps!
diff(seq(as.Date("2016-12-21"), as.Date("2017-04-05"), by="month"))
Time differences in days
[1] 31 31 28
The above code generates no of days in the month Dec, Jan and Feb.
However, my requirement is as follows
#Results that I need
#monthly days from date 2016-12-21 to 2017-04-05
11, 31, 28, 31, 5
#i.e 11 days of Dec, 31 of Jan, 28 of Feb, 31 of Mar and 5 days of Apr.
I even tried days_in_month from lubridate but not able to achieve the result
library(lubridate)
days_in_month(c(as.Date("2016-12-21"), as.Date("2017-04-05")))
Dec Apr
31 30
Try this:
x = rle(format(seq(as.Date("2016-12-21"), as.Date("2017-04-05"), by=1), '%b'))
> setNames(x$lengths, x$values)
# Dec Jan Feb Mar Apr
# 11 31 28 31 5
Although we have seen a clever replacement of table by rle and a pure table solution, I want to add two approaches using grouping. All approaches have in common that they create a sequence of days between the two given dates and aggregate by month but in different ways.
aggregate()
This one uses base R:
# create sequence of days
days <- seq(as.Date("2016-12-21"), as.Date("2017-04-05"), by = 1)
# aggregate by month
aggregate(days, list(month = format(days, "%b")), length)
# month x
#1 Apr 5
#2 Dez 11
#3 Feb 28
#4 Jan 31
#5 Mrz 31
Unfortunately, the months are ordered alphabetically as it happened with the simple table() approach. In these situations, I do prefer the ISO8601 way of unambiguously naming the months:
aggregate(days, list(month = format(days, "%Y-%m")), length)
# month x
#1 2016-12 11
#2 2017-01 31
#3 2017-02 28
#4 2017-03 31
#5 2017-04 5
data.table
Now that I've got used to the data.table syntax, this is my preferred approach:
library(data.table)
data.table(days)[, .N, .(month = format(days, "%b"))]
# month N
#1: Dez 11
#2: Jan 31
#3: Feb 28
#4: Mrz 31
#5: Apr 5
The order of months is kept as they have appeared in the input vector.
I have the below data with date and count. Please help in transforming this one row where months are columns. And rows are data of each year
Date count
=================
2011-01-01 10578
2011-02-01 9330
2011-03-01 10686
2011-04-01 10260
2011-05-01 10032
2011-06-01 9762
2011-07-01 10308
2011-08-01 9966
2011-09-01 10146
2011-10-01 10218
2011-11-01 8826
2011-12-01 9504
to
JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
------------------------------------------------------------------------------
2011 10578 9330 10686 10260 10032 9762 10308 9966 10146 10218 8826 9504
2012 ....
This is a perfect task for ts in R base. Suppose your data.frame is xthen using ts will produce the output you want.
> ts(x$count, start=c(2011,01,01), end=c(2011,12,01), frequency=12)
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
2011 10578 9330 10686 10260 10032 9762 10308 9966 10146 10218 8826 9504
If your data is in x try something like this:
library(reshape2)
res <- dcast(transform(x, month = format(Date, format="%m"),
year = format(Date, "%Y")),
year ~ month, value.var="count")
rownames(res) <- res$year
res <- res[,-1]
names(res) <- toupper(month.abb[as.numeric(names(res))])
res
This assumes that x$Date is already a date. If not, you will need to first convert is to a date:
x$Date <- as.Date(x$Date)