R Convert number to month - r

I'm trying to build a time series. My data frame has each month listed as a number. When I use as.Date() I get NA. How do I convert a number to its respective month, as a date.
Example

R Base has a built in month dataset. make sure your numbers are actually numeric by as.numeric() and then you can just use month.name[1] which outputs January

Below we assume that the month numbers given are the number of months relative to a base of the first month so for example month 13 would represent 12 months after month 1. Also we assume that the months re unique since that is the case in the question and since it is stated there that it represents a time series.
1) Let base be the year and month as a yearmon class object identifying the base year/month and assume months is vector of month numbers such that 1 is the base, 2 is one month later and so on. Since yearmon class represents a year and month as year + 0 for Jan, year + 1/12 for Feb, ..., year + 11/12 for Dec we have the code below to get a Date vector. Alternately use ym instead since that models a year and month already.
library(zoo)
# inputs
base <- as.yearmon("2020-01")
months <- 1:9
ym <- base + (months-1)/12
as.Date(ym)
## [1] "2020-01-01" "2020-02-01" "2020-03-01" "2020-04-01" "2020-05-01"
## [6] "2020-06-01" "2020-07-01" "2020-08-01" "2020-09-01"
For example, if we have this data.frame we can convert that to a zoo series or a ts series like this using base from above:
library(zoo)
DF <- data.frame(month = 1:9, value = 11:19) # input
z <- with(DF, zoo(value, base + (month-1)/12)) # zoo series
tt <- as.ts(z) # ts series
2) Alternately, if it were known that the series is consecutive months starting in January 2020 then we could ignore the month column and do this (where DF and base were shown above):
library(zoo)
zz <- zooreg(DF$value, base, freq = 12) # zooreg series
as.ts(zz) # ts series
3) This would also work to create a ts series if we can make the same assumptions as in (2). This uses only base R.
ts(DF$value, start = 2020, freq = 12)

Related

How to use cut function on dates

I have the following two dates:
dates <- c("2019-02-01", "2019-06-30")
I want to create the following bins from above two dates:
2019-05-30, 2019-04-30, 2019-03-31, 2019-02-28
I used cut function along with seq,
dt <- as.Date(dates)
cut(seq(dt[1], dt[2], by = "month"), "month")
but this does not produce correct results.
Could you please shed some light on the use of cut function on dates?
We assume that what is wanted is all end of months between but not including the 2 dates in dates. In the question dates[1] is the beginning of the month and dates[2] is the end of the month but we do not assume that although if we did it might be simplified. We have produced descending series below but usually in R one uses ascending.
The first approach below uses a monthly sequence and cut and the second approach below uses a daily sequence.
No packages are used.
1) We define a first of the month function, fom, which given a Date or character date gives the Date of the first of the month using cut. Then we calculate monthly dates between the first of the months of the two dates, convert those to end of the month and then remove any dates that are not strictly between the dates in dates.
fom <- function(x) as.Date(cut(as.Date(x), "month"))
s <- seq(fom(dates[2]), fom(dates[1]), "-1 month")
ss <- fom(fom(s) + 32) - 1
ss[ss > dates[1] & ss < dates[2]]
## [1] "2019-05-31" "2019-04-30" "2019-03-31" "2019-02-28"
2) Another approach is to compute a daily sequence between the two elements of dates after converting to Date class and then only keep those for which the next day has a different month and is between the dates in dates. This does not use cut.
dt <- as.Date(dates)
s <- seq(dt[2], dt[1], "-1 day")
s[as.POSIXlt(s)$mon != as.POSIXlt(s+1)$mon & s > dt[1] & s < dt[2]]
## [1] "2019-05-31" "2019-04-30" "2019-03-31" "2019-02-28"
There is no need for cut here:
library(lubridate)
dates <- c("2019-02-01", "2019-06-30")
seq(min(ymd(dates)), max(ymd(dates)), by = "months") - 1
#> [1] "2019-01-31" "2019-02-28" "2019-03-31" "2019-04-30" "2019-05-31"
Created on 2021-11-25 by the reprex package (v2.0.1)

Converting character variable into date variable with only years in R

I have a character variable with n observations per year, like this:
Years <- c("2010","2010","2011", "2011", "2012", "2012")
I would like R to read these characters as actual years, therefore I tried with
dates <- as.Date(Years, format = "%Y")
But the output obtained is this:
[1] "2010-09-24" "2010-09-24" "2011-09-24" "2011-09-24" "2012-09-24" "2012-09-24"
while I would like to keep them only with the year.
Is it possible to obtain a Date variable with only the years?
Thanks
Date class objects require a year, month and day. If you only have a year then you either have to add a month and day or not use Date class.
Also a time series should have one point for each index value. Aggregate the values corresonding to each year using mean, tail1 <- function(x) tail(x, 1) or other aggregation function so that there is only one point per year.
xts does not support just a year as the index but zoo does and if the series is regularly spaced a ts series could be used as well.
library(zoo)
# note that we are assuming a numeric year
DF <- data.frame(year = c(2010, 2010, 2011, 2012, 2012), value = 1:5)
z <- read.zoo(DF, aggregate = mean)
tt <- as.ts(z)
Another possibility is to use yearmon or yearqtr class. This has both a year and month or a year and a quarter but you don't need a day and internally January or Q1 is stored as a number equal to the year.
library(xts)
zm <- read.zoo(DF, FUN = as.yearmon, aggregate = mean)
xm <- as.xts(zm)
zq <- read.zoo(DF, FUN = as.yearqtr, aggregate = mean)
xq <- as.xts(zq)

What are the R equivalents of the Stata functions qofd(), mofd() and wofd()?

What are the R equivalents of the Stata functions qofd(), mofd() and wofd()?
I am not looking for any R function that converts dates to strings (for instance, converting 10/13/2016 to 2006q4 using qofd()).
I want functions that convert a date into a float format which can be used directly (without conversion to other formats) in regression, and can show for example as 2006q4 when we look at the data.
Date
If d is of Date class then as.numeric(d) gives the number of days since the UNIX Epoch (which is January 1, 1970). If a Date class variable is used in a regression that is the numeric vector used as shown in this example.
y <- (1:10)^2
x <- as.Date("2000-01-01") + 0:9
xx <- as.numeric(x)
identical(unname(coef(lm(y ~ x))), unname(coef(lm(y ~ xx))))
## [1] TRUE
yearmon and yearqtr
The zoo package has yearmon and yearqtr classes that display as shown below but are represented internally as year + fraction. For yearmon the fraction is 0 for Jan, 1/12 for Feb, ..., 11/12 for Dec. For yearqtr the fraction is 0 for Q1, 1/4 for Q2, 2/4 for Q3 and 3/4 for Q4.
Here is how objects of these classes are rendered by default. format can be used to get other formats. See ?yearmon in the zoo package.
library(zoo)
as.yearmon("2000-01")
## [1] "Jan 2000"
as.yearqtr("2000-1")
## [1] "2000 Q1"
Here we show that regressing on a yearmon variable is the same as regressing on its numeric representation. A similar example could be given for yearqtr. y is from above.
ym <- as.yearmon(2000) + 0:9/12
num <- as.numeric(ym)
identical(unname(coef(lm(y ~ ym))), unname(coef(lm(y ~ num))))
## [1] TRUE
weeks
The single line nextfri function defined in this zoo vignette:
https://cran.r-project.org/web/packages/zoo/vignettes/zoo-quickref.pdf
cam be used to standardize dates to Fridays only. Replace 5 in that formula with another number between 0 and 6 to get that day of the week.
library(zoo)
yy <- (1:365)^2
ww <- nextfri(as.Date("2019-01-01") + 0:364)
# regress yy on next Friday
lm(yy ~ ww)

Adding quarters to R date

I have a R time series data, where I am calculating the means for all values up to a particular date, and storing this means in the date + 4 quarters. The dates are all month ends. To achieve this, I am looking to increment 4 quarters to a date. My question is how can I add 4 quarters to an R date data-type. An illustration:
a <- as.Date("2006-01-01")
b <- as.Date("2011-01-01")
date_range <- quarter(seq.Date(a, b, by = "quarter"), with_year = TRUE)
> date_range[1] + 1
[1] 2007.1
> date_range[1] + quarter(1)
[1] 2007.1
> date_range[1] + 0.25
[1] 2006.35
One possible way I am thinking is to get year-quarter dates, and then adding 4 to it. But wasn't sure what is the best way to do this?
The problem is that quarters have different lengths. Q1 is shortest because it includes February (though it ties with Q2 in leap years). Things like this make "adding a quarter to a date" poorly defined. Even adding months to a date can be tricky at the ends months - what is 1 month after January 31?
Beginnings of months are more straightforward, and I would recommend you use the 1st day of quarters rather than the last (if you must use a specific date). lubridate provides functions like floor_date() and ceiling_date() to which you can pass unit = "quarter" and they will return the first day of the current or subsequent quarter, respectively. You can also always add months(3) to a day at the beginning of a month, though of course if your intention is to add 4 quarters you may as well just add 1 year.
Just add 12 months or a year instead?
Or if it must be quarters, define yourself a function, like so:
quarters <- function(x) {
months(3*x)
}
and then use it to add to the date sequence:
date_range <- seq.Date(a, b, by = "quarter")
date_range + quarters(4)
Lubridate has a function for quarters already included. This is a much better solution than creating your own function.
https://www.rdocumentation.org/packages/lubridate/versions/1.7.4/topics/quarter
Old answer but to those arriving here, lubridate has a function %m+%that adds months and preserves monthends.
a <- as.Date("2006-01-01")
Add future months worth of dates:
The original poster wanted 4 quarters in future so that will be 12 months.
future_date <- a %m+% months(12)
future_date
[1] "2007-01-01"
You could also do years as the period:
future_date <- a %m+% years(1)
Remove months from date:
Subtract dates with %m-%
If you wanted a date 3 months ago from 1/1/2006:
past_date <- a %m-% months(3)
past_date
[1] "2005-10-01"
Example with dates not at end of months:
mplus will preserve days in month:
as.Date("2022-10-10") %m-% months(3)
[1] "2022-07-10"
For more, see documentation on "Add and subtract months to a date without exceeding the last day of the new month"
Note that other answers that use Date class will give irregularly spaced series and so are unsuitable for time series analysis.
To do this in such a way that time series analyses can be performed and noting the zoo tag on the question, the yearmon class represents year/month as year + fraction where fraction is 0 for Jan, 1/12 for Feb, 2/12 for Mar, ..., 11/12 for Dec. Thus adding 4 quarters is just a matter of adding 1. (Adding x quarters is done by adding x/4.)
library(zoo)
ym <- yearmon(2006) + 0:11/12 # months in 2006
ym + 1 # one year later
Also this converts yearmon objects to end-of-month Date and in the second line Date to yearmon. Using frac = 0 or omitting frac in the first line would convert to beginning of month dates.
d <- as.Date(ym, frac = 1) # d is Date vector of end-of-months
as.yearmon(d) # convert Date vector to yearmon
If your input dates represent quarters then there is also the yearqtr class which represents a year/quarter as year + fraction where fraction is 0, 1/4, 2/4, 3/4 for the 4 quarters of a year. Adding 4 quarters is done by adding 1 (or to add x quarters add x/4).
yq <- as.yearqtr(2006) + 0:3/4 # all quarters in 2006
yq + 1 # one year later
Conversions work similarly to yearmon:
d <- as.Date(ym, frac = 1) # d is Date vector of end-of-quarters
as.yearqtr(d) # convert Date vector to yearqtr

R: Creating two date variables from a complete date

I have date recorded as: Month/Day/Year or MM/DD/YYYY
I would like to write code that creates two new variables from that information.
I would like a year variable alone
I would like to create a quarter variable
The Quarter Variables would not be influenced by year. I would want this variable to apply to all years.
Quarter 1 would be January 1 - March 31
Quarter 2 would be April 1 - June 30
Quarter 3 would be July 1 - September 30
Quarter 4 would be October 1 - December 31
Any assistance would be greatly appreciated. I cannot seem to get the nuance of how to do these functions in R.
Thanks,
Jared
Assuming that the date variable is of class POSIX** you could do:
#example date
date <- as.POSIXlt( "05/12/2015", format='%m/%d/%Y')
In order to return the year from a date data.table has already a function to do it and that is year:
library(data.table)
> year(date)
[1] 2015
As for the quarter it can easily be created from the function below (uses data.table::month that returns the number of a month):
quarter <- function(x) {
rep(c('quarter 1','quarter 2','quarter 3','quarter 4'), each=3)[month(x)]
}
> quarter(date)
[1] "quarter 2"
Using only the base packages:
Try formatting your dates with the strptime fxn, so that all dates are now in the Year-Month-Day format. This format constrains the each element of the date to be the same character length and in the same position. Look at the strptime documentation for the appropriate formatting argument.
date.vec<-c(1/1/1999,2/2/1999)
fmt.date.vec<-strptime(date.vec, "%m/%d/%Y")
With the dates in this format it is easy to extract the year, month, and day using the substring function
Year<-substring(fmt.date.vec,1,4)
Month<-substring(fmt.date.vec,6,7)
Day<-substring(fmt.date.vec,9,10)
With this information you can now generate your Quarter vector any number of ways. For example if a data.frame "df" has a Month column:
df$Quarter<-"Quarter_1"
df[df$Month %in% c("04","05","06"),]$Quarter<-"Quarter_2"
df[df$Month %in% c("07","08","09"),]$Quarter<-"Quarter_3"
df[df$Month %in% c("10","11","12"),]$Quarter<-"Quarter_4"

Resources