Conversion of NA yearqtr to string in zoo: possible bug? - r

In zoo an NA yearqtr is converted to the string "NA QNA" (which is not NA). For example
library(zoo)
qq <- as.yearqtr(c('2015 Q1', NA))
is.na(as.character(qq)) == is.na(qq) # returns TRUE FALSE
In contrast with base date we have:
dd <- as.Date(c('2015-1-1', NA))
is.na(as.character(dd)) == is.na(dd) # returns TRUE TRUE
My impression is that the date behavior is the expected behavior. Should I report this to zoo? (And if so, what is the best way to do so? Email maintainer?)

Thanks for pointing out this bug. And yes, the simplest way to report such problems is by e-mail to the maintainer (= me).
I've just fixed the problem in the development version of zoo (1.8-0 to be) on R-Forge. After running install.packages("zoo", repos="http://R-Forge.R-project.org") you should get the expected behavior:
library("zoo")
qq <- as.yearqtr(c("2015 Q1", NA))
as.character(qq)
## [1] "2015 Q1" NA
is.na(as.character(qq)) == is.na(qq)
## [1] TRUE TRUE
A new CRAN release is planned in the next days or next week.

Related

How to make never ending quarters in r studio [duplicate]

I want to generate a sequence of dates with one quarter interval, with a starting date and ending date. I have below code :
> seq(as.Date('1980-12-31'), as.Date('1985-06-30'), by = 'quarter')
[1] "1980-12-31" "1981-03-31" "1981-07-01" "1981-10-01" "1981-12-31"
[6] "1982-03-31" "1982-07-01" "1982-10-01" "1982-12-31" "1983-03-31"
[11] "1983-07-01" "1983-10-01" "1983-12-31" "1984-03-31" "1984-07-01"
[16] "1984-10-01" "1984-12-31" "1985-03-31"
As you can see, this is not generating right sequence, as I dont understand how the date "1981-07-01" is coming here, I would expect "1981-06-30".
Is there any way to generate such sequence correctly with quarter interval?
Thanks for your time.
The from and to dates in the question are both end-of-quarter dates so we assume that that is the general case you are interested in.
1) Create a sequence of yearqtr objects yq and then convert them to Date class. frac=1 tells it s to use the end of the month. Alternately just use yq since that directly models years with quarters.
library(zoo)
from <- as.Date('1980-12-31')
to <- as.Date('1985-06-30')
yq <- seq(as.yearqtr(from), as.yearqtr(to), by = 1/4)
as.Date(yq, frac = 1)
giving;
[1] "1980-12-31" "1981-03-31" "1981-06-30" "1981-09-30" "1981-12-31"
[6] "1982-03-31" "1982-06-30" "1982-09-30" "1982-12-31" "1983-03-31"
[11] "1983-06-30" "1983-09-30" "1983-12-31" "1984-03-31" "1984-06-30"
[16] "1984-09-30" "1984-12-31" "1985-03-31" "1985-06-30"
2) or without any packages add 1 to from and to so that they are at the beginning of the next month, create the sequence (it has no trouble with first of month sequences) and then subtract 1 from the generated sequence giving the same result as above.
seq(from + 1, to + 1, by = "quarter") - 1
Using the clock package and R >= 4.1:
library(clock)
seq(year_quarter_day(1980, 4), year_quarter_day(1985, 2), by = 1) |>
set_day("last") |>
as_date()
# [1] "1980-12-31" "1981-03-31" "1981-06-30" "1981-09-30" "1981-12-31" "1982-03-31" "1982-06-30" "1982-09-30" "1982-12-31"
# [10] "1983-03-31" "1983-06-30" "1983-09-30" "1983-12-31" "1984-03-31" "1984-06-30" "1984-09-30" "1984-12-31" "1985-03-31"
# [19] "1985-06-30"
Note that this includes the final quarter. I don't know if that was your intent.
Different definition of "quarter". A quarter might well be (although it is not in R) 365/4 days. Look at output of :
as.Date('1980-12-31')+(365/4)*(0:12)
#[1] "1980-12-31" "1981-04-01" "1981-07-01" "1981-09-30" "1981-12-31" "1982-04-01" "1982-07-01" "1982-09-30"
#[9] "1982-12-31" "1983-04-01" "1983-07-01" "1983-09-30" "1983-12-31"
In order to avoid the days of the month from surprising you, you need to use a starting day of the month between 1 and 28, at least in non-leap years.
seq(as.Date('1981-01-01'), as.Date('1985-06-30'), by = 'quarter')
[1] "1981-01-01" "1981-04-01" "1981-07-01" "1981-10-01" "1982-01-01" "1982-04-01" "1982-07-01" "1982-10-01"
[9] "1983-01-01" "1983-04-01" "1983-07-01" "1983-10-01" "1984-01-01" "1984-04-01" "1984-07-01" "1984-10-01"
[17] "1985-01-01" "1985-04-01"

Is there a function in R to get closest next quarter end date for a given date?

I am trying to find out closest next quarter end date for a given date in R.
For example, if the input is "2022-02-23", the output should be "2022-03-31"
and if the input is "2022-03-07", the output should be "2022-06-30".
If the input is "2021-12-15", the output should be "2022-03-31".
Is there any function in R for this?
lubridate::quarter with argument type = "date_last" will get you most of the way there. From the comments, it looks like you want to jump to the following quarter if the date is in the last month of a quarter; we can achieve this by adding a month to each date before passing to quarter. We can add months safely using the %m+% operator.
library(lubridate)
dates_in <- ymd(c("2022-02-23", "2022-03-07", "2021-12-15"))
dates_out <- quarter(dates_in %m+% months(1), type = "date_last")
dates_out
# "2022-03-31" "2022-06-30" "2022-03-31"
Please see this kind of function using lubridate's quarter function
last_day_in_quarter <- function(d){
require(lubridate)
last_month_in_quarter <- ymd(paste(year(d),quarter(d)*3,1))
return(last_month_in_quarter %m+% months(1) - 1)
}
last_day_in_quarter(ymd("2021-12-15")) #"2021-12-31"
last_day_in_quarter(ymd("2022-02-15")) #"2022-03-31"
last_day_in_quarter(ymd("2021-05-15")) #"2021-06-30"
last_day_in_quarter(ymd("2021-07-15")) #"2021-09-30"
I think these kinds of problems become immensely easier to understand if you work with a true year-quarter-day type. There is one of these in the clock package (I am the author).
library(clock)
x <- date_parse(c("2022-02-23", "2022-03-07", "2021-12-15"))
x
#> [1] "2022-02-23" "2022-03-07" "2021-12-15"
# What quarter are we in now?
yqd <- as_year_quarter_day(x)
yqd
#> <year_quarter_day<January><day>[3]>
#> [1] "2022-Q1-54" "2022-Q1-66" "2021-Q4-76"
# Is the current month the same as the end-of-quarter month?
# (if so, we are going to shift forward by 1 quarter).
shift <- get_month(x) == get_month(as.Date(set_day(yqd, "last")))
shift
#> [1] FALSE TRUE TRUE
# Shift by 1 quarter where applicable
yqd[shift] <- yqd[shift] + duration_quarters(1)
yqd
#> <year_quarter_day<January><day>[3]>
#> [1] "2022-Q1-54" "2022-Q2-66" "2022-Q1-76"
# Set day to end of quarter
yqd <- set_day(yqd, "last")
yqd
#> <year_quarter_day<January><day>[3]>
#> [1] "2022-Q1-90" "2022-Q2-91" "2022-Q1-90"
# Now convert back to Date
as.Date(yqd)
#> [1] "2022-03-31" "2022-06-30" "2022-03-31"

How to read quarterly data with R?

I'm trying to use Bayesian VAR, but I can't even get my data right properly. I get them from https://sdw.ecb.europa.eu/ but since a lot of them are quarterly data I have a problem to merge my variables since I'm unable to convert for example "2020-Q1" from char to date with as.Date.
I used the sub function to get 2020-1 for example and then tried as.Date(, format="%Y-%q) but it doesn't work, so I'm stuck.
textData <- "yearQuarter,Amount
2019-Q1,1000
2019-Q2,2000
2019-Q3,3000"
df <- read.csv(text=textData,header = TRUE,stringsAsFactors = FALSE)
as.Date(df$yearQuarter,format="%Y-%q")
...which produces:
> as.Date(df$yearQuarter,format="%Y-%q")
[1] NA NA NA
Thank you for your help !
library(lubridate)
d = yq("2020-Q1")
d
# [1] "2020-01-01"
year(d)
# [1] 2020
quarter(d)
# [1] 1

BUG in the r xts package's to.period function?

I inherited some R code that analyses simulation results. At one point, that code calls the xts package's to.monthly function with indexAt = 'yearmon' to summarize some values in a zoo.
That code normally runs without issue. Recently, however, when analysing simulations over much older data, the call to to.monthly generated some disturbing Warning messages like this:
Warning in zoo(xx, order.by = index(x), ...) :
some methods for “zoo” objects do not work if the index entries in ‘order.by’ are not unique
I culled my data down to the minimum size that still exhibits this Warning. Start with this R code:
library(xts)
z = structure(c(-1062503.35419463, -1080996.55425821, -1099783.92018741,
-1122831.06978888, -1138804.79976585, -1158620.33101501, -1163717.44859603,
-1183250.17288897, -1212428.97863421, -1234981.23171341, -1253605.89670471,
-1269885.84780747, -1272023.98376509, -1284471.17954946, -1313114.61914572,
-1334861.551294, -1349971.87378146, -1360596.77251109, -1363047.71977556,
-1383840.30131117, -1407963.97518998, -1427010.7195352, -1451908.36211767,
-1464563.94519573, -1470017.67402451, -1503642.02732151, -1529231.67395429,
-1560593.79655716, -1582052.24505653, -1595391.99583389), index = structure(c(1111985820,
1112072340, 1112158740, 1112245140, 1112331540, 1112392740, 1112587140,
1112673540, 1112759880, 1112846340, 1112932200, 1112993940, 1113191940,
1113278340, 1113364560, 1113451080, 1113537540, 1113598740, 1113796560,
1113883140, 1113969540, 1114055940, 1114142220, 1114203540, 1114401480,
1114487940, 1114574280, 1114660740, 1114747080, 1114808340), class = c("POSIXct",
"POSIXt")), class = "zoo")
class(z)
head(z)
tail(z)
Then execute this call to to.monthly:
to.monthly(z, indexAt = 'yearmon', name = "Monthly")
On my machine that generates this output:
Warning in zoo(xx, order.by = index(x), ...) :
some methods for “zoo” objects do not work if the index entries in ‘order.by’ are not unique
Warning in zoo(xx, order.by = index(x), ...) :
some methods for “zoo” objects do not work if the index entries in ‘order.by’ are not unique
Monthly.Open Monthly.High Monthly.Low Monthly.Close
Apr 2005 -1062503 -1062503 -1138805 -1138805
Apr 2005 -1158620 -1158620 -1595392 -1595392
Note the Warning messages, followed by the result of to.monthly, which is a zoo that has the duplicate position of "Apr 2005".
I spent some time executing the code in to.monthly line by line, and determined that the bug actually happens inside to.monthly's call to to.period.
In particular, I found that the xx local variable inside to.period is initially calculated correctly, but after the line
indexClass(xx) <- indexAt
is executed that is when the positions of xx become non-unique.
That behavior sure looks like a bug in the xts package's to.period function to me.
I would love to hear from someone who knows how to.monthly/to.period/yearmon really works either confirm that this is a bug, or explain to me why it is not and give me a work around.
I found this possibly related report on the xts github page (which I do not fully understand).
Concerning my machine:
> sessionInfo()
R version 3.4.1 (2017-06-30)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 17134)
...
other attached packages:
...
xts_0.10-0
zoo_1.8-0
When I startup Rgui, I see this Warning message about xts:
Warning: package ‘xts’ was built under R version 3.4.2
This looks like a bug, unrelated to #158. The problem is that the index of z is POSIXct in your local timezone. You aggregate to monthly, which doesn't have a timezone (so xts sets the timezone attribute to "UTC").
But the change in timezone occurs on the POSIXct index, which changes the local time before the index is converted to "yearmon". So, depending on your local timezone's offset from UTC, this may convert the first (last) observation in a month into the last (first) observation of the prior (next) month.
To illustrate:
Sys.setenv(TZ = "America/Chicago")
debugonce(xts:::`indexClass<-.xts`)
to.monthly(z, indexAt="yearmon", name="monthly")
# <snip>
# Browse[2]>
# debug: attr(attr(x, "index"), "tzone") <- "UTC"
# Browse[2]> print(x) # When timezone is "America/Chicago"
# monthly.Open monthly.High monthly.Low monthly.Close
# 2005-03-31 22:59:00 -1062503 -1062503 -1138805 -1138805
# 2005-04-29 15:59:00 -1158620 -1158620 -1595392 -1595392
# Browse[2]>
# debug: attr(attr(x, "index"), "tclass") <- value
# Browse[2]> print(x) # When timezone is "UTC"
# monthly.Open monthly.High monthly.Low monthly.Close
# 2005-04-01 04:59:00 -1062503 -1062503 -1138805 -1138805
# 2005-04-29 20:59:00 -1158620 -1158620 -1595392 -1595392
# Warning message:
# timezone of object (UTC) is different than current timezone ().
You can see that the call to attr(attr(x, "index"), "tzone") <- "UTC" pushed the last observation in March into the first day of April (note that the debugger lists the next call it will evaluate above my calls to print(x)).
Thanks for narrowing it down to the indexClass<- call. That made it a lot easier for me to debug!

Date sequence in R spanning B.C.E. to A.D

I would like to generate a sequence of dates from 10,000 B.C.E. to the present. This is easy for 0 C.E. (or A.D.):
ADtoNow <- seq.Date(from = as.Date("0/1/1"), to = Sys.Date(), by = "day")
But I am stumped as to how to generate dates before 0 AD. Obviously, I could do years before present but it would be nice to be able to graph something as BCE and AD.
To expand on Ricardo's suggestion, here is some testing of how things work. Or don't work for that matter.
I will repeat Joshua's warning taken from ?as.Date for future searchers in big bold letters:
"Note: Years before 1CE (aka 1AD) will probably not be handled correctly."
as.integer(as.Date("0/1/1"))
[1] -719528
as.integer(seq(as.Date("0/1/1"),length=2,by="-10000 years"))
[1] -719528 -4371953
seq(as.Date(-4371953,origin="1970-01-01"),Sys.Date(),by="1000 years")
# nonsense
[1] "0000-01-01" "'000-01-01" "(000-01-01" ")000-01-01" "*000-01-01"
[6] "+000-01-01" ",000-01-01" "-000-01-01" ".000-01-01" "/000-01-01"
[11] "0000-01-01" "1000-01-01" "2000-01-01"
> as.integer(seq(as.Date(-4371953,origin="1970-01-01"),Sys.Date(),by="1000 years"))
# also possibly nonsense
[1] -4371953 -4006710 -3641468 -3276225 -2910983 -2545740 -2180498 -1815255
[9] -1450013 -1084770 -719528 -354285 10957
Though this does seem to work for graphing somewhat:
yrs1000 <- seq(as.Date(-4371953,origin="1970-01-01"),Sys.Date(),by="1000 years")
plot(yrs1000,rep(1,length(yrs1000)),axes=FALSE,ann=FALSE)
box()
axis(2)
axis(1,at=yrs1000,labels=c(paste(seq(10000,1000,by=-1000),"BC",sep=""),"0AD","1000AD","2000AD"))
title(xlab="Year",ylab="Value")
Quite some time has gone by since this question was asked. With that time came a new R package, gregorian which can handle BCE time values in the as_gregorian method.
Here's an example of piecewise constructing a list of dates that range from -10000 BCE to the current year.
library(lubridate)
library(gregorian)
# Container for the dates
dates <- c()
starting_year <- year(now())
# Add the CE dates to the list
for (year in starting_year:0){
date <- sprintf("%s-%s-%s", year, "1", "1")
dates <- c(dates, gregorian::as_gregorian(date))
}
starting_year <- "-10000"
# Add the BCE dates to the list
for (year in starting_year:0){
start_date <- gregorian::as_gregorian("-10000-1-1")
date <- sprintf("%s-%s-%s", year, "1", "1")
dates <- c(dates, gregorian::as_gregorian(date))
}
How you use the list is up to you, just know that the relevant properties of the date objects are year and bce. For example, you can loop over list of dates, parse the year, and determine if it's BCE or not.
> gregorian_date <- gregorian::as_gregorian("-10000-1-1")
> gregorian_date$bce
[1] TRUE
> gregorian_date$year
[1] 10001
Notes on 0AD
The gregorian package assumes that when you mean Year 0, you're really talking about year 1 (shown below). I personally think an exception should be thrown, but that's the mapping users needs to keep in mind.
> gregorian::as_gregorian("0-1-1")
[1] "Monday January 1, 1 CE"
This is also the case with BCE
> gregorian::as_gregorian("-0-1-1")
[1] "Saturday January 1, 1 BCE"
As #JoshuaUlrich commented, the short answer is no.
However, you can splice out the year into a separate column and then convert to integer. Would this work for you?
The package lubridate seems to handle "negative" years ok, although it does create a year 0, which from the above comments seems to be inaccurate. Try:
library(lubridate)
start <- -10000
stop <- 2013
myrange <- NULL
for (x in start:stop) {
myrange <- c(myrange,ymd(paste0(x,'-01-01')))
}

Resources