I am getting wrong value of quarters - r

For below code i am getting wrong quarters. Please help me with this issue
qy= cut.POSIXt(as.POSIXct(c("2015-09-01 IST","2016-08-1 IST")), breaks="quarter", labels=FALSE,include.lowest=T)
qy
# [1] 1 5

cut.POSIXt (with labels=FALES) gives you the quarters relative to the min(X) quarter - it starts at one with the earliest date and tells you the number of quarters between each date and that. So when you give dates in Q3 of two consecutive years, the first is 1, and the second is 4 quarters later, i.e. 5.
If you're trying to get the quarter within the year for each date use quarters or lubridate::quarter:
quarters(as.POSIXct(c("2015-09-01 IST","2016-08-1 IST")))
[1] "Q3" "Q3"
lubridate::quarter(as.POSIXct(c("2015-09-01 IST","2016-08-1 IST")))
[1] 3 3
Note that quarters comes out as a string starting with "Q", whereas lubridate::quarter comes out as an integer.

Related

How to subtract a number of weeks from a yearweek/weeknumber in R?

I have a couples of weeknumbers of interest. Lets take '202124' (this week) as an example. How can I subtract x weeks from this week number?
Lets say I want to know the week number of 2 weeks prior, ideally I would like to do 202124 - 2 which would give me 202122. This is fine for most of the year however 202101 - 2 will give 202099 which is obviously not a valid week number. This would happen on a large scale so a more elegant solution is required. How could I go about this?
convert the year week values to dates subtract in days and format the output.
x <- c('202124', '202101')
format(as.Date(paste0(x, 1), '%Y%W%u') - 14, '%Y%V')
#[1] "202122" "202052"
To convert year week value to date we also need day of the week, I have used it as 1st day of the week.

How to calculate the difference in months between two dates in TOSCA?

Date 1: 10/25/2020
Date 2: 01/25/2021
Difference = 3
What is the formula to find the difference between two dates in months in TOSCA?
WEEKDAY :
To find the day of the week of a given date
{CALC[WEEKDAY(DATE(2018,3,18))]}
Expected result: For the above-given date, the result will be 1 (Sunday)
Sunday= 1, Monday=2, Tuesday=3,
Wednesday=4, Thursday=5, Friday=6 Saturday=7
DATEDIF :
To find the difference between two given dates.
{CALC[DATEDIF(DATE(2018,2,10),DATE(2018,2,21), """d""")]}
Expected result: Above expression will give you the results as 11 (Difference between the given dates is 11 days.)
IF :
Given two dates we find which date is bigger amongst the two dates
{CALC[IF(DATE(2018,1,3)>DATE(2018,3,24),"""True""","""False""")]}
Expected results: This expression gives you the result as false as the first given date is smaller than the second one.

Replacement of missing day and month in dates using R

This question is about how to replace missing days and months in a data frame using R. Considering the data frame below, 99 denotes missing day or month and NA represents dates that are completely unknown.
df<-data.frame("id"=c(1,2,3,4,5),
"date" = c("99/10/2014","99/99/2011","23/02/2016","NA",
"99/04/2009"))
I am trying to replace the missing days and months based on the following criteria:
For dates with missing day but known month and year, the replacement date would be a random selection from the middle of the interval (first day to the last day of that month). Example, for id 1, the replacement date would be sampled from the middle of 01/10/2014 to 31/10/2014. For id 5, this would be the middle of 01/04/2009 to 30/04/2009. Of note is the varying number of days for different months, e.g. 31 days for October and 30 days for April.
As in the case of id 2, where both day and month are missing, the replacement date is a random selection from the middle of the interval (first day to last day of the year), e.g 01/01/2011 to 31/12/2011.
Please note: complete dates (e.g. the case of id 3) and NAs are not to be replaced.
I have tried by making use of the seq function together with the as.POSIXct and as.Date functions to obtain the sequence of dates from which the replacement dates are to be sampled. The difficulty I am experiencing is how to automate the R code to obtain the date intervals (it varies across distinct id) and how to make a random draw from the middle of the intervals.
The expected output would have the date of id 1, 2 and 5 replaced but those of id 3 and 4 remain unchanged. Any help on this is greatly appreciated.
This isn't the prettiest, but it seems to work and adapts to differing month and year lengths:
set.seed(999)
df$dateorig <- df$date
seld <- grepl("^99/", df$date)
selm <- grepl("^../99", df$date)
md <- seld & (!selm)
mm <- seld & selm
df$date <- as.Date(gsub("99","01",as.character(df$date)), format="%d/%m/%Y")
monrng <- sapply(df$date[md], function(x) seq(x, length.out=2, by="month")[2]) - as.numeric(df$date[md])
df$date[md] <- df$date[md] + sapply(monrng, sample, 1)
yrrng <- sapply(df$date[mm], function(x) seq(x, length.out=2, by="12 months")[2]) - as.numeric(df$date[mm])
df$date[mm] <- df$date[mm] + sapply(yrrng, sample, 1)
#df
# id date dateorig
#1 1 2014-10-14 99/10/2014
#2 2 2011-02-05 99/99/2011
#3 3 2016-02-23 23/02/2016
#4 4 <NA> NA
#5 5 2009-04-19 99/04/2009

Adding quarters to R date

I have a R time series data, where I am calculating the means for all values up to a particular date, and storing this means in the date + 4 quarters. The dates are all month ends. To achieve this, I am looking to increment 4 quarters to a date. My question is how can I add 4 quarters to an R date data-type. An illustration:
a <- as.Date("2006-01-01")
b <- as.Date("2011-01-01")
date_range <- quarter(seq.Date(a, b, by = "quarter"), with_year = TRUE)
> date_range[1] + 1
[1] 2007.1
> date_range[1] + quarter(1)
[1] 2007.1
> date_range[1] + 0.25
[1] 2006.35
One possible way I am thinking is to get year-quarter dates, and then adding 4 to it. But wasn't sure what is the best way to do this?
The problem is that quarters have different lengths. Q1 is shortest because it includes February (though it ties with Q2 in leap years). Things like this make "adding a quarter to a date" poorly defined. Even adding months to a date can be tricky at the ends months - what is 1 month after January 31?
Beginnings of months are more straightforward, and I would recommend you use the 1st day of quarters rather than the last (if you must use a specific date). lubridate provides functions like floor_date() and ceiling_date() to which you can pass unit = "quarter" and they will return the first day of the current or subsequent quarter, respectively. You can also always add months(3) to a day at the beginning of a month, though of course if your intention is to add 4 quarters you may as well just add 1 year.
Just add 12 months or a year instead?
Or if it must be quarters, define yourself a function, like so:
quarters <- function(x) {
months(3*x)
}
and then use it to add to the date sequence:
date_range <- seq.Date(a, b, by = "quarter")
date_range + quarters(4)
Lubridate has a function for quarters already included. This is a much better solution than creating your own function.
https://www.rdocumentation.org/packages/lubridate/versions/1.7.4/topics/quarter
Old answer but to those arriving here, lubridate has a function %m+%that adds months and preserves monthends.
a <- as.Date("2006-01-01")
Add future months worth of dates:
The original poster wanted 4 quarters in future so that will be 12 months.
future_date <- a %m+% months(12)
future_date
[1] "2007-01-01"
You could also do years as the period:
future_date <- a %m+% years(1)
Remove months from date:
Subtract dates with %m-%
If you wanted a date 3 months ago from 1/1/2006:
past_date <- a %m-% months(3)
past_date
[1] "2005-10-01"
Example with dates not at end of months:
mplus will preserve days in month:
as.Date("2022-10-10") %m-% months(3)
[1] "2022-07-10"
For more, see documentation on "Add and subtract months to a date without exceeding the last day of the new month"
Note that other answers that use Date class will give irregularly spaced series and so are unsuitable for time series analysis.
To do this in such a way that time series analyses can be performed and noting the zoo tag on the question, the yearmon class represents year/month as year + fraction where fraction is 0 for Jan, 1/12 for Feb, 2/12 for Mar, ..., 11/12 for Dec. Thus adding 4 quarters is just a matter of adding 1. (Adding x quarters is done by adding x/4.)
library(zoo)
ym <- yearmon(2006) + 0:11/12 # months in 2006
ym + 1 # one year later
Also this converts yearmon objects to end-of-month Date and in the second line Date to yearmon. Using frac = 0 or omitting frac in the first line would convert to beginning of month dates.
d <- as.Date(ym, frac = 1) # d is Date vector of end-of-months
as.yearmon(d) # convert Date vector to yearmon
If your input dates represent quarters then there is also the yearqtr class which represents a year/quarter as year + fraction where fraction is 0, 1/4, 2/4, 3/4 for the 4 quarters of a year. Adding 4 quarters is done by adding 1 (or to add x quarters add x/4).
yq <- as.yearqtr(2006) + 0:3/4 # all quarters in 2006
yq + 1 # one year later
Conversions work similarly to yearmon:
d <- as.Date(ym, frac = 1) # d is Date vector of end-of-quarters
as.yearqtr(d) # convert Date vector to yearqtr

calculating ages in R by subtracting two dates columns

I have 2 columns with ~ 2000 rows of dates in them. One is a variable with a visit date (df$visitdate), and the other is a birth date of the individual (df$birthday).
Wondering if there is any simple way to subtract the visit date - birth date to create the variable "age at the time of the visit", accounting for leap years, etc.
I tried to use the following code (from an answer in a similar question) but it didn't work in my case.
find number of seconds in one year:
seconds_in_a_year <- as.integer((seconds(ymd("2010-01-01")) - seconds(ymd("2009-01-01"))))
now obtain number of seconds between the 2 dates you desire
seconds_between_dates <- as.integer(seconds(date1) - seconds(date2))
your final answer for number of years in floating points will be
years_between_dates <- seconds_between_dates / seconds_in_a_year
When I tried to apply this to my data frame (note: using variables rather than specific dates, so this may be the cause) I got the following:
seconds_in_a_year <- as.integer((seconds(ymd(df$visitdate)) - seconds(ymd(df$birthday))))
Warning message:
NAs introduced by coercion
Following the code along I got a final output of:
years_between_dates
[1] 1.157407e-05 [2] 1.157407e-05
Any help is greatly appreciated!
Subtracting from a Date object another Date object gives you the time difference in days, e.g.
> dates = as.Date(c("2007-03-01", "2004-05-23"))
>
> dates[1] - dates[2]
Time difference of 1012 days
So, assuming 365 days in a year
> age_time_visit = as.numeric(dates[1] - dates[2]) / 365
> age_time_visit
[1] 2.772603
There are various answers for this scattered around the internet.
I think the one I've typically used was inspired by Professor Ripley:
http://r.789695.n4.nabble.com/Calculate-difference-between-dates-in-years-td835196.html
age_years <- function(first, second)
{
lt <- data.frame(first, second)
age <- as.numeric(format(lt[,2],format="%Y")) - as.numeric(format(lt[,1],format="%Y"))
first <- as.Date(paste(format(lt[,2],format="%Y"),"-",format(lt[,1],format="%m-%d"),sep=""))
age[which(first > lt[,2])] <- age[which(first > lt[,2])] - 1
age
}
There's another approach at https://gist.github.com/mmparker/7254445
Or you you just want to raw, decimal value of years, you can get the number of days and divide by 365.2425
Here is an approach that accounts for leap years (don't know if this has been done before, but suspect it has...).
get.age <- function(from, to) {
require(lubridate) # for leap_year(...)
n <- as.integer(to-from)
n.l <- sum(leap_year(seq(from,to,by=1)))
n.l/366 + (n+1-n.l)/365
}
get.age(as.Date("2009-01-01"),as.Date("2012-12-31"))
# [1] 4
get.age(as.Date("2012-01-01"),as.Date("2012-01-31")) # 2012 was a leap year
# [1] 0.08469945
get.age(as.Date("2011-01-01"),as.Date("2011-01-31")) # 2011 was not
# [1] 0.08493151
So the basic idea is to create a vector with one element for every day between from and to (inclusive), then for each day account for whether that day is part of a leap year or not. The we add up the leap year days and the non-leap year days separately and calculate the number of years as:
leap-year-days/366 + non-leap-year-days/365
This works for single dates (vectors of length 1). To enable this for columns of dates, as you asked, we use Vectorize(...).
vget.age <- Vectorize(get.age) # vectorized version
And then a demo:
# example data set
set.seed(1) # for reproducible example
today <- as.Date("2015-09-09")
df <- data.frame(birth.date=today-sample(1000:10000,2000)) # 2000 birthdays
result <- vget.age(df$birth.date,today) # how old are they?
head(result)
# [1] 9.282192 11.909589 16.854795 25.115068 7.706849 24.865753

Resources