Determine week number from date over several years - r

I'm looking for a way to determine the week number (week beginning on Monday) over several years. That means I don't want to have 0-53 but if, let's say I have 2 years of dates, I want them to be numbered with 0-106 in R.
I tried strftime(Datum, format ="%W") but then I only get the annual week number and not as a whole.

Given that you did not provide any data, I took the liberty of creating some:
#create data
Datum<-c("2013-03-01", "2014-06-02", "2013-06-01")
# format data to year-month-day with strptime
Datum<-strptime(Datum, "%Y-%m-%d")
You now need to identify the origin year. As I'm sure you are aware not all years have the same number of weeks 52.29 in a leap year vs. 52.4 in a standard calendar year but as this is unlikely to be a consideration for only 2 years we can use the number of weeks returned through the strftime function.
origin.year=as.numeric(min(substring(Datum,1,4)))
# number of weeks in first year (offset for second year)
n.weeks<-52
Now we can create a vector containing the number of weeks to offset each week in Datum (X).
X<-as.numeric(substring(Datum,1,4)!=origin.year)*n.weeks
We can then simply add this vector to the number of weeks returned by strftime when it is applied to Datum
week.vec<-as.numeric(strftime(Datum, "%W"))+X
This will work for 2 years, but if you have more years than this, you will need to modify the offsets to account for this.

Related

How to subtract a number of weeks from a yearweek/weeknumber in R?

I have a couples of weeknumbers of interest. Lets take '202124' (this week) as an example. How can I subtract x weeks from this week number?
Lets say I want to know the week number of 2 weeks prior, ideally I would like to do 202124 - 2 which would give me 202122. This is fine for most of the year however 202101 - 2 will give 202099 which is obviously not a valid week number. This would happen on a large scale so a more elegant solution is required. How could I go about this?
convert the year week values to dates subtract in days and format the output.
x <- c('202124', '202101')
format(as.Date(paste0(x, 1), '%Y%W%u') - 14, '%Y%V')
#[1] "202122" "202052"
To convert year week value to date we also need day of the week, I have used it as 1st day of the week.

How to declare a time series of weekly data?

I have a question regarding declaring my dataset as a time series. I have weekly historical data of demand of a certain product. Every week is labeled as 201401, 201402, up to the current month 201937.
The problem arises once I declare the set as ts <- ts(data, start=2014, frequency=52). Because every year consists of 365.25 days, in my set I have a week 201553. So from week 201601, every week is basically a week later, which causes problem when finding seasonality patterns.
Do I have to delete week 201553 or what is the appropriate continuation?
If you set frequency=365.25/7, things should work out fine.
References
https://otexts.com/fpp2/ts-objects.html
https://www.rdocumentation.org/packages/stats/versions/3.6.1/topics/ts (interestingly, the trick with frequency=52.17857 is not mentioned here)
http://manishbarnwal.com/blog/2017/05/03/time_series_and_forecasting_using_R/

Is IDL able to add / subtract from date?

As you can see the question above, I was wondering if IDL is able to add or subtract days / months / years to a given date.
For example:
given_date = anytim('01-jan-2000')
print, given_date
1-Jan-2000 00:00:00.000
When I would add 2 weeks to the given_date, then this date should appear:
15-Jan-2000 00:00:00.000
I was already looking for a solution for this problem, but I unfortunately couldn't find any solution.
Note:
I am using a normal calendar date, not the julian date.
Are you only concerned with dates after 1582? Is accuracy to the second important?
The ANYTIM routine is not part of the IDL distribution. Possibly there are third party routines to handle time increments, but I don't know of any builtin to the IDL library.
By default, which you are using, ANYTIM returns seconds from Jan 1, 1979. So to add/subtract some number of days, weeks, or years, you could calculate the number of seconds in the time interval. Of course, this does not take into account leap seconds/years (but leap years are fairly easy to take into account, leap seconds requires a database of when they were added). And adding months is going to require determining which month so to determine the number of days in it.
IDL can convert to and from Julian dates using JULDAY and CALDAT.
You may also read and write Julian dates (which are doubles or long integers) to and from strings using the format keyword to PRINT, STRING, and READS.
You'll want to use the (C()) calendar date format code.
format='(c(cdi0,"-",cMoa,"-"cyi04," ",cHi02,":",cmi02,":",csf06.3))'
date = julday(1, 1, 2000)
print, date, format=format
; 1-Jan-2000 00:00:00.000
date = date + 14
print, date, format=format
; 15-Jan-2000 00:00:00.000

Post-Process a Stata %tw date in R

The %tw format in Stata has the form: 1960w1 which has no equivalent in R.
Therefore %tw dates must be post-processed.
Importing a .dta file into R, the date is an integer like 1304 (instead of 1985w5) or 1426 (instead of 1987w23). If it was a simple time series you could set a starting date as follows:
ts(df, start= c(1985,5), frequency=52)
Another possibility would be:
as.Date(Camp$date, format= "%Yw%W" , origin = "1985w5")
But if each row is not a single date, then you must convert it.
The package ISOweek is based on ISO-8601 with the form "1985-W05" and does not process the Stata %tw.
The Lubridate package does not work with this format. The week() returns the number of complete seven day periods that have occurred between the date and January 1st, plus one. week function
In Stata week 1 of any year starts on 1 January, whatever day of the week that is. Stata Documentation on Dates
In the format %W of Date in R the week starts as Monday as first day of the week.
From strptime %V is
the Week of the year as decimal number (00--53) as defined in ISO
8601. If the week (starting on Monday) containing 1 January has four or more days in the new year, then it is considered week 1. Otherwise,
it is the last week of the previous year, and the next week is week 1.
(Accepted but ignored on input.) Strptime
Larmarange noted on Github that Haven doesn't interpret dates properly:
months, week, quarter and halfyear are specific format from Stata,
respectively %tm, %tw, %tq and %th. I'm not sure that there are
corresponding formats available in R. So far they are imported as
integers.
Is there a way to convert Stata %tw to a date format R understands?
Here is an Stata file with dates
This won't be an answer in terms of R code, but it is commentary on Stata weeks that can't be fitted into a comment.
Strictly, dates in Stata are not defined by the display formats that make them intelligible to people. A date in Stata is always a numeric variable or scalar or macro defined with origin the first instance in 1960. Thus it is at best a shorthand to talk about %tw dates, etc. We can use display to see the effects of different date display formats:
. di %td 0
01jan1960
. di %tw 0
1960w1
. di %tq 0
1960q1
. di %td 42
12feb1960
. di %tw 42
1960w43
. di %tq 42
1970q3
A subtle point made explicit above is that changing the display format will not change what is stored, i.e. the numeric value.
Otherwise put, dates in Stata are not distinct data types; they are just integers made intelligible as dates by a pertinent display format.
The question presupposes that it was correct to describe some weekly dates in terms of Stata weeks. This seems unlikely, as I know no instance in which a body outside StataCorp uses the week rules of Stata, not only that week 1 always starts on 1 January, but also that week 52 always includes either 8 or 9 days and hence that there is never a week 53 in a calendar year.
So, you need to go upstream and find out what the data should have been. Failing some explanation, my best advice is to map the 52 weeks of each year to the days that start them, namely days 1(7)358 of each calendar year.
Stata weeks won't map one-to-one to any other scheme for defining weeks.
More in this article on Stata weeks
It's not completely clear what the question is but the year and week corresponding to 1304 are:
wk <- 1304
1960 + wk %/% 52
## [1] 1985
wk %% 52 + 1
## [1] 5
so assuming that the first week of the year is week 1 and starts on Jan 1st, the beginning of the above week is this date:
as.Date(paste(1960 + wk %/% 52, 1, 1, sep = "-")) + 7 * (wk %% 52)
## [1] "1985-01-29"

convert difftime time to years, months and days

How can I accurately convert the products (units is in days) of the difftime below to years, months and days?
difftime(Sys.time(),"1931-04-10")
difftime(Sys.time(),"2012-04-10")
This does years and days but how could I include months?
yd.conv<-function(days, print=TRUE){
x<-days*0.00273790700698851
x2<-floor(x)
x3<-x-x2
x4<-floor(x3*365.25)
if (print) cat(x2,"years &",x4,"days\n")
invisible(c(x2, x4))
}
yd.conv(difftime(Sys.time(),"1931-04-10"))
yd.conv(difftime(Sys.time(),"2012-04-10"))
I'm not sure how to even define months either. Would 4 weeks be considered a month or the passing of the same month day. So for the later definition of a month if the initial date was 2012-01-10 and the current 2012-05-31 then we'd have 0 years, 5 months and 21 days. This works well but what if the original date was on the 31st of the month and the end date was on feb 28 would this be considered a month?
As I wrote this question the question itself evolved so I'd better clarify:
What would be the best (most logical approach) to defining months and then how to find diff time in years, months and days?
If you're doing something like
difftime(Sys.time(), someDate)
It comes as implied that you must know what someDate is. In that case, you can convert this to a POSIXct class object that gives you the ability to extract temporal information directly (package chron offers more methods, too). For instance
as.POSIXct(c(difftime(Sys.time(), someDate, units = "sec")), origin = someDate)
This will return your desired date object. If you have a timezone tz to feed into difftime, you can also pass that directly to the tz parameter in as.POSIXct.
Now that you have your date object, you can run things like months(.) and if you have chron you can do years(.) and days(.) (returns ordered factor).
From here, you could do more simple math on the difference of years, months, and days separately (converting to appropriate numeric representations). Of course, convert someDate to POSIXct will be required.
EDIT: On second thought, months(.) returns a character representation of the month, so that may not be efficient. At least, it'll require a little processing (not too difficult) to give a numeric representation.
R has not implemented these features out of ignorance. difftime objects are transitive. A 700 day difference on any arbitrary start-date can yield a differing number of years depending on whether there was a leap year or not. Similarly for months, they take between 28-31 days.
For research purposes, we use these units a lot (months and years) and pragmatically, we define a year as 365.25 days and a month as 365.25/12 = 30.4375 days.
To do arithmetic on a given difftime, you must convert this value to numeric using as.numeric(difftime.obj) which is, in default, days so R stops spouting off the units.
You can not simply convert a difftime to month, since the definition of months depends on the absolute time at which the difftime has started.
You'll need to know the start date or the end date to accurately tell the number of months.
You could then, e.g., calculate the number of months in the first year of your timespan, the number of month in the last your of the timespan, and add the number of years between times 12.
Hmm. I think the most sensible would be to look at the various units themselves. So compare the day of the month first, then compare the month of the year, then compare the year. At each point, you can introduce a carry to avoid negative values.
In other words, don't work with the product of difftime, but recode your own difftime.

Resources