How to convert time series dates into data frame dates - r

I have a time series of weekly data, beginning at Jan. 1, 2016. I've tried using the method in this question, but getting dates from 1970.
This is what I'm doing below:
# Creating this df of dates used later on
index.date <- data.frame(start.date=seq(from=as.Date("01/01/2016",format="%m/%d/%Y"),
to=as.Date("10/30/2021",format="%m/%d/%Y"),
by='week'))
# Create a ts, specifying start date and frequency=52 for weekly
weekly.ts <- ts(rnorm(305,0,1),start=min(index.date$start.date), frequency = 52)
# Look at the min and max dates in the ts
as.Date(as.numeric(time(min(weekly.ts))))
[1] "1970-01-02"
as.Date(as.numeric(time(max(weekly.ts))))
[1] "1970-01-02"
I plan to place the ts into a df with dates shown in a date format with the following:
# Place ts dates and values into a df
output.df <-data.frame(date=as.Date(as.numeric(time(weekly.ts))),
y=as.matrix(weekly.ts))
Is this a matter of me specifying the dates incorrectly in the ts, or am I converting them incorrectly with as.Date(as.numeric(timeweekly.ts))))? I would expect the min date to be Jam. 1, 2016, and the maximum Oct. 29, 2021 (as it is for index.date).

ts series do not understand Date class but you can encode the dates into numbers and then decode them back. Assuming that you want a series with frequency 52 the first week in 2016 will be represented by 2016, the second by 2016+1/52, ..., the last by 2016+51/52.
For example,
tt <- ts(rnorm(305), start = 2016, freq = 52)
Now decode the dates.
toDate <- function(tt) {
yr <- as.integer(time(tt))
week <- as.integer(cycle(tt)) # first week of year is 1, etc.
as.Date(ISOdate(yr, 1, 1)) + 7 * (week - 1)
}
data.frame(dates = toDate(tt), series = c(tt))
We can also convert from Date class to year/week number
# input is a Date class object
to_yw <- function(date) {
yr <- as.numeric(format(date, "%Y"))
yday <- as.POSIXlt(date)$yday # jan 1st is 0
week <- pmin(floor(yday / 7), 51) + 1 # 1st week of yr is 1
yw <- yr + (week - 1) / 52
list(yw = yw, year = yr, yday = yday, week = week)
}

Try this
weekly.ts <- ts(rnorm(305,0,1),
start=min(index.date$start.date),
end=max(index.date$start.date), frequency=2)
# look at plot to see if it works
plot(stl(weekly.ts, s.window=2))
# get time
head(as.POSIXlt.Date(time(weekly.ts)))
[1] "2016-01-01 UTC" "2016-01-01 UTC" "2016-01-02 UTC" "2016-01-02 UTC"
[5] "2016-01-03 UTC" "2016-01-03 UTC"
tail(as.POSIXlt.Date(time(weekly.ts)))
[1] "2021-10-26 UTC" "2021-10-27 UTC" "2021-10-27 UTC" "2021-10-28 UTC"
[5] "2021-10-28 UTC" "2021-10-29 UTC"
You get 2 dates because of freqency=2, which is required by decompose or stl for meaningful data.

Related

Selecting and grouping similar dates from vectors of dates

I have three vectors of dates in POSIX format that correspond with data collection times from three large datasets. Each of these vectors is of a different length and have similar (but not identical) dates.
I would like to:
group these dates into specified time ranges, e.g. group dates from each vector that fall in a 30-day window and
reduce the number of date groupings to reflect the dataset with the smallest number of collection times, e.g. if "dataset A" has three sampling dates and "dataset B" has five sampling dates then there would only be three groupings of dates (unless the two extra dates in "dataset B" fall within 30 days of the dates in "dataset A").
An example with three vectors of dates in POSIX format (I want to group similar dates between the vectors, allowing a time window of 30 days):
A.dates = as.POSIXlt(c("1998-07-24 08:00","1999-07-24 08:00","2000-07-24 08:00"),
tz = "America/Los_Angeles")
B.dates = as.POSIXlt(c("1998-07-25 08:00","1999-07-25 08:00","2000-07-25 08:00"),
tz = "America/Los_Angeles")
C.dates = as.POSIXlt(c("1998-07-26 08:00","1999-07-26 08:00","2000-07-26 08:00","2000-08-29"),
tz = "America/Los_Angeles")
Specifying a time window of 30 days, there would be three date groupings (the sampling dates from July 1998, 1999, and 2000). The C.dates vector has a fourth collection date of August 29th, 2000 which would be excluded from the groupings because:
it is not within 30 days of the July dates in the other vectors and
there are no dates in the other two vectors that fall within 30 days of August 29th, 2000.
You could loop over each element of each vector and create sequences ± 15 days
L <- list(A.dates, B.dates, C.dates)
tmp <- lapply(L, function(x) lapply(x, function(x)
do.call(seq, c(as.list(as.Date(x) + c(-15, 15)), "day"))))
and unionite them in the list.
tmp <- lapply(tmp, function(x) as.Date(Reduce(union, x), origin="1970-01-01"))
Then simply find the intersect
i <- Reduce(function(...) as.Date(intersect(...), origin="1970-01-01"), tmp)
and select the dates accordingly.
tmp <- lapply(L, function(x) x[as.Date(x) %in% i])
tmp
# [[1]]
# [1] "1998-07-24 08:00:00 PDT" "1999-07-24 08:00:00 PDT"
# [3] "2000-07-24 08:00:00 PDT"
#
# [[2]]
# [1] "1998-07-25 08:00:00 PDT" "1999-07-25 08:00:00 PDT"
# [3] "2000-07-25 08:00:00 PDT"
#
# [[3]]
# [1] "1998-07-26 PDT" "1999-07-26 PDT" "2000-07-26 PDT"
To sort them by year according to your comment, we first unlist them. Unfortunately this converts the dates into numerics (i.e. seconds since January, 1 1970), so we need to convert them back.
tmp <- as.POSIXlt(unlist(lapply(tmp, as.POSIXct)), origin="1970-01-01",
tz="America/Los_Angeles")
Finally we split the list by the first four substrings which is the year (we also could do split(tmp, strftime(tmp, "%Y")) though).
res <- split(tmp, substr(tmp, 1, 4))
res
# $`1998`
# [1] "1998-07-24 08:00:00 PDT" "1998-07-25 08:00:00 PDT"
# [3] "1998-07-26 00:00:00 PDT"
#
# $`1999`
# [1] "1999-07-24 08:00:00 PDT" "1999-07-25 08:00:00 PDT"
# [3] "1999-07-26 00:00:00 PDT"
#
# $`2000`
# [1] "2000-07-24 08:00:00 PDT" "2000-07-25 08:00:00 PDT"
# [3] "2000-07-26 00:00:00 PDT"

Convert date to day-of-week in R

I have a date in this format in my data frame:
"02-July-2015"
And I need to convert it to the day of the week (i.e. 183). Something like:
df$day_of_week <- weekdays(as.Date(df$date_column))
But this doesn't understand the format of the dates.
You could use lubridate to convert to day of week or day of year.
library(lubridate)
# "02-July-2015" is Thursday
date_string <- "02-July-2015"
dt <- dmy(date_string)
dt
## [1] "2015-07-02 UTC"
### Day of week : (1-7, Sunday is 1)
wday(dt)
## [1] 5
### Day of year (1-366; for 2015, only 365)
yday(dt)
## [1] 183
### Or a little shorter to do the same thing for Day of year
yday(dmy("02-July-2015"))
## [1] 183
day = as.POSIXct("02-July-2015",format="%d-%b-%Y")
# see ?strptime for more information on date-time conversions
# Day of year as decimal number (001–366).
format(day,format="%j")
[1] "183"
#Weekday as a decimal number (1–7, Monday is 1).
format(day,format="%u")
[1] "4"
This is what anotherFishGuy supposed, plus converting the values to as.numeric so they fit through classifier.
# day <- Sys.time()
as.num.format <- function(day, ...){
as.numeric(format(day, ...))
}
doy <- as.num.format(day,format="%j")
doy <- as.num.format(day,format="%u")
hour <- as.num.format(day, "%H")

R - convert POSIXct to fraction of julian day

How can a date/time object in R be transformed on the fraction of a julian day?
For example, how can I turn this date:
date <- as.POSIXct('2006-12-12 12:00:00',tz='GMT')
into a number like this
> fjday
[1] 365.5
where julian day is elapsed day counted from the january 1st. The fraction 0.5 means that it's 12pm, and therefore half of the day.
This is just an example, but my real data covers all the 365 days of year 2006.
Since all your dates are from the same year (2006) this should be pretty easy:
julian(date, origin = as.POSIXct('2006-01-01', tz = 'GMT'))
If you or another reader happen to expand your dataset to other years, then you can set the origin for the beginning of each year as follows:
sapply(date, function(x) julian(x, origin = as.POSIXct(paste0(format(x, "%Y"),'-01-01'), tz = 'GMT')))
Have a look at the difftime function:
> unclass(difftime('2006-12-12 12:00:00', '2006-01-01 00:00:00', tz="GMT", units = "days"))
[1] 345.5
attr(,"units")
[1] "days"
A function to convert POSIX to julian day, an extension of the answer above, source it before using.
julian_conv <- function(x) {
if (is.na(x)) { # Because julian() cannot accept NA values
return(NA)
}
else {
j <-julian(x, origin = as.POSIXlt(paste0(format(x, "%Y"),'-01-01')))
temp <- unclass(j) # To unclass the object julian day to extract julian day
return(temp[1] + 1) # Because Julian day 1 is 1 e.g., 2016-01-01
}
}
Example:
date <- as.POSIXct('2006-12-12 12:00:00')
julian_conv(date)
#[1] 345.5

How to calculate date based on week number in R

I was wondering if there is a way to get the begin of the week date based on a week number in R? For example, if I enter week number = 10, it should give me 9th March, 2014.
I know how to get the reverse (aka..given a date, get the week number by using as.POSIX functions).
Thanks!
Prakhar
You can try this:
first.day <- as.numeric(format(as.Date("2014-01-01"), "%w"))
week <- 10
as.Date("2014-01-01") + week * 7 - first.day
# [1] "2014-03-09"
This assumes weeks start on Sundays. First, find what day of the week Jan 1 is, then, just add 7 * number of weeks to Jan 1, - the day of week Jan 1 is.
Note this is slightly different to what you get if you use %W when doing the reverse, as from that perspective the first day of the week seems to be Monday:
format(seq(as.Date("2014-03-08"), by="1 day", len=5), "%W %A %m-%d")
# [1] "09 Saturday 03-08" "09 Sunday 03-09" "10 Monday 03-10" "10 Tuesday 03-11"
# [5] "10 Wednesday 03-12"
but you can adjust the above code easily if you prefer the Monday centric view.
You may try the ISOweek2date function in package ISOweek.
Create a function which takes year, week, weekday as arguments and returns date(s):
date_in_week <- function(year, week, weekday){
w <- paste0(year, "-W", sprintf("%02d", week), "-", weekday)
ISOweek2date(w)
}
date_in_week(year = 2014, week = 10, weekday = 1)
# [1] "2014-03-03"
This date is corresponds to an ISO8601 calendar (see %V in ?strptime). I assume you are using the US convention (see %U in ?strptime). Then some tweeking is needed to convert between ISO8601 and US standard:
date_in_week(year = 2014, week = 10 + 1, weekday = 1) - 1
# [1] "2014-03-09"
You can enter several weekdays, e.g.
date_in_week(year = 2014, week = 10 + 1, weekday = 1:3) - 1
# [1] "2014-03-09" "2014-03-10" "2014-03-11"
You can also use strptime to easily get dates from weeks starting on Mondays:
first_date_of_week <- function(year, week){
strptime(paste(year, week, 1), format = "%Y %W %u")
}
You can accomplish this using the package lubridate
library(lubridate)
start = ymd("2014-01-01")
#[1] "2014-01-01 UTC"
end = start+10*weeks()
end = end-(wday(end)-1)*days()
#[1] "2014-03-09 UTC"

How to convert number to Julian date in r?

day <- c(seq(1, 10592, by = 1))
How to change 'day' into Julian date format from 1st January 1982 to 31st December 2010).
Thanks in advance.
Try help.search("Julian") -- there is a function julian.
So given your date sequence (and replace the length=... with by="1 day" for all dates)
R> seq(as.Date("1982-01-01"), as.Date("2010-12-31"), length=5)
[1] "1982-01-01" "1989-04-01" "1996-07-01" "2003-10-01" "2010-12-31"
R>
you compute Julian dates just by calling the function:
R> julian(seq(as.Date("1982-01-01"), as.Date("2010-12-31"), length=5))
[1] 4383.00 7030.75 9678.50 12326.25 14974.00
attr(,"origin")
[1] "1970-01-01"
R>

Resources