tmerge to split data into 5 yearly intervals - r

I have a data set with columns: ID, follow up (days), event, variable 1, variable 2 ..
I'm trying to use tmerge to convert into 5 yearly intervals with tstart and tstop using follow up (days) as time - how do I go about this?

Related

creating inteval object in r using lubridate package [duplicate]

This question already has an answer here:
indicateing to which interval a date belongs
(1 answer)
Closed 4 years ago.
hi i have data from uber :
about pick ups in NYC .
im trying to add a column to the raw data, that indicates for each row, for
which time interval (which is represented by a single timepoint at the beginning of thetime interval) it belongs.
i want to Create a vector containing all relevant timepoints (i.e. every 15 minutes
Use int_diff function from lubridate package on this vector to create an
interval object.
Run a loop on all the time points in the raw data and for each data
point; indicate to which interval (which is represented by a single
timepoint at the beginning of the time interval) it belongs.
i tried looking for explanations how to use the int_diff function but i dont understand how my vector should look and how the syntax of int_diff works
tanks for the help :)
Is this what you have in mind?
start <- mdy_hm('4/11/2014 0:00') # start of the period
end <- mdy_hm('5/12/2015 0:00') # end
time_seq <- seq(from = start, to = end, by = '15 mins') # sequence by 15 minutes
times <- mdy_hm(c('4/11/2014 0:12', '4/11/2014 1:24')) # times to find intervals for
dat <- data.frame(times)
dat$intervals <- cut(times, breaks = time_seq) # assign each time to an interval
intervals_cols <- model.matrix(~ - + intervals, dat) # turn this into a set of columns, one for each interval, with a 1 indicating that this observation falls into the column

weekly time series in r, arima

I have a data frame with the following column names.
"week" "demand" "product-id"
The problem is to convert it into a time series object.
week is a number like 3,4,5,6,7,8,9 etc., and demand is in units and product-id is unique.
I want to convert the week column into time series, so as to prepare for modeling.
I want to predict weeks 10 and 11 demand by using an ARIMA model. How do I do that?
myTS <- ts(mydataframe[-1], frequency = 52)
will convert your demand and productId to a timeseries of 52 observations per year. For more elaborate timeseries, check package xts. Also compare this post on weekly data with ts.

Handling SPELL data with exact dates (feature request?)

I'm learning TraMineR and have used different types of longitudinal data. My original data is SPELL data with id, start time, end time and status, where the start and end times are exact dates, so my subsequences have varying lengths
With seqformat() I can chop the data (automatically) into 1 year pieces and convert into STS format, eg. where first variable is the first date, second variable is the first date + 1 year and so on.
What I would like to do is adjust the conversion so that I could use half year or one month time periods.
Here I have converted the dates into years with decimals with decimal.date():
id start end status
1 1 1965.138 1965.974 1
2 1 1968.714 1987.237 1
3 1 1985.667 2003.933 2
4 1 1988.499 1988.665 1
5 1 1996.652 1996.878 1
The sequence object that is created automatically has the data in one year subsequences:
$ y1960.16803278689
$ y1961.16803278689
$ y1962.16803278689
$ y1963.16803278689
So with data with dates I would like to have the option to use also shorter than 1 year subsequence lengths. I understand that with seqgranularity() the opposite is possible.
Alternatively I'm interested to know if there's some way in R outside TraMineR to handle the SPELL data to create certain length subsequences.

R - Daily data and Time Series by year and week

Hi I am new to R and this time series forecasting.
I have a sample data of sales by day for past 3 years and I would like to use this data set to produce plot to find seasonality and pattern.
My daily data format is like eg..
Date, Sales
2010-01-01, 5
2010-01-03, 3
2010-01-04, 2
..
2011-12-01, 4
..
2014-11-01, 1
What I want to see is similar to below plot but by week and year using ts function. Also, due to leap year some year has 53 weeks and some 52 weeks, any idea how this taken into account when plotting ?
Playing with this ts function is not easy to me so it will be great if someone could help with this ..
You should start by creating a ts object. Check ?ts for the syntax, but assuming your data above were stored in `data', it's basically
tsData <- ts(data, start=c(2010,1), frequency=365)
where start refers to the (year, month) and frequency is the number of samples per year. Then you can use plot.ts() to plot the entire time series
plot.ts(tsData)
To extract seasonal patterns or trends, you can use the decompose() function.
decompose(tsData)
Here is an sample
x <- 1:10
y <- 11:20
plot(x, y)
lines(x,y)
On your data sales and date.
You can replace y with date and x with sales. If you still have issue, plz post it to me.

How do I add periods to time series in R after aggregation

I have a two variable dataframe (df) in R of daily sales for a ten year period from 2004-07-09 through 2014-12-31. Not every single date is represented in the ten year period, but pretty much most days Monday through Friday.
My objective is to aggregate sales by quarter, convert to a time series object, and run a seasonal decomposition and other time series forecasting.
I am having trouble with the conversion, as ulitmately I receive a error:
time series has no or less than 2 periods
Here's the structure of my code.
# create a time series object
library(xts)
x <- xts(df$amount, df$date)
# create a time series object aggregated by quarter
q.x <- apply.quarterly(x, sum)
When I try to run
fit <- stl(q.x, s.window = "periodic")
I get the error message
series is not periodic or has less than two periods
When I try to run
q.x.components <- decompose(q.x)
# or
decompose(x)
I get the error message
time series has no or less than 2 periods
So, how do I take my original dataframe, with a date variable and an amount variable (sales), aggregate that quarterly as a time series object, and then run a time series analysis?
I think I was able to answer my own question. I did this. Can anyone confirm if this structure makes sense?
library(lubridate)
# add a new variable indicating the calendar year.quarter (i.e. 2004.3) of each observation
df$year.quarter <- quarter(df$date, with_year = TRUE)
library(plyr)
# summarize gift amount by year.quarter
new.data <- ddply(df, .(year.quarter), summarize,
sum = round(sum(amount), 2))
# convert the new data to a quarterly time series object beginning
# in July 2004 (2004, Q3) and ending in December 2014 (2014, Q4)
nd.ts <- ts(new.data$sum, start = c(2004,3), end = c(2014,4), frequency = 4)

Resources