I have a time series for which I want to adjust a structural model (trend, seasonal and cycle) using KFAS. However, seasonality starts at a certain point in time. Say, the time series ranges monthly from january 2000 through august 2022, but seasonality starts in 2011. Is there a way to capture such behavior in the series without splitting the data at that point?
I have already tried splitting the time series, but I would like a unified model. I am using KFAS in R for the estimation, though I have used also autostm for automatic structural models. Even though they achieve an appropriate fit (even for the whole time series), I think it can be improved with this idea. I thought I could us a regressor on the seasonality but I couldn't find how.
Are you using SSModel with a formula input? You could try adding a seasonality term to your data and add the seasonality term to the right-hand side of ~ in the formula.
Related
I would like to create a forecasting model with time series in R. I have a target time series 'Sales' that I would like to forecast. I also have several time series that represent, for example, GDP or advertising spend. Unfortunately I have a lot of independent time series and I don't know how to figure out the most significant ones. It would be best to find out the most important ones already before building the model.
I have already worked with classification problems, here I have always used the Pearson correlation value. This is not possible with time series, right? How can I determine the correlation for time series and use the correlation to find suitable time series that describe my target time series?
I tried to use the corr.test() function in R, but I think thats not right.
Hi Stack Overflow community.
I have 5 years of weekly price data for more than 15K Products (5*15K**52 records). Each product is a univariate time series. The objective is to forecast the price of each product.
I am familiar with the univariate time series analysis in which we can visualize each ts series, plot its ACF, PACF, and forecast the series. But, Univariate time series analysis is not possible in this case when I have 15K different time-series, can not visualize each time series, its ACF, PACF, and forecast separately of each product, and make a tweak/decision on it.
I am looking for some recommendations and directions to solve this multi-series forecasting problem using R (preferable). Any help and support will be appreciated.
Thanks in advance.
I would suggest you use auto.arima from the forecast package.
This way you don't have to search for the right ARIMA model.
auto.arima: Returns best ARIMA model according to either AIC, AICc or BIC value. The function conducts a search over possible models within the order constraints provided.
fit <- auto.arima(WWWusage)
plot(forecast(fit,h=20))
Instead of WWWusage you could put one of your time series, to fit an ARIMA model.
With forecast you then perform the forecast - in this case 20 time steps ahead (h=20).
auto.arima basically chooses the ARIMA parameters for you (according to AIC - Akaike information criterion).
You would have to try, if it is too computational expensive for you. But in general it is not that uncommon to forecast that many time series.
Another thing to keep in mind could be, that it might after all not be that unlikely, that there is some cross-correlation in the time series. So from a forecasting precision standpoint it could make sense to not treat this as a univariate forecasting problem.
The setting it sounds quite similar to the m5 forecasting competition that was recently held on Kaggle. Goal was to point forecasts the unit sales of various products sold in the USA by Walmart.
So a lot of time series of sales data to forecast. In this case the winner did not do a univariate forecast. Here a link to a description of the winning solution. Since the setting seems so similar to yours, it probably makes sense to read a little bit in the kaggle forum of this challenge - there might be even useful notebooks (code examples) available.
I have time series data ranging from 0 to 30 million. Its basically web traffic weekly data. I am working on building a forecasting model with this data. I want to understand how can I deal with this range of data. I tried box cox transformation with prophet model. I am not sure about what metrics could I use to evaluate the performance of the model. The data has a lot of 0's. I can't remove them from the dataset. Is there a better way to deal with the 0's other than the Box Cox transformation? I had issues with the inverse transformation but I added a small value (0.1) to the data to avoid negative values.
If your series have lot of periodic zero data,Croston method is a one way.It is a basically forecast strategy for products with intermittent demand.Also you can try exponential smoothing and traditional ARIMA,SARIMA models and clip the negative values in the forecast(this is according to your use case).
you can find croston method in forecast package.
also refer these links as well.
https://stats.stackexchange.com/questions/8779/analysis-of-time-series-with-many-zero-values/8782
https://stats.stackexchange.com/questions/373689/forecasting-intermittent-demand-with-zeroes-in-times-series
https://robjhyndman.com/papers/foresight.pdf
One of the main advantages of the TBATS model is that it can detect and work with multiple seasonality - e.g. nested seasonality.
I have a time series which has two nested seasonal cycles - an intra-year weekly cycle (Week 48, Week 49 etc), and an intra-week daily cycle (Sunday, Monday, etc).
I have used the tbats function to forecast this series, and viewed the output. Initially, I left the seasonal periods as "NULL" (meaning auto-detect), and this generated a simple straight line, having captured none of the seasonality in the model. However, I changed the time periods to 365, and this then generated a series of forecasts which did correspond reasonably well to the intra-year weekly variation, but with little daily variation. I think this means that while the function is not (in this case) able to detect the seasonality in the model, once "told" the seasonality, it does a good job of forecasting with it. I would therefore like to "tell" the model that the seasonality is nested, but I don't know how/if you can do this. I have tried:
x<-tbats(y,seasonal.periods={365,7})
but this doesn't work, and generates an error.
So I have a time series which I cannot share with you all, but I have a few questions about the proper proceedings to fit the correct ARIMA model for my data.
I have successfully written a loop to determine what degree of differencing needs to be done (parameter d in I(d))
Question:
To determine p and q, I am looking at ACF and PACF plots of my data. However, I am wondering if I should be using a deseasonalized transformation of my time series (trend plus random error, but no seasonality component which could be added back later) or my original time series. I obtained the deseasonal data using the decompose function in R (is stl() significantly better?).
With the original time seriees, my acf plot looks like:
There is some definite seasonality at play here from the ACF plot. Does that mean I need to identify nonzero seasonal parameters in my final model if I need to use this data? How do I choose seasonal P and Q in this case?
With the deseasonalized data, here are what the plots look like:
Not sure how to interpret the deseasonal PACF/ACF plots other than the fact that the spike at lag 6 on the ACF plot indicates p might be 6?
Just learned ARIMA this summer and would appreciate the help from anyone who knows the subject well how to choose the optimal parameters based on what I've shown. Looking forward to a good discourse :)