I am fairly new to this field and I would like to get some help/advices. Any help would be much appreciated!
I am currently doing a forecasting project with time series data. However, it does not contain any weekend/holiday data. My goal is to predict the future value on a specific date. For example, with given data from 2000-present, I would like to predict the value of 2023-05-01. I tried creating some plots and use the zoo package. However, I am unsure how to approach this unevenly spaced data. Can someone provide me with some ideas of what model I should try? Btw, I am using R for this project. Thank you all so much!
I would agree with #jhoward that this is missing data, not unevenly spaced (like timestamped data). So you can interpolate the missing data. Maybe this helps for an overview of the possible techniques: 4-techniques-to-handle-missing-values-in-time-series-data
[Hey, I am sorry if this question might be too easy for this forum! My task was to decompose a time series to break it into the different components. Then plot it in R.
However, the correct solutions included the part that I uploaded as a picture. We are trying to identify the monthly data right? but why are we using the cbind and the data.frame function? than you very much in advance[enter image description here]1 ][2]
Basically, my task for the next 3 months is to forecast bed demand and a couple of other variables in a hospital's emergency department. The data is 5 years worth of daily observations of these variables. The data is complete with no missing values.
The goal is to improve the prediction accuracy of the current tool, which is an Excel workbook.
I have not taken any time series or optimization courses in college thus far- so imagine my horror when I realised I had no clue on how to approach this project and that I would be working entirely alone. I was told no one in the department has any experience and no one would be able to help me.
I'm using RStudio, but I'm not very proficient since it was self-taught.
From trying out the questions asked on here as well as YouTube tutorials to learn the appropriate syntax and functions, what I have managed to find out is:
1) My data is a time series and I should apply forecasting models to predict future values based on the historical data I have.
2) Daily observations of a long time series has weekly and annual seasonality, so I should define the data as a multi-seasonal time series.
I first tried defining my data as ts(), then msts(). One of the answers here mentioned zoo() would be more appropriate for daily obervations, so I tried that too. The forecasting models I've tried are snaive, ets, auto.arima and TBATS.
I would like to present the plots of the values/forecasts based on day-of-the-week other than all 365 days of the year, which is the only output I could plot. I tried using frequency = 365 and 7, and start = c(2014, 1) and end= c(2018, 365), but I haven't had any luck.
I would really appreciate any advice and help I could get from anyone. Thank you!
Without looking at your data, have you tried to get started with some basic ARIMA modeling and seeing what results you get from that? It’s a fairly friendly way to get started with time series forecasting, depending on your data. I was forecasting by the hour, but the frequency can be adjusted to whatever you need to forecast in. As you have mentioned, you are looking ot change the frequency. Sometimes it’s easier to see a pattern at larger time intervals, and can aggregate your data at larger time intervals.
For example, this converts daily observations to monthly.
library(xts)
dates <- seq(as.Date('2012-01-01'),as.Date('2019-03-31'),by='days')
beds$date.formatted <- dates
beds.xts <- xts(x=beds$neds.count,as.POSIXct(paste(beds$date.formatted)))
end.month <- endpoints(beds.xts,'months')
beds.month <- period.apply(beds.xts,end.month,sum)
beds.monthly.df <- data.frame(date=index(beds.month),coredata(beds.month))
colnames(beds.monthly.df) <- c('Date','Sessions')
beds.monthly <- ts(sessions.monthly.df$Sessions,start=c(2012,1),end=c(2019,3),frequency=12)
plot(beds.monthly)
I’m not sure if that would answer your question, but as you mentioned you are self-taught and stating out, I can share a script with you to help you go get started with an example, and maybe this would help you? It goes through the whole process of checking you have read your data in as a time series, what is time series data, how to check for non-stationary data and seasonality trends, plots that are useful for this, modeling, prediction, plotting actual vs predicted, accuracy, and further issues with the data that could be hindering your model. The video tutorial series are scripted in Python, but you can follow the end-to-end process of forecasting in ARIMA using the equivalent R script for this tutorial: https://code.datasciencedojo.com/rebeccam/tutorials/blob/master/Time%20Series/r_time_series_example.R
https://tutorials.datasciencedojo.com/time-series-python-reading-data/
I'm still a novice in R and I read quite a couple of posts and discussions on how to filter out frequency domains in a time series, but none of those quite matched my problem.
I would like to ask for your suggestions about the following:
I calculated wavelet coherence for two annually measured time series and taking a look at the wavelet coherence PSD graph:
The purple line (i.e. 8 year period) represents the border under which I would like to filter out the frequency domain, but not in the PSD, but in the original input data.
I though about using the butter function from the signal package, but it was overcomplicated for my purposes.
Thus I approached the problem with the bwfilter function of the mFilter package fo pass through the data over the 8 year period which corresponds to 2.37E-7 Hz.
name="dta OAK.resid Tair "
adat=read.table(file=paste(name,".csv", sep=""), sep=";", header=T)
dta=adat$ya
highpass <- bwfilter(dta, freq=8,drift=FALSE)
plot(highpass)
However, the results do not seem to be correct, because it seems to filter out too much from the data, the trend is too much aligned to the original time series.
Do you have any idea what may have gone wrong? The measurement unit maybe?
Any help is appreciated and if any additional details are needed I am happy to provide them!
Thank you!
The data can be found here
I know the similar ones have been discussed once again. But my question is on the wholly new scenario. And I applied these old methods, which does not work for me.
I use Rstudio 0.97551 under Windows XP.
I used xts package to draw multiple (usually two) time series into one plot and it works fine. However, it seems thatxts() and xtsExtra() did not support the normal plot functions in R and I cannot use usual commands to add additional y-axis on it.
My time series data is some finance data with irregular time intervals and that's the reason I feel using specific time series package to plot data would be better.
Here's the sample code and I use the data Canada under library vars to ensure it is replicable:
library(timeSeries)
library(timeDate)
library(xts)
library(xtsExtra)
library(vars)
data(Canada)
plot.xts(Canada,screens=1)
So I am really eager to ask some plausible ways to add additional y-axis on the xts-based time series plot. Hope my words were not so rookie enough...