Determining ARIMA frequency of non-stationary time series - r

I am trying to use ARIMA to forecast chemical concentrations in water tanks. I have a large dataset of around a million intervals, two minutes apart. When i use the autoarima in R i get a forecast looking like this:
Forecast
As you can see, it evens itself out, which makes larger forecasts quite useless.
As far as i can read myself to, the frequency of the time series is what i need to address in the model. I just simply cannot find anywhere that explains this. Frequency in this case is not that there is two minutes between each observation, but is something along the lines of "twelve observations per year" for a monthly observation, where the seasons have an effect on the data.
Here is a plot of the data, if it helps
Plot
and on a smaller scale:
Smaller scale plot

Have a look at this question and answer on stats stack exchange pretty much the same question, and the answer basically answers it.

Related

time series with multiple observations per unit of time

I have a dataset of the daily spreads of 500 stocks. My eventual goal is to make a model using extreme value theory. However as one of the first steps, I want to check my data for volatility clustering and leptokurticity. So I first want R to see my data as a time series and I want to plot my data. However, I only find examples of time series with only one observation per unit of time. Is there a possibility for R to treat my type of dataset as a time series? And what's the best way to plot it?

How to calculate area under the curve (AUC) in several data series?

I have the data of blood parameters from around 400 patients and from each patient I collected the parameter on 30 consecutive days. So each patient has around 30 values.
It looks like this:
So from these data I want to calculate the area under the curve for each patient.
As I see probably the "pROC" package could help me with this. But what is the fastest method to calculate the AUC for each patient? I want to avoid to calculate it for each patient manually.
Can anyone help?

Predicting multivariate time series with RNN

I have been experimenting with a R package called RNN.
The following is the code site:
https://github.com/bquast/rnn
It has a very nice example for financial time series prediction.
I have read the code and I understand it uses the sequence of the time series to predict in advance the value of next day instrument.
The following is an example of run with 10 hidden nodes and 200 epochs
RNN financial time series prediction
What I would expect as result is that the algorithm succeed, at least in part, to predict in advance the value of the instrument.
From what I can see, apparently is only approximating the value of the time series at the current day, not giving any prediction on the next day.
Is my expectation wrong?
This code is very simple, how would you improve it?
y <- X[,1:input$training_amount+input$prediction_gap,as.numeric(input$target)]
matrix(y, ncol=input$training_amount)
y.train moves all the data forward by a day so that is what is being trained on - next day data for the currency pair you care about. With ncol = training_amount when there are too many columns (with them now equal to training_amount + prediction_gap), the first data points fall off; hence all the data gets moved forward by the prediction_gap.

Generate artifical sales data series in R with arima model & add noise

I'm extremely new to R and pretty new to time series analysis. I have been looking for the answer online but I can't seem to find it.
I need to generate an artificial time series that would represent sales data of a product X. I need to make a few variations. an Independent series (I can do that with Excel), and a series with AR(1) with rho= 0.8, and with added noise level.
For the AR(1), I know I can use
ar.sim<-arima.sim(model=list(ar=c(.8)), n=52)
ar.sim
this generates 52 data points along the lines of
[1] -2.080871400 -1.327156528 -0.672868162 0.521280151 -0.200243250
[6] 0.808095642 2.208014641 4.266957623 1.682261358 0.796715498
Question 1: How do I make it start from a certain level, e.g. from 300 products sold? I thought n.start but that seems to be something else.. Do I just add 300 everytime? that seems wrong..
Question 2: How can I add noise to this series?

How to compare two forecasted graph for two different time series in R?

Actually I want to compare the forecasted graph for two different time series data. I have data for 5 year for two different city of rain data which has been observed monthly. For that I have plotted the graph for 5 years of period of time series and also for 2 more year in future using forecast package for both city. Now I want to compare graph these two graphs and their future prediction for 2 years(may be in terms of error).
Can anyone help me out of these.
You could start with something like this:
f1 <- forecast(series1, h=24)
f2 <- forecast(series2, h=24)
accuracy(f1)
accuracy(f2)
That will give you a lot of error measures on the historical data. Unless you have the actual data for the future periods, you can't do much more than that.

Resources