We have hourly time series data having 2 columns, one is the timestamp and other is the error rate. We used H2O deep-learning model to learn and predict future error-rate but looks like it requires at least 2 features (except timestamp) for creating the model.
Is there any way h2o can learn this type of data (time, value) having only one feature and predict the value given future time?
Not in the current release of H2O, but ARIMA models are in development. You can follow the progress here.
Interesting question,
I read about to declare other variables which represent previous values of the time series, similar to the methodology of regression in ARIMA models. But I'm not sure if this is a possible way to do it, so please correct me if I am wrong.
Consequently you could try to extend your dataset to something like this:
t value(t) value(t-1) value(t-2) value(t-3) ...
1 10 NA NA NA ...
2 14 10 NA NA ...
3 27 14 10 NA ...
...
After this, value(t) is your response (output neuron) and the others are your predictor variables, each refering to an input neuron.
I have tried to use many of the default methods inside H2O with time series data. If you treat the system as a state machine where the state variables are a series of lagged prior states, it's possible, but not entirely effective as the prior states don't maintain their causal order. One way to alleviate this is to assign weights to each lagged state set based on time past, similar to how an EMA gives precedence to more recent data.
If you are looking to see how easy or effective the DL/ML can be for a non-linear time series model, I would start with an easy problem to validate the DL approach gives any improvement over a simple 1 period ARIMA/GARCH type process.
I have used this technique, with varying success. What I have had success with is taking well known non linear time series models and improving their predictive qualities with additional factors using the the handcrafted non linear model as an input into the DL method. It seems that certain qualities that I haven't manually worked out about the entire parameter space are able to supplement a decent foundation.
The real question at that point is there is now an introduction of immense complexity that isn't entirely understood. Is that complexity warranted in the compiled landscape when the nonlinear model encapsulates about 95% of the information between the two stages?
Related
I am approaching a problem that Keras must offer an excellent solution for, but I am having problems developing an approach (because I am such a neophyte concerning anything for deep learning). I have sales data. It contains 11106 distinct customers, each with its time series of purchases, of varying length (anyway from 1 to 15 periods).
I want to develop a single model to predict each customer's purchase amount for the next period. I like the idea of an LSTM, but clearly, I cannot make one for each customer; even if I tried, there would not be enough data for an LSTM in any case---the longest individual time series only has 15 periods.
I have used types of Markov chains, clustering, and regression in the past to model this kind of data. I am asking the question here, though, about what type of model in Keras is suited to this type of prediction. A complication is that all customers can be clustered by their overall patterns. Some belong together based on similarity; others do not; e.g., some customers spend with patterns like $100-$100-$100, others like $100-$100-$1000-$10000, and so on.
Can anyone point me to a type of sequential model supported by Keras that might handle this well? Thank you.
I am trying to achieve this in R. Haven't been able to build a model that gives me more than about .3 accuracy.
I don't think the main difficulty is coming from which model to use as much as how to frame the problem.
As you mention, "WHO" is spending the money seems as relevant as their past transaction in knowing how much they will likely spend.
But you cannot train 10k+ models either for each customers.
Instead I would suggest clustering your customers base, and instead trying to fit a model by cluster, using all the time series combined for the customers in that cluster to train the same model.
This would allow each model to learn the spending pattern of that particular group.
For that you can use LTSM or RNN model.
Hi here's my suggestion and I will edit it later to provide you with more information
Since its a sequence problem you should use RNN based models: LSTM, GRU's
I have time series data ranging from 0 to 30 million. Its basically web traffic weekly data. I am working on building a forecasting model with this data. I want to understand how can I deal with this range of data. I tried box cox transformation with prophet model. I am not sure about what metrics could I use to evaluate the performance of the model. The data has a lot of 0's. I can't remove them from the dataset. Is there a better way to deal with the 0's other than the Box Cox transformation? I had issues with the inverse transformation but I added a small value (0.1) to the data to avoid negative values.
If your series have lot of periodic zero data,Croston method is a one way.It is a basically forecast strategy for products with intermittent demand.Also you can try exponential smoothing and traditional ARIMA,SARIMA models and clip the negative values in the forecast(this is according to your use case).
you can find croston method in forecast package.
also refer these links as well.
https://stats.stackexchange.com/questions/8779/analysis-of-time-series-with-many-zero-values/8782
https://stats.stackexchange.com/questions/373689/forecasting-intermittent-demand-with-zeroes-in-times-series
https://robjhyndman.com/papers/foresight.pdf
I am trying to predict the Bitcoin price at t+5, i.e. 5 minutes ahead, using 11 technical indicators up to time t which can all be calculated from the open, high, low, close and volume values from the Bitcoin time series (see my full data set here). As far as I know, it is not necessary to manipulate the data frame when using algorithms like regression trees, support vector machines or artificial neural networks, but when using ensemble methods like random forests (RF) and Boosting, I heard that it is necessary to re-arrange the data frame in some way, because ensemble methods draw repeated RANDOM samples from the training data, in which case the sequence of the Bitcoin time series will be ruined. So, is there a way to re-arrange the data frame in some way such that the time series will still be in chronological order every time repeated samples are drawn from the training data?
I was provided with an explanation of how to construct the data frame here and possibly here, too, but unfortunately, I didn't really understand these explanations, because I didn't see a visual example of the to-be-constructed data frame and because I wasn't able to identify the relevant line of code. So, if someone could, show me how to re-arrange the data frame using an example data frame, I would be very thankful. As example data frame, you might consider using the airquality in-built data frame in r (I think it contains time series data), the data I provided above, or any other data frame you think is best.
Many thanks!
There is no problem with resampling for ML algorithms. To capture (auto)correlation just add columns with lagged values of time series. E.g. in case of univarate time-series x[t], where t is time in minutes, you add x[t - 1], x[t - 2], ..., x[t - n] columns with lagged values. More lags you add more history will be accounted at model training.
Some very basic working example you can find here: Prediction using neural networks
More advanced staff with Keras is here: Time series prediction using RNN
However, just for your information, special message by Mr Chollet and Mr Allaire from the above-mentioned article ,):
NOTE: Markets and machine learning
Some readers are bound to want to take the techniques we’ve introduced
here and try them on the problem of forecasting the future price of
securities on the stock market (or currency exchange rates, and so
on). Markets have very different statistical characteristics than
natural phenomena such as weather patterns. Trying to use machine
learning to beat markets, when you only have access to publicly
available data, is a difficult endeavor, and you’re likely to waste
your time and resources with nothing to show for it.
Always remember that when it comes to markets, past performance is not
a good predictor of future returns – looking in the rear-view mirror
is a bad way to drive. Machine learning, on the other hand, is
applicable to datasets where the past is a good predictor of the
future.
I have data in the following format:
I want to use Neural Network for forecasting next value of A. I have already obtained lag in series A which comes out to be 37 based on AIC criteria. This is done by using the code
model<-nnetar(data[,2], P=1,repeats=20,lambda=NULL)
I wanted to use the same lag for both series A and B to predict next values of series A. The problem I am facing is nnetar builds a model only on 1 column and I want to use both column for forecasting and lag part needs to be intact. Is there any way, I could do that?
I think natively it is hardly possible to use NNAR natively as it has a fixed NN structure, except the number of nodes in the hidden layer.
(WARNING: slightly crazy idea)
However, as far as i understand nnetar it uses something like a windowing approach. Therefore, it might be possible to interleave your data having A's and B's alternating. This can be done using melt(data[,2:3]) from reshape. Afterwards you can predict A's and B's simultaneously. You need to adapt all the other parameters accordingly.
However, i would switch to a recurrent neural network library like RSNNS
if I have 2 lists of time intervals :
List1 :
1. 2010-06-06 to 2010-12-12
2. 2010-05-04 to 2010-11-02
3. 2010-02-04 to 2010-10-08
4. 2010-04-01 to 2010-08-02
5. 2010-01-03 to 2010-02-02
and
List2 :
1. 2010-06-08 to 2010-12-14
2. 2010-04-04 to 2010-10-10
3. 2010-02-02 to 2010-12-16
What would be the best way to calculate some sort of correlation or similarity factor between the two lists?
Thanks!
Is that the extent of the data or just a sample to give an idea of the structure you have?
Just a few ideas about how to look at this... My apologies if it is redundant to your current state in looking at this set.
Two basic ideas come to mind for comparing interval like this: absolute or relative. A relative comparison would ignore absolute time for the interval data and look for repeating structures or signature that occur in both groups but not necessarily at the same time. The absolute version would consider simultaneous events to be relevant and and it doesn't matter if something happens every week if they are separated by a year... You can maybe make this distinction by knowing something about the origin of the data.
If it is the grand total of data available for your decision about associations it will come down to some assumptions about what constitutes "correlation". For instance, if you have a specific model for what is going on - e.g. a time to start, time to stop (failure) model you could evaluate the likelihood of observing one sequence given the other. However, without more example data it seems unlikely you'd be able to make any firm conclusions.
The first interval in the two groups are nearly identical so they will contribute strongly to any correlation measure I can think of for the two groups. If there is a random model for this set, I would expect that many models would show these two observations and "unlikely" just because of that.
One way to asses "similarity" would be to ask what portion of the time-axis is covered (possibly generalized to multiple coverage) and compare the two groups on that basis.
Another possibility is to assign a function that adds one for each sequence that occurs during any particular day in the overall interval of these events. That way you have a continuous function with a rudimentary description of multiple events covering the same date. Calculating a correlation between the two groups might give you suggestions of structural similarity, but again you would need more groups of data to make any conclusions.
Ok that was a little rambling. Good luck with your project!
You may try with Cross-Correlation.
However, you should be aware that you have vector data (start, length), and the algorithms suppose a functional dependency between them. That depends on the semantic of your data, which is not clear from the question.
HTH!
A more useful link for your current problem here.