I currently have the problem that I need to compute some characteristic number (would be good if its in 0-1 range) for the "evenness" of event starts.
I have various Timeslots which consist of a start date and a duration. I also have a fixed Interval where these Timeslots can occur in, for example the fixed Interval would be a week from 10.07.2018 - 17.07.2018.
Then given some events in this week I want to determine if they are evenly distributed. Evenly distributed for me means that the distance between these events would be the same.
So if there are 3 events that are on 11.07, 13.07, 15.07 it should be considered better than if they all would occur on the 11.07.
Is there a mathematical algorithm or formula for this kind of problem?
Thanks in advance
Related
I received readings of multiple multiple filters for a time series x. Those filters are differences of EMAs and EMAs on EMAs. My filters have been calculated as
F1 = EMA(x, period.short) - EMA(x, period.long) and
F2 = ZMA(x, period.short) - ZMA(x, period.long).
ZMA is a delagged EMA which is calculated as EMA(EMA(x, period), period). Unfortunately the original values of the time series x have been lost. Thus, I wonder whether it is possible to somehow reconstruct the original time series. I suppose that an exact reconstruction won't be possible, however to get an glimpse of an idea I'd be already happy to get one possible time series history for the filter values of identical time points.
Therefor I wonder how such a reconstruction could be done in an efficient way and how it could be implemented (if possible in R). Maybe there are already some packages available for such a purpose?!
Thus, I'd be happy if you'd have any advice. Many thanks in advance!
I'm quite new to time series and I am wondering what is the best way to identify the starting date of a period with low values of a variable. So in this example I would in a first step want to i) identify whether there is such a period of let's say at least 5 values that are similarly low and ii) what the starting date of this period is.
So in this example (https://i.stack.imgur.com/IxLQg.png) for the first 3 individuals (c15793, c15798 and c3556) I want to figure out that there is such a period and that the starting date is on the 20th of May for c15798 and on the 22nd of May for the other two. But c5157 should be identified as not having such a period.
I have no clue on how I could identify such a period and I was hoping someone would have an idea and point me to a method or a point where I could start. Everything I can think of would require some sort of threshold (e.g. the difference between consecutive measurements) which I don't know how to choose. So if anyone has a more elegant idea or a good idea on how to set a threshold, I would be more than happy to learn about it.
Thanks so much in advance!
enter image description here
I am having great difficulty with this topic and I could use assistance from some experts.
I have a standard time series, I can have it as an XTS or as a dataframe with numeric inputs. The headers are typical: DATE, OPEN, HIGH, LOW, CLOSE, SMA20, SMA50, CROSSOVER.
SMA20 refers to the 20 periods simple moving average. Crossover refers to where the 20 period SMA is above or below the SMA50. If its positive it is above, if it is negative it is below.
Here is my problem that I am trying to solve. How do I create a separate column to track profits or losses??? I want to create a take profit function at 100 pips and a stop loss function at 100 pips away from the entry point. The entry point will always be the open of the next day, once the crossover happens.
What I am thinking so far is to use the open price, then look at the difference of the high and the low for the day, if difference between the high and the open is greater than 100, the trade gets closed, if it is less, the trades stays open. I have no idea how to begin coding this.
I am trying to cluster customers consumption behaviors using time series techniques. Customers buy tokens and use them whenever they want (a max of 4 tokens per day).
This is a sample of what the customers journeys time series (x = days after first order , y = number of tokens consumed per day) and it look alike the image below.
I tried clustering with derived variables (median delay between two events, standard deviation of the delays, total number of tokens, time between first and last consumption, mean number of tokens consumed per consumption event ...). I used K-means and this gave me some good results but it wasn't enough to spot all patterns in data. I looked at some papers about the use of Dynamic time warping in such cases but I have never used such algorithms..
Is there any materials (demos) on the use of such algorithms to cluster such time series ?
Yes.
There are many techniques that can be useful here.
The obvious approach from literature would be HAC with DTW.
I have several queue lengths stored as gauges in Graphite and I need to form a query URL which checks for a certain trend. Specifically, I need to know if a queue is:
...growing at a rate of more than 15% over three consecutive 1 hour periods.
I have tried various approaches including:
summarize(derivative(queue.length),"3h",avg)
...and various options with a timeStack(queue.length,"1h",3) series, but I just can't work it out.
Any suggestions appreciated.