I have to draw the average of data points of last 10 hours. I get a data point for every 5 minutes, so essentially i have to draw the average of last 12*10 data points.
Suppose i have "delay" as a data point, at every point it makes more sense to draw average of last 10 hours delay instead of plotting the current delay.
I tried Average(),sum() and summarize() functions but i guess they do not achieve this functionality.
Any help on this?
Can you take advantage of the movingAverage function within graphite?
An example for the 10 hours of the a moving average would be the below.
&target=movingAverage(datapoint.name.deplay,'10hour')
Related
I want to know when a time series stabilizes after its peak.
This time series shows a peak, and then it goes down (see images below (1); (2)).
I would like to calculate the moment at which this time-series stabilizes (becomes flatter) after its peak.
In an ideal world, the data should go down to 0 and stay there. But as you see, it does not reach 0, nor stay 100% stable.
I thought of various different ways/possibilities:
-Calculate a tangent point to see the slope change. But the data has many small ups & downs even after smoothing it.
-Calculate the average at the tail (end of the series /e.g. if time = 8000; calculate last 2000 values mean average), then calculate an interval (margin +- this value), then calculate the time at which the first value appears in this interval.
-Calculate pronounced changes in the trend or slope.
*Maybe you have a better idea I did not consider. Feel free to share it if you've already dealt with this in the past.
I need to know the time at which this stabilization happened.
Ideally, you could mute/ignore all values before the peak value, but without deleting these rows (time should stay). Calculating the peak is easy (max value).
I also standardized the data so it starts at y=0 (I have various time series, I make them all start at y=0 to compare them later*).
I do not know how to provide the data because it is about 8k values.
I would really appreciate your help.
Thank you very much.
Visual example of the data
I used a drone to create a DOF of a small area. During the flight, it takes a photo every 20sh seconds (40sh meters of a flight). I have created a CSV file, which I transferred to a point shapefile. In total, I made with drone 10 so-called "missions", each with 100-200 points which are "shaped" as squares on the map. What I want now is to create a polygon shapefile from the point shapefile.
Because those points sometimes overlap, I cannot use the "Aggregate Points" task, as it's only distance-based. I want to make polygons automatically, using some kind of script. What could help is the fact that a maximum time between two points (AKA photos taken) is 10-20 seconds, so if the time distance is over 3 minutes, it's another "mission". Can you help with such a script, that would quickly and automatically create as many polygons as there are missions?
Okay, I think I understand what you are trying to accomplish. Since no one replied I am going to give it a quick shot, so you have something to try.
I think the best strategy would be to:
Clustering algorithm: Try running a Clustering algorithm such as DBSCAN around the timestamp dimension to classify them based on time groups, instead of the distance (since, as you said, distance based separation is not enough to properly identify and separate the points). After which, you should have all the points classified between different groups with a column group id. Maximum distance parameter in the algorithm should be around 20 seconds steps, or even a minute (since you said each mission was separated at least about 3 minutes apart).
Feature based Polygon to point: At that point, then you run your generic Polygon_from_points(...) function that transforms these clustered points to polygons shapes based on a specific discriminant feature (which in your case is going to be each group id).
How does this work?: This would properly separate the groups first (time-based) and then you should be able to find a generic point to polygon based on a feature (Arcgis should have some).
I dont have an example dataset, nor any code written, but based on what you described I think it would work, hope it helps.
I have lat/lng data of multirotor UAV flights. There are alot of datapoints (~13k per flight) and I wish to find line segments from the data. They give me flight speed and direction. I know that most of the flights are guided missons meaning a point is given to fly to. However the exact points are unknown to me.
Here is a graph of a single flight lat/lng shifted to near (0,0) so they are visible on the same time-series graph.
I attempted to generate similar data, but there are several constraints and it may take more time to solve than working on the segmenting.
The graphs start and end nearly always at the same point.
Horisontal lines mean the UAV is stationary. These segments are expected.
Beginning and and end are always stationary for takeoff and landing.
There is some level of noise in the lines for the gps accuracy tho seemingly not that much.
Alot of data points.
The number of segments is unknown.
The noise I could calculate given the segments and least squares method to the line. Currently I'm thinking of sampling the data (to decimate it a little) and constructing lines. Merging the lines with smaller angle than x (dependant on the noise) and finding the intersection points of the lines left.
Another thought is to try and look at this problem in the frequency domain. The corners should be quite high frequency. Maybe I could make a custom filter kernel that would enable me to use a window function and win in efficency.
EDIT: Rewrote the question for more clarity and less rambling.
I'm using Graphite and Grafana and I'm trying to plot a series against a time shifted version of itself for comparison.
(I.e. is the current value similar to this time last week?)
What I'd like to do is plot;
the 5 minute moving average of the series
a band consisting of the 5 minute moving average of the series timeshifted by 7 days, bounded above and below by the standard deviation of itself
That way I can see if the current moving average falls within a band limited by the standard deviation of the moving average from a week ago.
I have managed to produce a band based on the timeshifted moving average, but only by offsetting either side by a constant amount. I can't work out any way of offsetting by the standard deviation (or indeed by any dynamic value).
I've copied a screenshot of the sort of thing I'm trying to achieve. The yellow line is the current moving average, the green area is bounded by the historical moving average offset either side by the standard deviation.
Is this possible at all in Grafana using Graphite as the backend?
I'm not quite on the latest version, but can easily upgrade (and will do so shortly anyway).
Incidentally, I'm not a statistician, if what I'm doing actually makes no sense mathematically, I'd love to know! ;-) My overall goal is to explore better alternatives, instead of using static thresholds, for highlighting anomalous or problematic server performance metrics - e.g. CPU load, disk IOPS, etc.
I'm building an app (in Qt) that includes a few graphs in it which are dynamic (meaning refreshes to new values rapidly), and gets there values from a background thread.
I want the first graph, whose details are important refreshing at one speed (100 Hz) and 4 other graphs refreshing in lower speed (10Hz).
The problem is, that when I'm refreshing them all at the same rate (100 Hz) the app can't handle it and the computer stucks, but when the refresh rate is different the first signal gets artifacts on it (comparing to for example running them all an 10Hz).
The artifacts are in the form of waves (instead of straight line for example I get a "snake").
Any suggestions regarding why it has artifacts (rendering limits I guess) and what can be done about it?
I'm writing this as an answer even if this doesn't quite answer your question, because this is too long for a comment.
When the goal is to draw smooth moving graphics, the basic unit of time is frame. At 60 Hz drawing rate, the frame is 16.67 ms. The drawing rate needs to match the monitor drawing rate. Drawing faster than the monitor is totally unnecessary.
When drawing graphs, the movement speed of graph must be kept constant. If you wonder why, walk 1 second fast, then 1 seconds slow, 1 second fast and so on. That doesn't look smooth.
Lets say the data sample rate is 60 Hz and each sample is represented as a one pixel. In each frame all new samples (in this case 1 sample) is drawn and the graph moves one pixel. The movement speed is one pixel per frame, in each frame. The speed is constant, and the graph looks very smooth.
But if the data sample rate is 100 Hz, during one second in 40 frames 2 pixels are drawn and in 20 frames 1 pixel is drawn. Now the graph movement speed is not constant anymore, it varies like this: 2,2,1,2,2,1,... pixels per frame. That looks bad. You might think that frame time is so small (16.67 ms) that you can't see this kind of small variation. But it is very clearly seen. Even single varying speed frames can be seen.
So how is this data of 100 Hz sample rate is drawn smoothly? By keeping the speed constant, in this case it would be 1.67 (100/60) pixels per frame. That of course will require subpixel drawing. So in every frame the graph moves by 1.67 pixels. If some samples are missing at the time of drawing, they are simply not drawn. In practice, that will happen quite often, for example USB data acquisition cards can give the data samples in bursts.
What if the graph drawing is so slow that it cannot be done at 60 Hz? Then the next best option is to draw at 30 Hz. Then you are drawing one frame for every 2 images the monitor draws. The 3rd best option is 20 Hz (one frame for every 3 images the monitor draws), then 15 Hz (one frame for every 4 images) and so on. Drawing at 30 Hz does not look as smooth as drawing at 60 Hz, but the speed can still be kept constant and it looks better than drawing faster with varying speed.
In your case, the drawing rate of 20 Hz would probably be quite good. In each frame there would be 5 new data samples (if you can get the samples at a constant 100 Hz).