Graphite movingAverage for only existing datapoints - graphite

Let's say I have the following metric:
http://i.imgur.com/Ssn3FVM.png
I want to calculate the moving average for last reported 5 values.
Unfortunately, the movingAverage(series,5) function is taking into account the Null, or None values, and not producing movingAverage for the visible data:
http://i.imgur.com/bC8Ibhw.png
I want to see the moving average for ONLY the X last existing (that is, visible) data points (therefore, not Null, or None -- on the graph above, those would be the visible 100's and 0's).
Can this be achieved in Graphite? If so, what function could be used?

Related

Turning a band of Sentinel 2 image into an array

I am new to Google Earth Engine and have started playing with mathematically combining different bands to define new index. The problem I am having is the visualisation of the new index - I need to define the max and min parameter when adding it to the map, and I am having troubles understanding what these two end points should be. So here come my two questions:
Is it possible to get the matrix of my image in terms of pixel values? Then I could easily see from what values they range and hence could define min and max!
What values are taken in different bands? Is it from 0 to 1 and measures intensity at given wavelength, or is it something else?
Any help would be much appreciated, many thanks in advance!
Is it possible to get the matrix of my image in terms of pixel values? Then I could easily see from what values they range and hence could define min and max!
If this is what you want to do, there's a built in way to do it. Go to the layer list, click on the gear for the layer, and in the “Range” section, pick one of the “Stretch:” options from the menu, then click “Apply”. You can choose a range in standard deviations, or 100% (min and max).
You can then use the “Import” button to save these parameters as a value you can use in your script.
(All of this applies to the region of the image that's currently visible on screen — not the entire image.)
What values are taken in different bands? Is it from 0 to 1 and measures intensity at given wavelength, or is it something else?
This is entirely up to the individual dataset you are using; Earth Engine only knows about numbers stored in bands and not units of measure or spectra. There may be sufficient information in the dataset's description in the data catalog, or you may need to consult the original provider's documentation.

How would I fit a polynomial to a given set of points, that falls inside a given range? (ideally done manually without computer)

Let's say I've fitted a polynomial to a set of 7 data points using gaussian elimination to bring a matrix to row echelon and then reduced row echelon form. I've done this all by hand, and when graphed, the polynomial goes through each point. Success! BUT, the polynomial goes too far up in between a couple of these points. Ideally, the polynomial doesn't go above the highest data point, or below the lowest data point. I don't care what it does outside of the domain of the data points. Right now it goes far above the highest data point, so it is effectively useless for my case.
Is there any way I can redo these calculations, but in a way that ensures the polynomial falls within a given range (inside the domain of the data points)? After calculating the polynomial, I can restrict domain so that it doesn't extrapolate outside of the given data, but I CAN'T restrict range because it will make the function discontinuous.
Ideally I can do all of this by hand, without a computer, but I'm open to other options.
Thanks!

Detect peaks at beginning and end of x-axis

I've been working on detecting peaks within a data set of thousands of y~x relationships. Thanks to this post, I've been using loess and rollapply to detect peaks by comparing the local maximum to the smooth. Since, I've been working to optimise the span and w thresholds for loess and rollapply functions, respectively.
However, I have realised that several of my relationships have a peak at the beginning or the end on the x-axis, which are of my interest. But these peaks are not being identified. For now, I've tried to add fake variables outside of my x variable range to imitate a peak. For example, if my x values range from -50 to 160, I created x values of -100 and 210 and assigned a 0 y value to them.
This helped me to identify some of the relationships that have a peak at the beginning or the end. As you can see here:
However, for some it does not work.
Despite the fact that I feel uncomfortable adding 'fake' values to the relationship, the smoothing shifts the location of the peak frequently and more importantly, I cannot find a solution that allows to detect these beginning or end peaks. Does anyone know how to work out a solution that works in R?

Accurately measuring relative distance between a set of fiducials (Augmented reality application)

Let's say I have a set of 5 markers. I am trying to find the relative distances between each marker using an augmented reality framework such as ARToolkit. In my camera feed thee first 20 frames show me the first 2 markers only so I can work out the transformation between the 2 markers. The second 20 frames show me the 2nd and 3rd markers only and so on. The last 20 frames show me the 5th and 1st markers. I want to build up a 3D map of the marker positions of all 5 markers.
My question is, knowing that there will be inaccuracies with the distances due to low quality of the video feed, how do I minimise the inaccuracies given all the information I have gathered?
My naive approach would be to use the first marker as a base point, from the first 20 frames take the mean of the transformations and place the 2nd marker and so forth for the 3rd and 4th. For the 5th marker place it inbetween the 4th and 1st by placing it in the middle of the mean of the transformations between the 5th and 1st and the 4th and 5th. This approach I feel has a bias towards the first marker placement though and doesn't take into account the camera seeing more than 2 markers per frame.
Ultimately I want my system to be able to work out the map of x number of markers. In any given frame up to x markers can appear and there are non-systemic errors due to the image quality.
Any help regarding the correct approach to this problem would be greatly appreciated.
Edit:
More information regarding the problem:
Lets say the realworld map is as follows:
Lets say I get 100 readings for each of the transformations between the points as represented by the arrows in the image. The real values are written above the arrows.
The values I obtain have some error (assumed to follow a gaussian distribution about the actual value). For instance one of the readings obtained for marker 1 to 2 could be x:9.8 y:0.09. Given I have all these readings how do I estimate the map. The result should ideally be as close to the real values as possible.
My naive approach has the following problem. If the average of the transforms from 1 to 2 is slightly off the placement of 3 can be off even though the reading of 2 to 3 is very accurate. This problem is shown below:
The greens are the actual values, the blacks are the calculated values. The average transform of 1 to 2 is x:10 y:2.
You can use a least-squares method, to find the transformation that gives the best fit to all your data. If all you want is the distance between the markers, this is just the average of the distances measured.
Assuming that your marker positions are fixed (e.g., to a fixed rigid body), and you want their relative position, then you can simply record their positions and average them. If there is a potential for confusing one marker with another, you can track them from frame to frame, and use the continuity of each marker location between its two periods to confirm its identity.
If you expect your rigid body to be moving (or if the body is not rigid, and so forth), then your problem is significantly harder. Two markers at a time is not sufficient to fix the position of a rigid body (which requires three). However, note that, at each transition, you have the location of the old marker, the new marker, and the continuous marker, at almost the same time. If you already have an expected location on the body for each of your markers, this should provide a good estimate of a rigid pose every 20 frames.
In general, if your body is moving, best performance will require some kind of model for its dynamics, which should be used to track its pose over time. Given a dynamic model, you can use a Kalman filter to do the tracking; Kalman filters are well-adapted to integrating the kind of data you describe.
By including the locations of your markers as part of the Kalman state vector, you may be able to be able to deduce their relative locations from purely sensor data (which appears to be your goal), rather than requiring this information a priori. If you want to be able to handle an arbitrary number of markers efficiently, you may need to come up with some clever mutation of the usual methods; your problem seems designed to avoid solution by conventional decomposition methods such as sequential Kalman filtering.
Edit, as per the comments below:
If your markers yield a full 3D pose (instead of just a 3D position), the additional data will make it easier to maintain accurate information about the object you are tracking. However, the recommendations above still apply:
If the labeled body is fixed, use a least-squares fit of all relevant frame data.
If the labeled body is moving, model its dynamics and use a Kalman filter.
New points that come to mind:
Trying to manage a chain of relative transformations may not be the best way to approach the problem; as you note, it is prone to accumulated error. However, it is not necessarily a bad way, either, as long as you can implement the necessary math in that framework.
In particular, a least-squares fit should work perfectly well with a chain or ring of relative poses.
In any case, for either a least-squares fit or for Kalman filter tracking, a good estimate of the uncertainty of your measurements will improve performance.

Minimising interpolation error between two data sets

In the top of the diagrams below we can see some value (y-axis) changing over time (x-axis).
As this happens we are sampling the value at different and unpredictable times, also we are alternating the sampling between two data sets, indicated by red and blue.
When computing the value at any time, we expect that both red and blue data sets will return similar values. However as shown in the three smaller boxes this is not the case. Viewed over time the values from each data set (red and blue) will appear to diverge and then converge about the original value.
Initially I used linear interpolation to obtain a value, next I tried using Catmull-Rom interpolation. The former results in a values come close together and then drift apart between each data point; the latter results in values which remain closer, but where the average error is greater.
Can anyone suggest another strategy or interpolation method which will provide greater smoothing (perhaps by using a greater number of sample points from each data set)?
I believe what you ask is a question that does not have a straight answer without further knowledge on the underlying sampled process. By its nature, the value of the function between samples can be merely anything, so I think there is no way to assure the convergence of the interpolations of two sample arrays.
That said, if you have a prior knowledge of the underlying process, then you can choose among several interpolation methods to minimize the errors. For example, if you measure the drag force as a function of the wing velocity, you know the relation is square (a*V^2). Then you can choose polynomial fitting of the 2nd order and have pretty good match between the interpolations of the two serieses.
Try B-splines: Catmull-Rom interpolates (goes through the data points), B-spline does smoothing.
For example, for uniformly-spaced data (not your case)
Bspline(t) = (data(t-1) + 4*data(t) + data(t+1)) / 6
Of course the interpolated red / blue curves depend on the spacing of the red / blue data points,
so cannot match perfectly.
I'd like to quote Introduction to Catmull-Rom Splines to suggest not using Catmull-Rom for this interpolation task.
One of the features of the Catmull-Rom
spline is that the specified curve
will pass through all of the control
points - this is not true of all types
of splines.
By definition your red interpolated curve will pass through all red data points and your blue interpolated curve will pass through all blue points. Therefore you won't get a best fit for both data sets.
You might change your boundary conditions and use data points from both data sets for a piecewise approximation as shown in these slides.
I agree with ysap that this question cannot be answered as you may be expecting. There may be better interpolation methods, depending on your model dynamics - as with ysap, I recommend methods that utilize the underlying dynamics, if known.
Regarding the red/blue samples, I think you have made a good observation about sampled and interpolated data sets and I would challenge your original expectation that:
When computing the value at any time, we expect that both red and blue data sets will return similar values.
I do not expect this. If you assume that you cannot perfectly interpolate - and particularly if the interpolation error is large compared to the errors in samples - then you are certain to have a continuous error function that exhibits largest errors longest (time) from your sample points. Therefore two data sets that have differing sample points should exhibit the behaviour you see because points that are far (in time) from red sample points may be near (in time) to blue sample points and vice versa - if staggered as your points are, this is sure to be true. Thus I would expect what you show, that:
Viewed over time the values from each data set (red and blue) will appear to diverge and then converge about the original value.
(If you do not have information about underlying dynamics (except frequency content), then Giacomo's points on sampling are key - however, you need not interpolate if looking at info below Nyquist.)
When sampling the original continuous function, the sampling frequency should comply to the Nyquist-Shannon sampling theorem, otherwise the sampling process introduces an error (also known as aliasing). The error, being different in the two datasets, results in a different value when you interpolate.
Therefore, you need to know the highest frequency B of the original function and then collect samples with a frequency at least 2B. If your function has very high frequencies and you cannot sample that fast, you should at least try to filter them away before sampling.

Resources