fitting a L shape (corner) to points to remove outliers - math

I am trying to extract the length and width from a set lidar sensor points (pink) as shown in the image below. The points circled in blue and white are actually noise which I wish to eliminated. [The orange box is the length and width I currently have calculated from the points. As seen, the calculated width is much 1/3 wider than it is supposed to be, due to the noisy points i blue and white]
I've read some approaches to do corner/rectangle fitting, then discarding x% of the poorest fitting points. This would help me to get rid of the circled points. But so far, after many searches I still cannot find any concrete implementations on how to do this fitting.
Can any provide any suggestions how I can go about doing this?

Related

Area of n overlapping circle [duplicate]

I recently came across a problem where I had four circles (midpoints and radius) and had to calculate the area of the union of these circles.
Example image:
For two circles it's quite easy,
I can just calculate the fraction of the each circles area that is not within the triangles and then calculate the area of the triangles.
But is there a clever algorithm I can use when there is more than two circles?
Find all circle intersections on the outer perimeter (e.g. B,D,F,H on the following diagram). Connect them together with the centres of the corresponding circles to form a polygon. The area of the union of the circles is the area of the polygon + the area of the circle slices defined by consecutive intersection points and the circle center in between them. You'll need to also account for any holes.
I'm sure there is a clever algorithm, but here's a dumb one to save having to look for it;
put a bounding box around the circles;
generate random points within the bounding box;
figure out whether the random point is inside one of the circles;
compute the area by some simple addition and division (proportion_of_points_inside*area_of_bounding_box).
Sure it's dumb, but:
you can get as accurate an answer as you want, just generate more points;
it will work for any shapes for which you can calculate the inside/outside distinction;
it will parallelise beautifully so you can use all your cores.
Ants Aasma's answer gave the basic idea, but I wanted to make it a little more concrete. Take a look at the five circles below and the way they've been decomposed.
The blue dots are circle centers.
The red dots are circle boundary intersections.
The red dots with white interior are circle boundary intersections that are not contained in any other circles.
Identifying these 3 types of dots is easy. Now construct a graph data structure where the nodes are the blue dots and the red dots with white interior. For every circle, put an edge between the circle middle (blue dot) and each of its intersections (red dots with white interior) on its boundary.
This decomposes the circle union into a set of polygons (shaded blue) and circular pie pieces (shaded green) that are pairwise disjoint and cover the original union (that is, a partition). Since each piece here is something that's easy to compute the area of, you can compute the area of the union by summing the pieces' areas.
For a different solution from the previous one you could produce an estimation with an arbitrary precision using a quadtree.
This also works for any shape union if you can tell if a square is inside or outside or intersects the shape.
Each cell has one of the states : empty , full , partial
The algorithm consists in "drawing" the circles in the quadtree starting with a low resolution ( 4 cells for instance marked as empty). Each cell is either :
inside at least one circle, then mark the cell as full,
outside all circles, mark the cell as empty,
else mark the cell as partial.
When it's done, you can compute an estimation of the area : the full cells give the lower bound, the empty cells give the higher bound, the partial cells give the max area error.
If the error is too big for you, you refine the partial cells until you get the right precision.
I think this will be easier to implement than the geometric method which may require to handle a lot of special cases.
I love the approach to the case of 2 intersecting circles -- here's how i'd use a slight variation of the same approach for the more complex example.
It might give better insight into generalising the algorithm for larger numbers of semi-overlapping circles.
The difference here is that i start by linking the centres (so there's a vertice between the centre of the circles, rather than between the places where the circles intersect) I think this lets it generalise better.
(in practice, maybe the monte-carlo method is worthwhile)
(source: secretGeek.net)
If you want a discrete (as opposed to a continuous) answer, you could do something similar to a pixel painting algorithm.
Draw the circles on a grid, and then color each cell of the grid if it's mostly contained within a cirle (i.e., at least 50% of its area is inside one of the circles). Do this for the entire grid (where the grid spans all of the area covered by the circles), then count the number of colored cells in the grid.
Hmm, very interesting problem. My approach would probably be something along the lines of the following:
Work out a way of working out what the areas of intersection between an arbitrary number of circles is, i.e. if I have 3 circles, I need to be able to work out what the intersection between those circles is. The "Monte-Carlo" method would be a good way of approximating this (http://local.wasp.uwa.edu.au/~pbourke/geometry/circlearea/).
Eliminate any circles that are contained entirely in another larger circle (look at radius and the modulus of the distance between the centre of the two circles) I dont think is mandatory.
Choose 2 circles (call them A and B) and work out the total area using this formula:
(this is true for any shape, be it circle or otherwise)
area(A∪B) = area(A) + area(B) - area(A∩B)
Where A ∪ B means A union B and A ∩ B means A intersect B (you can work this out from the first step.
Now keep on adding circles and keep on working out the area added as a sum / subtraction of areas of circles and areas of intersections between circles. For example for 3 circles (call the extra circle C) we work out the area using this formula:
(This is the same as above where A has been replaced with A∪B)
area((A∪B)∪C) = area(A∪B) + area(C) - area((A∪B)∩C)
Where area(A∪B) we just worked out, and area((A∪B)∩C) can be found:
area((A∪B)nC) = area((A∩C)∪(B∩C)) = area(A∩C) + area(A∩B) - area((A∩C)∩(B∩C)) = area(A∩C) + area(A∩B) - area(A∩B∩C)
Where again you can find area(A∩B∩C) from above.
The tricky bit is the last step - the more circles get added the more complex it becomes. I believe there is an expansion for working out the area of an intersection with a finite union, or alternatively you may be able to recursively work it out.
Also with regard to using Monte-Carlo to approximate the area of itersection, I believe its possible to reduce the intersection of an arbitrary number of circles to the intersection of 4 of those circles, which can be calculated exactly (no idea how to do this however).
There is probably a better way of doing this btw - the complexity increases significantly (possibly exponentially, but I'm not sure) for each extra circle added.
There are efficient solutions to this problem using what are known as power diagrams. This is really heavy math though and not something that I would want to tackle offhand. For an "easy" solution, look up line-sweep algorithms. The basic principle here is that that you divide the figure up into strips, where calculating the area in each strip is relatively easy.
So, on the figure containing all of the circles with nothing rubbed out, draw a horizontal line at each position which is either the top of a circle, the bottom of a circle or the intersection of 2 circles. Notice that inside these strips, all of the areas you need to calculate look the same: a "trapezium" with two sides replaced by circular segments. So if you can work out how to calculate such a shape, you just do it for all the individual shapes and add them together. The complexity of this naive approach is O(N^3), where N is the number of circles in the figure. With some clever data structure use, you could improve this line-sweep method to O(N^2 * log(N)), but unless you really need to, it's probably not worth the trouble.
The pixel-painting approach (as suggested by #Loadmaster) is superior to the mathematical solution in a variety of ways:
Implementation is much simpler. The above problem can be solved in less than 100 lines of code, as this JSFiddle solution demonstrates (mostly because it’s conceptually much simpler, and has no edge cases or exceptions to deal with).
It adapts easily to more general problems. It works with any shape, regardless of morphology, as long as it’s renderable with 2D drawing libraries (i.e., “all of them!”) — circles, ellipses, splines, polygons, you name it. Heck, even bitmap images.
The complexity of the pixel-painting solution is ~O[n], as compared to ~O[n*n] for the mathematical solution. This means it will perform better as the number of shapes increases.
And speaking of performance, you’ll often get hardware acceleration for free, as most modern 2D libraries (like HTML5’s canvas, I believe) will offload rendering work to graphics accelerators.
The one downside to pixel-painting is the finite accuracy of the solution. But that is tunable by simply rendering to larger or smaller canvases as the situation demands. Note, too, that anti-aliasing in the 2D rendering code (often turned on by default) will yield better-than-pixel-level accuracy. So, for example, rendering a 100x100 figure into a canvas of the same dimensions should, I think, yield accuracy on the order of 1 / (100 x 100 x 255) = .000039% ... which is probably “good enough” for all but the most demanding problems.
<p>Area computation of arbitrary figures as done thru pixel-painting, in which a complex shape is drawn into an HTML5 canvas and the area determined by comparing the number of white pixels found in the resulting bitmap. See javascript source for details.</p>
<canvas id="canvas" width="80" height="100"></canvas>
<p>Area = <span id="result"></span></p>
// Get HTML canvas element (and context) to draw into
var canvas = document.getElementById('canvas');
var ctx = canvas.getContext('2d');
// Lil' circle drawing utility
function circle(x,y,r) {
ctx.beginPath();
ctx.arc(x, y, r, 0, Math.PI*2);
ctx.fill();
}
// Clear canvas (to black)
ctx.fillStyle = 'black';
ctx.fillRect(0, 0, canvas.width, canvas.height);
// Fill shape (in white)
ctx.fillStyle = 'white';
circle(40, 50, 40);
circle(40, 10, 10);
circle(25, 15, 12);
circle(35, 90, 10);
// Get bitmap data
var id = ctx.getImageData(0, 0, canvas.width, canvas.height);
var pixels = id.data; // Flat array of RGBA bytes
// Determine area by counting the white pixels
for (var i = 0, area = 0; i < pixels.length; i += 4) {
area += pixels[i]; // Red channel (same as green and blue channels)
}
// Normalize by the max white value of 255
area /= 255;
// Output result
document.getElementById('result').innerHTML = area.toFixed(2);
I have been working on a problem of simulating overlapping star fields, attempting to estimate the true star counts from the actual disk areas in dense fields, where the larger bright stars can mask fainter ones. I too had hoped to be able to do this by rigorous formal analysis, but was unable to find an algorithm for the task. I solved it by generating the star fields on a blue background as green disks, whose diameter was determined by a probability algorithm. A simple routine can pair them to see if there's an overlap (turning the star pair yellow); then a pixel count of the colours generates the observed area to compare to the theoretical area. This then generates a probability curve for the true counts. Brute force maybe, but it seems to work OK.
(source: 2from.com)
Here's an algorithm that should be easy to implement in practice, and could be adjusted to produce arbitrarily small error:
Approximate each circle by a regular polygon centered at the same point
Calculate the polygon which is the union of the approximated circles
Calculate the area of the merged polygon
Steps 2 and 3 can be carried out using standard, easy-to-find algorithms from computational geometry.
Obviously, the more sides you use for each approximating polygon, the closer to exact your answer would be. You could approximate using inscribed and circumscribed polygons to get bounds on the exact answer.
I found this link which may be useful. There does not seem to be a definitive answer though.
Google answers. Another reference for three circles is Haruki's theorem. There is a paper there as well.
Depending on what problem you are trying to solve it could be sufficient to get an upper and lower bound. An upper bound is easy, just the sum of all the circles. For a lower bound you can pick a single radius such that none of the circles overlap. To better that find the largest radius (up to the actual radius) for each circle so that it doesn't overlap. It should also be pretty trivial to remove any completely overlapped circles (All such circles satisfy |P_a - P_b| <= r_a) where P_a is the center of circle A, P_b is the center of circle B, and r_a is the radius of A) and this betters both the upper and lower bound. You could also get a better Upper bound if you use your pair formula on arbitrary pairs instead of just the sum of all the circles. There might be a good way to pick the "best" pairs (the pairs that result in the minimal total area.
Given an upper and lower bound you might be able to better tune a Monte-carlo approach, but nothing specific comes to mind. Another option (again depending on your application) is to rasterize the circles and count pixels. It is basically the Monte-carlo approach with a fixed distribution.
I've got a way to get an approximate answer if you know that all your circles are going to be within a particular region, i.e. each point in circle is inside a box whose dimensions you know. This assumption would be valid, for example, if all the circles are in an image of known size. If you can make this assumption, divide the region which contains your image into 'pixels'. For each pixel, compute whether it is inside at least one of the circles. If it is, increment a running total by one. Once you are done, you know how many pixels are inside at least one circle, and you also know the area of each pixel, so you can calculate the total area of all the overlapping circles.
By increasing the 'resolution' of your region (the number of pixels), you can improve your approximation.
Additionally, if the size of the region containing your circles is bounded, and you keep the resolution (number of pixels) constant, the algorithm runs in O(n) time (n is the number of circles). This is because for each pixel, you have to check whether it is inside each one of your n circles, and the total number of pixels is bounded.
This can be solved using Green's Theorem, with a complexity of n^2log(n).
If you're not familiar with the Green's Theorem and want to know more, here is the video and notes from Khan Academy. But for the sake of our problem, I think my description will be enough.
If I put L and M such that
then the RHS is simply the area of the Region R and can be obtained by solving the closed integral or LHS and this is exactly what we're going to do.
So Integrating along the path in the anticlockwise gives us the Area of the region and integrating along the clockwise gives us negative of the Area. So
AreaOfUnion = (Integration along red arcs in anticlockwise direction + Integration along blue arcs in clockwise direction)
But the cool trick is if for each circle if we integrate the arcs which are not inside any other circle we get our required area i.e. we get integration in an anticlockwise direction along all red arcs and integration along all blue arcs along the clockwise direction. JOB DONE!!!
Even the cases when a circle doesn't intersect with any other is taken
care of.
Here is the GitHub link to my C++ Code

Weights & Biases - How can I interpret the graphs when training BERT

Can someone help me to understand the amazing graphs generated by Weights & Biases tools when you are training a BERT model?
How can I interpret the above image? I don't know what the dispersion grey means, nor if the concentration in the blue region is good or bad.
Thanks in advance.
So those charts show the histograms of the gradients, per time step.
Take the leftmost chart, layer.10 weights. In very first slice at Step 0, the grey shading tells you that the gradients for that layer had values between ~ -40 and +40. The blue parts however tell you that most of those gradients were between -2 and +2 (roughly).
So the shading represents the count of gradients in that particular histogram bin, for that particular time step.
Now interpreting gradients can be tricky sometimes, but generally I find these plots useful to check that your gradients haven't exploded (big values on the y-axis) or collapsed (concentrated blue around 0 with little to no deviation). For example if you try train with a very high learning rate you should see the values on the y-axis go into the 100s or 1000s, indicating that your gradients are huge.
One final tip would be to focus more on the gradients from the weights as opposed to the biases as this can be more informative about what your model is doing.

Resize image using real world measurements

I'm working on a floor design app where the user can import a floor texture and the app will place the texture on to a room image.
I've managed to transform the perspective of the floor image so that it matches the room image - thanks to this answer, but I'm now stuck on scaling the floor image to match the room image dimensions.
I know the real dimensions of the wooden floor (177mm x 1220mm per plank), I know the height of an object in the room image (height of white tile near sink is 240mm) and I know the distance between the camera and the white tile (roughly 2500mm). The room image size is 2592x1936, the floor image size is 1430x1220.
The room image was taken with from an iPad air camera to which I can't seem to find any info regarding the focal length and sensor size, the nearest I could find was a 3.3 focal length with 3.6mm sensor height (this may be where I'm going wrong).
I tried using this equation
The numbers I plugged in to the equation,
2662 = (3.3 240 x 1936) / (160 x 3.6)
I then tried to work out the object height for a wooden plank in the floor image,
(3.3 x 1220 x 1936) / (2662 x 3.6) = 813 px
I then divided the image height by the object height to get a ratio = 2.38.
This image is with a 2.38 ratio applied to the floor image which isn't quite right.
I know I'm going wrong somewhere or going the complete wrong way about it, hope somebody can point me in the right direction.
Thanks
I'd extend the lines of the tile till they touch the edge where the back wall meets the floor. Using this technique you can transfer a length from the wall plane to an equal length in the floor plane. So at that point, all you have to do is match lengths along a single line, namely the lengths between planks and the lengths between your transferred points. But you have to do this in a projectively consistent fashion. The most versatile tool for projective measurements is the cross ratio. An application very similar to what you have here is described in How to calculate true lengths from perspective projection on Math SE. If your vanishing point on that line where the walls meet is indeed at infinity (which appears to be approximately the case in your setup), you can get by with some simpler computations, but unless you can guarantee that this will always be the case, I'd not rely on that.
The above will help you adjust the scale in one direction only. The direction perpendicular to that is still open, though. In your exaple that would be the depth direction, the direction away from the camera. Do you have any reference points for that direction? It looks to me as though you might be able to use one complete tile on the left wall, before the window starts. But depending on how the corner between the two walls is tiled, that might be slightly off.
To illustrate these ideas, look at the picture above. Since the red lines appear almost horizontal, seeing the effects of perspective there is pretty hard. Therefore I'll do the other direction. Suppose the tile in the corner is indeed the same visible size as all the other tiles on the wall. So you know the real world distance between A1 and B1. You project along the blue vertical lines (vertical in the real world, not neccessarily the image) down to A2 and B2 which is where the left wall plane meets the floor plane.
Why do they meet there? Well, the lines A1,A2 is where the left all meets the back wall. The line A2,A3 is where the back wall meets the floor. Both of these plane intersections are actually visible at least in part, which made drawing the lines possible. So at A2 all three planes meet, and connecting that to the far point F gives the third edge, where the left wall meets the floor.
Since the segments A1,B1 and A2,B2 are just vertically transported versions of one another in the real world, they have equals length. That transportation was along the left wall in the vertical direction. Now transport them again, this time in the floor plane and in the left-right direction. You do so using the red lines, which are either parallel or meet at a point (which is pretty far away in this example). These red lines A2,A3 and B2,B3 are parallel in the real world, and their distance is still the edge length of that tile.
Now start measuring something, e.g. distance between C and D. To do that, compute the cross ratio (F,A3;B3,C) which expresses the distance from A3 to C, expressed in multiples of the distance from A3 to B3, and using F as the point at infinity. Do the same for D, and then the difference will be the length from C to D, expressed in multiples of the distance from A3 to B3. So the distance between C and D is 4.42 tile edge lengths in this example. Scale your image to fit this figure.

Calculate a dynamic iteration value when zooming into a Mandelbrot

I'm trying to figure out how to automatically adjust the maximum iteration value when moving around in the Mandelbrot fractal.
All examples I've found uses a constant of 1000 or less but that's not enough when zooming into the fractal set.
Is there a way to determine the number of max_iterations based on for example where you are in the Mandelbrot space (x_start,x_end,y_start,y_end)?
One method I tried was to repetitively pre-process a small area in the region of the Mset boundary with increasing iterations until the percentage change in status from one repetition to the next was small. The problem was, that would vary in different places on the current map, since the "depth" varies across it. How to find the right place to do it? By logging the "deepest" boundary area during the previous generation (that will still be within the next zoom area).
But my best strategy was to avoid iterating wherever possible:
Away from the boundary of the Mset, areas of equal depth can be "contoured" and then filled with that depth. It was not an easy algorithm. Basically I followed a raster scan but when I detected a boundary of iteration change (examining all the neighbours to ensure I wasn't close the the edge of the Mset), I would switch to a curve-stitching method to iterate around a contour back to where it started (obviously not recalculating spots I already did), and then make a second pass filling in the raster lines within the countour with the iteration level. It was fraught with leaks but eventually I cracked it.
Within the Mset, I followed the same approach, because the very last thing you want to do is to plough across vast areas and hit the iteration limit.
The difficult area is close the the boundary, where the iteration results can't be related to smooth contours with the neighbours. The contour stitching method won't work here, since there is only ever 1 pixel of a particular depth.
Using the contour method also will have faults to the lower or Mset sides of this region, but since this area looks chaotic until you zoom deeper, I lived with that.
So having said all that, I simply set the iteration depth as high as I can tolerate, but perhaps you can combine my first paragraph with the area-filling techniques.
BTW colouring the region adjacent to the Mset looks terrible when an animated smooth playback of the zoom is attempted. For that reason I coloured this area in a grey scale, by comparing with neighbours. If there was too much difference, I coloured to 0x808080 at first, then adapted that depending on the predominance of the neighbours' depth. All requiring fine tuning!

How to deal with arbitrary size for Laplacian Pyramid?

Recently I had much fun with the Laplacian Pyramid algorithm (http://persci.mit.edu/pub_pdfs/pyramid83.pdf). But one big problem is that the original paper is limited to 2^m+1*2^n+1 images. My question is: What is the best way to deal with arbitrary w*h instead? I can think of a couple of options:
Up sample the input to the next 2^m+1,2^n+1 up front
Pad even lines. How exactly? Wouldn't it shift the signal?
Shift even lines by half a sample? Wouldn't it loose half a sample?
Does anybody have experience with this? What is the most practical and efficient approach? Also any pointers to papers dealing with this would be very welcome.
One approach is to create an image with a width and height equal to the next 2^m+1,2^n+1, but instead of up-sampling the image to fill the expanded dimensions, just place it in the top-left corner and fill the empty space to the right and below with a constant value (the average value for the image is a good choice for this). Then encode in the normal way, storing the original image dimensions along with the pyramid. When decoding, decode and then crop to the original size.
This won't introduce any visual artifacts or degradation because you aren't stretching or offsetting the image in any way.
Because the empty space to the right and below the original image is a constant value, the high-pass bands at each level in the image pyramid will be all zero in this area. So if you are using a compression scheme like run length encoding to store each level this will be automatically taken care off and these areas will be compressed to almost nothing. If not then you can simply store the top-left (potentially non-zero) area of each level and then fill out the rest with zeros when decoding.
You could find the min and max x and y bounding rectangle of the non-zero values for each level and store this along with the level, cropped to include only non-zero values. The decoder could also be optimized so that areas of the image that are going to be cropped away are not actually decoded in the first place, by only processing the top-left of each level.
Here's an illustration of the technique:
Instead of just filling the lower-right area with a flat color, you could fill it with horizontally and vertically mirrored copies of the image to the right and below, and a copy mirrored in both directions to the bottom-right, like this:
This will avoid the discontinuities of the first technique, although there will be a discontinuity in dx (e.g. if the value was gradually increasing from left to right it will suddenly be decreasing). Choosing a mirror that keeps dx constant and ddx zero will avoid this second-order discontinuity by linearly extrapolating the values.
Another technique, which is similar to what some JPEG encoders do to pad out an image to a whole number of MCU blocks, is to take the last pixel value of each row and repeat it, and likewise for columns, with the bottom-right-most pixel of the image used to fill the bottom-right area:
This last technique could easily be modified to extrapolate the gradient of values or even the gradient of gradients instead of just repeating the same value for the remainder of the row or column.

Resources