Resize image using real world measurements - math

I'm working on a floor design app where the user can import a floor texture and the app will place the texture on to a room image.
I've managed to transform the perspective of the floor image so that it matches the room image - thanks to this answer, but I'm now stuck on scaling the floor image to match the room image dimensions.
I know the real dimensions of the wooden floor (177mm x 1220mm per plank), I know the height of an object in the room image (height of white tile near sink is 240mm) and I know the distance between the camera and the white tile (roughly 2500mm). The room image size is 2592x1936, the floor image size is 1430x1220.
The room image was taken with from an iPad air camera to which I can't seem to find any info regarding the focal length and sensor size, the nearest I could find was a 3.3 focal length with 3.6mm sensor height (this may be where I'm going wrong).
I tried using this equation
The numbers I plugged in to the equation,
2662 = (3.3 240 x 1936) / (160 x 3.6)
I then tried to work out the object height for a wooden plank in the floor image,
(3.3 x 1220 x 1936) / (2662 x 3.6) = 813 px
I then divided the image height by the object height to get a ratio = 2.38.
This image is with a 2.38 ratio applied to the floor image which isn't quite right.
I know I'm going wrong somewhere or going the complete wrong way about it, hope somebody can point me in the right direction.
Thanks

I'd extend the lines of the tile till they touch the edge where the back wall meets the floor. Using this technique you can transfer a length from the wall plane to an equal length in the floor plane. So at that point, all you have to do is match lengths along a single line, namely the lengths between planks and the lengths between your transferred points. But you have to do this in a projectively consistent fashion. The most versatile tool for projective measurements is the cross ratio. An application very similar to what you have here is described in How to calculate true lengths from perspective projection on Math SE. If your vanishing point on that line where the walls meet is indeed at infinity (which appears to be approximately the case in your setup), you can get by with some simpler computations, but unless you can guarantee that this will always be the case, I'd not rely on that.
The above will help you adjust the scale in one direction only. The direction perpendicular to that is still open, though. In your exaple that would be the depth direction, the direction away from the camera. Do you have any reference points for that direction? It looks to me as though you might be able to use one complete tile on the left wall, before the window starts. But depending on how the corner between the two walls is tiled, that might be slightly off.
To illustrate these ideas, look at the picture above. Since the red lines appear almost horizontal, seeing the effects of perspective there is pretty hard. Therefore I'll do the other direction. Suppose the tile in the corner is indeed the same visible size as all the other tiles on the wall. So you know the real world distance between A1 and B1. You project along the blue vertical lines (vertical in the real world, not neccessarily the image) down to A2 and B2 which is where the left wall plane meets the floor plane.
Why do they meet there? Well, the lines A1,A2 is where the left all meets the back wall. The line A2,A3 is where the back wall meets the floor. Both of these plane intersections are actually visible at least in part, which made drawing the lines possible. So at A2 all three planes meet, and connecting that to the far point F gives the third edge, where the left wall meets the floor.
Since the segments A1,B1 and A2,B2 are just vertically transported versions of one another in the real world, they have equals length. That transportation was along the left wall in the vertical direction. Now transport them again, this time in the floor plane and in the left-right direction. You do so using the red lines, which are either parallel or meet at a point (which is pretty far away in this example). These red lines A2,A3 and B2,B3 are parallel in the real world, and their distance is still the edge length of that tile.
Now start measuring something, e.g. distance between C and D. To do that, compute the cross ratio (F,A3;B3,C) which expresses the distance from A3 to C, expressed in multiples of the distance from A3 to B3, and using F as the point at infinity. Do the same for D, and then the difference will be the length from C to D, expressed in multiples of the distance from A3 to B3. So the distance between C and D is 4.42 tile edge lengths in this example. Scale your image to fit this figure.

Related

How to find the sides of a rectangle if you know the sides of a quadrilateral inside the rectangle?

I'm working on an application that uses a accelerometer to measure the sides of a room, I know it will not be exact measurements but it's fine.
In reality I would like the program to be able to calculate the sides of any room shape not only rectangles and squares (and more than 4 corners), but I'm starting with something more simple (rectangle shaped rooms).
My problem is not with the accelerometer but more with the math aspect of the code. Because I measured the room by placing the phone on a wall and then going to the connected wall, I will get the measurements of a quadrilateral inside the rectangle. From there, if it's possible, I will get the measurements of the sides of the rectangle, but I don't really know how.
What I've tried so far:
Divided the quadrilateral inside the rectangle in half, to make 2 triangles. Then I calculated the diagonal using the Pythagoras theorem. Then I used the law of Cosines to calculate one of the angles, and did the same again to find another. Then found the 3rd angle using the 2 other angles (c=a+b-180). I did this for both triangles.
I don't know if this is the right approach and if I have missed something simple, or if I simply don't have enough information to solve for the sides of the rectangle. I have looked into some geometry and trigonometry math online and haven't find anything that gives me a solution. But like I said, maybe I missed something simple.
Any push in the right direction would be helpful.
The rectangle and the quadrilateral
The problem lacks a unique solution. Imagine placing a pair of calipers around the quadrilateral. You'll be able to rotate the calipers around it, and at each angle the calipers will be able to close to a different width. Each of those widths is a different possible room dimension.
You'll also never get an accurate position measurement using the inertial sensors in a phone to begin with. The accels and gyros aren't even close to accurate enough. GPS is, but only outdoors away from structures that cause multipathing artifacts. Quick and sloppy with a tape measure will win every time.

Weird phenomenon with three.js plane

This is the first question I've ever asked on here! Apologies in advance if I've done it wrong somehow.
I have written a program which stacks up spheres in three.js.
Each sphere starts with randomly generated (within certain bounds) x and z co-ordinates, and a y co-ordinate high above the ground plane. I casts rays from each of the sphere's vertices to see how far down it can fall before it intersects with an existing mesh.
For each sphere, I test it in 80 different random xz positions, see where it can fall the furthest, and then 'drop' it into that position.
This is intended to create bubble towers like this one:
However, I have noticed that when I make the bubble radius very small and the base dimensions of the tower large, this happens:
If I turn the recursions down from 80, this effect is less apparent. For some reason, three.js seems to think that the spheres can fall further at the corners of the base square. The origin is exactly at the center of the base square - perhaps this is relevant.
When I console log all the fall-distances I'm receiving from the raycaster, they are indeed larger the further away you get from the center of the square... but only at the 11th or 12th decimal place.
This is not so much a problem I am trying to solve (I could just round fall distances to the nearest 10th decimal place before I pick the largest one), but something I am very curious about. Does anyone know why this is happening? Has anybody come across something similar to this before?
EDIT:
I edited my code to shift everything so that the origin is no longer at the center of the base square:
So am I correct in thinking... this phenomenon is something to do with distance from the origin, rather than anything relating to the surface onto which the balls are falling?
Indeed, the pattern you are seeing is exactly because the corners and edges of the bottom of your tower are furthest from the origin where you are dropping the balls. You are creating a right triangle (see image below) in which the vertical "leg" is the line from the origin from which you are dropping the balls down to the point directly below on mesh floor (at a right angle to the floor - thus the name, right triangle). The hypotenuse is always the longest leg of a right triangle, and the futher out your rays cast from the point just below the origin, the longer the hypotenuse will be, and the more your algorithm will favor that longer distance (no matter how fractional).
Increasing the size of the tower base would exaggerate this effect as the hypotenuse measurements can now grow even larger. Reducing the size of the balls would also favor the pattern you are seeing, as now each ball is not taking up as much space, and so the distant measurments to the corners won't fill in as quickly as they would with larger balls so that now more balls will congregate at the edges before filling in the rest of the space.
Moving your dropping origin to one side or another creates longer distances (hypotenuses) to the opposites sides and corners, so that the balls will fill in those distant locations first.
The reason you see less of an effect when you reduce the sample size from 80 to say, 20, is that there are simply fewer chances to detect these more distant locations to which the balls could fall (an odds game).
A right triangle:
A back-of-the-napkin sketch:

determine rectangle rotation point

I would like to know how to compute rotation components of a rectangle in space according to four given points in a projection plane.
Hard to depict in a single sentence, thus I explain my needs.
I have a 3D world viewed from a static camera (located in <0,0,0>).
I have a known rectangular shape (an picture, actually) That I want to place in that space.
I only can define points (up to four) in a spherical/rectangular referencial (camera looking at <0°,0°> (sph) or <0,0,1000> (rect)).
I considere the given polygon to be my rectangle shape rotated (rX,rY,rZ). 3 points are supposed to be enough, 4 points should be too constraintfull. I'm not sure for now.
I want to determine rX, rY and rZ, the rectangle rotation about its center.
--- My first attempt at solving this constrint problem was to fix the first point: given spherical coordinates, I "project" this point onto a camera-facing plane at z=1000. Quite easy, this give me a point.
Then, the second point is considered to be on the <0,0,0>- segment, which is about an infinity of solution ; but I fix this by knowing the width(w) and height(h) of my rectangle: I then get two solutions for my second point ; one is "in front" of the first point, and the other is "far away"... I now have a edge of my rectangle. Two, in fact.
And from there, I don't know what to do. If in the end I have my four points, I don't have a clue about how to calculate the rotation equivalency...
It's hard to be lost in Mathematics...
To get an idea of the goal of all this: I make photospheres and I want to "insert" in them images. For instance, I got on my photo a TV screen, and I want to place a picture in the screen. I know my screen size (or I can guess it), I know the size of the image I want to place in (actually, it has the same aspect ratio), and I know the four screen corner positions in my space (spherical or euclidian). My software allow my to place an image in the scene and to rotate it as I want. I can zoom it (to give the feeling of depth)... I then can do all this manually, but it is a long try-fail process and never exact. I would like then to be able to type in the screen corner positions, and get the final image place and rotation attributes in a click...
The question in pictures:
Images presenting steps of the problem
Note that on the page, I present actual images of my app. I mean I had to manually rotate and scale the picture to get it fits the screen but it is not a photoshop. The parameters found are:
Scale: 0.86362
rX = 18.9375
rY = -12.5875
rZ = -0.105881
center position: <-9.55, 18.76, 1000>
Note: Rotation is not enought to set the picture up: we need scale and translation. I assume the scale can be found once a first edge is fixed (first two points help determining two solutions as initial constraints, and because I then know edge length and picture width and height, I can deduce scale. But the software is kind and allow me to modify picture width and height: thus the constraint is just to be sure the four points are descripbing a rectangle in space, with is simple to check with vectors. Here, the problem seems to place the fourth point as a valid rectangle corner, and then deduce rotation from that rectangle. About translation, it is the center (diagonal cross) of the points once fixed.

Calculating 2D angles for 3D objects in perspective

Imagine a photo, with the face of a building marked out.
Its given that the face of the building is a rectangle, with 90 degree corners. However, because its a photo, perspective will be involved and the parallel edges of the face will converge on the horizon.
With such a rectangle, how do you calculate the angle in 2D of the vectors of the edges of a face that is at right angles to it?
In the image below, the blue is the face marked on the photo, and I'm wondering how to calculate the 2D vector of the red lines of the other face:
example http://img689.imageshack.us/img689/2060/leslievillestarbuckscor.jpg
So if you ignore the picture for a moment, and concentrate on the lines, is there enough information in one of the face outlines - the interior angles and such - to know the path of the face on the other side of the corner? What would the formula be?
We know that both are rectangles - that is that each corner is a right angle - and that they are at right angles to each other. So how do you determine the vector of the second face using only knowledge of the position of the first?
It's quite easy, you should use basic 2 point perspective rules.
First of all you need 2 vanishing points, one to the left and one to the right of your object. They'll both stay on the same horizon line.
alt text http://img62.imageshack.us/img62/9669/perspectiveh.png
After having placed the horizon (that chooses the sight heigh) and the vanishing points (the positions of the points will change field of view) you can easily calculate where your lines go (of course you need to be able to calculate the line that crosses two points: i think you can do it)
Honestly, what I'd do is a Hough Transform on the image and determine a way to identify the red lines from the image. To find the red lines, I'd find any lines in the transform that touch your blue ones. The good thing about the transform is that you get angle information for free.
Since you know that you're looking at lines, you could also do a Radon Transform and look for peaks at particular angles; it's essentially the same thing.
Matlab has some nice functionality for this kind of work.

Normal Vector of Three Points

Hey math geeks, I've got a problem that's been stumping me for a while now. It's for a personal project.
I've got three dots: red, green, and blue. They're positioned on a cardboard slip such that the red dot is in the lower left (0,0), the blue dot is in the lower right (1,0), and the green dot is in the upper left. Imagine stepping back and taking a picture of the card from an angle. If you were to find the center of each dot in the picture (let's say the units are pixels), how would you find the normal vector of the card's face in the picture (relative to the camera)?
Now a few things I've picked up about this problem:
The dots (in "real life") are always at a right angle. In the picture, they're only at a right angle if the camera has been rotated around the red dot along an "axis" (axis being the line created by the red and blue or red and green dots).
There are dots on only one side of the card. Thus, you know you'll never be looking at the back of it.
The distance of the card to the camera is irrelevant. If I knew the depth of each point, this would be a whole lot easier (just a simple cross product, no?).
The rotation of the card is irrelevant to what I'm looking for. In the tinkering that I've been doing to try to figure this one out, the rotation can be found with the help of the normal vector in the end. Whether or not the rotation is a part of (or product of) finding the normal vector is unknown to me.
Hope there's someone out there that's either done this or is a math genius. I've got two of my friends here helping me on it and we've--so far--been unsuccessful.
i worked it out in my old version of MathCAD:
Edit: Wording wrong in screenshot of MathCAD: "Known: g and b are perpendicular to each other"
In MathCAD i forgot the final step of doing the cross-product, which i'll copy-paste here from my earlier answer:
Now we've solved for the X-Y-Z of the
translated g and b points, your
original question wanted the normal of
the plane.
If cross g x b, we'll get the
vector normal to both:
| u1 u2 u3 |
g x b = | g1 g2 g3 |
| b1 b2 b3 |
= (g2b3 - b2g3)u1 + (b1g3 - b3g1)u2 + (g1b2 - b1g2)u3
All the values are known, plug them in
(i won't write out the version with g3
and b3 substituted in, since it's just
too long and ugly to be helpful.
But in practical terms, i think you'll have to solve it numerically, adjusting gz and bz so as to best fit the conditions:
g · b = 0
and
|g| = |b|
Since the pixels are not algebraically perfect.
Example
Using a picture of the Apollo 13 astronauts rigging one of the command module's square Lithium Hydroxide cannister to work in the LEM, i located the corners:
Using them as my basis for an X-Y plane:
i recorded the pixel locations using Photoshop, with positive X to the right, and positive Y down (to keep the right-hand rule of Z going "into" the picture):
g = (79.5, -48.5, gz)
b = (-110.8, -62.8, bz)
Punching the two starting formulas into Excel, and using the analysis toolpack to "minimize" the error by adjusting gz and bz, it came up with two Z values:
g = (79.5, -48.5, 102.5)
b = (-110.8, -62.8, 56.2)
Which then lets me calcuate other interesting values.
The length of g and b in pixels:
|g| = 138.5
|b| = 139.2
The normal vector:
g x b = (3710, -15827, -10366)
The unit normal (length 1):
uN = (0.1925, -0.8209, -0.5377)
Scaling normal to same length (in pixels) as g and b (138.9):
Normal = (26.7, -114.0, -74.7)
Now that i have the normal that is the same length as g and b, i plotted them on the same picture:
i think you're going to have a new problem: distortion introduced by the camera lens. The three dots are not perfectly projected onto the 2-dimensional photographic plane. There's a spherical distortion that makes straight lines no longer straight, makes equal lengths no longer equal, and makes the normals slightly off of normal.
Microsoft research has an algorithm to figure out how to correct for the camera's distortion:
A Flexible New Technique for Camera Calibration
But it's beyond me:
We propose a flexible new technique to
easily calibrate a camera. It is well
suited for use without specialized
knowledge of 3D geometry or computer
vision. The technique only requires
the camera to observe a planar pattern
shown at a few (at least two)
different orientations. Either the
camera or the planar pattern can be
freely moved. The motion need not be
known. Radial lens distortion is
modeled. The proposed procedure
consists of a closed-form solution,
followed by a nonlinear refinement
based on the maximum likelihood
criterion. Both computer simulation
and real data have been used to test
the proposed technique, and very good
results have been obtained. Compared
with classical techniques which use
expensive equipments such as two or
three orthogonal planes, the proposed
technique is easy to use and flexible.
It advances 3D computer vision one
step from laboratory environments to
real world use.
They have a sample image, where you can see the distortion:
(source: microsoft.com)
Note
you don't know if you're seeing the "top" of the cardboard, or the "bottom", so the normal could be mirrored vertically (i.e. z = -z)
Update
Guy found an error in the derived algebraic formulas. Fixing it leads to formulas that i, don't think, have a simple closed form. This isn't too bad, since it can't be solved exactly anyway; but numerically.
Here's a screenshot from Excel where i start with the two knowns rules:
g · b = 0
and
|g| = |b|
Writing the 2nd one as a difference (an "error" amount), you can then add both up and use that value as a number to have excel's solver minimize:
This means you'll have to write your own numeric iterative solver. i'm staring over at my Numerical Methods for Engineers textbook from university; i know it contains algorithms to solve recursive equations with no simple closed form.
From the sounds of it, you have three points p1, p2, and p3 defining a plane, and you want to find the normal vector to the plane.
Representing the points as vectors from the origin, an equation for a normal vector would be
n = (p2 - p1)x(p3 - p1)
(where x is the cross-product of the two vectors)
If you want the vector to point outwards from the front of the card, then ala the right-hand rule, set
p1 = red (lower-left) dot
p2 = blue (lower-right) dot
p3 = green (upper-left) dot
# Ian Boyd...I liked your explanation, only I got stuck on step 2, when you said to solve for bz. You still had bz in your answer, and I don't think you should have bz in your answer...
bz should be +/- square root of gx2 + gy2 + gz2 - bx2 - by2
After I did this myself, I found it very difficult to substitute bz into the first equation when you solved for gz, because when substituting bz, you would now get:
gz = -(gxbx + gyby) / sqrt( gx2 + gy2 + gz2 - bx2 - by2 )
The part that makes this difficult is that there is gz in the square root, so you have to separate it and combine the gz together, and solve for gz Which I did, only I don't think the way I solved it was correct, because when I wrote my program to calculate gz for me, I used your gx, and gy values to see if my answer matched up with yours, and it did not.
So I was wondering if you could help me out, because I really need to get this to work for one of my projects. Thanks!
Just thinking on my feet here.
Your effective inputs are the apparent ratio RB/RG [+], the apparent angle BRG, and the angle that (say) RB makes with your screen coordinate y-axis (did I miss anything). You need out the components of the normalized normal (heh!) vector, which I believe is only two independent values (though you are left with a front-back ambiguity if the card is see through).[++]
So I'm guessing that this is possible...
From here on I work on the assumption that the apparent angle of RB is always 0, and we can rotate the final solution around the z-axis later.
Start with the card positioned parallel to the viewing plane and oriented in the "natural" way (i.e. you upper vs. lower and left vs. right assignments are respected). We can reach all the interesting positions of the card by rotating by \theta around the initial x-axis (for -\pi/2 < \theta < \pi/2), then rotating by \phi around initial y-axis (for -\pi/2 < \phi < \pi/2). Note that we have preserved the apparent direction of the RB vector.
Next step compute the apparent ratio and apparent angle after in terms of \theta and \phi and invert the result.[+++]
The normal will be R_y(\phi)R_x(\theta)(0, 0, 1) for R_i the primitive rotation matrix around axis i.
[+] The absolute lengths don't count, because that just tells you the distance to card.
[++] One more assumption: that the distance from the card to view plane is much large than the size of the card.
[+++] Here the projection you use from three-d space to the viewing plane matters. This is the hard part, but not something we can do for you unless you say what projection you are using. If you are using a real camera, then this is a perspective projection and is covered in essentially any book on 3D graphics.
right, the normal vector does not change by distance, but the projection of the cardboard on a picture does change by distance (Simple: If you have a small cardboard, nothing changes.
If you have a cardboard 1 mile wide and 1 mile high and you rotate it so that one side is nearer and the other side more far away, the near side is magnified and the far side shortened on the picture. You can see that immediately that an rectangle does not remain a rectangle, but a trapeze)
The mostly accurate way for small angles and the camera centered on the middle is to measure the ratio of the width/height between "normal" image and angle image on the middle lines (because they are not warped).
We define x as left to right, y as down to up, z as from far to near.
Then
x = arcsin(measuredWidth/normWidth) red-blue
y = arcsin(measuredHeight/normHeight) red-green
z = sqrt(1.0-x^2-y^2)
I will calculate tomorrow a more exact solution, but I'm too tired now...
You could use u,v,n co-oridnates. Set your viewpoint to the position of the "eye" or "camera", then translate your x,y,z co-ordinates to u,v,n. From there you can determine the normals, as well as perspective and visible surfaces if you want (u',v',n'). Also, bear in mind that 2D = 3D with z=0. Finally, make sure you use homogenious co-ordinates.

Resources