I'm trying to derive the matrix of a rigid transform to map between two coordinate spaces. I have the origin and the axis directions of the target coordinate space in terms of the known coordinate space; does anyone know how I can solve for the 4x4 rigid transformation matrix given these?
So, in other words, I have two coordinate spaces, A and B, and I know
Point3D originOfBInA;
Vector3D xAxisOfBInA; // Unit vector
Vector3D yAxisOfBInA; // Unit vector
Vector3D yAxisOfBInA; // Unit vector
And I'm trying to find the 4x4 matrix
Matrix4x4 AtoB;
First construct the 4x4 matrix for the change of basis (call it M) by using your unit vectors (Ax, Ay, Az) and the origin (T) as column vectors:
M =
[Ax Ay Az T] <-- 3x4
[0 0 0 1]
To transform the coordinates of a point p (specified with respect to frame A) to q (with respect to frame B), just multiply by the inverse of M:
q = M-1p
Related
I'm studying linear algebra and I discovered that linear transformations are often used in video games.
I tried to calculate the associated matrix with the transformation that translates a point (x y z) by a vector (x y z) and I came to the conclusion that that transformation is not linear because, given p1, p2 and a translation vector v ∊ V:
T(v1 + v2) = v1 + v2 + p ≠ T(v1) + T(v2)
I navigated online and I found that 3D coordinates (x y z) are translated in a vector (x y z 1) but, given v1 and v2 ∊ V:
v1 + v2 = (x1 + x2, y1 + y2, z1 + z2, 2)
V is not even a vector space
My question is: why do I get these wrong results?
Thanks for all.
In the vector with homogeneous coordinate format, (x y z 1),
(x/1 y/1 z/1) are 3D cartesian coordinates, and 1 is the scaling factor.
We divide the first three values by the scaling factor to get the vector in cartesian coordinate format, (x y z). Homogeneous coordinates can be useful for efficient arbitrary precision with rational coordinates, and eloquent algebra with nice properties like linearity here.
When we want to translate a point by adding a vector, that's natural with cartesian coordinates. With all point/vector coordinates in cartesian format,
addition of a vector, v, to a point, p1, is a translation, T1, to some point, p2, such that
T1(p1) = p1 + v = p2
You are correct that something like T1 isn't linear.
When we want to translate a point by multiplying a matrix, it's more natural to think in homogeneous coordinates. This matrix, A, would be the identity matrix, but its last column is the vector, v, in homogeneous form. With points in homogeneous format, we can represent the transform, T2, as
T2(p1) = A * p1 = p2
With T2, we do have a linear transform.
Your results after not wrong: The translation of a point by a vector is not a linear transformation.
Translating a point by a vector is an affine transformation and it's done in an affine space. An affine space can be loosely defined as a set of points together with a vector space, where you can add a vector to a point and get another point as a result. Adding points to points is not allowed.
One way to construct an affine space is by taking a projective space whose elements are represented with homogeneous coordinates. These concepts come from the beautiful field of projective geometry, but a full explanation does not fit in a stack overflow post.
A more direct way to construct an affine space is by taking a vector space and adding one extra bit of information: take vectors of the form (x y z 1) as the points, and vectors of the form (x y z 0) as the vectors. Note that the points do not form a vector space, but the vectors do, and that if you add a vector to a point, the result is another point.
With this representation of points and vectors, translation of a point p by a vector can be written as a matrix multiplication T*p. The matrix T for translating by vector (x y z 0) is:
1 0 0 x
0 1 0 y
0 0 1 z
0 0 0 1
Note that this is still not a linear transform because points do not form a vector space.
I have a position in space called X1. X1 has a velocity called V1. I need to construct an orthogonal plane perpendicular to the velocity vector. The origin of the plane is X1.
I need to turn the two edges from the plane into two vectors, E1 and E2. The edges connect at the origin. So the three vectors form an axis.
I'm using the GLM library for the vector mathematics.
One way to create a frame from a vector is to use Householder transformations. This may seem complicated but the code is in quite short, at least as efficient as using cross products, and less prone to rounding error. Moreover exactly the same idea works in any number of dimensions.
The ideas is, given a vector v, find a Householder transformation that maps v to a multiple of (1,0,0), and then apply the inverse of this to (0,1,0) and (0,0,1) to get the other frame vectors. Since a Householder transformation is it's own inverse, and since they are simple to apply, the resulting code is fairly efficient. Below is C code that I use:
static void make_frame( const double* v, double* f)
{
double lv = hypot( hypot( v[0], v[1]), v[2]); // length of v
double s = v[0] > 0.0 ? -1.0 : 1.0;
double h[3] = { v[0] - s*lv, v[1], v[2]}; // householder vector for Q
double a = 1.0/(lv*(lv + fabs( v[0]))); // == 2/(h'*h)
double b;
// first frame vector is v normalised
b = 1.0/lv;
f[3*0+0] = b*v[0]; f[3*0+1] = b*v[1]; f[3*0+2] = b*v[2];
// compute other frame vectors by applying Q to (0,1,0) and (0,0,1)
b = -v[1]*a;
f[3*1+0] = b*h[0]; f[3*1+1] = 1.0 + b*h[1]; f[3*1+2] = b*h[2];
b = -v[2]*a;
f[3*2+0] = h[0]*b; f[3*2+1] = b*h[1]; f[3*2+2] = 1.0 + b*h[2];
}
In general you can define a plane in 3D using four numbers, e.g., Ax+By+Cz=D. You can think of the triple of numbers (A,B,C) as a vector that sticks out perpendicularly to the plane (called the normal vector).
The normal vector n = (A,B,C) only defines the orientation of the plane, so depending on the choice of the constant D you get planes at different distance from the origin.
If I understand your question correctly, the plane you're looking for has normal vector (A,B,C) = V1 and the constant D is obtained using the dot product: D = (A,B,C) . X1, i.e., D = AX1.x + BX1.y + C*X1.z.
Note you can also obtain the same result using the geometric equation of a plane n . ((x,y,z) - p0) = 0, where p0 is some point on the plane, which in your case is V1 . ( (x,y,z) - X1) = 0.
So I want to compute the volume of spheres (unit balls), cuboids (cubes) that are transformed using arbitrary transformation matrices.
e.g.: I got my Sphere with a radius of 1 in the center of my 3D-Space. Then i got a Transformation to apply to that Sphere. What would the volume (resp. radius) be after that? How can i extract that information from the Transformation Matrix? I know that translation and rotation matrices wont affect that, but scaling matrices will.
Thanks in Advance!
The transformation's determinant specifies how the volume of any object changes. If the determinant is 1, the volume is preserved. If it is negative, there is a mirroring included which results in a reversed face order.
Only use the linear part of the matrix to calculate the determinant (disregarding translations and perspective transformations).
Another measure that you might be interested in are the matrice's eigen values and eigen vectors (or singular values). They specify the orientation of the ellipsoid that a matrix transforms a unit sphere into.
If the transformation is a rigid motion, then the mass is unaffected by the transform. If the transform is the composition of a rigid motion and a uniform scaling, then the volume is multiplied by the cube of the scale factor.
That is, if S is a homogeneous scaling transform in the form:
[ s 0 0 0 ]
[ 0 s 0 0 ]
S = [ 0 0 s 0 ]
[ 0 0 0 1 ]
and R is a rigid homogeneous transform, then the mass m of a body transformed by S.R or R.S is m*s^3.
Other than these two special cases (rigid motion and uniform scaled motion), and ignoring trivial cases such as the case where the transform maps all points to a plane, there is no simple formula for the transformed volumes.
OpenGL transform matrix is stored like this:
double m[16]; // it is 4x4 matrix stored as 1 dimensional array for speed
m[0]=xx; m[4]=yx; m[ 8]=zx; m[12]=x0;
m[1]=xy; m[5]=yy; m[ 9]=zy; m[13]=y0;
m[2]=xz; m[6]=yz; m[10]=zz; m[14]=z0;
m[3]= 0; m[7]= 0; m[11]= 0; m[15]= 1;
where:
X(xx,xy,xz) is unit vector of X axis in GCS (global coordinate system)
Y(yx,yy,yz) is unit vector of Y axis in GCS
Z(zx,zy,zz) is unit vector of Z axis in GCS
P(x0,y0,z0) is origin of represented coordinate system in GCS
when orthogonal transforms are applied (rotation,translation,scale...)
then shape is 'preserved'
cube became cuboid and sphere became ellipsoid
then the size of axises vector is the scale in that axis
so if Volume of object before transformation is V0
then Volume after transform is:
V1=V0*|X|*|Y|*|Z|
when non orthogonal transforms are applied (skew,projections,...)
skew is one of the non-orthogonal but linear transforms
Volume can be computed as for orthogonal transforms
but the shape will be broken.
because non orthogonal transforms change
not only size but the shape itself !!!
non-linear transforms like projections always broke the Volume nonlinearly
in this case you have to compute the Volume the hard way
transform object surface and compute the Volume via integration ...
How to determine orthogonality of transform?
easy transforms are orthogonal if X,Y,Z vectors are perpendicular
to each other then transform is linear so:
XY = X x Y; XY /= |XY|; Z0=Z/|Z|;
XZ = X x Z; XZ /= |XZ|; Y0=Y/|Y|;
YZ = Y x Z; YZ /= |YZ|; X0=X/|X|;
if ((XY!=Z0)&&(XY!=-Z0)) non linear;
if ((XZ!=Y0)&&(XZ!=-Y0)) non linear;
if ((YZ!=X0)&&(YZ!=-X0)) non linear;
do not forget to do the comparisons with some accuracy
to avoid FP problems
How to detect non linearity of transform?
create a set of points evenly placed inside some cube (same distance between them)
transform them
now if the point distances are not the same along new axises
but change with distance from origin then transform is non-linear
Let's say I have two points in 3D space (a and b) and a fixed axis/unit vector called n.
I want to create a rotation matrix that minimizes the euclidan distance between point a (unrotated) and the rotated point b.
E.g:
Q := matrix_from_axis_and_angle (n, alpha);
find the unknown alpha that minimizes sqrt(|a - b*Q|)
Btw - If a solution/algorithm can be easier expressed with unit-quaternions go ahead and use them. I just used matrices to formulate my question because they're more widely used.
Oh - I know there are some degenerated cases ( a or b lying exactly in line with n ect.) These can be ignored. I'm just looking for the case where a single solution can be calculated.
sounds fairly easy. Assume unit vector n implies rotation around a line parallel to n through point x0. If x0 != the origin, translate the coordinate system by -x0 to get points a' and b' relative to new coordinate system origin 0, and use those 2 points instead of a and b.
1) calculate vector ry = n x a
2) calculate unit vector uy = unit vector in direction ry
3) calculate unit vector ux = uy x n
You now have a triplet of mutually perpendicular unit vectors ux, uy, and n, which form a right-handed coordinate system. It can be shown that:
a = dot(a,n) * n + dot(a,ux) * ux
This is because unit vector uy is parallel to ry which is perpendicular to both a and n. (from step 1)
4) Calculate components of b along unit vectors ux, uy. a's components are (ax,0) where ax = dot(a,ux). b's components are (bx,by) where bx = dot(b,ux), by = dot(b,uy). Because of the right-handed coordinate system, ax is always positive so you don't actually need to calculate it.
5) Calculate theta = atan2(by, bx).
Your rotation matrix is the one which rotates by angle -theta relative to coordinate system (ux,uy,n) around the n-axis.
This yields degenerate answers if a is parallel to n (steps 1 and 2) or if b is parallel to n (steps 4, 5).
I think you can rephrase the question to:
what is the distance from a point to a 2d circle in 3d space.
the answer can be found here
so the steps needed are as following:
rotating the point b around a vector n gives you a 2d circle in 3d space
using the above, find the distance to that circle (and the point on the circle)
the point on the circle is the rotated point b you are looking for.
deduce the rotated angle
...or something ;^)
The distance will be minimized when the vector from a to the line along n lines up with the vector from b to the line along n.
Project a and b into the plane perpendicular to n and solve the problem in 2 dimensions. The rotation you get there is the rotation you need to minimize the distance.
Let P be the plane that is perpendicular to n.
We can find the projection of a into the P-plane, (and similarly for b):
a' = a - (dot(a,n)) n
b' = b - (dot(b,n)) n
where dot(a,n) is the dot-product of a and n
a' and b' lie in the P-plane.
We've now reduced the problem to 2 dimensions. Yay!
The angle (of rotation) between a' and b' equals the angle (of rotation) needed to swing b around the n-axis so as to be closest to a. (Think about the shadows b would cast on the P-plane).
The angle between a' and b' is easy to find:
dot(a',b') = |a'| * |b'| * cos(theta)
Solve for theta.
Now you can find the rotation matrix given theta and n here:
http://en.wikipedia.org/wiki/Rotation_matrix
Jason S rightly points out that once you know theta, you must still decide to rotate b clockwise or counterclockwise about the n-axis.
The quantity, dot((a x b),n), will be a positive quantity if (a x b) lies in the same direction as n, and negative if (a x b) lies in the opposite direction. (It is never zero as long as neither a nor b is collinear with n.)
If (a x b) lies in the same direction as n, then b has to be rotated clockwise by the angle theta about the n-axis.
If (a x b) lies in the opposite direction, then b has to be rotated clockwise by the angle -theta about the n-axis.
I have an input 3D vector, along with the pitch and yaw of the camera. Can anyone describe or provide a link to a resource that will help me understand and implement the required transformation and matrix mapping?
The world-to-camera transformation matrix is the inverse of the camera-to-world matrix. The camera-to-world matrix is the combination of a translation to the camera's position and a rotation to the camera's orientation. Thus, if M is the 3x3 rotation matrix corresponding to the camera's orientation and t is the camera's position, then the 4x4 camera-to-world matrix is:
M00 M01 M02 tx
M10 M11 M12 ty
M20 M21 M22 tz
0 0 0 1
Note that I've assumed that vectors are column vectors which are multiplied on the right to perform transformations. If you use the opposite convention, make sure to transpose the matrix.
To find M, you can use one of the formulas listed on Wikipedia, depending on your particular convention for roll, pitch, and yaw. Keep in mind that those formulas use the convention that vectors are row vectors which are multiplied on the left.
Instead of computing the camera-to-world matrix and inverting it, a more efficient (and numerically stable) alternative is to calculate the world-to-camera matrix directly. To do so, just invert the camera's position (by negating all 3 coordinates) and its orientation (by negating the roll, pitch, and yaw angles, and adjusting them to be in their proper ranges), and then compute the matrix using the same algorithm.
If we have a structure like this to describe a 4x4 matrix:
class Matrix4x4
{
public:
union
{
struct
{
Type Xx, Xy, Xz, Xw;
Type Yx, Yy, Yz, Yw;
Type Zx, Zy, Zz, Zw;
Type Wx, Wy, Wz, Ww;
};
struct
{
Vector3<Type> Right;
Type XW;
Vector3<Type> Up;
Type YW;
Vector3<Type> Look;
Type ZW;
Vector3<Type> Pos;
Type WW;
};
Type asDoubleArray[4][4];
Type asArray[16];
};
};
If all you have is Euler angles, that is an angles representing the yaw, pitch, and roll and a point in 3d space for the position, you can calculate the Right, Up, and Look vectors. Note that Right, Up, and Look are just the X,Y,Z Vectors, but since this is a camera, I find it easier to name it so. The simplest way to apply your roations to the camera matrix is to build a series of rotation matrices and multiply our camera matrix by each rotation matrix.
A good reference for that is here: http://www.euclideanspace.com
Once you have applied all the needed rotations, you can set the vector Pos to the camera's position in the world space.
Lastly, before you apply the camera's transformation, you need to take the camera's inverse of its matrix. This is what you are going to multiply your modelview matrix by before you start drawing polygons. For the matrix class above, the inverse is calculated like this:
template <typename Type>
Matrix4x4<Type> Matrix4x4<Type>::OrthoNormalInverse(void)
{
Matrix4x4<Type> OrthInv;
OrthInv = Transpose();
OrthInv.Xw = 0;
OrthInv.Yw = 0;
OrthInv.Zw = 0;
OrthInv.Wx = -(Right*Pos);
OrthInv.Wy = -(Up*Pos);
OrthInv.Wz = -(Look*Pos);
return OrthInv;
}
So finally, with all our matrix constuction out of the way, you would be doing something like this:
Matrix4x4<float> cameraMatrix, rollRotation, pitchRotation, yawRotation;
Vector4<float> cameraPosition;
cameraMatrix = cameraMatrix * rollRotation * pitchRotation * yawRotation;
Matrix4x4<float> invCameraMat;
invCameraMat = cameraMatrix.OrthoNormalInverse();
glMultMatrixf(invCameraMat.asArray);
Hope this helps.
What you are describing is called 'Perspective Projection' and there are reams of resources on the web that explain the matrix math and give the code necessary to do this. You could start with the wikipedia page