OpenGL Math - Projecting Screen space to World space coords - math

Time for a little bit of math for the end of the day..
I need to project 4 points of the window size:
<0,0> <1024,768>
Into a world space coordinates so it will form a quadrilateral shape that will later be used for terrain culling - without GluUnproject
For test only, I use mouse coordinates - and try to project them onto the world coords

RESOLVED
Here's how to do it exactly, step by step.
Obtain your mouse coordinates within the client area
Get your Projection matrix and View matrix if no Model matrix required.
Multiply Projection * View
Inverse the results of multiplication
Construct a vector4 consisting of
x = mouseposition.x within a range of window x
transform to values between -1 and 1
y = mouseposition.y within a range of window y
transform to values between -1 and 1
remember to invert mouseposition.y if needed
z = the depth value ( this can be obtained with glReadPixel)
you can manually go from -1 to 1 ( zNear, zFar )
w = 1.0
Multiply the vector by inversed matrix created before
Divide result vector by it's w component after matrix multiplication ( perspective division )
POINT mousePos;
GetCursorPos(&mousePos);
ScreenToClient( this->GetWindowHWND(), &mousePos );
CMatrix4x4 matProjection = m_pCamera->getViewMatrix() * m_pCamera->getProjectionMatrix() ;
CMatrix4x4 matInverse = matProjection.inverse();
float in[4];
float winZ = 1.0;
in[0]=(2.0f*((float)(mousePos.x-0)/(this->GetResolution().x-0)))-1.0f,
in[1]=1.0f-(2.0f*((float)(mousePos.y-0)/(this->GetResolution().y-0)));
in[2]=2.0* winZ -1.0;
in[3]=1.0;
CVector4 vIn = CVector4(in[0],in[1],in[2],in[3]);
pos = vIn * matInverse;
pos.w = 1.0 / pos.w;
pos.x *= pos.w;
pos.y *= pos.w;
pos.z *= pos.w;
sprintf(strTitle,"%f %f %f / %f,%f,%f ",m_pCamera->m_vPosition.x,m_pCamera->m_vPosition.y,m_pCamera->m_vPosition.z,pos.x,pos.y,pos.z);
SetWindowText(this->GetWindowHWND(),strTitle);

I had to make some adjustments to the answers provided here. But here's the code I ended up with (Note I'm using GLM, that could affect multiplication order). nearResult is the projected point on the near plane and farResult is the projected point on the far plane. I want to perform a ray cast to see what my mouse is hovering over so I convert them to a direction vector which will then originate from my camera's position.
vec3 getRayFromScreenSpace(const vec2 & pos)
{
mat4 invMat= inverse(m_glData.getPerspective()*m_glData.getView());
vec4 near = vec4((pos.x - Constants::m_halfScreenWidth) / Constants::m_halfScreenWidth, -1*(pos.y - Constants::m_halfScreenHeight) / Constants::m_halfScreenHeight, -1, 1.0);
vec4 far = vec4((pos.x - Constants::m_halfScreenWidth) / Constants::m_halfScreenWidth, -1*(pos.y - Constants::m_halfScreenHeight) / Constants::m_halfScreenHeight, 1, 1.0);
vec4 nearResult = invMat*near;
vec4 farResult = invMat*far;
nearResult /= nearResult.w;
farResult /= farResult.w;
vec3 dir = vec3(farResult - nearResult );
return normalize(dir);
}

Multiply all your matrices. Then invert the result. Point after projection are always in the -1,1. So the four corner screen points are -1,-1; -1,1; 1,-1;1,1. But you still need to choose th z value. If you are in OpenGL, z is between -1 and 1. For directx, the range is 0 to 1. Finally take your points and transform them with the matrix

If you have access to the glu libraries, use gluUnProject(winX, winY, winZ, model, projection, viewport, &objX, &objY, &objZ);
winX and winY will be the corners of your screen in pixels. winZ is a number in [0,1] which will specify where between zNear and zFar (clipping planes) the points should fall. objX-Z will hold the results. The middle variables are the relevant matrices. They can be queried if needed.

Related

How to calculate ray in real-world coordinate system from image using projection matrix?

Given n images and a projection matrix for each image, how can i calculate the ray (line) emitted by each pixel of the images, which is intersecting one of the three planes of the real-world coordinate system? The object captured by the camera is at the same position, just the camera's position is different for each image. That's why there is a separate projection matrix for each image.
As far as my research suggests, this is the inverse of the 3D to 2D projection. Since information is lost when projecting to 2D, it's only possible to calculate the ray (line) in the real-world coordinate system, which is fine.
An example projection matrix P, that a calculated based on given K, R and t component, according to K*[R t]
3310.400000 0.000000 316.730000
K= 0.000000 3325.500000 200.550000
0.000000 0.000000 1.000000
-0.14396457836077139000 0.96965263281337499000 0.19760617153779569000
R= -0.90366580603479685000 -0.04743335255026152200 -0.42560419233334673000
-0.40331536459778505000 -0.23984130575212276000 0.88306936201487163000
-0.010415508744
t= -0.0294278883669
0.673097816109
-604.322 3133.973 933.850 178.711
P= -3086.026 -205.840 -1238.247 37.127
-0.403 -0.240 0.883 0.673
I am using the "DinoSparseRing" data set available at http://vision.middlebury.edu/mview/data
for (int i = 0; i < 16; i++) {
RealMatrix rotationMatrix = MatrixUtils.createRealMatrix(rotationMatrices[i]);
RealVector translationVector = MatrixUtils.createRealVector(translationMatrices[i]);
// construct projection matrix according to K*[R t]
RealMatrix projMatrix = getP(kalibrationMatrices[i], rotationMatrices[i], translationMatrices[i]);
// getM returns the first 3x3 block of the 3x4 projection matrix
RealMatrix projMInverse = MatrixUtils.inverse(getM(projMatrix));
// compute camera center
RealVector c = rotationMatrix.transpose().scalarMultiply(-1.f).operate(translationVector);
// compute all unprojected points and direction vector per project point
for (int m = 0; m < image_m_num_pixel; m++) {
for (int n = 0; n < image_n_num_pixel; n++) {
double[] projectedPoint = new double[]{
n,
m,
1};
// undo perspective divide
projectedPoint[0] *= projectedPoint[2];
projectedPoint[1] *= projectedPoint[2];
// undo projection by multiplying by inverse:
RealVector projectedPointVector = MatrixUtils.createRealVector(projectedPoint);
RealVector unprojectedPointVector = projMInverse.operate(projectedPointVector);
// compute direction vector
RealVector directionVector = unprojectedPointVector.subtract(c);
// normalize direction vector
double dist = Math.sqrt((directionVector.getEntry(0) * directionVector.getEntry(0))
+ (directionVector.getEntry(1) * directionVector.getEntry(1))
+ (directionVector.getEntry(2) * directionVector.getEntry(2)));
directionVector.setEntry(0, directionVector.getEntry(0) * (1.0 / dist));
directionVector.setEntry(1, directionVector.getEntry(1) * (1.0 / dist));
directionVector.setEntry(2, directionVector.getEntry(2) * (1.0 / dist));
}
}
}
The following 2 plots show the outer rays for each images (total of 16 images). The blue end is the camera point and the cyan is a bounding box containing the object captured by the camera. One can clearly see the rays projecting back to the object in world coordinate system.
To define the ray you need a start point (which is the camera/eye position) and a direction vector, which can be calculated using any point on the ray.
For a given pixel in the image, you have a projected X and Y (zeroed at the center of the image) but no Z depth value. However the real-world co-ordinates corresponding to all possible depth values for that pixel will all lie on the ray you are trying to calculate, so you can just choose any arbitrary non-zero Z value, since any point on the ray will do.
float projectedX = (x - imageCenterX) / (imageWidth * 0.5f);
float projectedY = (y - imageCenterY) / (imageHeight * 0.5f);
float projectedZ = 1.0f; // any arbitrary value
Now that you have a 3D projected co-ordinate you can undo the projection by applying the perspective divide in reverse by multiplying X and Y by Z, then multiplying the result by the inverse projection matrix to get the unprojected point.
// undo perspective divide (redundant if projectedZ = 1, shown for completeness)
projectedX *= projectedZ;
projectedY *= projectedZ;
Vector3 projectedPoint = new Vector3(projectedX, projectedY, projectedZ);
// undo projection by multiplying by inverse:
Matrix invProjectionMat = projectionMat.inverse();
Vector3 unprojectedPoint = invProjectionMat.multiply(projectedPoint);
Subtract the camera position from the unprojected point to get the direction vector from the camera to the point, and then normalize it. (This step assumes that the projection matrix defines both the camera position and orientation, if the position is stored separately then you don't need to do the subtraction)
Vector3 directionVector = unprojectedPoint.subtract(cameraPosition);
directionVector.normalize();
The ray is defined by the camera position and the normalized direction vector. You can then intersect it with any of the X, Y, Z planes.

Calculate Angle from Two Points and a Direction Vector

I have two vectors in a game. One vector is the player, one vector is an object. I also have a vector that specifies the direction the player if facing. The direction vector has no z part. It is a point that has a magnitude of 1 placed somewhere around the origin.
I want to calculate the angle between the direction the soldier is currently facing and the object, so I can correctly pan some audio (stereo only).
The diagram below describes my problem. I want to calculate the angle between the two dashed lines. One dashed line connects the player and the object, and the other is a line representing the direction the player is facing from the point the player is at.
At the moment, I am doing this (assume player, object and direction are all vectors with 3 points, x, y and z):
Vector3d v1 = direction;
Vector3d v2 = object - player;
v1.normalise();
v2.normalise();
float angle = acos(dotProduct(v1, v2));
But it seems to give me incorrect results. Any advice?
Test of code:
Vector3d soldier = Vector3d(1.f, 1.f, 0.f);
Vector3d object = Vector3d(1.f, -1.f, 0.f);
Vector3d dir = Vector3d(1.f, 0.f, 0.f);
Vector3d v1 = dir;
Vector3d v2 = object - soldier;
long steps = 360;
for (long step = 0; step < steps; step++) {
float rad = (float)step * (M_PI / 180.f);
v1.x = cosf(rad);
v1.y = sinf(rad);
v1.normalise();
float dx = dotProduct(v2, v1);
float dy = dotProduct(v2, soldier);
float vangle = atan2(dx, dy);
}
You shoud always use atan2 when computing angular deltas, and then normalize.
The reason is that for example acos is a function with domain -1...1; even normalizing if the input absolute value (because of approximations) gets bigger than 1 the function will fail even if it's clear that in such a case you would have liked an angle of 0 or PI instead. Also acos cannot measure the full range -PI..PI and you'd need to use explicitly sign tests to find the correct quadrant.
Instead atan2 only singularity is at (0, 0) (where of course it doesn't make sense to compute an angle) and its codomain is the full circle -PI...PI.
Here is an example in C++
// Absolute angle 1
double a1 = atan2(object.y - player.y, object.x - player.x);
// Absolute angle 2
double a2 = atan2(direction.y, direction.x);
// Relative angle
double rel_angle = a1 - a2;
// Normalize to -PI .. +PI
rel_angle -= floor((rel_angle + PI)/(2*PI)) * (2*PI) - PI;
In the case of a general 3d orientation you need two orthogonal directions, e.g. the vector of where the nose is pointing to and the vector to where your right ear is.
In that case the formulas are just slightly more complex, but simpler if you have the dot product handy:
// I'm assuming that '*' is defined as the dot product
// between two vectors: x1*x2 + y1*y2 + z1*z2
double dx = (object - player) * nose_direction;
double dy = (object - player) * right_ear_direction;
double angle = atan2(dx, dy); // Already in -PI ... PI range
In 3D space, you also need to compute the axis:
Vector3d axis = normalise(crossProduct(normalise(v1), normalise(v2)));

Radius of projected Sphere

i want to refine a previous question:
How do i project a sphere onto the screen?
(2) gives a simple solution:
approximate radius on screen[CLIP SPACE] = world radius * cot(fov / 2) / Z
with:
fov = field of view angle
Z = z distance from camera to sphere
result is in clipspace, multiply by viewport size to get size in pixels
Now my problem is that i don't have the FOV. Only the view and projection matrices are known. (And the viewport size if that does help)
Anyone knows how to extract the FOV from the projection matrix?
Update:
This approximation works better in my case:
float radius = glm::atan(radius/distance);
radius *= glm::max(viewPort.width, viewPort.height) / glm::radians(fov);
I'm a bit late to this party. But I came across this thread when I was looking into the same problem. I spent a day looking into this and worked though some excellent articles I found here:
http://www.antongerdelan.net/opengl/virtualcamera.html
I ended up starting with the projection matrix and working backwards. I got the same formula you mention in your post above. ( where cot(x) = 1/tan(x) )
radius_pixels = (radius_worldspace / {tan(fovy/2) * D}) * (screen_height_pixels / 2)
(where D is the distance from camera to the target's bounding sphere)
I'm using this approach to determine the radius of an imaginary trackball that I use to rotate my object.
Btw Florian, you can extract the fovy from the Projection matrix as follows:
If you take the Sy component from the Projection matrix as shown here:
Sx 0 0 0
0 Sy 0 0
0 0 Sz Pz
0 0 -1 0
where Sy = near / range
and where range = tan(fovy/2) x near
(you can find these definitions at the page I linked above)
if you substitute range in the Sy eqn above you get:
Sy = 1 / tan(fovy/2) = cot(fovy/2)
rearranging:
tan(fovy/2) = 1 / Sy
taking arctan (the inverse of tan) of both sides we get:
fovy/2 = arctan(1/Sy)
so,
fovy = 2 x arctan(1/Sy)
Not sure if you still care - its been a while! - but maybe this will help someone else.
Update: see below.
Since you have the view and projection matrices, here's one way to do it, though it's probably not the shortest:
transform the sphere's center into view space using the view matrix: call the result point C
transform a point on the surface of the sphere, e.g. C+(r, 0, 0) in world coordinates where r is the sphere's world radius, into view space; call the result point S
compute rv = distance from C to S (in view space)
let point S1 in view coordinates be C + (rv, 0, 0) - i.e. another point on the surface of the sphere in view space, for which the line C -> S1 is perpendicular to the "look" vector
project C and S1 into screen coords using the projection matrix as Cs and S1s
compute screen radius = distance between Cs and S1s
But yeah, like Brandorf said, if you can preserve the camera variables, like FOVy, it would be a lot easier. :-)
Update:
Here's a more efficient variant on the above: make an inverse of the projection matrix. Use it to transform the viewport edges back into view space. Then you won't have to project every box into screen coordinates.
Even better, do the same with the view matrix and transform the camera frustum back into world space. That would be more efficient for comparing many boxes against; but harder to figure out the math.
The answer posted at your link radiusClipSpace = radius * cot(fov / 2) / Z, where fov is the angle of the field of view, and Z is the z-distance to the sphere, definitely works. However, keep in mind that radiusClipSpace must be multiplied by the viewport's width to get a pixel measure. The value measured in radiusClipSpace will be a value between 0 and 1 if the object fits on the screen.
An alternative solution may be to use the solid angle of the sphere. The solid angle subtended by a sphere in a sky is basically the area it covers when projected to the unit sphere.
The formulae are given at this link but roughly what I'm doing is:
if( (!radius && !distance) || fabsf(radius) > fabsf(distance) )
; // NAN conditions. do something special.
theta=arcsin( radius/distance )
sphereSolidAngle = ( 1 - cosf( theta ) ) ; // not multiplying by 2PI since below ratio used only
frustumSolidAngle = ( 1 - cosf( fovy / 2 ) ) / M_PI ; // I cheated here. I assumed
// the solid angle of a frustum is (conical), then divided by PI
// to turn it into a square (area unit square=area unit circle/PI)
numPxCovered = 768.f*768.f * sphereSolidAngle / frustumSolidAngle ; // 768x768 screen
radiusEstimate = sqrtf( numPxCovered/M_PI ) ; // area=pi*r*r
This works out to roughly the same numbers as radius * cot(fov / 2) / Z. If you only want an estimate of the area covered by the sphere's projection in px, this may be an easy way to go.
I'm not sure if a better estimate of the solid angle of the frustum could be found easily. This method involves more comps than radius * cot(fov / 2) / Z.
The FOV is not directly stored in the projection matrix, but rather used when you call gluPerspective to build the resulting matrix.
The best approach would be to simply keep all of your camera variables in their own class, such as a frustum class, whose member variables are used when you call gluPerspective or similar.
It may be possible to get the FOVy back out of the matrix, but the math required eludes me.

Perturb vector by some angle

I have a unit vector in 3D space whose direction I wish to perturb by some angle within the range 0 to theta, with the position of the vector remaining the same. What is a way I can accomplish this?
Thanks.
EDIT: After thinking about the way I posed the question, it seems to be a bit too general. I'll attempt to make it more specific: Assume that the vector originates from the surface of an object (i.e. sphere, circle, box, line, cylinder, cone). If there are different methods to finding the new direction for each of those objects, then providing help for the sphere one is fine.
EDIT 2: I was going to type this in a comment but it was too much.
So I have orig_vector, which I wish to perturb in some direction between 0 and theta. The theta can be thought of as forming a cone around my vector (with theta being the angle between the center and one side of the cone) and I wish to generate a new vector within that cone. I can generate a point lying on the plane that is tangent to my vector and thus creating a unit vector in the direction of the point, call it rand_vector. At this time, I orig_vector and trand_vector are two unit vectors perpendicular to each other.
I generate my first angle, angle1 between 0 and 2pi and I rotate rand_vector around orig_vector by angle1, forming rand_vector2. I looked up a resource online and it said that the second angle, angle2 should be between 0 and sin(theta) (where theta is the original "cone" angle). Then I rotate rand_vector2 by acos(angle2) around the vector defined by the cross product between rand_vector2 and orig_vector.
When I do this, I don't obtain the desired results. That is, when theta=0, I still get perturbed vectors, and I expect to get orig_vector. If anyone can explain the reason for the angles and why they are the way they are, I would greatly appreciate it.
EDIT 3: This is the final edit, I promise =). So I fixed my bug and everything that I described above works (it was an implementation bug, not a theory bug). However, my question about the angles (i.e. why is angle2 = sin(theta)*rand() and why is perturbed_vector = rand_vector2.Rotate(rand_vector2.Cross(orig_vector), acos(angle2)). Thanks so much!
Here's the algorithm that I've used for this kind of problem before. It was described in Ray Tracing News.
1) Make a third vector perpendicular to the other two to build an orthogonal basis:
cross_vector = unit( cross( orig_vector, rand_vector ) )
2) Pick two uniform random numbers in [0,1]:
s = rand( 0, 1 )
r = rand( 0, 1 )
3) Let h be the cosine of the cone's angle:
h = cos( theta )
4) Modify uniform sampling on a sphere to pick a random vector in the cone around +Z:
phi = 2 * pi * s
z = h + ( 1 - h ) * r
sinT = sqrt( 1 - z * z )
x = cos( phi ) * sinT
y = sin( phi ) * sinT
5) Change of basis to reorient it around the original angle:
perturbed = rand_vector * x + cross_vector * y + orig_vector * z
If you have another vector to represent an axis of rotation, there are libraries that will take the axis and the angle and give you a rotation matrix, which can then be multiplied by your starting vector to get the result you want.
However, the axis of rotation should be at right angles to your starting vector, to get the amount of rotation you expect. If the axis of rotation does not lie in the plane perpendicular to your vector, the result will be somewhat different than theta.
That being said, if you already have a vector at right angles to the one you want to perturb, and you're not picky about the direction of the perturbation, you can just as easily take a linear combination of your starting vector with the perpendicular one, adjust for magnitude as needed.
I.e., if P and Q are vectors having identical magnitude, and are perpendicular, and you want to rotate P in the direction of Q, then the vector R given by R = [Pcos(theta)+Qsin(theta)] will satisfy the constraints you've given. If P and Q have differing magnitude, then there will be some scaling involved.
You may be interested in 3D-coordinate transformations to change your vector angle.
I don't know how many directions you want to change your angle in, but transforming your Cartesian coordinates to spherical coordinates should allow you to make your angle change as you like.
Actually, it is very easy to do that. All you have to do is multiply your vector by the correct rotation matrix. The resulting vector will be your rotated vector. Now, how do you obtain such rotation matrix? That depends on the 3d framework/engine you are using. Any 3d framework must provide functions for obtaining rotation matrices, normally as static methods of the Matrix class.
Good luck.
Like said in other comments you can rotate your vector using a rotation matrix.
The rotation matrix has two angles you rotate your vector around. You can pick them with a random number generator, but just picking two from a flat generator is not correct. To ensure that your rotation vector is generated flat, you have to pick one random angle φ from a flat generator and the other one from a generator flat in cosθ ;this ensures that your solid angle element dcos(θ)dφ is defined correctly (φ and θ defined as usual for spherical coordinates).
Example: picking a random direction with no restriction on range, random() generates flat in [0,1]
angle1 = acos(random())
angle2 = 2*pi*random()
My code in unity - tested and working:
/*
* this is used to perturb given vector 'direction' by changing it by angle not more than 'angle' vector from
* base direction. Used to provide errors for player playing algorithms
*
*/
Vector3 perturbDirection( Vector3 direction, float angle ) {
// division by zero protection
if( Mathf.Approximately( direction.z, 0f )) {
direction.z = 0.0001f;
}
// 1 get some orthogonal vector to direction ( solve direction and orthogonal dot product = 0, assume x = 1, y = 1, then z = as below ))
Vector3 orthogonal = new Vector3( 1f, 1f, - ( direction.x + direction.y ) / direction.z );
// 2 get random vector from circle on flat orthogonal to direction vector. get full range to assume all cone space randomization (-180, 180 )
float orthoAngle = UnityEngine.Random.Range( -180f, 180f );
Quaternion rotateTowardsDirection = Quaternion.AngleAxis( orthoAngle, direction );
Vector3 randomOrtho = rotateTowardsDirection * orthogonal;
// 3 rotate direction towards random orthogonal vector by vector from our available range
float perturbAngle = UnityEngine.Random.Range( 0f, angle ); // range from (0, angle), full cone cover guarantees previous (-180,180) range
Quaternion rotateDirection = Quaternion.AngleAxis( perturbAngle, randomOrtho );
Vector3 perturbedDirection = rotateDirection * direction;
return perturbedDirection;
}

Implementing Ray Picking

I have a renderer using directx and openGL, and a 3d scene. The viewport and the window are of the same dimensions.
How do I implement picking given mouse coordinates x and y in a platform independent way?
If you can, do the picking on the CPU by calculating a ray from the eye through the mouse pointer and intersect it with your models.
If this isn't an option I would go with some type of ID rendering. Assign each object you want to pick a unique color, render the objects with these colors and finally read out the color from the framebuffer under the mouse pointer.
EDIT: If the question is how to construct the ray from the mouse coordinates you need the following: a projection matrix P and the camera transform C. If the coordinates of the mouse pointer is (x, y) and the size of the viewport is (width, height) one position in clip space along the ray is:
mouse_clip = [
float(x) * 2 / float(width) - 1,
1 - float(y) * 2 / float(height),
0,
1]
(Notice that I flipped the y-axis since often the origin of the mouse coordinates are in the upper left corner)
The following is also true:
mouse_clip = P * C * mouse_worldspace
Which gives:
mouse_worldspace = inverse(C) * inverse(P) * mouse_clip
We now have:
p = C.position(); //origin of camera in worldspace
n = normalize(mouse_worldspace - p); //unit vector from p through mouse pos in worldspace
Here's the viewing frustum:
First you need to determine where on the nearplane the mouse click happened:
rescale the window coordinates (0..640,0..480) to [-1,1], with (-1,-1) at the bottom-left corner and (1,1) at the top-right.
'undo' the projection by multiplying the scaled coordinates by what I call the 'unview' matrix: unview = (P * M).inverse() = M.inverse() * P.inverse(), where M is the ModelView matrix and P is the projection matrix.
Then determine where the camera is in worldspace, and draw a ray starting at the camera and passing through the point you found on the nearplane.
The camera is at M.inverse().col(4), i.e. the final column of the inverse ModelView matrix.
Final pseudocode:
normalised_x = 2 * mouse_x / win_width - 1
normalised_y = 1 - 2 * mouse_y / win_height
// note the y pos is inverted, so +y is at the top of the screen
unviewMat = (projectionMat * modelViewMat).inverse()
near_point = unviewMat * Vec(normalised_x, normalised_y, 0, 1)
camera_pos = ray_origin = modelViewMat.inverse().col(4)
ray_dir = near_point - camera_pos
Well, pretty simple, the theory behind this is always the same
1) Unproject two times your 2D coordinate onto the 3D space. (each API has its own function, but you can implement your own if you want). One at Min Z, one at Max Z.
2) With these two values calculate the vector that goes from Min Z and point to Max Z.
3) With the vector and a point calculate the ray that goes from Min Z to MaxZ
4) Now you have a ray, with this you can do a ray-triangle/ray-plane/ray-something intersection and get your result...
I have little DirectX experience, but I'm sure it's similar to OpenGL. What you want is the gluUnproject call.
Assuming you have a valid Z buffer you can query the contents of the Z buffer at a mouse position with:
// obtain the viewport, modelview matrix and projection matrix
// you may keep the viewport and projection matrices throughout the program if you don't change them
GLint viewport[4];
GLdouble modelview[16];
GLdouble projection[16];
glGetIntegerv(GL_VIEWPORT, viewport);
glGetDoublev(GL_MODELVIEW_MATRIX, modelview);
glGetDoublev(GL_PROJECTION_MATRIX, projection);
// obtain the Z position (not world coordinates but in range 0 - 1)
GLfloat z_cursor;
glReadPixels(x_cursor, y_cursor, 1, 1, GL_DEPTH_COMPONENT, GL_FLOAT, &z_cursor);
// obtain the world coordinates
GLdouble x, y, z;
gluUnProject(x_cursor, y_cursor, z_cursor, modelview, projection, viewport, &x, &y, &z);
if you don't want to use glu you can also implement the gluUnProject you could also implement it yourself, it's functionality is relatively simple and is described at opengl.org
Ok, this topic is old but it was the best I found on the topic, and it helped me a bit, so I'll post here for those who are are following ;-)
This is the way I got it to work without having to compute the inverse of Projection matrix:
void Application::leftButtonPress(u32 x, u32 y){
GL::Viewport vp = GL::getViewport(); // just a call to glGet GL_VIEWPORT
vec3f p = vec3f::from(
((float)(vp.width - x) / (float)vp.width),
((float)y / (float)vp.height),
1.);
// alternatively vec3f p = vec3f::from(
// ((float)x / (float)vp.width),
// ((float)(vp.height - y) / (float)vp.height),
// 1.);
p *= vec3f::from(APP_FRUSTUM_WIDTH, APP_FRUSTUM_HEIGHT, 1.);
p += vec3f::from(APP_FRUSTUM_LEFT, APP_FRUSTUM_BOTTOM, 0.);
// now p elements are in (-1, 1)
vec3f near = p * vec3f::from(APP_FRUSTUM_NEAR);
vec3f far = p * vec3f::from(APP_FRUSTUM_FAR);
// ray in world coordinates
Ray ray = { _camera->getPos(), -(_camera->getBasis() * (far - near).normalize()) };
_ray->set(ray.origin, ray.dir, 10000.); // this is a debugging vertex array to see the Ray on screen
Node* node = _scene->collide(ray, Transform());
cout << "node is : " << node << endl;
}
This assumes a perspective projection, but the question never arises for the orthographic one in the first place.
I've got the same situation with ordinary ray picking, but something is wrong. I've performed the unproject operation the proper way, but it just doesn't work. I think, I've made some mistake, but can't figure out where. My matix multiplication , inverse and vector by matix multiplications all seen to work fine, I've tested them.
In my code I'm reacting on WM_LBUTTONDOWN. So lParam returns [Y][X] coordinates as 2 words in a dword. I extract them, then convert to normalized space, I've checked this part also works fine. When I click the lower left corner - I'm getting close values to -1 -1 and good values for all 3 other corners. I'm then using linepoins.vtx array for debug and It's not even close to reality.
unsigned int x_coord=lParam&0x0000ffff; //X RAW COORD
unsigned int y_coord=client_area.bottom-(lParam>>16); //Y RAW COORD
double xn=((double)x_coord/client_area.right)*2-1; //X [-1 +1]
double yn=1-((double)y_coord/client_area.bottom)*2;//Y [-1 +1]
_declspec(align(16))gl_vec4 pt_eye(xn,yn,0.0,1.0);
gl_mat4 view_matrix_inversed;
gl_mat4 projection_matrix_inversed;
cam.matrixProjection.inverse(&projection_matrix_inversed);
cam.matrixView.inverse(&view_matrix_inversed);
gl_mat4::vec4_multiply_by_matrix4(&pt_eye,&projection_matrix_inversed);
gl_mat4::vec4_multiply_by_matrix4(&pt_eye,&view_matrix_inversed);
line_points.vtx[line_points.count*4]=pt_eye.x-cam.pos.x;
line_points.vtx[line_points.count*4+1]=pt_eye.y-cam.pos.y;
line_points.vtx[line_points.count*4+2]=pt_eye.z-cam.pos.z;
line_points.vtx[line_points.count*4+3]=1.0;

Resources