Finding a pixel location in a field of view - math

So I have a camera with a defined field of view a point of a place I would like to label in the image. I have both the lat and lon of the points and I know the angle between them however my equation for finding the pixel location is off. Attached is an image to help my explanation:
I can solve for each vector of the camera to the center of view and the point and the full angle of the field of view and the angle between the center of view and point.
Here is what Im currently using: [angle of the field of view(green angle)/ the angle between the vectors (the blue angle)] * 1024 (screen width)
With numbers: (14.182353/65) * 1024 = 223.426620 and on the image the pixel value should be 328...
Another way I tried it was using a bearing equation: [[the bearing of the point to the camera- the bearing of the left side of the field of view ]/field of view] * 1024
With numbers: ((97.014993-83.500000)/65) * 1024 = 212.913132 and the answer should be 328...
Can anyone think of a more accurate solution?

Try 512(1-tan(blue)/tan(green/2)), where blue is positive to the left.
If blue is to the right, you can treat it as a negative number, to get 512(1+tan(blue)/tan(green/2)).
Explanation:
Let C be the camera, d be the dot labeled 328, E be the center of the field of view, and L be the left end point of the field of view, so that you want to find dL. Then (for blue going left):
dL+dE = EL = 512
tan(green/2)=EL/CE
tan(blue)=dE/CE
Then tan(blue)/tan(green/2) = dE/EL = (512-dL)/512, and you can solve for dL.
Going right would be similar (or you can work with negative distances, and everything works out fine).

Related

How to fix zoom towards mouse routine?

I'm trying to learn how to zoom towards mouse using Orthographic projection and so far I've got this:
def dolly(self, wheel, direction, x, y, acceleration_enabled):
v = vec4(*[float(v) for v in glGetIntegerv(GL_VIEWPORT)])
w, h = v[2], v[3]
f = self.update_zoom(direction, acceleration_enabled) # [0.1, 4]
aspect = w/h
x,y = x-w/2, y-h/2
K1 = f*10
K0 = K1*aspect
self.left = K0*(-2*x/w-1)
self.right = K0*(-2*x/w+1)
self.bottom = K1*(2*y/h-1)
self.top = K1*(2*y/h+1)
x/y: mouse screen coordinates
w/h: window width/height
f: factor which goes from 0.1 to 4 when scrolling down/up
left/right/bottom/top: values used to compute the new orthographic projection
The results I'm getting are really strange but I don't know which part of the formulas I've messed up.
Could you please spot which part of my maths are wrong or just post a clear pseudocode I can try? Just for the record, I've read&tested quite a lot of versions out there on the internet but haven't found yet any place where this subject is explained properly.
Ps. You don't need to post any SO link related to this subject as I've read all of them already :)
I'm going to answer this in a general way, based on the following set of assumptions:
You use a matrix P for the (ortho) projection describing the actual mapping of your eye space view volume onto the standard view volume [-1,1]^3 OpenGL will clip against (see also assumption 2) and a matrix V for the view transformtation, that is postion and orientation of the "camera" (if there is such a thing, especially in ortho projections) and basically establishing an eye space where your view volume will be defined relative to.
I will ignore the homogeneous clip space, as you work with completely affine ortho projections only, that means NDC coordinates and clip space will be identical, and no tricks to any w coordinate are applied.
I assume default GL conventions for eye space and projection matrices, notably eye space origin is camera location and camera lookat direction is -z
The viewport is filling the window completely.
Windows Space is default OpenGL convention where the origin is at the bottom left.
Mouse coordinates are in some window-specific coordinate frame where the origin is at top left, mouse is at integer pixel coordinates.
I assume that the view volume defined by P is symmetrical: right = -left and top = -bottom, and it is also supposed to stay symmetrical after the zoom operation, therefore, to compensate for any movement, the view matrix V must be adjusted, too.
What you want to get is a zoom such that the object point under the mouse cursor does not move, so becomes the center of the scale operation. The mouse cursor itself is only 2D and a whole straight line in the 3D space will be mapped to the same pixel location. However, in an ortho projection, that line will be orthogonal to the image plane, so we don't need to bother much with the third dimension.
So what we want is to scale the current situation with P_old (defined by the ortho parameters l_old, r_old, b_old, t_old, n_old and f_old) and V_old (defined by "camera" position c_old and ortientation o_old) by a zoom factor s at mouse position (x,y) (in the space from assumption 6).
We can see a few things directly:
the near and far plane of the projection should be unaffected by the operation, so n_new = n_old and f_new = f_old.
the actual camera orientation (or lookat direction) should also be unaffected: o_new = o_old
If we zoom in by a factor of s, the actual view volume must be scaled by 1/s, since when we zoom in, a smaller part of the complete world is mapper on the screen than before (and appears bigger). So we can simply scale the frustum parameters we had:
l_new = l_old / s, r_new = r_old / s, b_new = b_old / s, t_new = t_old / s
If new only replace P_old by P_new, we get the zoom, but the world point under the mouse cursor will move (except the mouse is exactly in the center of the view). So we have to compensate for that by modifying the camera position.
Let's first put the mouse coords (x,y) into OpenGL window space (assumptions 5 and 6):
x_win = x + 0.5
y_win = height - 0.5 - y
Note that besides mirroring y, I also shift the coordinates by half a pixels. That's because in OpenGL window space, pixel centers are at half-inter coordinates, while I assume that your integer mouse coordinates are to represent the center of the pixel you click onto (will not make a big difference visually, but still)
Now let's further put the coords into Normalized Device Space (relying on assumption 4 here):
x_ndc = 2.0 * x_win / width - 1
y_ndc = 2.0 * y_win / height - 1
By assumption 2, clip and NDC coordiantes will be identical, and we can call the vector v our NDC/space mouse coordinates: v = (x_ndc, y_ndc, 0, 1)^T
We can now state our "point under mouse must not move" condition:
inverse(V_old) * inverse(P_old) * v = inverse(V_new) * inverse(P_new) * v
But let's just go into eye space and let's look at what happened:
Let a = inverse(P_old) * v be the eye space location of the point under the mouse cursor before we scaled.
Let b = inverse(P_new) * v be the eye space location of the pointer under the mouse cursor after we scaled.
Since we assumed a symmetrical view volume, we already know that for the x and y coordinates, b = (1/s) *a holds (assumption 7. if that assumption does not hold, you need to do the actual calculation for b too, which isn't hard either).
So, we can set up an 2D eye space offset vector d which describes how our point of interest was moved by the scale:
d = b - a = (1 / s) *a - a = a (1/s - 1)
To compensate for that movement, we have to move our camera inversely, so by -d.
If you keep the camera position separate as I did in assumption 1, you simply need to update the camera position c accordingly. You just have to take care about the fact that c is the world space postion, while d is an eye space offset:
c_new = c_old - inverse(V_old) * (d_x, d_y, 0, 0)^T
Not that if you do not keep the camera position as a separate variable, but keep the view matrix directly, you can simply pre-multiply the translation: V_new = translate(-d_x, -d_y, 0) * V_old
Update
What I wrote so far is correct, but I took a shortcut which is numerically a very bad idea when working with not-infinite precision data types. The error in camera position accumulates very fast if one zooms out a lot. So after #BPL implemted this, this it what he got:
The main issue seems to be that I directly calculated the offset vector d in eye space, which does not take the current view matrix V_old (and its small errors into account). So a more stable approach is to calculate all of this directly in world space:
a = inverse(P_old * V_old) * v
b = inverse(P_new * V_old) * v
d = b - a
c_new = c_old - d
(doing so makes assumption 7 not needed anymore as a by product, so it directly works in the general case of arbitrary ortho matrices).
Using this approach, the zoom operation worked as expected:

formula for game xy to image (pixel) xy

It's probably quite simple, but i can't find what i need on search engines... (it's like they used to know better what i was looking for)
I need to convert in-game coordinates to "coordinates" on an image so i can add ... say a pixel on the image to represent the location of the in-game coordinates.
The image is a map, the size is 2384x2044 (width x height).
The in-game 0,0 = the middle of the in-game map, this would also be the middle of the image.
So it's easy to find the xy to print a pixel on the middle of image:
2384 : 2 = 1192 and 2044 : 2 = 1022, so the xy for 0,0 in-game on the image is 1192,1022.
Now, for example, if i move up and slightly to the left in-game the coordinates become: -141.56,1108.11 - How can i calculate the correct xy for the image?
image: http://i.imgur.com/yfiwfO7.png?1
To recap, you want to scale game coordinates of -3000 to +3000 in both axes and offset them to centre them on your image; in that case the computations you want are
pixel_x = 1192 + (game_x * 1192 / 3000.0)
pixel_y = 1022 - (game_y * 1022 / 3000.0)
Note the minus on the y line to invert the direction of the offset. Your game coordinates are floating point so I've made the 3000s floating point by adding a .0 - you didn't say what language you were using so this may or may not be the correct syntax.
However you probably ought to avoid putting constants into this in case you ever want to change the size of the playfield or the image. It would be better to
set up constants in your program for the playfield dimensions
set up constants or global variables for the size of the image: you can read this from the image as you load it
pre-compute the values 1192 / 3000.0 and 1022 / 3000.0 (but using your image constants) to save one floating point operation for each scale? probably not worth it nowadays as a speed optimisation, though, and you might sacrifice a tiny bit of floating point accuracy at the end of the mantissa, but that won't matter here.

3D perspective 'grab' panning with DirectX

I am implementing a pan tool in our software's 3D view which is supposed to work much like the grab tool of, say, Photoshop or Acrobat Reader. That is, the point the user grabs onto with the mouse (clicks and holds, then moves the mouse) stays under the mouse cursor as the mouse moves.
This is a common paradigm and one that's been asked about on SO before, the best answer being to this question about the technique in OpenGL. There is another that also has some hints, and I have been reading this very informative CodeProject article. (It doesn't explain many of its code examples' variables etc, but from reading the text I think I understand the technique.) But, I have some implementation issues because my 3D environment's navigation is set up quite differently to those articles, and I am seeking some guidance.
My technique - and this might be fundamentally flawed, so please say so - is:
The scene 'camera' is stored as two D3DXVECTOR3 points: the eye position and a look point. The view matrix is constructed using D3DXMatrixLookAtLH like so:
const D3DXVECTOR3 oUpVector(0.0f, 1.0f, 0.0f); // Keep up "up", always.
D3DXMatrixLookAtLH(&m_oViewMatrix, &m_oEyePos, &m_oLook, &oUpVector);
When the mouse button is pressed, shoot a ray through that pixel and find: the coordinate (in unprojected scene / world space) of the pixel that was clicked on; the intersection of that ray with the near plane; and the distance between the near-plane point and object, which is the length between those two points. Store this and the mouse position, and the original navigation (eye and look).
// Get the clicked-on point in unprojected (normal) world space
D3DXVECTOR3 o3DPos;
if (Get3DPositionAtMouse(roMousePos, o3DPos)) { // fails if nothing under the mouse
// Mouse location when panning started
m_oPanMouseStartPos = roMousePos;
// Intersection at near plane (z = 0) of the ray from camera to clicked spot
D3DXVECTOR3 oRayVector;
CalculateRayFromPixel(m_oPanMouseStartPos, m_oPanPlaneZ0StartPos, oRayVector);
// Store original eye and look points
m_oPanOriginalEyePos = m_oEyePos;
m_oPanOriginalLook = m_oLook;
// Store the distance between near plane and the object, and the object position
m_dPanPlaneZ0ObjectDist = fabs(D3DXVec3Length(&(o3DPos - m_oPanPlaneZ0StartPos)));
m_oPanOriginalObjectPos = o3DPos;
Get3DPositionAtMouse is a known-ok method which picks a 3D coordinate under the mouse. CalculateRayFromPixel is a known-ok method which takes in a screen-space mouse coordinate and casts a ray, and fills the other two parameters with the ray intersection at the near plane (Z = 0) and the normalised ray vector.
When the mouse moves, cast another ray at the new position, but using the old (original) view matrix. (Thanks to Nico below for pointing this out.) Calculate where the object should be by extending the ray from the near plane the distance between the object and near plane (this way, the original object and new object points should be in parallel plane to the near plane.) Move the eye and look coordinates by this much. Eye and Look are set from their original (when panning started) values, with the difference being from the original mouse and new mouse positions. This is to reduce any precision loss from incrementing or decrementing by granular (integer) pixel movements as the mouse moves, ie it calculates the whole difference in navigation every time.
// Set navigation back to original (as it was when started panning) and cast a ray for the mouse
m_oEyePos = m_oPanOriginalEyePos;
m_oLook = m_oPanOriginalLook;
UpdateView();
D3DXVECTOR3 oRayVector;
D3DXVECTOR3 oNewPlaneZPos;
CalculateRayFromPixel(roMousePos, oNewPlaneZPos, oRayVector);
// Now intersect that ray (ray through the mouse pixel, using the original navigation)
// to hit the plane the object is in. Function uses a "line", so start at near plane
// and the line is of the length of the far plane away
D3DXVECTOR3 oNew3DPos;
D3DXPlaneIntersectLine(&oNew3DPos, &m_oPanObjectPlane, &oNewPlaneZPos, &(oRayVector * GetScene().GetFarPlane()));
// The eye/look difference /should/ be as simple as:
// const D3DXVECTOR3 oDiff = (m_oPanOriginalObjectPos - oNew3DPos);
// But that lags and is slow, ie the objects trail behind. I don't know why. What does
// work is to scale the from-to difference by the distance from the camera relative to
// the whole scene distance
const double dDist = D3DXVec3Length(&(oNew3DPos - m_oPanOriginalEyePos));
const double dTotalDist = GetScene().GetFarPlane() - GetScene().GetNearPlane();
const D3DXVECTOR3 oDiff = (m_oPanOriginalObjectPos - oNew3DPos) * (1.0 + (dDist / dTotalDist));
// Adjust the eye and look points by the same amount, so orthogonally changed
m_oEyePos = m_oPanOriginalEyePos + oDiff;
m_oLook = m_oPanOriginalLook + oDiff;
Diagram
This diagram is my working sketch for implementing this:
and hopefully explains the above much more simply than the text. You can see a moving point, and where the camera has to move to keep that point at the same relative position. The clicked-on point (the ray from the camera to the object) is just to the right of the straight-ahead ray representing the center pixel.
The problem
But, as you've probably guessed, this doesn't work as I hope. What I wanted to see was the clicked-on object moving with the mouse cursor. What I actually see is that the object moves in the direction of the mouse, but not enough, ie it does not keep the clicked-on point under the cursor. Secondly, the movement flickers and jumps around, jittering by up to twenty or thirty pixels sometimes, then flickers back. If I replace oDiff with something constant this doesn't occur.
Any ideas, or code samples showing how to implement this with DirectX (D3DX, DX matrix order, etc) will be gratefully read.
Edit
Commenter Nico below pointed out that when calculating the new position using the mouse cursor's moved position, I needed to use the original view matrix. Doing so helps a lot, and the objects stay near the mouse position. However, it's still not exact. What I've noticed is that at the center of the screen, it is exact; as the mouse moves further from the center, it gets out by more and more. This seemed to change based on how far away the object was, too. By pure 'I have no idea what I'm doing' guesswork, I scaled this by a factor of the near/far plane and how far away the object was, and this brings it very close to the mouse cursor, but still a few pixels away (1 to, say, 30 at the extreme edge of the screen, which is enough to make it feel wrong.)
Here's how i solve this problem.
float fieldOfView = 45.0f;
float halfFOV = (fieldOfView / 2.0f) * (DEGREES_TO_RADIANS);
float distanceToObject = // compute the world space distance from the camera to the object you want to pan
float projectionToWorldScale = distanceToObject * tan( halfFov );
Vector mouseDeltaInScreenSpace = // the delta mouse in pixels that we want to pan
Vector mouseDeltaInProjectionSpace = Vector( mouseDeltaInScreenSpace.x * 2 / windowPixelSizeX, mouseDeltaInScreenSpace.y * 2 / windowPixelSizeY ); // ( the "*2" is because the projection space is from -1 to 1)
// go from normalized device coordinate space to world space (at origin)
Vector cameraDelta = -mouseDeltaInProjectionSpace * projectionToWorldScale;
// now translate your camera by "cameraDelta".
Note this works for an field of view apsect ratio of 1, i think you would have to break up the "scale" into separate x and y components if they vertical field of view was different than the horizontal field of view
Also, you mentioned a "look at" vector. I'm not sure how my math would need to change for that since my camera is always looking straight down the z-axis.
One problem is your calculation of the new 3d position. I am not sure if this is the root cause, but you might try it. If it doesn't help, just post a comment.
The problem is that your offset vector is not parallel to the znear plane. This is because the two rays are not parallel. Therefore, if the have the same length behind znear, the distance of the end point to the znear plane cannot be equal.
You can calculate the offset vector with the theorem of intersecting lines. If zNearA and zNearB are the intersection points of the znear plane with ray A and ray B respectively, then the theorem states:
Length(original_position - cam_position) / Length(offset_vector) = Length(zNearA - cam_position) / Length(zNearB - zNearA)
And therefore
offset_vector = Length(original_position - cam_position) / Length(zNearA - cam_position) * (zNearB - zNearA)
Then you can be sure to move on a line that is parallel to the znear plane.
Just try it out and see if it helps.

Rotating a Rectangle around a Grid To Calculate Players View

I have a player who can rotate and move around a 2D Cartesian grid, I need to calculate where to draw the enemies on screen.
The player should have a certain viewpoint which is the size of the screen in front of the direction the player is facing. (and a little behind)
I've tried tons of ways to implement this messing with Bi-Polar co-ordinates and Trig but I havn't been able to solve the problem of calculating where on the screen the enemies should be drawn.
The problem is best represent in the form of a graph with green being the viewpoint which is a rectangle that can rotate and move around the grid, and dots representing player and enemy.
So I need to work out the positions of the enemies on screen relative to the players rotation and position.
If you're going for a Doom-like perspective, you should imagine the viewing area as a parallelogram, rather than a rectangle. Imagine that behind your character is a camera man with its own position and angle.
The enemy's screen position is related to the angle between the camera and the enemy.
//indicates where on the screen an enemy should be drawn.
//-1 represents the leftmost part of the screen,
//and 1 is the rightmost.
//Anything larger or smaller is off the edge of the screen and should not be drawn.
float calculateXPosition(camera, enemy){
//the camera man can see anything 30 degrees to the left or right of its line of sight.
//This number is arbitrary; adjust to your own tastes.
frustumWidth = 60;
//the angle between the enemy and the camera, in relation to the x axis.
angle = atan2(enemy.y - camera.y, enemy.x - camera.x);
//the angle of the enemy, in relation to the camera's line of sight. If the enemy is on-camera, this should be less than frustumWidth/2.
objectiveAngle = camera.angle - angle;
//scale down from [-frustumWidth/2, frustumWidth/2] to [-1, 1]
return objectiveAngle / (frustrumWidth / 2);
}
These diagrams visualize what the variables I'm using here represent:
Once you have an "X position" in the range of [-1, 1], it should be easy enough to convert that into pixel coordinates. For example, if your screen is 500 pixels wide, you can do something like ((calculateXPosition(camera, enemy) + 1) / 2) * 500;
Edit:
You can do something similar to find the y-coordinate of a point, based on the point's height and distance from the camera.
(I'm not sure how you should define the height of the enemy and camera - any number should be fine as long as they somewhat match the scale set by the x and y dimensions of the cartesian grid.)
//this gives you a number between -1 and 1, just as calculateXPosition does.
//-1 is the bottom of the screen, 1 is the top.
float getYPosition(pointHeight, cameraHeight, distanceFromCamera){
frustrumWidth = 60;
relativeHeight = pointHeight - cameraHeight;
angle = atan2(relativeHeight, distanceFromCamera);
return angle / (frustrumWidth / 2);
}
You can call the method twice to determine the y position of both the top and the bottom of the enemy:
distanceFromCamera = sqrt((enemy.x - camera.x)^2 + (enemy.y - camera.y)^2);
topBoundary = convertToPixels(getYPosition(enemy.height, camera.height, distanceFromCamera));
bottomBoundary = convertToPixels(getYPosition(0, camera.height, distanceFromCamera));
That should give you enough information to properly scale and position the enemy's sprite.
(aside: the frustrumWidths in the two methods don't need to be the same - in fact, they should be different if the screen you are drawing to is rectangular. The ratios of the x frustrum and y frustrum should be equal to the ratios of the width and height of the screen.)

In OpenGL, How can I determine the bounds of the view at a given depth

I'm playing around with OpenGL and I've got a question that I haven't been able to find an answer to or at least haven't found the right way to ask search engines. I have a a pretty simple setup. An 800x600 viewport and a projection matrix with a 45 degree field of view and near and far planes of 1.0 and 200.0. For the sake of discussion, the modelview matrix is the identity matrix.
What I'm trying to determine is the bounds of the view at a given depth. For example, (0,0,0) is the center of the screen. And I'm looking in the -Z direction.
I want to know, if I draw geometry on a plane 100 units into the screen (0,0,-100), what are the bounds of the view? How far in the x and y direction can I draw in this plane and the geometry still be visible.
More generically, Given a plane parallel to the near and far plane (and between them), what are the visible bounds of that plane?
Also, if what I'm trying to determine has a common name or is a common operation, what's it called? That way I can track down more reading material
Your view angle is 45 degrees, you have a plane at a distance of a away from the camera, with an unkown height h. The whole thing looks like this:
Note that the angle here is half of your field of view.
Dusting off the highschool maths books, we get:
tan(angle) = h/a
Rearrange for h and subsitute the half field of view:
h = tan(FieldOfView / 2) * a;
This is how much your plane extends upwards along the Y axis.
Since screens aren't square, the width of your plane is different to the height. More exactly, the width is the aspect ratio times the height. I.e. w = h * aspectRatio
I hope this answers your question.

Resources