Calcluation of viewport coordinates - vector

I read an article about normalized device coordinates (on the german DGL wiki) and the following example is provided:
"Let's consider that we had a Viewport with dimensions 1024 pixel(width) and 768 pixel height. A point P with absolute, not normalized, coordinates P(350/210) would be in normalized coordinates P(-0,32/-0,59).These coordinates can now be projected on a Viewport (800x600) just by multiplying the normalized device coordinates (similar to vector scaling) with the size of the viewport. In this case the result would be P(273/164).
Somehow I can't understand how one can get to the result provided (I mean 273/164 and -0,32/-0,59 ...could somebody explain to me how to calculate the coordinates?
P.S. : This is the article - https://wiki.delphigl.com/index.php/Normalisierte_Ger%C3%A4tekoordinate
Thank you!

That article is definitely lacking description. I can get you part of the way there; maybe someone with more math can help finish.
According to this answer, the formula to convert non-normalized coords to normalized coords is:
(where Cx/y = Coordinate X/Y; Sx/y = Screen X/Y; and Nx/y = Normalized X/Y).
Plugging the example's numbers in:
Nx = (350/1024) * 2 - 1 = -0.31640625
Ny = 1 - (210/768) * 2 = 0.453125
...or (-.36, 0.45).
Reversing this to get the new coords:
Cx = (1 + -0.31640625) / 2 * 800 = 273.4375
Cy = (1 - 0.453125) / 2 * 600 = 164.0625
Note that the Y value doesn't match. This is probably because my calculation doesn't account for the aspect ratio, and it should be since these screens have a .75 aspect ratio, while NDC's is 1. This SO answer may help too.

Related

Screen coordinates to isometric coordinates

I'm struggling at converting mouse/screen coordinates to isometric tile index. I have tried about every formula I could find here or on internet but none of them seems to work or I am missing something.
Here is a picture, origin is in the top left corner and dimensions of one tile are 128x64px.
I would appreciate any help, thanks.
Basically, you need to apply a rotation matrix with a few other bits. Here's some sample code written in AWK which should be easy to port to any other language:
END {
PI = 3.1415;
x = 878.0;
y = 158.0;
# Translate one origin to the other
x1 = x - 128*5;
# Stretch the height so that it's the same as the width in the isometric
# This makes the rotation easier
# Invert the sign because y is upwards in math but downwards in graphics
y1 = y * -2;
# Apply a counter-clockwise rotation of 45 degrees
xr = cos(PI/4)*x1 - sin(PI/4)*y1;
yr = sin(PI/4)*x1 + cos(PI/4)*y1;
# The side of each isometric tile (which is now a square after the stretch)
diag = 64 * sqrt(2);
# Calculate which tile the coordinate belongs to
x2 = int(xr / diag);
# Don't forget to invert the sign again
y2 = int(yr * -1 / diag);
# See the final result
print x2, y2;
}
I tested it with a few different coordinates and the results seem correct.
I tried the solution by acfrancis and I found that the function has its limits when it comes to negative indices. Just in case someone else will tackle this issue:
Reason for issue: negative values like -0.1.... will be cast to 0 instead of -1.
Its the classic "there is only one zero" problem for arrays.
To solve it: before casting the x2, y2 values to int:
check if xr/diag < 0 and, if true, result = result - 1
(respectively for y2: yr * -1 / diag < 0 then result = result -1)
you then cast the result values to int like before.
Hope it helps.
Addition:
The translation of the origin by 128*5 seems to specific to a certain case so i guess this should be removed in order to generalize the function.

Kinect intrinsic parameters from field of view

Microsoft state that the field of view angles for the Kinect are 43 degrees vertical and 57 horizontal (stated here) . Given these, can we calculate the intrinsic parameters i.e. focal point and centre of projection? I assume centre of projection can be given as (0,0,0)?
Thanks
EDIT: some more information on what I'm trying to do
I have a dataset of images recorded with a Kinect, I am trying to convert pixel positions (x_screen,y_screen and z_world (in mm)) to real world coordinates.
If I know the camera is placed at point (x',y',z') in the real world coordinate system, is it sufficient to find the real world coordinates by doing the following:
x_world = (x_screen - c_x) * z_world / f_x
y_world = (y_screen - c_y) * z_world / f_y
where c_x = x' and c_y = y' and f_x, f_y is the focal length? And also how can I find the focal length given just knowledge of the field of view?
Thanks
If you equate the world origin (0,0,0) with the camera focus (center of projection as you call it) and you assume the camera is pointing along the positive z-axis, then the situation looks like this in the plane x=0:
Here the axes are z (horizontal) and y (vertical). The subscript v is for "viewport" or screen, and w is for world.
If I get your meaning correctly, you know h, the screen height in pixels. Also, zw, yv and xv. You want to know yw and xw. Note this calculation has (0,0) in the center of the viewport. Adjust appropriately for the usual screen coordinate system with (0,0) in the upper left corner. Apply a little trig:
tan(43/2) = (h/2) / f = h / (2f), so f = h / ( 2 tan(43/2) )
and similar triangles
yw / zw = yv / f also xw / zw = xv / f
Solve:
yw = zw * yv / f and xw = zw * xv / f
Note this assumes the "focal length" of the camera is equal in the x-direction. It doesn't have to be. For best accuracy in xw, you should recalculate with f = w / 2 tan(57/2) where w is the screen width. This is because f isn't a true focal length. It's just a constant of conversion. If the pixels of the camera are square and optics have no aberrations, these two f calculations will give the same result.
NB: In a deleted (improper) article the OP seemed to say that it isn't zw that's known but the length D of the hypotenuse: origin to (xw,yw,zw). In this case just note zw = D * f / sqrt(xv² + yv² + f²) (assuming camera pixels are square; some scaling is necessary if not). They you can proceed as above.
i cannot add comment since i have a too low reputation here.
But I remind that the camera angle of the kinect isn't general the same
like in a normal photo camera, due to the video stream format and its sensor chip. Therefore the SDK mentioning 57 degrees and 43 degrees, might refer to different degree resolution for hight and width.
it sends a bitmap of 320x240 pixels and those pixels relate to
Horizontal FOV: 58,5° (as distributed over 320 pixels horizontal)
Vertical FOV: 45,6° (as distributed over 240 pixels vertical).
Z is known your angle is known, so i supose law of sines can get you proper locations then https://en.wikipedia.org/wiki/Law_of_sines

How can I calculate the distance between two points in Cartesian space while respecting Asteroids style wrap around?

I have two points (x1, y1) and (x2,y2) which represent the location of two entities in my space. I calculate the Euclidian distance between them using Pythagoras' theorem and everything is wonderful. However, if my space becomes finite, I want to define a new shortest distance between the points that "wraps around" the seams of the map. For example, if I have point A as (10, 10) and point B as (90,10), and my map is 100 units wide, I'd like to calculate the distance between A and B as 20 (out the right edge of the map and back into the left edge), instead of 80, which is the normal Euclidian distance.
I think my issue is that I'm using a coordinate system that isn't quite right for what I'm trying to do, and that really my flat square map is more of a seamless doughnut shape. Any suggestions for how to implement a system of this nature and convert back and forth from Cartesian coordinates would be appreciated too!
Toroidal plane? Okay, I'll bite.
var raw_dx = Math.abs(x2 - x1);
var raw_dy = Math.abs(y2 - y1);
var dx = (raw_dx < (xmax / 2)) ? raw_dx : xmax - raw_dx;
var dy = (raw_dy < (ymax / 2)) ? raw_dy : ymax - raw_dy;
var l2dist = Math.sqrt((dx * dx) + (dy * dy));
There's a correspondence here between the rollover behavior of your x and y coordinates and the rollover behavior of signed integers represented using the base's complement representation in the method of complements.
If your coordinate bounds map exactly to the bounds of a binary integer type supported by your language, you can take advantage of the two's complement representation used by nearly all current machines by simply performing the subtraction directly, ignoring overflow and reinterpreting the result as a signed value of the same size as the original coordinate. In the general case, you're not going to be that lucky, so the above dance with abs, compare and subtract is required.

Radius of projected Sphere

i want to refine a previous question:
How do i project a sphere onto the screen?
(2) gives a simple solution:
approximate radius on screen[CLIP SPACE] = world radius * cot(fov / 2) / Z
with:
fov = field of view angle
Z = z distance from camera to sphere
result is in clipspace, multiply by viewport size to get size in pixels
Now my problem is that i don't have the FOV. Only the view and projection matrices are known. (And the viewport size if that does help)
Anyone knows how to extract the FOV from the projection matrix?
Update:
This approximation works better in my case:
float radius = glm::atan(radius/distance);
radius *= glm::max(viewPort.width, viewPort.height) / glm::radians(fov);
I'm a bit late to this party. But I came across this thread when I was looking into the same problem. I spent a day looking into this and worked though some excellent articles I found here:
http://www.antongerdelan.net/opengl/virtualcamera.html
I ended up starting with the projection matrix and working backwards. I got the same formula you mention in your post above. ( where cot(x) = 1/tan(x) )
radius_pixels = (radius_worldspace / {tan(fovy/2) * D}) * (screen_height_pixels / 2)
(where D is the distance from camera to the target's bounding sphere)
I'm using this approach to determine the radius of an imaginary trackball that I use to rotate my object.
Btw Florian, you can extract the fovy from the Projection matrix as follows:
If you take the Sy component from the Projection matrix as shown here:
Sx 0 0 0
0 Sy 0 0
0 0 Sz Pz
0 0 -1 0
where Sy = near / range
and where range = tan(fovy/2) x near
(you can find these definitions at the page I linked above)
if you substitute range in the Sy eqn above you get:
Sy = 1 / tan(fovy/2) = cot(fovy/2)
rearranging:
tan(fovy/2) = 1 / Sy
taking arctan (the inverse of tan) of both sides we get:
fovy/2 = arctan(1/Sy)
so,
fovy = 2 x arctan(1/Sy)
Not sure if you still care - its been a while! - but maybe this will help someone else.
Update: see below.
Since you have the view and projection matrices, here's one way to do it, though it's probably not the shortest:
transform the sphere's center into view space using the view matrix: call the result point C
transform a point on the surface of the sphere, e.g. C+(r, 0, 0) in world coordinates where r is the sphere's world radius, into view space; call the result point S
compute rv = distance from C to S (in view space)
let point S1 in view coordinates be C + (rv, 0, 0) - i.e. another point on the surface of the sphere in view space, for which the line C -> S1 is perpendicular to the "look" vector
project C and S1 into screen coords using the projection matrix as Cs and S1s
compute screen radius = distance between Cs and S1s
But yeah, like Brandorf said, if you can preserve the camera variables, like FOVy, it would be a lot easier. :-)
Update:
Here's a more efficient variant on the above: make an inverse of the projection matrix. Use it to transform the viewport edges back into view space. Then you won't have to project every box into screen coordinates.
Even better, do the same with the view matrix and transform the camera frustum back into world space. That would be more efficient for comparing many boxes against; but harder to figure out the math.
The answer posted at your link radiusClipSpace = radius * cot(fov / 2) / Z, where fov is the angle of the field of view, and Z is the z-distance to the sphere, definitely works. However, keep in mind that radiusClipSpace must be multiplied by the viewport's width to get a pixel measure. The value measured in radiusClipSpace will be a value between 0 and 1 if the object fits on the screen.
An alternative solution may be to use the solid angle of the sphere. The solid angle subtended by a sphere in a sky is basically the area it covers when projected to the unit sphere.
The formulae are given at this link but roughly what I'm doing is:
if( (!radius && !distance) || fabsf(radius) > fabsf(distance) )
; // NAN conditions. do something special.
theta=arcsin( radius/distance )
sphereSolidAngle = ( 1 - cosf( theta ) ) ; // not multiplying by 2PI since below ratio used only
frustumSolidAngle = ( 1 - cosf( fovy / 2 ) ) / M_PI ; // I cheated here. I assumed
// the solid angle of a frustum is (conical), then divided by PI
// to turn it into a square (area unit square=area unit circle/PI)
numPxCovered = 768.f*768.f * sphereSolidAngle / frustumSolidAngle ; // 768x768 screen
radiusEstimate = sqrtf( numPxCovered/M_PI ) ; // area=pi*r*r
This works out to roughly the same numbers as radius * cot(fov / 2) / Z. If you only want an estimate of the area covered by the sphere's projection in px, this may be an easy way to go.
I'm not sure if a better estimate of the solid angle of the frustum could be found easily. This method involves more comps than radius * cot(fov / 2) / Z.
The FOV is not directly stored in the projection matrix, but rather used when you call gluPerspective to build the resulting matrix.
The best approach would be to simply keep all of your camera variables in their own class, such as a frustum class, whose member variables are used when you call gluPerspective or similar.
It may be possible to get the FOVy back out of the matrix, but the math required eludes me.

I've got my 2D/3D conversion working perfectly, how to do perspective

Although the context of this question is about making a 2d/3d game, the problem i have boils down to some math.
Although its a 2.5D world, lets pretend its just 2d for this question.
// xa: x-accent, the x coordinate of the projection
// mapP: a coordinate on a map which need to be projected
// _Dist_ values are constants for the projection, choosing them correctly will result in i.e. an isometric projection
xa = mapP.x * xDistX + mapP.y * xDistY;
ya = mapP.x * yDistX + mapP.y * yDistY;
xDistX and yDistX determine the angle of the x-axis, and xDistY and yDistY determine the angle of the y-axis on the projection (and also the size of the grid, but lets assume this is 1-pixel for simplicity).
x-axis-angle = atan(yDistX/xDistX)
y-axis-angle = atan(yDistY/yDistY)
a "normal" coordinate system like this
--------------- x
|
|
|
|
|
y
has values like this:
xDistX = 1;
yDistX = 0;
xDistY = 0;
YDistY = 1;
So every step in x direction will result on the projection to 1 pixel to the right end 0 pixels down. Every step in the y direction of the projection will result in 0 steps to the right and 1 pixel down.
When choosing the correct xDistX, yDistX, xDistY, yDistY, you can project any trimetric or dimetric system (which is why i chose this).
So far so good, when this is drawn everything turns out okay. If "my system" and mindset are clear, lets move on to perspective.
I wanted to add some perspective to this grid so i added some extra's like this:
camera = new MapPoint(60, 60);
dx = mapP.x - camera.x; // delta x
dy = mapP.y - camera.y; // delta y
dist = Math.sqrt(dx * dx + dy * dy); // dist is the distance to the camera, Pythagoras etc.. all objects must be in front of the camera
fac = 1 - dist / 100; // this formula determines the amount of perspective
xa = fac * (mapP.x * xDistX + mapP.y * xDistY) ;
ya = fac * (mapP.x * yDistX + mapP.y * yDistY );
Now the real hard part... what if you got a (xa,ya) point on the projection and want to calculate the original point (x,y).
For the first case (without perspective) i did find the inverse function, but how can this be done for the formula with the perspective. May math skills are not quite up to the challenge to solve this.
( I vaguely remember from a long time ago mathematica could create inverse function for some special cases... could it solve this problem? Could someone maybe try?)
The function you've defined doesn't have an inverse. Just as an example, as user207422 already pointed out anything that's 100 units away from the camera will get mapped to (xa,ya)=(0,0), so the inverse isn't uniquely defined.
More importantly, that's not how you calculate perspective. Generally the perspective scaling factor is defined to be viewdist/zdist where zdist is the perpendicular distance from the camera to the object and viewdist is a constant which is the distance from the camera to the hypothetical screen onto which everything is being projected. (See the diagram here, but feel free to ignore everything else on that page.) The scaling factor you're using in your example doesn't have the same behaviour.
Here's a stab at trying to convert your code into a correct perspective calculation (note I'm not simplifying to 2D; perspective is about projecting three dimensions to two, trying to simplify the problem to 2D is kind of pointless):
camera = new MapPoint(60, 60, 10);
camera_z = camera.x*zDistX + camera.y*zDistY + camera.z*zDistz;
// viewdist is the distance from the viewer's eye to the screen in
// "world units". You'll have to fiddle with this, probably.
viewdist = 10.0;
xa = mapP.x*xDistX + mapP.y*xDistY + mapP.z*xDistZ;
ya = mapP.x*yDistX + mapP.y*yDistY + mapP.z*yDistZ;
za = mapP.x*zDistX + mapP.y*zDistY + mapP.z*zDistZ;
zdist = camera_z - za;
scaling_factor = viewdist / zdist;
xa *= scaling_factor;
ya *= scaling_factor;
You're only going to return xa and ya from this function; za is just for the perspective calculation. I'm assuming the the "za-direction" points out of the screen, so if the pre-projection x-axis points towards the viewer then zDistX should be positive and vice-versa, and similarly for zDistY. For a trimetric projection you would probably have xDistZ==0, yDistZ<0, and zDistZ==0. This would make the pre-projection z-axis point straight up post-projection.
Now the bad news: this function doesn't have an inverse either. Any point (xa,ya) is the image of an infinite number of points (x,y,z). But! If you assume that z=0, then you can solve for x and y, which is possibly good enough.
To do that you'll have to do some linear algebra. Compute camera_x and camera_y similar to camera_z. That's the post-transformation coordinates of the camera. The point on the screen has post-tranformation coordinates (xa,ya,camera_z-viewdist). Draw a line through those two points, and calculate where in intersects the plane spanned by the vectors (xDistX, yDistX, zDistX) and (xDistY, yDistY, zDistY). In other words, you need to solve the equations:
x*xDistX + y*xDistY == s*camera_x + (1-s)*xa
x*yDistX + y*yDistY == s*camera_y + (1-s)*ya
x*zDistX + y*zDistY == s*camera_z + (1-s)*(camera_z - viewdist)
It's not pretty, but it will work.
I think that with your post i can solve the problem. Still, to clarify some questions:
Solving the problem in 2d is useless indeed, but this was only done to make the problem easier to grasp (for me and for the readers here). My program actually give's a perfect 3d projection (i checked it with 3d images rendered with blender). I did left something out about the inverse function though. The inverse function is only for coordinates between 0..camera.x * 0.5 and 0.. camera.y*0.5. So in my example between 0 and 30. But even then i have doubt's about my function.
In my projection the z-axis is always straight up, so to calculate the height of an object i only used the vieuwingangle. But since you cant actually fly or jumpt into the sky everything has only a 2d point. This also means that when you try to solve the x and y, the z really is 0.
I know not every funcion has an inverse, and some functions do, but only for a particular domain. My basic thought in this all was... if i can draw a grid using a function... every point on that grid maps to exactly one map-point. I can read the x and y coordinate so if i just had the correct function i would be able to calculate the inverse.
But there is no better replacement then some good solid math, and im very glad you took the time to give a very helpfull responce :).

Resources