Normalized Device Coordinate Metal coming from OpenGL - math

Alright, so I know there are a lot of questions referring to normalized device coordinates here on SO, but none of them address my particular issue.
So, everything I draw it's specified in 2D screen coordinates where top,left is (0,0) and bottom right is (screenWidth, screenHeight) then in my vertex shader I do this calculation to get out NDC (basically, I'm rendering UI elements):
float ndcX = (screenX - ScreenHalfWidth) / ScreenHalfWidth;
float ndcY = 1.0 - (screenY / ScreenHalfHeight);
where ScreenX/ScreenY is pixel coordinates, for example (600, 700) and screenHalf_____ is half of the screen width/height.
And the final position that I return from the vertex shader for the rasterization state is:
gl_Position = vec4(ndcX, ndcY, Depth, 1.0);
Which which works perfectly fine in Opengl ES.
Now the problem is that when I try it just like this in Metal 2, it doesn't work.
I know Metal's NDC are 2x2x1 and Opengl's NDC are 2x2x2 but I thought depth here didn't play an important part in this equation since I am passing it in my self per vertex.
I tried this link and this so question but was confused and the links weren't that helpful since I am trying to avoid matrix calculations in the vertex shader since I am rendering everything 2D for now.
So my questions...What is the formula to transform pixel coordinates to NDC in Metal? Is it possible without using an orthographic projection matrix? Why doesn't my equation work for Metal?

It is of course possible without a projection matrix. Matrices are just a useful convenience for applying transformations. But it's important to understand how they work when situations like this arise, since using a general orthographic projection matrix would perform unnecessary operations to arrive at the same results.
Here are the formulae I might use to do this:
float xScale = 2.0f / drawableSize.x;
float yScale = -2.0f / drawableSize.y;
float xBias = -1.0f;
float yBias = 1.0f;
float clipX = position.x * xScale + xBias;
float clipY = position.y * yScale + yBias;
Where drawableSize is the dimension (in pixels) of the renderbuffer, which can be passed in a buffer to the vertex shader. You can also precompute the scale factors and pass those in instead of the screen dimensions, to save some computation on the GPU.

Related

Calculating camera matrix, given position, angle and FOV

I've recently been venturing into conversion of 3D points in space to a 2D pixel position on a screen, and almost every single answer I've found has been something like "do X with your world-to-camera matrix, and multiply by your viewport height to get it in pixels".
Now, that's all fine and good, but oftentimes these questions were about programming for video game engines, where a function to get a camera's view matrix is often built into a library and called on-command. But in my case, I can't do that - I need to know how to, given an FOV (say, 78 degrees) and a position and angle (of the format pitch = x, yaw = y, roll = z) it's facing, calculate the view matrix of a virtual camera.
Does anybody know what I need to do? I'm working with Lua (with built-in userdata for things like 3D vectors, angles, and 4x4 matrices exposed via the C interface), if that helps.
I am using gluPerspective
where:
fovw,fovh // are FOV in width and height of screen angles [rad]
zn,zf // are znear,zfar distances from focal point of camera
When using FOVy notation from OpenGL then:
aspect = width/height
fovh = FOVy
fovw = FOVx = FOVy*aspect
so just feed your 4x4 matrix with the values in order defined by notations you use (column or row major order).
I got the feeling you are doing SW render on your own so Do not forget to do the perspective divide!. Also take a look at the matrix link above and also at:
3D graphic pipeline

Converting XYZ to XY (world coords to screen coords)

Is there a way to convert that data:
Object position which is a 3D point (X, Y, Z),
Camera position which is a 3D point (X, Y, Z),
Camera yaw, pitch, roll (-180:180, -90:90, 0)
Field of view (-45°:45°)
Screen width & height
into the 2D point on the screen (X, Y)?
I'm looking for proper math calculations according to this exact set of data.
It's difficult, but it's possible to do it for yourself.
There are lots of libraries that do this for you, but it is more satisfying if you do it yourself:
This problem is possible and I have written my own 3D engine to do this for objects in javascript using the HTML5 Canvas. You can see my code here and solve a 3D maze game I wrote here to try and understand what I will talk about below...
The basic idea is to work in steps. To start, you have to forget about camera angle (yaw, pitch and roll) as these come later and just imagine you are looking down the y axis. Then the basic idea is to calculate, using trig, the pitch angle and yaw to your object coordinate. By this I mean imagining that you are looking through a letterbox, the yaw angle would be the angle in degrees left and right to your coordinate (so both positive and negative) from the center/ mid line and the yaw up and down from it. Taking these angles, you can map them to the x and y 2D coordinate system.
The calculations for the angles are:
pitch = atan((coord.x - cam.x) / (coord.y - cam.y))
yaw = atan((coord.z - cam.z) / (coord.y - cam.y))
with coord.x, coord.y and coord.z being the coordinates of the object and the same for the cam (cam.x, cam.y and cam.z). These calculations also assume that you are using a Cartesian coordinate system with the different axis being: z up, y forward and x right.
From here, the next step is to map this angle in the 3D world to a coordinate which you can use in a 2D graphical representation.
To map these angles into your screen, you need to scale them up as distances from the mid line. This means multiplying them by your screen width / fov. Finally, these distances will now be positive or negative (as it is an angle from the mid line) so to actually draw it on a canvas, you need to add it to half of the screen width.
So this would mean your canvas coordinate would be:
x = width / 2 + (pitch * (width / fov)
y = height / 2 + (yaw * (height / fov)
where width and height are the dimensions of you screen, fov is the camera's fov and yaw and pitch are the respective angles of the object from the camera.
You have now achieved the first big step which is mapping a 3D coordinate down to 2D. If you have managed to get this all working, I would suggest trying multiple points and connecting them to form shapes. Also try moving your cameras position to see how the perspective changes as you will soon see how realistic it already looks.
In addition, if this worked fine for you, you can move on to having the camera be able to not only change its position in the 3D world but also change its perspective as in yaw, pitch and roll angles. I will not go into this entirely now, but the basic idea is to use 3D world transformation matrices. You can read up about them here but they do get quite complicated, however I can give you the calculations if you get this far.
It might help to read (old style) OpenGL specs:
https://www.khronos.org/registry/OpenGL/specs/gl/glspec14.pdf
See section 2.10
Also:
https://www.khronos.org/opengl/wiki/Vertex_Transformation
Might help with more concrete examples.
Also, for "proper math" look up 4x4 matrices, projections, and homogeneous coordinates.
https://en.wikipedia.org/wiki/Homogeneous_coordinates

A* orientation discretization

I have a space with obstacles I wish to find a path through. What I can do is discretize the space into a grid and use A* (or D* or whatever) to find a path through it. I wish to now add orientation to the algorithm. So the node location now becomes a 3d vector (x, y, phi). You can go from one node to another one only if they belong to an arc (both positions are on a circle and are oriented along the tangent lines). How do I discretize the space so that angles don't explode in a sense that by traversing the graph, the set of possible angles becomes finite?
Thanks.
As I understand it, your challenge is not to discretize coordinate, but to discretize the headings. I had to do the same thing in a grid world that allowed movement in eight directions, i.e. horizontal, vertical and diagonal. Your discretized space should match the problem domain. For your consideration:
4-directions: use a square grid with movement across edges
8-directions use a square grid with movement across edges and vertices
6-directions use a hexagonal grid with movement across edges
12-directions use a hexagonal grid with movement across edges and points
... and so on.
To actually get the discretized headings, I declared an enum called Direction:
public enum Direction {
North,
NorthEast,
East,
SouthEast,
South,
SouthWest,
West,
NorthWest;
//additional code below...
}
You can lookup the correct heading by first computing the XY-offset between the current position and goal position:
int dx = currentPosition.x - goalPosition.x;
int dy = currentPosition.y - goalPosition.y;
These were passed to the getInstance(int,int) method (below) to obtain the correct Direction:
public static Direction getInstance(int dx, int dy) {
int count = Direction.values().length;
double rad = Math.atan2(dy, dx); // In radians
double degree = rad * (180 / Math.PI) + 450;
return getInstance(((int) Math.round(((degree % 360) / (360 / count)))) % count);
}
public static Direction getInstance(int i) {
return Direction.values()[i % Direction.count];
}
In effect, these methods compute the heading in degrees and rounds to the nearest Direction. You can then implement a method that moves/turns the agent in the the Direction heading, e.g. agent.turnToward(Direction d) or agent.move(Direction d).
Additional Resources:
Hexagon grids: http://www.redblobgames.com/grids/hexagons/#distances
Representing grids with graphs: http://www.redblobgames.com/pathfinding/grids/algorithms.html
Pathfinding with A*: http://theory.stanford.edu/~amitp/GameProgramming/
Angles can be prevented from blowing up by ensuring that phi is considered to be modulo 2pi, that is, phi = phi + 2pi*k for any integer value of k.
In C like syntax, you might end up updating phi with fmod.
phi = fmod(phi + deltaphi, 2*pi)
Where deltaphi is the change in angle you're introducing (in radians).
The most common way to do this is to constrain the values of the angle phi to take on one of n discrete angles which also has the advantage of avoiding precision/rounding issues. Given that you know phi can only take on one of n values, you can treat it like an integer, and map the value to a real when necessary.
i = (i + deltai) % n
phi = (2*i*pi)/n)
Where your change in angle deltai is (2*deltai*pi)/n radians.
However, finding a good discretization is only part of the solution - it defines a representation of your configuration space, but as you've pointed out, you also need to consider what a valid transition is.
The simplest approach to integrate angles into planning is to require rotations and translations to be distinct (at any time you can do one or the other, but not both), or to be composable (always translate, and then on arriving instantaneously rotate).
Moving forward and or backward at the same time while you're turning introduces is more complex, and tends to not work particularly well with discrete lattices - it tends to require some model of the vehicle you're working with. The most common are the simple nonholonomic models - the forward only car (the Dubins' car) or the car with forward / reverse (the Reeds Shepp car) - your reference to tangents to circles, I'm guessing this is what you're after. Dubins-Curves, or similar libraries can be used to build libraries of possible paths that could be combined with an A* (or D*) planner.
Differentially Constrained Mobile Robot Motion Planning in State Lattices by Mihail Pivtoraiko, Ross A. Knepper and Alonzo Kelly has some striking images of what's possible.

How do I calculate pixel shader depth to render a circle drawn on a point sprite as a sphere that will intersect with other objects?

I am writing a shader to render spheres on point sprites, by drawing shaded circles, and need to write a depth component as well as colour in order that spheres near each other will intersect correctly.
I am using code similar to that written by Johna Holwerda:
void PS_ShowDepth(VS_OUTPUT input, out float4 color: COLOR0,out float depth : DEPTH)
{
float dist = length (input.uv - float2 (0.5f, 0.5f)); //get the distance form the center of the point-sprite
float alpha = saturate(sign (0.5f - dist));
sphereDepth = cos (dist * 3.14159) * sphereThickness * particleSize; //calculate how thick the sphere should be; sphereThickness is a variable.
depth = saturate (sphereDepth + input.color.w); //input.color.w represents the depth value of the pixel on the point-sprite
color = float4 (depth.xxx ,alpha ); //or anything else you might need in future passes
}
The video at that link gives a good idea of the effect I'm after: those spheres drawn on point sprites intersect correctly. I've added images below to illustrate too.
I can calculate the depth of the point sprite itself fine. However, I am not sure show to calculate the thickness of the sphere at a pixel in order to add it to the sprite's depth, to give a final depth value. (The above code uses a variable rather than calculating it.)
I've been working on this on and off for several weeks but haven't figured it out - I'm sure it's simple, but it's something my brain hasn't twigged.
Direct3D 9's point sprite sizes are calculated in pixels, and my sprites have several sizes - both by falloff due to distance (I implemented the same algorithm the old fixed-function pipeline used for point size computations in my vertex shader) and also due to what the sprite represents.
How do I go from the data I have in a pixel shader (sprite location, sprite depth, original world-space radius, radius in pixels onscreen, normalised distance of the pixel in question from the centre of the sprite) to a depth value? A partial solution simply of sprite size to sphere thickness in depth coordinates would be fine - that can be scaled by the normalised distance from the centre to get the thickness of the sphere at a pixel.
I am using Direct3D 9 and HLSL with shader model 3 as the upper SM limit.
In pictures
To demonstrate the technique, and the point at which I'm having trouble:
Start with two point sprites, and in the pixel shader draw a circle on each, using clip to remove fragments outside the circle's boundary:
One will render above the other, since after all they are flat surfaces.
Now, make the shader more advanced, and draw the circle as though it was a sphere, with lighting. Note that even though the flat sprites look 3D, they still draw with one fully in front of the other since it's an illusion: they are still flat.
(The above is easy; it's the final step I am having trouble with and am asking how to achieve.)
Now, instead of the pixel shader writing only colour values, it should write the depth as well:
void SpherePS (...any parameters...
out float4 oBackBuffer : COLOR0,
out float oDepth : DEPTH0 <- now also writing depth
)
{
Note that now the spheres intersect when the distance between them is smaller than their radiuses:
How do I calculate the correct depth value in order to achieve this final step?
Edit / Notes
Several people have commented that a real sphere will distort due to perspective, which may be especially visible at the edges of the screen, and so I should use a different technique. First, thanks for pointing that out, it's not necessarily obvious and is good for future readers! Second, my aim is not to render a perspective-correct sphere, but to render millions of data points fast, and visually I think a sphere-like object looks nicer than a flat sprite, and shows the spatial position better too. Slight distortion or lack of distortion does not matter. If you watch the demo video, you can see how it is a useful visual tool. I don't want to render actual sphere meshes because of the large number of triangles compared to a simple hardware-generated point sprite. I really do want to use the technique of point sprites, and I simply want to extend the extant demo technique in order to calculate the correct depth value, which in the demo was passed in as a variable with no source for how it was derived.
I came up with a solution yesterday, which which works well and and produces the desired result of a sphere drawn on the sprite, with a correct depth value which intersects with other objects and spheres in the scene. It may be less efficient than it needs to be (it calculates and projects two vertices per sprite, for example) and is probably not fully correct mathematically (it takes shortcuts), but it produces visually good results.
The technique
In order to write out the depth of the 'sphere', you need to calculate the radius of the sphere in depth coordinates - i.e., how thick half the sphere is. This amount can then be scaled as you write out each pixel on the sphere by how far from the centre of the sphere you are.
To calculate the radius in depth coordinates:
Vertex shader: in unprojected scene coordinates cast a ray from the eye through the sphere centre (that is, the vertex that represents the point sprite) and add the radius of the sphere. This gives you a point lying on the surface of the sphere. Project both the sprite vertex and your new sphere surface vertex, and calculate depth (z/w) for each. The different is the depth value you need.
Pixel Shader: to draw a circle you already calculate a normalised distance from the centre of the sprite, using clip to not draw pixels outside the circle. Since it's normalised (0-1), multiply this by the sphere depth (which is the depth value of the radius, i.e. the pixel at the centre of the sphere) and add to the depth of the flat sprite itself. This gives a depth thickest at the sphere centre to 0 and the edge, following the surface of the sphere. (Depending on how accurate you need it, use a cosine to get a curved thickness. I found linear gave perfectly fine-looking results.)
Code
This is not full code since my effects are for my company, but the code here is rewritten from my actual effect file omitting unnecessary / proprietary stuff, and should be complete enough to demonstrate the technique.
Vertex shader
void SphereVS(float4 vPos // Input vertex,
float fPointRadius, // Radius of circle / sphere in world coords
out float fDXScale, // Result of DirectX algorithm to scale the sprite size
out float fDepth, // Flat sprite depth
out float4 oPos : POSITION0, // Projected sprite position
out float fDiameter : PSIZE, // Sprite size in pixels (DX point sprites are sized in px)
out float fSphereRadiusDepth : TEXCOORDn // Radius of the sphere in depth coords
{
...
// Normal projection
oPos = mul(vPos, g_mWorldViewProj);
// DX depth (of the flat billboarded point sprite)
fDepth = oPos.z / oPos.w;
// Also scale the sprite size - DX specifies a point sprite's size in pixels.
// One (old) algorithm is in http://msdn.microsoft.com/en-us/library/windows/desktop/bb147281(v=vs.85).aspx
fDXScale = ...;
fDiameter = fDXScale * fPointRadius;
// Finally, the key: what's the depth coord to use for the thickness of the sphere?
fSphereRadiusDepth = CalculateSphereDepth(vPos, fPointRadius, fDepth, fDXScale);
...
}
All standard stuff, but I include it to show how it's used.
The key method and the answer to the question is:
float CalculateSphereDepth(float4 vPos, float fPointRadius, float fSphereCenterDepth, float fDXScale) {
// Calculate sphere depth. Do this by calculating a point on the
// far side of the sphere, ie cast a ray from the eye, through the
// point sprite vertex (the sphere center) and extend it by the radius
// of the sphere
// The difference in depths between the sphere center and the sphere
// edge is then used to write out sphere 'depth' on the sprite.
float4 vRayDir = vPos - g_vecEyePos;
float fLength = length(vRayDir);
vRayDir = normalize(vRayDir);
fLength = fLength + vPointRadius; // Distance from eye through sphere center to edge of sphere
float4 oSphereEdgePos = g_vecEyePos + (fLength * vRayDir); // Point on the edge of the sphere
oSphereEdgePos.w = 1.0;
oSphereEdgePos = mul(oSphereEdgePos, g_mWorldViewProj); // Project it
// DX depth calculation of the projected sphere-edge point
const float fSphereEdgeDepth = oSphereEdgePos.z / oSphereEdgePos.w;
float fSphereRadiusDepth = fSphereCenterDepth - fSphereEdgeDepth; // Difference between center and edge of sphere
fSphereRadiusDepth *= fDXScale; // Account for sphere scaling
return fSphereRadiusDepth;
}
Pixel shader
void SpherePS(
...
float fSpriteDepth : TEXCOORD0,
float fSphereRadiusDepth : TEXCOORD1,
out float4 oFragment : COLOR0,
out float fSphereDepth : DEPTH0
)
{
float fCircleDist = ...; // See example code in the question
// 0-1 value from the center of the sprite, use clip to form the sprite into a circle
clip(fCircleDist);
fSphereDepth = fSpriteDepth + (fCircleDist * fSphereRadiusDepth);
// And calculate a pixel color
oFragment = ...; // Add lighting etc here
}
This code omits lighting etc. To calculate how far the pixel is from the centre of the sprite (to get fCircleDist) see the example code in the question (calculates 'float dist = ...') which already drew a circle.
The end result is...
Result
Voila, point sprites drawing spheres.
Notes
The scaling algorithm for the sprites may require the depth to be
scaled, too. I am not sure that line is correct.
It is not fully mathematically correct (takes shortcuts)
but as you can see the result is visually correct
When using millions of sprites, I still get a good rendering speed (<10ms per frame for 3 million sprites, on a VMWare Fusion emulated Direct3D device)
The first big mistake is that a real 3d sphere will not project to a circle under perspective 3d projection.
This is very non intuitive, but look at some pictures, especially with a large field of view and off center spheres.
Second, I would recommend against using point sprites in the beginning, it might make things harder than necessary, especially considering the first point. Just draw a generous bounding quad around your sphere and go from there.
In your shader you should have the screen space position as an input. From that, the view transform, and your projection matrix you can get to a line in eye space. You need to intersect this line with the sphere in eye space (raytracing), get the eye space intersection point, and transform that back to screen space. Then output 1/w as depth. I am not doing the math for you here because I am a bit drunk and lazy and I don't think that's what you really want to do anyway. It's a great exercise in linear algebra though, so maybe you should try it. :)
The effect you are probably trying to do is called Depth Sprites and is usually used only with an orthographic projection and with the depth of a sprite stored in a texture. Just store the depth along with your color for example in the alpha channel and just output
eye.z+(storeddepth-.5)*depthofsprite.
Sphere will not project into a circle in general case. Here is the solution.
This technique is called spherical billboards. An in-depth description can be found in this paper:
Spherical Billboards and their Application to Rendering Explosions
You draw point sprites as quads and then sample a depth texture in order to find the distance between per-pixel Z-value and your current Z-coordinate. The distance between the sampled Z-value and current Z affects the opacity of the pixel to make it look like a sphere while intersecting underlying geometry. Authors of the paper suggest the following code to compute opacity:
float Opacity(float3 P, float3 Q, float r, float2 scr)
{
float alpha = 0;
float d = length(P.xy - Q.xy);
if(d < r) {
float w = sqrt(r*r - d*d);
float F = P.z - w;
float B = P.z + w;
float Zs = tex2D(Depth, scr);
float ds = min(Zs, B) - max(f, F);
alpha = 1 - exp(-tau * (1-d/r) * ds);
}
return alpha;
}
This will prevent sharp intersections of your billboards with the scene geometry.
In case point-sprites pipeline is difficult to control (i can say only about OpenGL and not DirectX) it is better to use GPU-accelerated billboarding: you supply 4 equal 3D vertices that match the center of the particle. Then you move them into the appropriate billboard corners in a vertex shader, i.e:
if ( idx == 0 ) ParticlePos += (-X - Y);
if ( idx == 1 ) ParticlePos += (+X - Y);
if ( idx == 2 ) ParticlePos += (+X + Y);
if ( idx == 3 ) ParticlePos += (-X + Y);
This is more oriented to the modern GPU pipeline and of coarse will work with any nondegenerate perspective projection.

Calculating modelview matrix for 2D camera using Eigen

I'm trying to calculate modelview matrix of my 2D camera but I can't get the formula right. I use the Affine3f transform class so the matrix is compatible with OpenGL. This is closest that I did get by trial and error. This code rotates and scales the camera ok, but if I apply translation and rotation at same time the camera movement gets messed up: camera moves in rotated fashion, which is not what I want. (And this probaly due to fact I first apply the rotation matrix and then translation)
Eigen::Affine3f modelview;
modelview.setIdentity();
modelview.translate(Eigen::Vector3f(camera_offset_x, camera_offset_y, 0.0f));
modelview.scale(Eigen::Vector3f(camera_zoom_x, camera_zoom_y, 0.0f));
modelview.rotate(Eigen::AngleAxisf(camera_angle, Eigen::Vector3f::UnitZ()));
modelview.translate(Eigen::Vector3f(camera_x, camera_y, 0.0f));
[loadmatrix_to_gl]
What I want is that camera would rotate and scale around offset position in screenspace {(0,0) is middle of the screen in this case} and then be positioned along the global xy-axes in worldspace {(0,0) is also initialy at middle of the screen} to the final position. How would I do this?
Note that I have set up also an orthographic projection matrix, which may affect this problem.
If you want a 2D image, rendered in the XY plane with OpenGL, to (1) rotate counter-clockwise by a around point P, (2) scale by S, and then (3) translate so that pixels at C (in the newly scaled and rotated image) are at the origin, you would use this transformation:
translate by -P (this moves the pixels at P to the origin)
rotate by a
translate by P (this moves the origin back to where it was)
scale by S (if you did this earlier, your rotation would be messed up)
translate by -C
If the 2D image we being rendered at the origin, you'd also need to end by translate by some value along the negative z axis to be able to see it.
Normally, you'd just do this with OpenGL basics (glTranslatef, glScalef, glRotatef, etc.). And you would do them in the reverse order that I've listed them. Since you want to use glLoadMatrix, you'd do things in the order I described with Eigen. It's important to remember that OpenGL is expecting a Column Major matrix (but that seems to be the default for Eigen; so that's probably not a problem).
JCooper did great explaining the steps to construct the initial matrix.
However I eventually solved the problem bit differently. There was few additional things and steps that were not obvious for me at the time. See JCooper answer's comments. First is to realize all matrix operations are relative.
Thus if you want to position or move the camera with absolute xy-axes, you must first decompose the matrix to extract its absolute position with unchanged axes. Then you translate the matrix by the difference of the old and new position.
Here is way to do this with Eigen:
First compute Affine2f matrix cmat scalar determinant D. With Eigen this is done with D = cmat.linear().determinant();. Next compute 'reverse' matrix matrev of the current rotation+scale matrix R using the D. matrev = (RS.array() / (1.0f / determ)).matrix()); where RS is cmat.matrix().topLeftCorner(2,2)
The absolute camera position P is then given by P = invmat * -C where C is cmat.matrix().col(2).head<2>()
Now we can reposition the camera anywhere along the absolute axes and keeping the rotation+scaling same: V = RS * (T - P) where RS is same as before, T is the new position vec and P is the decomposed position vec.
The cmat then simply translated by V to move the camera: cmat.pretranslate(V)

Resources