How to unproject a point on screen to object space coordinates in vulkan? - projection

I need to be able to unproject a screen pixel into object space using Vulkan, but somewhere my math is going wrong.
Here is the shader as it stands today for reference:
void main()
{
//the depth of this pixel is between 0 and 1
vec4 obj_space = vec4( float(gl_FragCoord.x)/ubo.screen_width, float(gl_FragCoord.y)/ubo.screen_height, gl_FragCoord.z, 1.0f);
//this puts us in normalized device coordinates [-1,1 ] range
obj_space.xy = ( obj_space.xy * 2.0f ) -1.0f;
//this two lines will put is in object space coordinates
//mvp_inverse is derived from this in the c++ side:
//glm::inverse(app.three_d_camera->get_projection_matrix() * app.three_d_camera->view_matrix * model);
obj_space = ubo.mvp_inverse * obj_space;
obj_space.xyz /= obj_space.w;
//the resulting position here is wrong
out_color = obj_space;
}
when I output the position in color, the colors are off. I know I can simply pass in the object space position from the vertex shader to the fragment shader, but I'd like to understand why my math is not working, it will help me understand Vulkan and maybe learn a little math myself.
Thanks!

I'm not entirely sure what your problem is, but lets go over potential problems.
Remember, vulkan clip space is:
positive y = down,
positive x = right,
positive z = out,
centered at the middle of the screen.
Additionally, despite OpenGL's GLSL docs saying it is centered at the bottom left corner, in vulkan gl_FragCoord is centered at the top left corner.
in this step:
obj_space.xy = ( obj_space.xy * 2.0f ) -1.0f;
obj_space is now:
left x : -1.0
right x : 1.0
top y = -1.0
bottom y = 1.0
out z = 1.0
back z = 0
I'm almost entirely sure you don't mean your object space to have Y be negative at the top. The reasoning for y increasing starting from top to bottom is for images and textures, which on the CPU are ordered the same way, and now are ordered like that in vulkan.
Some other notes:
You claim your inverse is derivied from glm::inverse here:
glm::inverse(app.three_d_camera->get_projection_matrix() * app.three_d_camera->view_matrix * model);
But GLM uses OpenGL notation for matrix dimensions and handedness, and unless you force it to the correct coordinate system, it is going to assume right handed positive Y up, z negative out. You'll need to include the following #defines before it works correctly (or physically change your calculations to accommodate this).
#define GLM_FORCE_DEPTH_ZERO_TO_ONE
#define GLM_FORCE_LEFT_HANDED
Additionally you'll need to modify your matrices to account for the negative Y direction. Here is an example of how I've handled this in the past (modifying the perspective matrix directly):
ubo.model = glm::translate(glm::mat4(1.0f), glm::vec3(pos_x,pos_y,pos_z));
ubo.model *= glm::rotate(glm::mat4(1.0f), time * glm::radians(0.0f), glm::vec3(0.0f, 0.0f, 1.0f));
ubo.view = glm::lookAt(glm::vec3(0.0f, 0.0f, -10.0f), glm::vec3(0.0f, 0.0f, 0.0f), glm::vec3(0.0f, 1.0f, 0.0f));
ubo.proj = glm::perspective(glm::radians(45.0f), swapChainExtent.width / (float) swapChainExtent.height, 0.1f, 100.0f);
ubo.proj[1][1] *= -1; // makes the y axis projected to the same as vulkans

Related

how to color point cloud from image pixels?

I am using google tango tablet to acquire point cloud data and RGB camera images. I want to create 3D scan of the room. For that i need to map 2D image pixels to point cloud point. I will be doing this with a lot of point clouds and corresponding images.Thus I need to write a code script which has two inputs 1. point cloud and 2. image taken from the same point in same direction and the script should output colored point cloud. How should i approach this & which platforms will be very simple to use?
Here is the math to map a 3D point v to 2D pixel space in the camera image (assuming that v already incorporates the extrinsic camera position and orientation, see note at bottom*):
// Project to tangent space.
vec2 imageCoords = v.xy/v.z;
// Apply radial distortion.
float r2 = dot(imageCoords, imageCoords);
float r4 = r2*r2;
float r6 = r2*r4;
imageCoords *= 1.0 + k1*r2 + k2*r4 + k3*r6;
// Map to pixel space.
vec3 pixelCoords = cameraTransform*vec3(imageCoords, 1);
Where cameraTransform is the 3x3 matrix:
[ fx 0 cx ]
[ 0 fy cy ]
[ 0 0 1 ]
with fx, fy, cx, cy, k1, k2, k3 from TangoCameraIntrinsics.
pixelCoords is declared vec3 but is actually 2D in homogeneous coordinates. The third coordinate is always 1 and so can be ignored for practical purposes.
Note that if you want texture coordinates instead of pixel coordinates, that is just another linear transform that can be premultiplied onto cameraTransform ahead of time (as is any top-to-bottom vs. bottom-to-top scanline addressing).
As for what "platform" (which I loosely interpreted as "language") is simplest, the native API seems to be the most straightforward way to get your hands on camera pixels, though it appears people have also succeeded with Unity and Java.
* Points delivered by TangoXYZij already incorporate the depth camera extrinsic transform. Technically, because the current developer tablet shares the same hardware between depth and color image acquisition, you won't be able to get a color image that exactly matches unless both your device and your scene are stationary. Fortunately in practice, most applications can probably assume that neither the camera pose nor the scene changes enough in one frame time to significantly affect color lookup.
This answer is not original, it is simply meant as a convenience for Unity users who would like the correct answer, as provided by #rhashimoto, worked out for them. My contribution (hopefully) is providing code that reduces the normal 16 multiplies and 12 adds (given Unity only does 4x4 matrices) to 2 multiplies and 2 adds by dropping out all of the zero results. I ran a little under a million points through the test, checking each time that my calculations agreed with the basic matrix calculations - defined as the absolute difference between the two results being less than machine epsilon - I'm as comfortable with this as I can be knowing that #rhashimoto may show up and poke a giant hole in it :-)
If you want to switch back and forth, remember this is C#, so the USEMATRIXMATH define must appear at the beginning of the file.
Given there's only one Tango device right now, and I'm assuming the intrinsics are constant across all of the devices, I just dumped them in as constants, such that
fx = 1042.73999023438
fy = 1042.96997070313
cx = 637.273986816406
cy = 352.928985595703
k1 = 0.228532999753952
k2 = -0.663019001483917
k3 = 0.642908990383148
Yes they can be dumped in as constants, which would make things more readable, and C# is probably smart enough to optimize it out - however, I spent too much of my life in Agner Fogg's stuff, and will always be paranoid.
The commented out code at the bottom is for testing the difference, should you desire. You'll have to uncomment some other stuff, and comment out the returns if you want to test the results.
My thanks again to #rhashimoto, this is far far better than what I had
I have stayed true to his logic, remember these are pixel coordinates, not UV coordinates - he is correct that you can premultiply the transform to get normalized UV values, but since he schooled me on this once already, I will stick with exactly the math he presented before I fiddle with too much :-)
static public Vector2 PictureUV(Vector3 tangoDepthPoint)
{
Vector2 imageCoords = new Vector2(tangoDepthPoint.x / tangoDepthPoint.z, tangoDepthPoint.y / tangoDepthPoint.z);
float r2 = Vector2.Dot(imageCoords, imageCoords);
float r4 = r2*r2;
float r6 = r2*r4;
imageCoords *= 1.0f + 0.228532999753952f*r2 + -0.663019001483917f*r4 + 0.642908990383148f*r6;
Vector3 ic3 = new Vector3(imageCoords.x,imageCoords.y,1);
#if USEMATRIXMATH
Matrix4x4 cameraTransform = new Matrix4x4();
cameraTransform.SetRow(0,new Vector4(1042.73999023438f,0,637.273986816406f,0));
cameraTransform.SetRow(1, new Vector4(0, 1042.96997070313f, 352.928985595703f, 0));
cameraTransform.SetRow(2, new Vector4(0, 0, 1, 0));
cameraTransform.SetRow(3, new Vector4(0, 0, 0, 1));
Vector3 pixelCoords = cameraTransform * ic3;
return new Vector2(pixelCoords.x, pixelCoords.y);
#else
//float v1 = 1042.73999023438f * imageCoords.x + 637.273986816406f;
//float v2 = 1042.96997070313f * imageCoords.y + 352.928985595703f;
//float v3 = 1;
return new Vector2(1042.73999023438f * imageCoords.x + 637.273986816406f,1042.96997070313f * imageCoords.y + 352.928985595703);
#endif
//float dx = Math.Abs(v1 - pixelCoords.x);
//float dy = Math.Abs(v2 - pixelCoords.y);
//float dz = Math.Abs(v3 - pixelCoords.z);
//if (dx > float.Epsilon || dy > float.Epsilon || dz > float.Epsilon)
// UnityEngine.Debug.Log("Well, that didn't work");
//return new Vector2(v1, v2);
}
As one final note, do note the code he provided is GLSL - if you're just using this for pretty pictures, use it - this is for those that actually need to perform additional processing.

Calculating if or not a 3D eyepoint is behind a 2D plane or upwards

The setup
Draw XY-coordinate axes on a piece of paper. Write a word on it along X-axis, so that the word's centerpoint is at origo (half on positive side of X/Y, the other half on negative side of X/Y).
Now, if you flip the paper upside down you'll notice that the word is mirrored in relation to both X- and Y-axis. If you look from behind the paper, it's mirrored in relation to Y-axis. If you look at it from behind and upside down, it's mirrored in relation to X-axis.
Ok, I have points in 2D-plane (vertices) that are created in similar way at the origo and I need to apply exactly the same rule for them. To make things interesting:
The 2D plane is actually 3D, each point (vertex) being (x, y, 0). Initially the vertices are positioned to the origo and their normal is Pn(0,0,1). => Correctly seen when looked at from point Pn towards origo.
The vertex-plane has it's own rotation matrix [Rp] and position P(x,y,z) in the 3D-world. The rotation is applied before positioning.
The 3D world is "right handed". The viewer would be looking towards origo from some distance along positive Z-axis but the world is also oriented by rotation matrix [Rw]. [Rw] * (0,0,1) would point directly to the viewer's eye.
From those I need to calculate when the vertex-plane should be mirrored and by which axis. The mirroring itself can be done before applying [Rp] and P by:
Vertices vertices = Get2DPlanePoints();
int MirrorX = 1; // -1 to mirror, 1 NOT to mirror
int MirrorY = 1; // -1 to mirror, 1 NOT to mirror
Matrix WorldRotation = GetWorldRotationMatrix();
MirrorX = GetMirrorXFactor(WorldRotation);
MirrorY = GetMirrorYFactor(WorldRotation);
foreach(Vertex v in vertices)
{
v.X = v.X * MirrorX * MirrorY;
v.Y = V.Y * MirrorY;
}
// Apply rotation...
// Add position...
The question
So I need GetMirrorXFactor() & ..YFactor() -functions that return -1 if the viewer's eyepoint is at greater "X/Y"-angle than +-90 degrees in relation to the vertex-plane's normal after the rotation and world orientation. I have already solved this, but I'm looking for more "elegant" mathematics. I know that rotation matrices somehow contain info about how much is rotated by which axis and I believe that can be utilized here.
My Solution for MirrorX:
// Matrix multiplications. Vectors are vertical matrices here.
Pnr = [Rp] * Pn // Rotated vertices's normal
Pur = [Rp] * (0,1,0) // Rotated vertices's "up-vector"
Wnr = [Rw] * (0,0,1) // Rotated eye-vector with world's orientation
// = vector pointing directly at the viewer's eye
// Use rotated up-vector as a normal some new plane and project viewer's
// eye on it. dot = dot product between vectors.
Wnrx = Wnr - (Wnr dot Pur) * Pur // "X-projected" eye.
// Calculate angle between eye's X-component and plane's rotated normal.
// ||V|| = V's norm.
angle = arccos( (Wnrx dot Pnr) / ( ||Wnrx|| * ||Pnr|| ) )
if (angle > PI / 2)
MirrorX = -1; // DO mirror
else
MirrorX = 1; // DON'T mirror
Solution for mirrorY can be done in similar way using viewer's up and vertex-plane's right -vectors.
Better solution?
if (([Rp]*(1,0,0)) dot ([Rw]*(1,0,0))) < 0
MirrorX = -1; // DO mirror
else
MirrorX = 1; // DON'T mirror
if (([Rp]*(0,1,0)) dot ([Rw]*(0,1,0))) < 0
MirrorY = -1; // DO mirror
else
MirrorY = 1; // DON'T mirror
Explaining in more detail is difficult without diagrams, but if you have trouble with this solution we can work through some cases.

Rotating a D3DXVECTOR3 around a specific point

This is probably a pretty simple thing but my knowledge of direct x is just not up to par with what I'm trying to achieve.
For the moment I am trying to create a vehicle that moves around on terrain. I am attempting to make the vehicle recognize the terrain by creating a square (4 D3DXVECTOR3 points) around the vehicle who's points each detect the height of the terrain and adjust the vehicle accordingly.
The vehicle is a simple object derived from Microsoft sample code. It has a world matrix, coordinates, rotations etc.
What I am trying to achieve is to make these points move along with the vehicle, turning when it does so they can detect the difference in height. This requires me to update the points each time the vehicle moves but I cannot for the life of me figure out how to get them to rotate properly.
So In summary I am looking for a simple way to rotate a vector about an origin (my vehicles coordinates).
These points are situated near the vehicle wheels so if it worked they would stay there regardless of the vehicles y -axis rotation.
Heres What Ive tryed:
D3DXVECTOR3 vec;
D3DXVec3TransformCoord(&vectorToHoldTransformation,&SquareTopLeftPoint,&matRotationY);
SquareTopLeftPoint = vec;
This resulted in the point spinning madly out of control and leaving the map.
xRot = VehicleCoordinateX + cos(RotationY) * (SquareTopleftX - VehicleCoordinateX) - sin(RotationY) * (SquareTopleftZ - VehicleCoordinateZ);
yRot = VehicleCoordinateZ + sin(RotationY) * (SquareTopleftX - VehicleCoodinateX) + cos(RotationY) * (SquareToplefteZ - VehicleCoordinateZ);
BoxPoint refers to the vector I am attempting to rotate.
Vehicle is of course the origin of rotation
RotationY is the amount it has rotated.
This is the code for 1 of 4 vectors in this square but I assume once I get 1 write the rest are just copy-paste.
No matter what I try the point either does not move or spirals out of control under leaving the map all-together.
Here is a snippet of my object class
class Something
{
public:
float x, y, z;
float speed;
float rx, ry, rz;
float sx, sy, sz;
float width;
float length;
float frameTime;
D3DXVECTOR3 initVecDir;
D3DXVECTOR3 currentVecDir;
D3DXMATRIX matAllRotations;
D3DXMATRIX matRotateX;
D3DXMATRIX matRotateY;
D3DXMATRIX matRotateZ;
D3DXMATRIX matTranslate;
D3DXMATRIX matWorld;
D3DXMATRIX matView;
D3DXMATRIX matProjection;
D3DXMATRIX matWorldViewProjection;
//these points represent a box that is used for collision with terrain.
D3DXVECTOR3 frontLeftBoxPoint;
D3DXVECTOR3 frontRightBoxPoint;
D3DXVECTOR3 backLeftBoxPoint;
D3DXVECTOR3 backRightBoxPoint;
}
I was thinking it might be possible to do this using D3DXVec3TransformCoord
D3DXMatrixTranslation(&matTranslate, origin.x,0,origin.z);
D3DXMatrixRotationY(&matRotateY, ry);
D3DXMatrixTranslation(&matTranslate2,width,0,-length);
matAllRotations = matTranslate * matRotateY * matTranslate2;
D3DXVECTOR3 newCoords;
D3DXVECTOR3 oldCoords = D3DXVECTOR3(x,y,z);
D3DXVec3TransformCoord(&newCoords, &oldCoords, &matAllRotations);
Turns out that what I need to do was
Translate by -origin.
rotate
Translate by origin.
What I was doing was
Move to origin
Rotate
Translate by length/width
Thought it was the same.
D3DXMATRIX matTranslate2;
D3DXMatrixTranslation(&matTranslate,-origin.x,0,-origin.z);
D3DXMatrixRotationY(&matRotateY,ry);
D3DXMatrixTranslation(&matTranslate2,origin.x,0,origin.z);
//D3DXMatrixRotationAxis(&matRotateAxis,&origin,ry);
D3DXMATRIX matAll = matTranslate * matRotateY * matTranslate2;
D3DXVECTOR4 newCoords;
D3DXVECTOR4 oldCoords = D3DXVECTOR4(x,y,z,1);
D3DXVec4Transform(&newCoords,&oldCoords,&matAll);
//D3DXVec4TransformCoord(&newCoords, &oldCoords, &matAll);
return newCoords;
Without knowing more about your code I can't say what it does exactly, however one 'easy' way to think about this problem if you know the angle of the heading of your vehicle in world coordinates is to represent your points in a manner such that the center of the vehicle is at the origin, use a simple rotation matrix to rotate it around the vehicle according to the heading, and then add your vehicle's center to the resulting coordinates.
x = vehicle_center_x + cos(heading) * corner_x - sin(heading) * corner_y
y = vehicle_center_y - sin(heading) * corner_x + cos(heading) * corner_y
Keep in mind that corner_x and corner_y are expressed in coordinates relative to the vehicle -- NOT relative to the world.

How do I take a 2D point, and project it into a 3D Vector by a perspective camera

I have a 2D Point (x,y) and I want to project it to a Vector, so that I can perform a ray-trace to check if the user clicked on a 3D Object, I have written all the other code, Except when I got back to my function to get the Vector from the xy cords of the mouse, I was not accounting for Field-Of-View, and I don't want to guess what the factor would be, as 'voodoo' fixes are not a good idea for a library. any math-magicians wanna help? :-).
Heres my current code, that needs FOV of the camera applied:
sf::Vector3<float> Camera::Get3DVector(int Posx, int Posy, sf::Vector2<int> ScreenSize){
//not using a "wide lens", and will maintain the aspect ratio of the viewport
int window_x = Posx - ScreenSize.x/2;
int window_y = (ScreenSize.y - Posy) - ScreenSize.y/2;
float Ray_x = float(window_x)/float(ScreenSize.x/2);
float Ray_y = float(window_y)/float(ScreenSize.y/2);
sf::Vector3<float> Vector(Ray_x,Ray_y, -_zNear);
// to global cords
return MultiplyByMatrix((Vector/LengthOfVector(Vector)), _XMatrix, _YMatrix, _ZMatrix);
}
You're not too fart off, one thing is to make sure your mouse is in -1 to 1 space (not 0 to 1)
Then you create 2 vectors:
Vector3 orig = Vector3(mouse.X,mouse.Y,0.0f);
Vector3 far = Vector3(mouse.X,mouse.Y,1.0f);
You also need to use the inverse of your perspective tranform (or viewprojection if you want world space)
Matrix ivp = Matrix::Invert(Projection)
Then you do:
Vector3 rayorigin = Vector3::TransformCoordinate(orig,ivp);
Vector3 rayfar = Vector3::TransformCoordinate(far,ivp);
If you want a ray, you also need direction, which is simply:
Vector3 raydir = Normalize(rayfar-rayorigin);

OpenGL FPS Camera movement relative to lookAt target

I have a camera in OpenGL.I had no problem with it until adding FPS controller.The problem is that the basic FPS behavior is ok. The camera moves forward,backward,left and right+ rotates towards the direction supplied by mouse input.The problems begin when the camera moves to the sides or the back of the target position.In such a case camera local forward,backward,left,right directions aren't updated based on its current forward look but remain the same as if it was right in front of the target.Example:
If the target object position is at (0,0,0) and camera position is at (-50,0,0) (to the left of the target) and camera is looking at the target,then to move it back and forth I have to use the keys for left and right movement while backward/forward keys move the camera sideways.
Here is the code I use to calculate camera position, rotation and LookAt matrix:
void LookAtTarget(const vec3 &eye,const vec3 &center,const vec3 &up)
{
this->_eye = eye;
this->_center = center;
this->_up = up;
this->_direction =normalize((center - eye));
_viewMatrix=lookAt( eye, center , up);
_transform.SetModel(_viewMatrix );
UpdateViewFrustum();
}
void SetPosition(const vec3 &position){
this->_eye=position;
this->_center=position + _direction;
LookAtTarget(_eye,_center,_up);
}
void SetRotation(float rz , float ry ,float rx){
_rotationMatrix=mat4(1);
vec3 direction(0.0f, 0.0f, -1.0f);
vec3 up(0.0f, 1.0f, 0.0f);
_rotationMatrix=eulerAngleYXZ(ry,rx,rz);
vec4 rotatedDir= _rotationMatrix * vec4(direction,1) ;
this->_center = this->_eye + vec3(rotatedDir);
this->_up =vec3( _rotationMatrix * vec4(up,1));
LookAtTarget(_eye, _center, up);
}
Then in the render loop I set camera's transformations:
while(true)
{
display();
fps->print(GetElapsedTime());
if(glfwGetKey(GLFW_KEY_ESC) || !glfwGetWindowParam(GLFW_OPENED)){
break;
}
calculateCameraMovement();
moveCamera();
view->GetScene()->GetCurrentCam()->SetRotation(0,-camYRot,-camXRot);
view->GetScene()->GetCurrentCam()->SetPosition(camXPos,camYPos,camZPos);
}
lookAt() method comes from GLM math lib.
I am pretty sure I have to multiply some of the vectors (eye ,center etc) with rotation matrix but I am not sure which ones.I tried to multiply _viewMatrix by the _rotationMatrix but it creates a mess.The code for FPS camera position and rotation calculation is taken from here.But for the actual rendering I use programmable pipeline.
Update:
I solved the issue by adding a separate method which doesn't calculate camera matrix using lookAt but rather using the usual and basic approach:
void FpsMove(GLfloat x, GLfloat y , GLfloat z,float pitch,float yaw){
_viewMatrix =rotate(mat4(1.0f), pitch, vec3(1, 0, 0));
_viewMatrix=rotate(_viewMatrix, yaw, vec3(0, 1, 0));
_viewMatrix= translate(_viewMatrix, vec3(-x, -y, -z));
_transform.SetModel( _viewMatrix );
}
It solved the problem but I still want to know how to make it work with lookAt() methods I presented here.
You need to change the forward direction of the camera, which is presumably fixed to (0,0,-1). You can do this by rotating the directions about the y axis by camYRot (as computed in the lookat function) so that forwards is in the same direction that the camera is pointing (in the plane made by the z and x axes).

Resources