Say I have object A and object B, I have control of A's movement and rotation.
I'm trying to link A and B together in such way, that when A moves, relative distance b/w A and B remains same. and when A rotates by an angle, B also rotates with the same angle, but its rotation pivot as position of A.
you can imagine it like you (A) being connected to your phone (B) with a selfie stick.
I've been trying to figure out what Matrix operations will be required, to make this possible.
I've been told this looks like parent child relation b/w 2 objects, therefore I can use ModelMatrix of the parent (A in this case) and apply local transformation Matrix to get Model Matrix of child (B in this case).
In practice, it only fixed the relative position b/w the 2 object, but for rotation it didn't work
If you want to rotate around a pivot you have to:
Translate the object so that the pivot point is moved to (0, 0).
Rotate the object.
Move the object so that the pivot point moves in its original position.
glm::vec3 pivot(object_A_x, object_A_y, object_A_z);
glm::vec3 rotation_axis(0.0f, 0.0f, 1.0f)
float angle_rad = glm::radians(rotation_angle_degree)
glm::mat4 translate_to_orgin = glm::translate(glm::mat4(1.0f), -pivot);
glm::mat4 rotate = glm::rotate(glm::mat4(1.0f), angle_rad, rotation_axis);
glm::mat4 translate_back = glm::translate(glm::mat4(1.0f), pivot);
glm mat4 rotate_around_pivot = translate_back * rotate * translate_to_orgin;
I'm trying to learn how to zoom towards mouse using Orthographic projection and so far I've got this:
def dolly(self, wheel, direction, x, y, acceleration_enabled):
v = vec4(*[float(v) for v in glGetIntegerv(GL_VIEWPORT)])
w, h = v[2], v[3]
f = self.update_zoom(direction, acceleration_enabled) # [0.1, 4]
aspect = w/h
x,y = x-w/2, y-h/2
K1 = f*10
K0 = K1*aspect
self.left = K0*(-2*x/w-1)
self.right = K0*(-2*x/w+1)
self.bottom = K1*(2*y/h-1) = K1*(2*y/h+1)
x/y: mouse screen coordinates
w/h: window width/height
f: factor which goes from 0.1 to 4 when scrolling down/up
left/right/bottom/top: values used to compute the new orthographic projection
The results I'm getting are really strange but I don't know which part of the formulas I've messed up.
Could you please spot which part of my maths are wrong or just post a clear pseudocode I can try? Just for the record, I've read&tested quite a lot of versions out there on the internet but haven't found yet any place where this subject is explained properly.
Ps. You don't need to post any SO link related to this subject as I've read all of them already :)
I'm going to answer this in a general way, based on the following set of assumptions:
You use a matrix P for the (ortho) projection describing the actual mapping of your eye space view volume onto the standard view volume [-1,1]^3 OpenGL will clip against (see also assumption 2) and a matrix V for the view transformtation, that is postion and orientation of the "camera" (if there is such a thing, especially in ortho projections) and basically establishing an eye space where your view volume will be defined relative to.
I will ignore the homogeneous clip space, as you work with completely affine ortho projections only, that means NDC coordinates and clip space will be identical, and no tricks to any w coordinate are applied.
I assume default GL conventions for eye space and projection matrices, notably eye space origin is camera location and camera lookat direction is -z
The viewport is filling the window completely.
Windows Space is default OpenGL convention where the origin is at the bottom left.
Mouse coordinates are in some window-specific coordinate frame where the origin is at top left, mouse is at integer pixel coordinates.
I assume that the view volume defined by P is symmetrical: right = -left and top = -bottom, and it is also supposed to stay symmetrical after the zoom operation, therefore, to compensate for any movement, the view matrix V must be adjusted, too.
What you want to get is a zoom such that the object point under the mouse cursor does not move, so becomes the center of the scale operation. The mouse cursor itself is only 2D and a whole straight line in the 3D space will be mapped to the same pixel location. However, in an ortho projection, that line will be orthogonal to the image plane, so we don't need to bother much with the third dimension.
So what we want is to scale the current situation with P_old (defined by the ortho parameters l_old, r_old, b_old, t_old, n_old and f_old) and V_old (defined by "camera" position c_old and ortientation o_old) by a zoom factor s at mouse position (x,y) (in the space from assumption 6).
We can see a few things directly:
the near and far plane of the projection should be unaffected by the operation, so n_new = n_old and f_new = f_old.
the actual camera orientation (or lookat direction) should also be unaffected: o_new = o_old
If we zoom in by a factor of s, the actual view volume must be scaled by 1/s, since when we zoom in, a smaller part of the complete world is mapper on the screen than before (and appears bigger). So we can simply scale the frustum parameters we had:
l_new = l_old / s, r_new = r_old / s, b_new = b_old / s, t_new = t_old / s
If new only replace P_old by P_new, we get the zoom, but the world point under the mouse cursor will move (except the mouse is exactly in the center of the view). So we have to compensate for that by modifying the camera position.
Let's first put the mouse coords (x,y) into OpenGL window space (assumptions 5 and 6):
x_win = x + 0.5
y_win = height - 0.5 - y
Note that besides mirroring y, I also shift the coordinates by half a pixels. That's because in OpenGL window space, pixel centers are at half-inter coordinates, while I assume that your integer mouse coordinates are to represent the center of the pixel you click onto (will not make a big difference visually, but still)
Now let's further put the coords into Normalized Device Space (relying on assumption 4 here):
x_ndc = 2.0 * x_win / width - 1
y_ndc = 2.0 * y_win / height - 1
By assumption 2, clip and NDC coordiantes will be identical, and we can call the vector v our NDC/space mouse coordinates: v = (x_ndc, y_ndc, 0, 1)^T
We can now state our "point under mouse must not move" condition:
inverse(V_old) * inverse(P_old) * v = inverse(V_new) * inverse(P_new) * v
But let's just go into eye space and let's look at what happened:
Let a = inverse(P_old) * v be the eye space location of the point under the mouse cursor before we scaled.
Let b = inverse(P_new) * v be the eye space location of the pointer under the mouse cursor after we scaled.
Since we assumed a symmetrical view volume, we already know that for the x and y coordinates, b = (1/s) *a holds (assumption 7. if that assumption does not hold, you need to do the actual calculation for b too, which isn't hard either).
So, we can set up an 2D eye space offset vector d which describes how our point of interest was moved by the scale:
d = b - a = (1 / s) *a - a = a (1/s - 1)
To compensate for that movement, we have to move our camera inversely, so by -d.
If you keep the camera position separate as I did in assumption 1, you simply need to update the camera position c accordingly. You just have to take care about the fact that c is the world space postion, while d is an eye space offset:
c_new = c_old - inverse(V_old) * (d_x, d_y, 0, 0)^T
Not that if you do not keep the camera position as a separate variable, but keep the view matrix directly, you can simply pre-multiply the translation: V_new = translate(-d_x, -d_y, 0) * V_old
What I wrote so far is correct, but I took a shortcut which is numerically a very bad idea when working with not-infinite precision data types. The error in camera position accumulates very fast if one zooms out a lot. So after #BPL implemted this, this it what he got:
The main issue seems to be that I directly calculated the offset vector d in eye space, which does not take the current view matrix V_old (and its small errors into account). So a more stable approach is to calculate all of this directly in world space:
a = inverse(P_old * V_old) * v
b = inverse(P_new * V_old) * v
d = b - a
c_new = c_old - d
(doing so makes assumption 7 not needed anymore as a by product, so it directly works in the general case of arbitrary ortho matrices).
Using this approach, the zoom operation worked as expected:
Alright, so I know there are a lot of questions referring to normalized device coordinates here on SO, but none of them address my particular issue.
So, everything I draw it's specified in 2D screen coordinates where top,left is (0,0) and bottom right is (screenWidth, screenHeight) then in my vertex shader I do this calculation to get out NDC (basically, I'm rendering UI elements):
float ndcX = (screenX - ScreenHalfWidth) / ScreenHalfWidth;
float ndcY = 1.0 - (screenY / ScreenHalfHeight);
where ScreenX/ScreenY is pixel coordinates, for example (600, 700) and screenHalf_____ is half of the screen width/height.
And the final position that I return from the vertex shader for the rasterization state is:
gl_Position = vec4(ndcX, ndcY, Depth, 1.0);
Which which works perfectly fine in Opengl ES.
Now the problem is that when I try it just like this in Metal 2, it doesn't work.
I know Metal's NDC are 2x2x1 and Opengl's NDC are 2x2x2 but I thought depth here didn't play an important part in this equation since I am passing it in my self per vertex.
I tried this link and this so question but was confused and the links weren't that helpful since I am trying to avoid matrix calculations in the vertex shader since I am rendering everything 2D for now.
So my questions...What is the formula to transform pixel coordinates to NDC in Metal? Is it possible without using an orthographic projection matrix? Why doesn't my equation work for Metal?
It is of course possible without a projection matrix. Matrices are just a useful convenience for applying transformations. But it's important to understand how they work when situations like this arise, since using a general orthographic projection matrix would perform unnecessary operations to arrive at the same results.
Here are the formulae I might use to do this:
float xScale = 2.0f / drawableSize.x;
float yScale = -2.0f / drawableSize.y;
float xBias = -1.0f;
float yBias = 1.0f;
float clipX = position.x * xScale + xBias;
float clipY = position.y * yScale + yBias;
Where drawableSize is the dimension (in pixels) of the renderbuffer, which can be passed in a buffer to the vertex shader. You can also precompute the scale factors and pass those in instead of the screen dimensions, to save some computation on the GPU.
So I have a bit of a math problem. Here are the pieces.
Rot = Rotation (degrees). This is the rotation of the "player". This is also the yaw.
Vel.X = This is the left/rightward movement that would be happening if it weren't rotated
Vel.Z = Same as last except its up/down movement
Result.X = This is the actual movement that should be happening along the x axis considering rotation
Result.Z = Same as last
Basically the scenario is that a player is standing on a platform with "Rot" rotation. When directional keys are pressed velocity is added accordingly to the "Vel" value. However if rotation isn't 0 this wont produce the right result because when the player rotates moving left becomes relative.
Could you please tell me a formula that would find the proper x and y movement that would result in the player moving around relative to its rotation?
This problem is probably the most basic rotation question in game programming.
Using your Vel.X and Vel.Z values, you have what you might think of as the vector you wish to rotate in the x/z plane (instead of x/y - but same idea). Whether velocity or position, the approach is the same. With a simple google search we find that for 2D vector rotation, the formula is:
Result.X = Vel.X * cos(Rot) - Vel.Z * sin(Rot);
Result.Z = Vel.X * sin(Rot) + Vel.Z * cos(Rot);
I used the Qt equivalent to the gluLookAt to set my view matrix and I've been moving it by translating it everywhere in the scene.. now I want to get close with the camera to an object.
I know the position of the object, both in object coords and in each other coords (I have the model matrix for that object), but how to get the position of the camera?
To animate the camera to get closer and closer to the object I suppose I should take two points:
The point where the object is
The point where the camera is
and then do something like
QVector3D direction_to_get_closer = point_where_object_is - point_where_camera_is
How do I get the point where the camera is? Or, alternatively if this is not needed, how do I get the vector to the direction the camera has to follow (no rotations, I just need translations, this is going to simplify things) to reach the object?
gluLookAt(eye, target, headUp) takes three parameters, the position of the camera/eye, the position of the object you want to look at, and a unitvector to controll roll/head up direction.
To zoom closer, you can move the eye/camera position by some fraction of your vector direction_to_get_closer. For instance,
point_where_camera_is += 0.1f * direction_to_get_closer; // move 10% closer
Its more useful to move by a constant amount instead of 10% of the current distance (or else you will move very fast when the distance is great, and then increasingly slower). Therefore, you should use the normalized direction:
QVector3D unitDir = direction_to_get_closer.normalized();
point_where_camera_is += 0.1f * unitDir; // move 0.1 units in direction
The camera transform will break if point_where_camera_is becomes equal to point_where_object_is.
A better way, if you don't need to zoom, translate/rotate the new "zoomed" point_where_camera_is is to interpolate between to positions.
float t = some user input value between 0 and 1 (0% to 100% of the line camToObj)
QVector3D point_on_line_cam_obj = t * point_where_camera_is + (1-t) * point_where_object_is;
This way, you can stop the user from zooming into the object by limiting t, also, you can go back to the start position with t=0;
I'm trying to calculate modelview matrix of my 2D camera but I can't get the formula right. I use the Affine3f transform class so the matrix is compatible with OpenGL. This is closest that I did get by trial and error. This code rotates and scales the camera ok, but if I apply translation and rotation at same time the camera movement gets messed up: camera moves in rotated fashion, which is not what I want. (And this probaly due to fact I first apply the rotation matrix and then translation)
Eigen::Affine3f modelview;
modelview.translate(Eigen::Vector3f(camera_offset_x, camera_offset_y, 0.0f));
modelview.scale(Eigen::Vector3f(camera_zoom_x, camera_zoom_y, 0.0f));
modelview.rotate(Eigen::AngleAxisf(camera_angle, Eigen::Vector3f::UnitZ()));
modelview.translate(Eigen::Vector3f(camera_x, camera_y, 0.0f));
What I want is that camera would rotate and scale around offset position in screenspace {(0,0) is middle of the screen in this case} and then be positioned along the global xy-axes in worldspace {(0,0) is also initialy at middle of the screen} to the final position. How would I do this?
Note that I have set up also an orthographic projection matrix, which may affect this problem.
If you want a 2D image, rendered in the XY plane with OpenGL, to (1) rotate counter-clockwise by a around point P, (2) scale by S, and then (3) translate so that pixels at C (in the newly scaled and rotated image) are at the origin, you would use this transformation:
translate by -P (this moves the pixels at P to the origin)
rotate by a
translate by P (this moves the origin back to where it was)
scale by S (if you did this earlier, your rotation would be messed up)
translate by -C
If the 2D image we being rendered at the origin, you'd also need to end by translate by some value along the negative z axis to be able to see it.
Normally, you'd just do this with OpenGL basics (glTranslatef, glScalef, glRotatef, etc.). And you would do them in the reverse order that I've listed them. Since you want to use glLoadMatrix, you'd do things in the order I described with Eigen. It's important to remember that OpenGL is expecting a Column Major matrix (but that seems to be the default for Eigen; so that's probably not a problem).
JCooper did great explaining the steps to construct the initial matrix.
However I eventually solved the problem bit differently. There was few additional things and steps that were not obvious for me at the time. See JCooper answer's comments. First is to realize all matrix operations are relative.
Thus if you want to position or move the camera with absolute xy-axes, you must first decompose the matrix to extract its absolute position with unchanged axes. Then you translate the matrix by the difference of the old and new position.
Here is way to do this with Eigen:
First compute Affine2f matrix cmat scalar determinant D. With Eigen this is done with D = cmat.linear().determinant();. Next compute 'reverse' matrix matrev of the current rotation+scale matrix R using the D. matrev = (RS.array() / (1.0f / determ)).matrix()); where RS is cmat.matrix().topLeftCorner(2,2)
The absolute camera position P is then given by P = invmat * -C where C is cmat.matrix().col(2).head<2>()
Now we can reposition the camera anywhere along the absolute axes and keeping the rotation+scaling same: V = RS * (T - P) where RS is same as before, T is the new position vec and P is the decomposed position vec.
The cmat then simply translated by V to move the camera: cmat.pretranslate(V)