What's a simple way of warping an image with a given set of points? - math

I'd like to implement image morphing, for which I need to be able to deform the image with given set of points and their destination positions (where they will be "dragged"). I am looking for a simple and easy solution that gets the job done, it doesn't have to look great or be extremely fast.
This is an example what I need:
Let's say I have an image and a set of only one deforming point [0.5,0.5] which will have its destination at [0.6,0.5] (or we can say its movement vector is [0.1,0.0]). This means I want to move the very center pixel of the image by 0.1 to the right. Neighboring pixels in some given radius r need to of course be "dragged along" a little with this pixel.
My idea was to do it like this:
I'll make a function mapping the source image positions to destination positions depending on the deformation point set provided.
I will then have to find the inverse function of this function, because I have to perform the transformation by going through destination pixels and seeing "where the point had to come from to come to this position".
My function from step 1 looked like this:
p2 = p1 + ( 1 / ( (distance(p1,p0) / r)^2 + 1 ) ) * s
where
p0 ([x,y] vector) is the deformation point position.
p1 ([x,y] vector) is any given point in the source image.
p2 ([x,y] vector) is the position, to where p1 will be moved.
s ([x,y] vector) is movement vector of deformation point and says in which direction and how far p0 will be dragged.
r (scalar) is the radius, just some number.
I have problem with step number 2. The calculation of the inverse function seems a little too complex to me and so I wonder:
If there is an easy solution for finding the inverse function, or
if there is a better function for which finding the inverse function is simple, or
if there is an entirely different way of doing all this that is simple?

Here's the solution in Python - I did what Yves Daoust recommended and simply tried to use the forward function as the inverse function (switching the source and destination). I also altered the function slightly, changing exponents and other values produces different results. Here's the code:
from PIL import Image
import math
def vector_length(vector):
return math.sqrt(vector[0] ** 2 + vector[1] ** 2)
def points_distance(point1, point2):
return vector_length((point1[0] - point2[0],point1[1] - point2[1]))
def clamp(value, minimum, maximum):
return max(min(value,maximum),minimum)
## Warps an image accoording to given points and shift vectors.
#
# #param image input image
# #param points list of (x, y, dx, dy) tuples
# #return warped image
def warp(image, points):
result = img = Image.new("RGB",image.size,"black")
image_pixels = image.load()
result_pixels = result.load()
for y in range(image.size[1]):
for x in range(image.size[0]):
offset = [0,0]
for point in points:
point_position = (point[0] + point[2],point[1] + point[3])
shift_vector = (point[2],point[3])
helper = 1.0 / (3 * (points_distance((x,y),point_position) / vector_length(shift_vector)) ** 4 + 1)
offset[0] -= helper * shift_vector[0]
offset[1] -= helper * shift_vector[1]
coords = (clamp(x + int(offset[0]),0,image.size[0] - 1),clamp(y + int(offset[1]),0,image.size[1] - 1))
result_pixels[x,y] = image_pixels[coords[0],coords[1]]
return result
image = Image.open("test.png")
image = warp(image,[(210,296,100,0), (101,97,-30,-10), (77,473,50,-100)])
image.save("output.png","PNG")

You don't need to construct the direct function and invert it. Directly compute the inverse function, by swapping the roles of the source and destination points.
You need some form of bivariate interpolation, have a look at radial basis function interpolation. It requires to solve a linear system of equations.
Inverse distance weighting (similar to your proposal) is the easiest to implement but I am afraid it will give disappointing results.
https://en.wikipedia.org/wiki/Multivariate_interpolation#Irregular_grid_.28scattered_data.29

Related

3D Projection Modification - Encode Z/W into Z

This is a little tricky to explain, so bare with me. I'm attempting to design a 2D projection matrix that takes 2D pixel coordinates along with a custom world-space depth value, and converts to clip-space.
The idea is that it would allow drawing elements based on screen coordinates, but at specific depths, so that these elements would interact on the depth buffer with normal 3D elements. However, I want x and y coordinates to remain the same scale at every depth. I only want depth to influence the depth buffer, and not coordinates or scale.
After the vertex shader, the GPU sets depth_buffer=z/w. However, it also scales x/w and y/w, which creates the depth scaling I want to avoid. This means I must make sure my final clip-space w coordinate ends up being 1.0, to avoid those things. I think I could also adopt to scale x and y by w, to cancel out the divide, but I would rather do the former, if possible.
This is the process that my 3D projection matrix uses to convert depth into clip space (d = depth, n = near distance, f = far distance)
z = f/(f-n) * d + f/(f-n) * -n;
w = d;
This is how I would like to setup my 2D projection matrix. Compared to the 3D version, it would divide both attributes by the input depth. This would simulate having z/w encoded into just the z value.
z = ( f/(f-n) * d + f/(f-n) * -n ) / d;
w = d / d;
I think this turns into something like..
r = f/(f-n); // for less crazy math
z = r + ( r * -n ) / d;
w = 1.0;
However, I can't seem to wrap my math around the values that I would need to plug into my matrix to get this result. It looks like I would need to set my matrix up to perform a division by depth. Is that even possible? Can anyone help me figure out the values I need to plug into my matrix at m[2][2] and m[3][2] (m._33 and m._43) to make something like this happen?
Note my 3D projection matrix uses the following properties to generate the final z value:
m._33 = f / (f-n); // depth scale
m._43 = -(f / (f-n)) * n; // depth offset
Edit: After thinking about this a little more, I realized that the rate of change of the depth buffer is not linear, and I'm pretty sure a matrix can only perform linear change when its input is linear. If that is the case, then what I'm trying to do wouldn't be possible. However, I'm still open to any ideas that are in the same ball park, if anyone has one. I know that I can get what I want by simply doing pos.z /= pos.w; pos.w = 1; in the vertex shader, but I was really hoping to make it all happen in the projection matrix, if possible.
In case anyone is attempting to do this, it cannot be done. Without black magic, there is apparently no way to divide values with a matrix, unless of course the diviser is a constant or etc, where you can swap out a scaler with 1/x. I resorted to performing the operation in the shader in the end.

3D bounding box for an item with three axis rotations

I'm trying to find the <x,y,z> size of what would end up being a bounding box for a rotated shape in all three axis rotations. Though to help keep it simple, the example demonstrated below has only the x axis rotated.
vector Size = <10,1,0.5>; vector Deg = <22.5,0,0>
if(Deg.x > 0 && Deg.y == 0 && Deg.z == 0){
Y1 = Cos(Deg.x) * Size.y + Sin(Deg.x) * Size.z;
Z1 = Cos(Deg.x) * Size.z + Sin(Deg.x) * Size.y;}
Below are for the y and z rotations, that is if you decided to change the degrees to say <0,22.5,0> and <0,0,22.5>.
if(Deg.y > 0 && Deg.x == 0 && Deg.z == 0){
X2 = Cos(Deg.y) * Size.x + Sin(Deg.y) * Size.z;
Z2 = Cos(Deg.y) * Size.z + Sin(Deg.y) * Size.x;}
if(Deg.z > 0 && Deg.x == 0 && Deg.y == 0){
X3 = Cos(Deg.z) * Size.x + Sin(Deg.z) * Size.y;
Y3 = Cos(Deg.z) * Size.y + Sin(Deg.z) * Size.x;}
Though the part I get lost at is, where do I go from here if when you have the rotation in two or three axis. Such as <22.5,22.5,0> or <22.5,22.5.22.5>
Is there a website with a tutorial or example equations I could review or are there any hints or ideas of what I could do to figure this out.
EDIT:
I do want to add that the comment from Nico helped, in that what I'm asking about is called: Axis-Aligned Bounding Box or AABB for short.
As for JohanC comment about the 22.5, yes the Deg = Degree. Also yes you'll have to turn the Degree into Radians, but I put it as Degree in the example for the Sin and Cos input to keep it simple.
If you're wondering how this question might be useful. Well as an example you'd need to know the AABB to help in an equation to keep the item in question flush with the surface its sitting on if you were to say resize the item while it had <x,y,z> rotations that weren't at perfect zero rotations.
I found my own solution after a lot of testing and debugging. I'll show and explain a small portion of the script I created since functions and options from coding language to language may vary.
A few details about the example script below. Deg = Degrees. Though yes what JohanC mentioned about radians is correct, since even the language I'm using I have to convert degrees to radians with a function. Not every language or calculator is like that, yet for intents and purposes to make it easier to read, I did take out the excess while keeping the meat of the code/equation intact so to speak. Also in this example, I'm rotating it in the Z axis to demonstrate and as a starting point.
vector Size = <10,1,0.5>; //Insert your own coding language function
//for splitting the size in half with positive and negative halves.
vector min = <-5,-0.5,-0.25>; vector max = <5,0.5,0.25>;
//You'll need at least 8 points in total to do this correctly.
//I already have 8 points added down below, labeled p1 - p8
vector p1 = <max.x,max.y,max.z>; vector p2 = <max.x,min.y,max.z>;
vector p3 = <max.x,max.y,min.z>; vector p4 = <max.x,min.y,min.z>;
vector p5 = <min.x,max.y,max.z>; vector p6 = <min.x,min.y,max.z>;
vector p7 = <min.x,max.y,min.z>; vector p8 = <min.x,min.y,min.z>;
vector Deg = <0,0,22.5>
//This will give you an idea of how to setup the x,y,z for each point
//equation to make / change to fit the rotations. Plus to make it compact
//I suggest that you use list, a while loop, and etc.
x1 = p1.x * llCos(Deg.z) - p1.y * llSin(Deg.z);
y1 = p1.y * llCos(Deg.z) + p1.x * llSin(Deg.z);
A lot of what I kept out of the example above is several list, true and false statements, a while loop, and few other things. Though it's kept simple to show the less complicated portion of it and the fact that not every language has the same functions available.
Though once when you get all of the location data of the 8 points collected, you can then put each of the x, y, z point information into its own list. Then run the equation (edited as needed for each rotation). After that, get the max and min from each axis rotation output. Then take the maximum and minus the minimum from it since the minimum will always be a negative. That right there will give you the <x,y,z> size of your bounding box.
I do want to add that the link from Nico helped slightly, though the only flaw in it is "if" the language you're using allows a matrix or "if" you're able to create a multidimensional array. If you can manage that, then use the link Nico gave to see if it helps.
Also some of JohanC tips/hints helped as well. Speaking of which, the part about "using the output of the vector from one rotation to the next" works well if and only if you're using the 8 points method. Otherwise if you try it with the bounding box size of one to the next, the first rotation to the second will work fine because its only moving in 2D at that point, but by the time you try to make the third rotation with the second rotations bounding box it won't work because it'll be going from 2D to 3D.
Note - If you think you have a better way to do the equation or simpler way to explain it, feel free to add in your own answer.

Calculate the 3rd point of an equilateral triangle from two points at any angle, pointing the "correct" way for a Koch Snowflake

Perhaps the question title needs some work.
For context this is for the purpose of a Koch Snowflake (using C-like math syntax in a formula node in LabVIEW), thus why the triangle must be the correct way. (As given 2 points an equilateral triangle may be in one of two directions.)
To briefly go over the algorithm: I have an array of 4 predefined coordinates initially forming a triangle, the first "generation" of the fractal. To generate the next iteration, one must for each line (pair of coordinates) get the 1/3rd and 2/3rd midpoints to be the base of a new triangle on that face, and then calculate the position of the 3rd point of the new triangle (the subject of this question). Do this for all current sides, concatenating the resulting arrays into a new array that forms the next generation of the snowflake.
The array of coordinates is in a clockwise order, e.g. each vertex travelling clockwise around the shape corresponds to the next item in the array, something like this for the 2nd generation:
This means that when going to add a triangle to a face, e.g. between, in that image, the vertices labelled 0 and 1, you first get the midpoints which I'll call "c" and "d", you can just rotate "d" anti-clockwise around "c" by 60 degrees to find where the new triangle top point will be (labelled e).
I believe this should hold (e.g. 60 degrees anticlockwise rotating the later point around the earlier) for anywhere around the snowflake, however currently my maths only seems to work in the case where the initial triangle has a vertical side: [(0,0), (0,1)]. Else wise the triangle goes off in some other direction.
I believe I have correctly constructed my loops such that the triangle generating VI (virtual instrument, effectively a "function" in written languages) will work on each line segment sequentially, but my actual calculation isn't working and I am at a loss as to how to get it in the right direction. Below is my current maths for calculating the triangle points from a single line segment, where a and b are the original vertices of the segment, c and d form new triangle base that are in-line with the original line, and e is the part that sticks out. I don't want to call it "top" as for a triangle formed from a segment going from upper-right to lower-left, the "top" will stick down.
cx = ax + (bx - ax)/3;
dx = ax + 2*(bx - ax)/3;
cy = ay + (by - ay)/3;
dy = ay + 2*(by - ay)/3;
dX = dx - cx;
dY = dy - cy;
ex = (cos(1.0471975512) * dX + sin(1.0471975512) * dY) + cx;
ey = (sin(1.0471975512) * dX + cos(1.0471975512) * dY) + cy;
note 1.0471975512 is just 60 degrees in radians.
Currently for generation 2 it makes this: (note the seemingly separated triangle to the left is formed by the 2 triangles on the top and bottom having their e vertices meet in the middle and is not actually an independent triangle.)
I suspect the necessity for having slightly different equations depending on weather ax or bx is larger etc, perhaps something to do with how the periodicity of sin/cos may need to be accounted for (something about quadrants in spherical coordinates?), as it looks like the misplaced triangles are at 60 degrees, just that the angle is between the wrong lines. However this is a guess and I'm just not able to imagine how to do this programmatically let alone on paper.
Thankfully the maths formula node allows for if and else statements which would allow for this to be implemented if it's the case but as said I am not awfully familiar with adjusting for what I'll naively call the "quadrants thing", and am unsure how to know which quadrant one is in for each case.
This was a long and rambling question which inevitably tempts nonsense so if you've any clarifying questions please comment and I'll try to fix anything/everything.
Answering my own question thanks to #JohanC, Unsurprisingly this was a case of making many tiny adjustments and giving up just before getting it right.
The correct formula was this:
ex = (cos(1.0471975512) * dX + sin(1.0471975512) * dY) + cx;
ey = (-sin(1.0471975512) * dX + cos(1.0471975512) * dY) + cy;
just adding a minus to the second sine function. Note that if one were travelling anticlockwise then one would want to rotate points clockwise, so you instead have the 1st sine function negated and the second one positive.

Vectors and Scaler in games

How can we use vector and scalar in games? What benefit from that.
Could someone please indicate precisely the difference between a scalar and a vector in games field ? I find no matter how many times I try to understand but I maybe need examples for that.
A scalar is just another word for a number. The distinction is that a scalar is a number that is part of a vector. There is nothing special about a scalar.
A vector is a set of numbers (one or more) that define something, in the most common case you have 2 numbers representing a 2D vector or 3 numbers representing a 3D vector. The abstract notion for a vector is simply an arrow.
Take a piece of graph paper. Select any point on that paper and call it the origin. Its coordinate will be x = 0, y = 0. Now draw a straight line from that point in any direction and any length. To describe that arrow you need to define the vector. Count how far across the page the end of the arrow is from the start (origin) and that is the x component. Then how far up the page and that is the y component.
You have just created a vector that has two numbers (x,y) that completely describe the arrow on the paper. This type of vector always starts at zero. You can also describe the vector by its direction (ie north, east, south...) and length.
In 3D you need 3 numbers to describe any arrow. (x,y,z)
Vectors are very handy. You can use a vector to describe how fast something is moving and in what direction. The vector represents a little arrow that starts where the object is now and ends where the object will be in the next time unit.
Thus an object at coordinate x,y has a vector velocity(0.2,0.3). To calculate the position of the object in the next time unit just add the vector to the coordinate
newXPos = currentXPos + velocityVectorX
newYPos = currentYPos + velocityVectorY
If you want to slow the speed by half you can multiply the vector by 0.5
velocityVectorX = velocityVectorX * 0.5
velocityVectorY = velocityVectorY * 0.5
You do the same to increase the speed.
velocityVectorX = velocityVectorX * 2
velocityVectorY = velocityVectorY * 2
You may have an object in 3D space that has many forces acting on it. There is gravity a vector (arrow) pointing down (G). The force of the air resistance pointing up (R). The current velocity another arrow pointing in the direction it is traveling (V). You can have as many as you like (need) to describe all the forces that push and pull at the object. When you need to calculate the position of the object for the next instance in time (say one second) you just add all the force vectors together to get the total force as a vector and add that to the objects position
Object.x = Object.x + G.x + R.x + V.x;
Object.y = Object.y + G.y + R.y + V.y;
Object.z = Object.z + G.y + R.z + V.z;
If you just want the velocity
V.x = V.x + G.x + R.x;
V.y = V.y + G.y + R.y;
V.z = V.z + G.y + R.z;
That is the new velocity in one second.
There are many things that can be done with a vector. A vector can be used to point away from a surface in 3D, this vector is called a surface normal. The you create a vector from a point on that surface pointing to a light. The cosine of the angle between the two vectors is how much light the surface will reflect.
You can use a vector to represent the three direction in space an object has. Say a box, there is a 3D vector pointing along the width, another along the height and the last along the depth. The length of each vector represents the length of each side. You can make another vector to represent how far the corner of the box is from the origin (any known point) In 3D these 4 vectors are used to represent the object and is called a transformation matrix (just another type of vector made up of vectors)
The list of things vectors can do is endless.
The basics is just like number, you can add, subtract, multiply and divide and vector.
Then there are a host of special functions for vectors, normalize, transform, dot product and cross product to name but a few. For these things people normally use a library that does all this for you. My view is that if you really want to learn about vectors and how they are used write your own vector library at some point until then use a library.
Hope that cleared the mud a little bit for you, it is always difficult to describe something you have used for a long time to someone that is new to it so feel free to ask questions in the comments if you need.

I've got my 2D/3D conversion working perfectly, how to do perspective

Although the context of this question is about making a 2d/3d game, the problem i have boils down to some math.
Although its a 2.5D world, lets pretend its just 2d for this question.
// xa: x-accent, the x coordinate of the projection
// mapP: a coordinate on a map which need to be projected
// _Dist_ values are constants for the projection, choosing them correctly will result in i.e. an isometric projection
xa = mapP.x * xDistX + mapP.y * xDistY;
ya = mapP.x * yDistX + mapP.y * yDistY;
xDistX and yDistX determine the angle of the x-axis, and xDistY and yDistY determine the angle of the y-axis on the projection (and also the size of the grid, but lets assume this is 1-pixel for simplicity).
x-axis-angle = atan(yDistX/xDistX)
y-axis-angle = atan(yDistY/yDistY)
a "normal" coordinate system like this
--------------- x
|
|
|
|
|
y
has values like this:
xDistX = 1;
yDistX = 0;
xDistY = 0;
YDistY = 1;
So every step in x direction will result on the projection to 1 pixel to the right end 0 pixels down. Every step in the y direction of the projection will result in 0 steps to the right and 1 pixel down.
When choosing the correct xDistX, yDistX, xDistY, yDistY, you can project any trimetric or dimetric system (which is why i chose this).
So far so good, when this is drawn everything turns out okay. If "my system" and mindset are clear, lets move on to perspective.
I wanted to add some perspective to this grid so i added some extra's like this:
camera = new MapPoint(60, 60);
dx = mapP.x - camera.x; // delta x
dy = mapP.y - camera.y; // delta y
dist = Math.sqrt(dx * dx + dy * dy); // dist is the distance to the camera, Pythagoras etc.. all objects must be in front of the camera
fac = 1 - dist / 100; // this formula determines the amount of perspective
xa = fac * (mapP.x * xDistX + mapP.y * xDistY) ;
ya = fac * (mapP.x * yDistX + mapP.y * yDistY );
Now the real hard part... what if you got a (xa,ya) point on the projection and want to calculate the original point (x,y).
For the first case (without perspective) i did find the inverse function, but how can this be done for the formula with the perspective. May math skills are not quite up to the challenge to solve this.
( I vaguely remember from a long time ago mathematica could create inverse function for some special cases... could it solve this problem? Could someone maybe try?)
The function you've defined doesn't have an inverse. Just as an example, as user207422 already pointed out anything that's 100 units away from the camera will get mapped to (xa,ya)=(0,0), so the inverse isn't uniquely defined.
More importantly, that's not how you calculate perspective. Generally the perspective scaling factor is defined to be viewdist/zdist where zdist is the perpendicular distance from the camera to the object and viewdist is a constant which is the distance from the camera to the hypothetical screen onto which everything is being projected. (See the diagram here, but feel free to ignore everything else on that page.) The scaling factor you're using in your example doesn't have the same behaviour.
Here's a stab at trying to convert your code into a correct perspective calculation (note I'm not simplifying to 2D; perspective is about projecting three dimensions to two, trying to simplify the problem to 2D is kind of pointless):
camera = new MapPoint(60, 60, 10);
camera_z = camera.x*zDistX + camera.y*zDistY + camera.z*zDistz;
// viewdist is the distance from the viewer's eye to the screen in
// "world units". You'll have to fiddle with this, probably.
viewdist = 10.0;
xa = mapP.x*xDistX + mapP.y*xDistY + mapP.z*xDistZ;
ya = mapP.x*yDistX + mapP.y*yDistY + mapP.z*yDistZ;
za = mapP.x*zDistX + mapP.y*zDistY + mapP.z*zDistZ;
zdist = camera_z - za;
scaling_factor = viewdist / zdist;
xa *= scaling_factor;
ya *= scaling_factor;
You're only going to return xa and ya from this function; za is just for the perspective calculation. I'm assuming the the "za-direction" points out of the screen, so if the pre-projection x-axis points towards the viewer then zDistX should be positive and vice-versa, and similarly for zDistY. For a trimetric projection you would probably have xDistZ==0, yDistZ<0, and zDistZ==0. This would make the pre-projection z-axis point straight up post-projection.
Now the bad news: this function doesn't have an inverse either. Any point (xa,ya) is the image of an infinite number of points (x,y,z). But! If you assume that z=0, then you can solve for x and y, which is possibly good enough.
To do that you'll have to do some linear algebra. Compute camera_x and camera_y similar to camera_z. That's the post-transformation coordinates of the camera. The point on the screen has post-tranformation coordinates (xa,ya,camera_z-viewdist). Draw a line through those two points, and calculate where in intersects the plane spanned by the vectors (xDistX, yDistX, zDistX) and (xDistY, yDistY, zDistY). In other words, you need to solve the equations:
x*xDistX + y*xDistY == s*camera_x + (1-s)*xa
x*yDistX + y*yDistY == s*camera_y + (1-s)*ya
x*zDistX + y*zDistY == s*camera_z + (1-s)*(camera_z - viewdist)
It's not pretty, but it will work.
I think that with your post i can solve the problem. Still, to clarify some questions:
Solving the problem in 2d is useless indeed, but this was only done to make the problem easier to grasp (for me and for the readers here). My program actually give's a perfect 3d projection (i checked it with 3d images rendered with blender). I did left something out about the inverse function though. The inverse function is only for coordinates between 0..camera.x * 0.5 and 0.. camera.y*0.5. So in my example between 0 and 30. But even then i have doubt's about my function.
In my projection the z-axis is always straight up, so to calculate the height of an object i only used the vieuwingangle. But since you cant actually fly or jumpt into the sky everything has only a 2d point. This also means that when you try to solve the x and y, the z really is 0.
I know not every funcion has an inverse, and some functions do, but only for a particular domain. My basic thought in this all was... if i can draw a grid using a function... every point on that grid maps to exactly one map-point. I can read the x and y coordinate so if i just had the correct function i would be able to calculate the inverse.
But there is no better replacement then some good solid math, and im very glad you took the time to give a very helpfull responce :).

Resources