What is the simplest way to un-warp a photo made using fisheye or wide-angle lens? I'm looking a pixel projection formula that has few parameters. Camera and lens parameters will not be known, so user has to change the parameters visually. Thanks
There is a good paper here that provides some decent looking mathematical models for lens distortion. It's at least. SDX2000 was kind of on the right track with the grid I think. I think the most common way to approach the problem is to map the image to a grid and then allow warping parameters to be applied to produce pincushion and barrel distortion. See the lens distortion filters in Lightroom or Photoshop as an example.
There is an excellent discussion from ImageMagick. They give the equation that they use.
Note that this does not correct distortion in the same way as Photoshop CS6 (i.e. you cannot take coefficients from the Adobe lens profiles and simply chuck them in).
The paper that Kamil points to seems like an excellent in-depth look.
I would assume you could use the lens equation to do it.
1/f = 1/object_distance + 1/image_distance
Where f is the focal length (the user input). The ratio of image distance and object distance could be used to resize the image appropriately, using the magnification equation. To get what you really want, then, you need to restructure the equation:
1/object_distance = 1/f - 1/image_distance
And then use the magnification equation to use the object height to resize:
-image_distance/object_distance = image_height/object_height
The catch, as you may have noticed, is that you need to know the distance each pixel is away from the camera. Otherwise, it simply doesn't work. You could ask the user for that information, but that seems unlikely, and painful. I don't know of any other way to do it-- lens distortion is a 3D effect, and you're given 2D information. At best you can attempt to correct it two-dimensionally, but this will be difficult, and won't work properly.
If its possible you should ask the user to take a photograph of a reference image (a chess board for example) using the same camera and then use this information to analyze the lens characteristics. This information can then be used to un-warp the other photographs taken by the same camera.
For implementation you could use neural networks/genetic algorithms.
Related
I have successfully calculated Rotation, Translation with the intrinsic camera matrix of two cameras.
I also got rectified images from the left and right cameras. Now, I wonder how I calculate the 3D coordinate of a point, just one point in an image. Here, please see the green points. I have a look at the equation, but it requires baseline which I don't know how to calculate. Could you show me the process of calculating the 3d coordinate of the green point with the given information (R, T, and intrinsic matrix)?
FYI
1. I also have a Fundamental matrix and Essential matrix, just in case we need them.
2. Original image size is 960 x 720. Rectified ones are 925 x 669
3. The green point from the left image: (562, 185), from the right image: (542, 185)
The term "baseline" usually just means translation. Since you already have your rotation, translation and intrinsics matrices (let's not them R, T and K). you can triangulate and don't need either the Fundamental or Essential matrices (they could be used to extract R, T etc but you already have them). You don't really need your images to be rectified either, since it doesn't change the triangulation process that much. There are many ways to triangulate, each with their pros and cons, and many libraries that implement them. So, all I can do here is give you and overview of the problem and potential solutions, as well as pointers to resources that you can either use as their are or as a source of inspiration to write your own code.
Formalization and solution outlines. Let's formalize what we are after here. You have a 3d point X, with two observations x_1 and x_2 respectively in the left and right images. If you backproject them, you obtain two rays:
ray_1=K^{1}x_1
rat_2=R*K^{-1}x_2+T //I'm assuming that [R|T] is the pose of the second camera expressed in the referential of the first camera
Ideally, you'd want those two rays to meet at point X. Since in practice we always have some noise (discretization noise, rounding errors and so on) the two rays wont meet at X, so the best answer would be a point Q such that
Q=argmin_X {d(X,ray_1)^2+d(X,ray_2)^2}
where d(.) denotes the Euclidian distance between a line and a point. You can solve this problem as a regular least squares problem, or you can just take the geometric approach (called midpoint) of considering the line segment l that is perpendicular to both ray_1 and ray_2, and take its middle as your solution. Another quick and dirty way is to use the DLT. Basically, you re-write the constrains (i.e. X should be as close as possible to both rays) as a linear system AX=0 and solve it with SVD.
Usually, the geometric (midpoint) method is the less precise. The DLT based one, while not the most stable numerically, usually produces acceptable results.
Ressources that present in depth formalization
Hartley-Zisserman's book of course! Chapter 12. A simple DLT-based method, which is the one used in opencv (both in the calibration and sfm modules) is explained on page 312. It is very easy to implement, it shouldn't take more that 10 minutes in any language.
Szeliski'st book. It has an intersting discussion on triangulation in the chapter on SFM, but is not as straight-forward or in depth as Hartley-Zisserman's.
Code. You can use the triangulation methods from opencv, either from the calib3d module, or from the contribs/sfm module. Both use the DLT, but the code from the SFM module is more easily understandable (the calib3d code has a lot of old-school C code which is not very pleasant to read). There is also another lib, called openGV, which has a few interesting methods for triangulation.
cv::triangulatePoints
cv::sfm::triangulatePoints
OpenGV
The openGV git repo doesn't seem very active, and I'm not a big fan of the design of the library, but if I remember correctly (feel free to tell me otherwise) it offers methods other that the DLT for triangulations.
Naturally, those are all written in C++, but if you use other languages, finding wrappers or similar libraries wont be difficult (with python you still have opencv wrappers, and MATLAB has a bundle module, etc.).
I draw a vectorial geometry with some calibration points around it.
I print this geometry and then I physically scan the printed calibration points (I can't scan the geometry, I can only scan the calibration points).
When I acquire these points, these aren't in their position anymore because of some print error or bad print calibration.
The question is:
Is there any algorithm that helps me to adapt the original geometry in base of the new points scanned?
In practice I need to warp the geometry in order to obtain the real geometry printed on the paper with the same print error that I have on the calibration points.
The distortion is given by the physical distortion of the material (not paper but cloth) during the print process. I can't know how much the material will distort during the print.
Yes, there are algorithms to help you with that. In general you need to learn/find the transformation between the two images that you have.
Typical geometrical transformations are affine transformations (shift, scale, rotation, shear, reflections) which need at least three control points or piecewise local linear/ local weighted mean which need at least 4-6 control points. The more control points you have, the better in general.
Given a set of control points in one image and the corresponding set of control points in the other image there are algorithms for finding the optimal transformation between if you specify a class (affine or piecewise local linear). See for example fitgeotrans in Matlab. I don't know how exactly it solves the problem by I guess by some kind of optimization. It should be easy to find available implementations for other programming languages (Python, C, Java).
What remains is finding the correspondence between the control points in the two images. For a few images you may be able to do that by hand, but in the general case you might want to automatize this as well. General image registration algorithms like imregister should do well for your images. They give you a good initial estimate for the transformation (may already be sufficient) so that then identification of the corresponding point pairs is trivial (always take the nearest) and allow refining.
So I advice you to first just try to register the images (gray scale data) with an identity transformation as starting value. Then identify corresponding point pairs and refine the transformation either using an affine or a piecewiece/local transformation. Then apply the transformation on the geometry to get the printed geometry. Depending on your choice of programming languages you will find many implementations that do the job.
How can I convert a 3D vector into a 2D vector that can be drawn on screen?
I have the camera position, the positon of the 3D point, the vertical and horizontal rotation of the camera, the screen resolution and the field of view.
I heard about the world to screen function but do not know how to use it. Is there a way to do it just using maths?
Thx. in advance.
mate. Your question is not of simple answer. I would explain the maths to you if I remembered. But it's been some time since I don't study linear algebra.
First, the methods you would call to do that uses the math you are looking for, and it is probably optimized!
But, regarding your questions about the maths behind it, here is a link:
https://en.wikipedia.org/wiki/3D_projection
This "3d vector to 2D vector screen point" is called projection, and you are probably referring to perspective projection. It is explained in the link.
If you are interested in learning it, you should first study some linear algebra, and then start studying computer graphics. (One good book: http://www.amazon.com/gp/product/0321399528/ref=as_li_qf_sp_asin_tl?ie=UTF8&camp=1789&creative=9325&creativeASIN=0321399528&linkCode=as2&tag=casueffe06-20)
I have a 2d world made of tiles. Tiles are either passable, non-passable or have some sort of movement penalty.
All entities and tiles have their own hitboxes and sizes for collision detection.
Each tile tile has dimensions of 16x16px.
Most examples I've read seem to suggest that we're moving from center of one tile to the center of another tile. And as we see from the picture below, that red part looks hardly optimal nor it doesn't take entity size into account. Also pathding nodes are also placed into 2d array, with only 8 possible directions from each node.
But wouldn't actually shortest path be something like this?
How should I implement pathfinding?
Should tiles be splitted into smaller nodes for pathfinding or is there some other way to get more accurate routes? Even if I splitted each tile to have 10x10 pathfinding nodes, It still wouldn't find shortest line between 2 points.
Should there be more than 8 directions and if so, how should that be implemented?
For example if my world was 50x50 tiles big, how should the pathfinding map look like and how it should be generated?
It depends on your definition of "shortest path" and what you plan to do with it.
In your example, it appears that you consider a valid move to be from the center of one tile to the center of any other tile in unobstructed view. How you'd validate moves to partially obstructed tiles is not clear. This differs from the geometrically shortest path, which would obviously hug the wall, and the realistic shortest path, would would use a unit width and turn radius to avoid walls and sudden changes in direction.
A common approach is to use A* as usual, and then post-process the path in a number of ways to optimize and smooth it. This works both for grid based worlds like yours, and for more general navmeshes.
Gamasutra had a nice overview of this called Toward More Realistic Pathfinding, with a variety of ideas and techniques from smoothing zigzags and adding curves, to optimizing paths for units with acceleration and direction.
I had almost same problem and I have coded a pre-computation software for all tiles to all tiles with some optimization
You can find source code here : https://github.com/FurkanGozukara/pathfinding-2d-tile-map
The development video is here : https://www.youtube.com/watch?v=jRTA0iLjv6M
I did come up with my own algorithm and implementation. Therefore it is probably not the best nor the most optimized one. Although it is already implemented into my free browser based game MonsterMMORPG and it works great : https://www.monstermmorpg.com/
I'm trying to implement my own physics for an app I'm making in C++, OpenframeWorks. I'm currently using Box2D but I don't need collision detection so I want something much lighter.
I have a world with gravity and a dynamic object with movement constrained by a prismatic joint of an arbitrary length at an arbitrary angle, attached to a static object. Friction is simulated using the joint motor.
I've looked at
Resources for 2d game physics
But everything here seems to focus on building complete physics engines which I don't need to do. Could anyone point me in the right direction for the maths on this?
You just need to separate the force of gravity into two components; Along the Prismatic Joint Axis, and anything else. (See Free body diagrams)
This is easily achieved with the vector dot product between the gravity vector and axis vector. If you first scale the axis vector to length 1, the result of the dot product will be the force along the axis.
To translate the force into acceleration, you just need to divide by the mass of the moving object.
If Box2D has what you want, I'd recommend you reconsider your "lighter" requirement. Unless you can quantify the harm caused by using a library with a few more bytes, I'd say that the benefit will outweigh the cost of you writing it for yourself.
If you have a good understanding of the physics, and want to learn how to do it, by all means go ahead. If not, use what someone more knowledgeable than you has provided and forget the size of the library.