I read an image-based 3D-reconstruction paper, and there is a paragraph as following:
Volumetric representations, which have been extensively adopted in early deep leaning-based 3D reconstruction techniques, allow the parametrization
of 3D shapes using regular voxel grids. As such, 2D
convolutions used in image analysis can be easily
extended to 3D. They are, however, very expensive
in terms of memory requirements, and only a few
techniques can achieve sub-voxel accuracy.
So what's the meaning of sub-voxel?
Related
In 6 techniques: trigonometry, com-plex numbers, vectors, matrices, quaternions and multivectors. which roatation technieque is used mostly in 3D graphic? I read about these 6 techniques in Rotation Transforms for Computer Graphics book.
Thanks.
Matrices are mostly used at the lowest level since graphics hardware is optimized for matrix-vector multiplication. Quaternions are used a lot since they offer multiple advantages over matrices (like easy quaternion interpolation and averaging, singularity free representation, simple re-normalization, faster rotation concatenation, etc). Frequently quaternions are converted to matrices and matrices to quaternions. Geometric Algebra Multivectors, mostly known as Rotors, are isomorphic to quaternions (basically no difference with them) just belong to a different algebra which subsume quaternions, however they are new to graphics people so not very popular yet, although GA users is growing quickly. Complex numbers they are used as well in the context of 3D non-euclidean geometry (e.g., hyperbolic geometry) and in 2D geometry of course. However not very used in 3D eiclidean geometry. Trigonometry is used a lot in 3D graphics, not exactly for rotations but for many other things is essential, like rendering, ilumination, etc. In summary they are all used. You definitely need to know matrices, vectors and trigonometry to succeed.
I have successfully calculated Rotation, Translation with the intrinsic camera matrix of two cameras.
I also got rectified images from the left and right cameras. Now, I wonder how I calculate the 3D coordinate of a point, just one point in an image. Here, please see the green points. I have a look at the equation, but it requires baseline which I don't know how to calculate. Could you show me the process of calculating the 3d coordinate of the green point with the given information (R, T, and intrinsic matrix)?
FYI
1. I also have a Fundamental matrix and Essential matrix, just in case we need them.
2. Original image size is 960 x 720. Rectified ones are 925 x 669
3. The green point from the left image: (562, 185), from the right image: (542, 185)
The term "baseline" usually just means translation. Since you already have your rotation, translation and intrinsics matrices (let's not them R, T and K). you can triangulate and don't need either the Fundamental or Essential matrices (they could be used to extract R, T etc but you already have them). You don't really need your images to be rectified either, since it doesn't change the triangulation process that much. There are many ways to triangulate, each with their pros and cons, and many libraries that implement them. So, all I can do here is give you and overview of the problem and potential solutions, as well as pointers to resources that you can either use as their are or as a source of inspiration to write your own code.
Formalization and solution outlines. Let's formalize what we are after here. You have a 3d point X, with two observations x_1 and x_2 respectively in the left and right images. If you backproject them, you obtain two rays:
ray_1=K^{1}x_1
rat_2=R*K^{-1}x_2+T //I'm assuming that [R|T] is the pose of the second camera expressed in the referential of the first camera
Ideally, you'd want those two rays to meet at point X. Since in practice we always have some noise (discretization noise, rounding errors and so on) the two rays wont meet at X, so the best answer would be a point Q such that
Q=argmin_X {d(X,ray_1)^2+d(X,ray_2)^2}
where d(.) denotes the Euclidian distance between a line and a point. You can solve this problem as a regular least squares problem, or you can just take the geometric approach (called midpoint) of considering the line segment l that is perpendicular to both ray_1 and ray_2, and take its middle as your solution. Another quick and dirty way is to use the DLT. Basically, you re-write the constrains (i.e. X should be as close as possible to both rays) as a linear system AX=0 and solve it with SVD.
Usually, the geometric (midpoint) method is the less precise. The DLT based one, while not the most stable numerically, usually produces acceptable results.
Ressources that present in depth formalization
Hartley-Zisserman's book of course! Chapter 12. A simple DLT-based method, which is the one used in opencv (both in the calibration and sfm modules) is explained on page 312. It is very easy to implement, it shouldn't take more that 10 minutes in any language.
Szeliski'st book. It has an intersting discussion on triangulation in the chapter on SFM, but is not as straight-forward or in depth as Hartley-Zisserman's.
Code. You can use the triangulation methods from opencv, either from the calib3d module, or from the contribs/sfm module. Both use the DLT, but the code from the SFM module is more easily understandable (the calib3d code has a lot of old-school C code which is not very pleasant to read). There is also another lib, called openGV, which has a few interesting methods for triangulation.
cv::triangulatePoints
cv::sfm::triangulatePoints
OpenGV
The openGV git repo doesn't seem very active, and I'm not a big fan of the design of the library, but if I remember correctly (feel free to tell me otherwise) it offers methods other that the DLT for triangulations.
Naturally, those are all written in C++, but if you use other languages, finding wrappers or similar libraries wont be difficult (with python you still have opencv wrappers, and MATLAB has a bundle module, etc.).
I draw a vectorial geometry with some calibration points around it.
I print this geometry and then I physically scan the printed calibration points (I can't scan the geometry, I can only scan the calibration points).
When I acquire these points, these aren't in their position anymore because of some print error or bad print calibration.
The question is:
Is there any algorithm that helps me to adapt the original geometry in base of the new points scanned?
In practice I need to warp the geometry in order to obtain the real geometry printed on the paper with the same print error that I have on the calibration points.
The distortion is given by the physical distortion of the material (not paper but cloth) during the print process. I can't know how much the material will distort during the print.
Yes, there are algorithms to help you with that. In general you need to learn/find the transformation between the two images that you have.
Typical geometrical transformations are affine transformations (shift, scale, rotation, shear, reflections) which need at least three control points or piecewise local linear/ local weighted mean which need at least 4-6 control points. The more control points you have, the better in general.
Given a set of control points in one image and the corresponding set of control points in the other image there are algorithms for finding the optimal transformation between if you specify a class (affine or piecewise local linear). See for example fitgeotrans in Matlab. I don't know how exactly it solves the problem by I guess by some kind of optimization. It should be easy to find available implementations for other programming languages (Python, C, Java).
What remains is finding the correspondence between the control points in the two images. For a few images you may be able to do that by hand, but in the general case you might want to automatize this as well. General image registration algorithms like imregister should do well for your images. They give you a good initial estimate for the transformation (may already be sufficient) so that then identification of the corresponding point pairs is trivial (always take the nearest) and allow refining.
So I advice you to first just try to register the images (gray scale data) with an identity transformation as starting value. Then identify corresponding point pairs and refine the transformation either using an affine or a piecewiece/local transformation. Then apply the transformation on the geometry to get the printed geometry. Depending on your choice of programming languages you will find many implementations that do the job.
What are some path finding algorithms used in games of all types? (Of all types where characters move, anyway) Is Dijkstra's ever used? I'm not really looking to code anything; just doing some research, though if you paste pseudocode or something, that would be fine (I can understand Java and C++).
I know A* is like THE algorithm to use in 2D games. That's great and all, but what about 2D games that are not grid-based? Things like Age of Empires, or Link's Awakening. There aren't distinct square spaces to navigate to, so what do they do?
What do 3D games do? I've read this thingy http://www.ai-blog.net/archives/000152.html, which I hear is a great authority on the subject, but it doesn't really explain HOW, once the meshes are set, the path finding is done. IF A* is what they use, then how is something like that done in a 3D environment? And how exactly do the splines work for rounding corners?
Dijkstra's algorithm calculates the shortest path to all nodes in a graph that are reachable from the starting position. For your average modern game, that would be both unnecessary and incredibly expensive.
You make a distinction between 2D and 3D, but it's worth noting that for any graph-based algorithm, the number of dimensions of your search space doesn't make a difference. The web page you linked to discusses waypoint graphs and navigation meshes; both are graph-based and could in principle work in any number of dimensions. Although there are no "distinct square spaces to move to", there are discrete "slots" in the space that the AI can move to and which have been carefully layed out by the game designers.
Concluding, A* is actually THE algorithm to use in 3D games just as much as in 2D games. Let's see how A* works:
At the start, you know the coordinates of your current position and
your target position. You make an optimistic estimate of the
distance to your destination, for example the length of the straight
line between the start position and the target.
Consider the adjacent nodes in the graph. If one of them is your
target (or contains it, in case of a navigation mesh), you're done.
For each adjacent node (in the case of a navigation mesh, this could
be the geometric center of the polygon or some other kind of
midpoint), estimate the associated cost of traveling along there as the
sum of two measures: the length of the path you'd have traveled so
far, and another optimistic estimate of the distance that would still
have to be covered.
Sort your options from the previous step by their estimated cost
together with all options that you've considered before, and pick
the option with the lowest estimated cost. Repeat from step 2.
There are some details I haven't discussed here, but this should be enough to see how A* is basically independent of the number of dimensions of your space. You should also be able to see why this works for continous spaces.
There are some closely related algorithms that deal with certain problems in the standard A* search. For example recursive best-first search (RBFS) and simplified memory-bounded A* (SMA*) require less memory, while learning real-time A* (LRTA*) allows the agent to move before a full path has been computed. I don't know whether these algorithms are actually used in current games.
As for the rounding of corners, this can be done either with distance lines (where corners are replaced by circular arcs), or with any kind of spline function for full-path smoothing.
In addition, algorithms are possible that rely on a gradient over the search space (where each point in space is associated with a value), rather than a graph. These are probably not applied in most games because they take more time and memory, but might be interesting to know about anyway. Examples include various hill-climbing algorithms (which are real-time by default) and potential field methods.
Methods to procedurally obtain a graph from a continuous space exist as well, for example cell decomposition, Voronoi skeletonization and probabilistic roadmap skeletonization. The former would produce something compatible with a navigation mesh (though it might be hard to make it equally efficient as a hand-crafted navigation mesh) while the latter two produce results that will be more like waypoint graphs. All of these, as well as potential field methods and A* search, are relevant to robotics.
Sources:
Artificial Intelligence: A Modern Approach, 2nd edition
Introduction to The Design and Analysis of Algorithms, 2nd edition
I am IT student and I have to make a project in VB6, I was thinking to make a 3D Software Renderer but I don't really know where to start, I found a few tutorials but I want something that goes in depth with the maths and algorithms, I will like something that shows how to make 3D transformations, Camera, lights, shading ...
It does not matter the programing language used, I just need some resources that shows me exactly how to make this.
So I just want to know where to find some resources, or you can show me some source code and tell me where to start from.
Or if any of you have a better idea for a VB6 project.
Thanks.
I disagree with the previous posts, a 3D renderer is actually pretty simple. A high-quality 3D renderer is hard however.
Get a bunch of 3D data, triangles are simplest.
Learn about homogenous coordinates and the great 4x4 matrix for transforms.
Define a camera by a position and a rotation (expressed in the 4x4 matrix).
Transform your 3D geometry by this camera.
Perform the perspective divide and scale to your window. This converts your 3D data to 2D.
Render the data as 2D.
Now you're going to lose out on a depth buffer, so stick to wireframes in the beginning. :-)
Don't listen to these nay-sayers, go out and have some fun!
Many years ago I made a shaded triangle renderer that used library calls to draw the triangles. It's a rather naive approach but you would be able to achieve the same result using VB6. I got all the maths & techniques from "Computer Graphics principles and practice" by Foley et al. Some parts are out of date now but I think you'd find it very helpful for this project and it can be bought 2nd hand at reasonable prices from Amazon for example.
One simple approach could be:
Read model file as triangles
Transform each triangle using matrices to account for camera position
Project triangle points onto 2D
Draw 2D triangle (probably using GDI)
This covers wireframe viewing. To extend this to hidden surface removal you need to work out which triangles are in front. Two possible ways:
Z-order sorting the triangles and drawing the ones furthest from the camera first. This is simple but inefficient if there are a lot of triangles and can give overlapping triangle effects when the order is not quite correct. You also have to decide how to sort the triangles - e..g by centroid, by extents...
Using a software depth buffer. This will give better results but is more work to implement. You will have to write your own triangle drawing code so cannot rely on GDI. See bresenham's line algorithm and related algorithms for doing filled triangles for how to do this.
After this you'd also need some kind of shading based on lighting. The calculations are covered in Computer Graphics principles and practice. For simple shading you can stick with drawing triangles using gdi , but if you want to do gouraud or phong shading the colour values vary across a triangle. One way around this is to sub-divide the triangle into smaller triangles, but this is inefficient and won't give very nice looking results. Better would be to draw the triangles yourself as required above for the software depth buffer.
A good extension would be to support primitives other than triangles. Basic approach would be to split primitives into triangles as you read them.
Good luck - could be an interesting project.
VB6 is not the best suited language to do maths and 3D graphics, and given that you have no previous knowledge about the subject either, I would recommend you to choose something different (and easier).
As it's Visual Basic, you could try something more form-oriented, that is the original intent of the language.
There is the 3D engine list which lists three engine in pure basic (an oxymoron) + Source code and of them one is in Visual Basic (Dex3D)
DeX3D is an open source 3D engine
coded entirely in Visual Basic from
Jerry Chen ( -onlyuser#hotmail.com ).
Gouraud shading
Transparency
Fogging
Omni and spot lights
Hierarchical meshes
Support for 3D Studio files
Particle systems
Bezier curve segments
2.5 D text
Visual Basic source
More information, screenshots and the
source can be found on the Dex3D
Homepage. (<= Dead Link)
EGL25 by Erkan Sanli is a fast open source VB 6 renderer that can render, rotate, animate, etc. complex solid shapes made of thousands of polygons. Just Windows API calls – no DirectX, no OpenGL.
VBMigration.com chose EGL25 as a high-quality open-source VB6 project to demonstrate their VB6 to VB.Net upgrade tool.
A 3D software renderer as a whole project is fairly complex if you've never done it before. I would suggest something smaller - like just doing the 3D portion and using lines to do the rendering OR just write a shaded triangle renderer (which is the underpinnings of 3D renderers anyway).
Something a little simpler rather than trying to write a full-blown 3D software renderer on the first go - especially in VB.
A software renderer is a very difficult project and the language VB6 is not indicated at all ( for a task like this c++ is the way.. ), anyway I can suggest you some great books I used:
Shaders: http://wiki.gamedev.net/index.php/D3DBook:Introduction_%28Volume%29
Math: 3D Math Primer for Graphics and Game Development
There are other 2 books. Even if they are for VB.NET you can find some useful code:
.NET Game Programming with DirectX 9.0
Beginning .NET Game Programming in VB .NET
I think you can take two ways either go the Direct X way and use DirectX 8 that has VB 5-6 support. I found a page http://www.gamedev.net/reference/articles/article1308.asp
You can always write a engine group up but by doing so you will need some basic linear algebra like Frank Krueger suggests.