Webgl2 -- Transformfeedback vs. dumping to CPU. Which is faster for repositioning 1,000,000 vertice positions? - webgl2

For a physics particle system simulation, I'm wanting to optimize a WebGL2 program. I want to know if it is faster to adjust vertice positions using a transformfeedback accessing a 3D-texture, setting my position to be, for example, "color.r" from a pixel of the texture, or alternatively, dumping the entire 3D-texture back to the CPU and extracting position values for all the vertices from the texture, and resubmitting the new vertice array to the GPU for processing on the subsequent draw cycle.
Being a beginner, I'm clueless as to what would be faster. I need to use a texture because my position calculation requires knowing the positions of 26 neighbor particles relative to the vertex being calculated.
I have no code to show. I'm hoping for guidance for an approach before I write code for either approach.
My intuition says that staying on the GPU rather than pumping 1,000,000 vertices (minimum) worth of data back and forth each draw cycle would be faster, but this is a newbie intuition and I prefer to get guidance from someone who has confidence in his knowledge.

Related

2d game collision detection using displacement vectors

I am trying to implement a collision detection algorithm for my game that uses 2D coordinates (x, y) and quads (rectangles). I am terrible at maths, and prior to making this post I wandered through solutions on stackoverflow, which have left me even more confused as they were stacked with comments saying this doesn't work for this case, or there's a better algorithm than this one, etc...
I did manage to implement a simple AABB collision detection and resolution algorithm in the beginning but later on realized the algorithm doesn't detect cases when the object's speed is high enough for it to phase through objects.
My current thought process was to grab the old position vertices (oldTL, oldTR, oldBL, oldBR) and new position vertices (newTL, newTR, newBL, newBR) of the object, create 4 line segments represented by two points (old), (new) for each pair of vertices, and find out if they intersect any edges on any objects.
I'm very lost and would appreciate any help or feedback I could get...

Triangulating a huge set of points

When triangulating a set of points and the number of points is huge (10 millions), you need to triangulate one chunk at time after subdividing the problem using a quad-tree or an oct-tree.
So far so good, we are now looking for a smart approach to fill the small straight gaps between each mesh. Do you know a good one?
Thanks.
Rather than finish by welding together the separate parts of the mesh, why not start by decomposing the point set into overlapping chunks ? This way your problem becomes one of removing unwanted edges rather than finding missing ones, at the expense of duplicating the computation of the mesh along the borders. This might be easier though I suspect its computational complexity is no different.
I believe that most standard approaches to triangulation can not be expected to produce the same mesh across the boundary for the two overlapping chunks. However, I also believe (without proof) that the computation of the mesh across the boundary between (the interior of neighbouring) chunks is increasingly likely to produce the same triangulation across the boundary as the depth of the overlap increases.
Think of an existing triangulation of a set of points, and add a new point outside the hull of the existing points. Triangulating the extended set of points will require only local (in some vague sense) adjustment of the existing mesh, in most cases. Simlilarly, deleting a point at the edge of an existing mesh will rarely affect the triangulation at the centre of the mesh.
If this ad-hoc approach doesn't appeal to you, use your favourite search engine and look for parallel delaunay triangulation
If the mesh is connected using linear elements (straight sides), the only way you can have gaps is if endpoints on adjacent edges aren't coincident.
You can check within some tolerance sphere whether two points should be made one, but the tolerance has to be smaller than the shortest edge in your mesh or you'll collapse elements.
The smartest thing I can think of is to parallelize the job. Break the mesh into one chunk per thread/process and do the tolerance check on each one.
It might be a good map-reduce job. Or perhaps GPUs and CUDA would be a good way to go.
When you calculate the distance between two points you can forgo the expensive square root and just look at the square of the distance compared to the tolerance.

Vector Shape Difference & intersection

Let me explain my problem:
I have a black vector shape (let's say it's a series of joined, straight lines for now, but it'd be nice if I could also support quadratic curves).
I also have a rectangle of a predefined width and height. I'm going to place it on top of the black shape, and then take the union of the two.
My first issue is that I don't know how to quickly extract vector unions, but I think there is a well-defined formula I can figure out for myself.
My second, and more tricky issue is how to efficiently detect the position the rectangle needs to be in (i.e., what translation and rotation are needed by the matrices), in order to maximize the black, remaining after the union (see figure, below).
The red outlined shape below is ~33% black; the green is something like 85%; and there are positions for this shape & rectangle wherein either could have 100% coverage.
Obviously, I can brute-force this by trying every translation and rotation value for every point where at least part of the rectangle is touching the black shape, then keep track of the one with the most black coverage. The problem is, I can only try a finite number of positions (and may therefore miss the maximum). Apart from that, it feels very inefficient!
Can you think of a more efficient way of tackling this problem?
Something from my Uni days tells me that a Fourier transform might improve the efficiency here, but I can't figure out how I'd do that with a vector shape!
Three ideas that have promise of being faster and/or more precise than brute force search:
Suppose you have a 3d physics engine. Define a "cone-shaped" surface where the apex is at say (0,0,-1), the black polygon boundary on the z=0 plane with its centroid at the origin, and the cone surface is formed by connecting the apex with semi-infinite rays through the polygon boundary. Think of a party hat turned upside down and crumpled to the shape of the black polygon. Now constrain the rectangle to be parallel to the z=0 plane and initially so high above the cone (large z value) that it's easy to find a place where it's definitely "inside". Then let the rectangle fall downward under gravity, twisting about z and translating in x-y only as it touches the cone, staying inside all the way down until it settles and can't move any farther. The collision detection and force resolution of the physics engine takes care of the complexities. When it settles, it will be in a position of maximal coverage of the black polygon in a local sense. (If it settles with z<0, then coverage is 100%.) For the convex case it's probably a global maximum. To probabilistically improve the result for non-convex cases (like your example), you'd randomize the starting position, dropping the polygon many times, taking the best result. Note you don't really need a full blown physics engine (though they certainly exist in open source). It's enough to use collision resolution to tell you how to rotate and translate the rectangle in a pseudo-physical way as it twists and slides uniformly down the z axis as far as possible.
Different physics model. Suppose the black area is an attractive field generator in 2d following the usual inverse square rule like gravity and magnetism. Now let the rectangle drift in a damping medium responding to this field. It ought to settle with a maximal area overlapping the black area. There are problems with "nulls" like at the center of a donut, but I don't think these can ever be stable equillibria. Can they? The simulation could be easily done by modeling both shapes as particle swarms. Or since the rectangle is a simple shape and you are a physicist, you could come up with a closed form for the integral of attractive force between a point and the rectangle. This way only the black shape needs representation as particles. Come to think of it, if you can come up with a closed form for torque and linear attraction due to two triangles, then you can decompose both shapes with a (e.g. Delaunay) triangulation and get a precise answer. Unfortunately this discussion implies it can't be done analytically. So particle clouds may be the final solution. The good news is that modern processors, particularly GPUs, do very large particle computations with amazing speed. Edit: I implemented this quick and dirty. It works great for convex shapes, but concavities create stable points that aren't what you want. Using the example:
This problem is related to robot path planning. Looking at this literature may turn up some ideas In RPP you have obstacles and a robot and want to find a path the robot can travel while avoiding and/or sliding along them. If the robot is asymmetric and can rotate, then 2d planning is done in a 3d (toroidal) configuration space (C-space) where one dimension is rotation (so closes on itself). The idea is to "grow" the obstacles in C-space while shrinking the robot to a point. Growing the obstacles is achieved by computing Minkowski Differences.) If you decompose all polygons to convex shapes, then there is a simple "edge merge" algorithm for computing the MD.) When the C-space representation is complete, any 1d path that does not pierce the "grown" obstacles corresponds to continuous translation/rotation of the robot in world space that avoids the original obstacles. For your problem the white area is the obstacle and the rectangle is the robot. You're looking for any open point at all. This would correspond to 100% coverage. For the less than 100% case, the C-space would have to be a function on 3d that reflects how "bad" the intersection of the robot is with the obstacle rather than just a binary value. You're looking for the least bad point. C-space representation is an open research topic. An octree might work here.
Lots of details to think through in both cases, and they may not pan out at all, but at least these are frameworks to think more about the problem. The physics idea is a bit like using simulated spring systems to do graph layout, which has been very successful.
I don't believe it is possible to find the precise maximum for this problem, so you will need to make do with an approximation.
You could potentially render the vector image into a bitmap and use Haar features for this - they provide a very quick O(1) way of calculating the average colour of a rectangular region.
You'd still need to perform this multiple times for different rotations and positions, but it would bring it algorithmic complexity down from a naive O(n^5) to O(n^3) which may be acceptably fast. (with n here being the size of the different degrees of freedom you are scanning)
Have you thought to keep track of the remaining white space inside the blocks with something like if whitespace !== 0?

Rendering massive amount of data

I have a 3D floating-point matrix, in worst-case scenario the size could be (200000x1000000x100), I want to visualize this matrix using Qt/OpenGL.
Since the number of elements is extremely high, I want to render them in a way that when the camera is far away from the matrix, I just show a number of interesting points that gives an approximation of how the matrix look like. When the camera gets closer, I want to get more details and hence more elements are calculated.
I would like to know if there are techniques that deals with this kind of visualization.
The general idea is called level-of-detail rendering and is a whole science in itself.
For your domain i would recommend two steps:
1) Reduce the number of cells by averaging (arithmetic-mean function) them in cubes of different sizes and caching those cubes (on disk as well as RAM). "Different" means here, that you have the same data in multiple sizes of cubes, e.g. coarse-grained cubes of 10000x10000x10000 and finer cubes of 100x100x100 cells resulting in multiple levels-of-detail. You have to organize these in a hierarchical structure (the larger ones containing multiple smaller ones) and for this i would recommend an Octree:
http://en.wikipedia.org/wiki/Octree
2) The second step is to actually render parts of this Octree:
To do this use the distance of your camera-point to the sub-cubes. Go through the cubes and decide to either enter the sub-cube or render the larger cube by using this distance-function and heuristically chosen or guessed threshold-values.
(2) can be further optimized but this is optional: To optimize this rendering organize the to-be-rendered cube's into layers: The direction of the layers (whether it is in x, y, or z-slices) depends on your camera-viewpoint to which it should be near-perpendicular. Then render each slice into a texture and voila you only have to render a single quad with that texture for each slice, 1000 quads are no problem to render.
Qt has some way of rendering huge number of elements efficiently. Check the examples/demo that is part of QT.

2D orbital physics

I'm working on a 2D physics engine for a game. I have gravity and masses working, using a simple iterative approach (that I know I'll have to upgrade eventually); I can push the masses around manually and watch them move and it all works as I'd expect.
Right now I'm trying to set up the game world in advance with a satellite in a simple circular orbit around a planet. To do this I need to calculate the initial velocity vector of the satellite given the mass of the planet and the desired distance out; this should be trivial, but I cannot for the life of me get it working right.
Standard physics textbooks tell me that the orbital velocity of an object in circular orbit around a mass M is:
v = sqrt( G * M / r )
However, after applying the appropriate vector the satellite isn't going anything like fast enough and falls in in a sharply elliptical orbit. Random tinkering shows that it's off by about a factor of 3 in one case.
My gravity simulation code is using the traditional:
F = G M m / r^2
G is set to 1 in my universe.
Can someone confirm to me that these equations do still hold in 2D space? I can't see any reason why not, but at this point I really want to know whether the problem is in my code or my assumptions...
Update: My physics engine works as follows:
for each time step of length t:
reset cumulative forces on each object to 0.
for each unique pair of objects:
calculate force between them due to gravity.
accumulate force to the two objects.
for each object:
calculate velocity change dV for this timestep using Ft / m.
v = v + dV.
calculate position change dS using v * t.
s = s + dS.
(Using vectors where appropriate, of course.)
Right now I'm doing one physics tick every frame, which is happening about 500-700 times per second. I'm aware that this will accumulate errors very quickly, but it should at least get me started.
(BTW, I was unable to find an off-the-shelf physics engine that handles orbital mechanics --- most 2D physics engines like Chipmunk and Box2D are more focused on rigid structures instead. Can anyone suggest one I could look at?)
You need to make sure that your delta t iterative time value is small enough. You will definitely have to tinker with the constants in order to get the behaviour you expect. Iterative simulation in your case and most cases is a form of integration where errors build up fast and unpredictably.
Yes, these equations hold in 2D space, because your 2D space is just a 2D representation of a 3D world. (A "real" 2D universe would have different equations, but that's not relevant here.)
A long shot: Are you perhaps using distance to the surface of the planet as r?
If that isn't it, try cutting your time step in half; if that makes a big difference, keep reducing it until the behavior stops changing.
If that makes no difference, try setting the initial velocity to zero, then watching it fall for a few iterations and measuring its acceleration to see if it's GM/r2. If the answer still isn't clear, post the results and we'll try to figure it out.

Resources