Problem with huge objects in a quad tree - 2d

Let's say I have circular objects. Each object has a diameter of 64 pixels.
The cells of my quad tree are let's say 96x96 pixels.
Everything will be fine and working well when I check collision from the cell a circle is residing in + all it's neighbor cells.
BUT what if I have one circle that has a diameter of 512 pixels? It would cover many cells and thus this would be a problem when checking only the neighbor cells. But I can't re-size my quad-tree-grid every time a much larger object is inserted into the tree...

Instead och putting objects into a single cell put them in all cells they collide with. That way you can just test each cell individually. Use pointers to the object so you dont create copies. Also you only need to do this with leavenodes, so no need to combine data contained in higher nodes with lower ones.

This an interesting problem. Maybe you can extend the node or the cell with a tree height information? If you have an object bigger then the smallest cell nest it with the tree height. That's what map's application like google or bing maps does.
Here a link to a similar solution: http://www.gamedev.net/topic/588426-2d-quadtree-collision---variety-in-size. I was confusing the screen with the quadtree. You can check collision with a simple recusion.

Oversearching
During the search, and starting with the largest objects first...
Test Object.Position.X against QuadTreeNode.Centre.X, and also
test Object.Position.Y against QuadTreeNode.Centre.Y;
... Then, by taking the Absolute value of the difference, treat the object as lying within a specific child node whenever the absolute value is NOT more than the radius of the object...
... that is, when some portion of the object intrudes into that quad : )
The same can be done with AABB (Axis Aligned Bounding Boxes)
The only real caveat here is that VERY large objects that cover most of the screen, will force a search of the entire tree. In these cases, a different approach may be called for.
Of course, this only takes care of the object that everything else is being tested against. To ensure that all the other large objects in the world are properly identified, you will need to alter your quadtree slightly...
Use Multiple Appearances
In this variation on the QuadTree we ONLY place objects in the leaf nodes of the QuadTree, as pointers. Larger objects may appear in multiple leaf nodes.
Since some objects have multiple appearances in the tree, we need a way to avoid them once they've already been tested against.
So...
A simple Boolean WasHit flag can avoid testing the same object multiple times in a hit-test pass... and a 'cleanup' can be run on all 'hit' objects so that they are ready for the next test.
Whilst this makes sense, it is wasteful if performing all-vs-all hit-tests
So... Getting a little cleverer, we can avoid having any cleanup at all by using a Pointer 'ptrLastObjectTestedAgainst' inside of each object in the scene. This avoids re-testing the same objects on this run (the pointer is set after the first encounter)
It does not require resetting when testing a new object against the scene (the new object has a different pointer value than the last one). This avoids the need to reset the pointer as you would with a simple Bool flag.
I've used the latter approach in scenes with vastly different object sizes and it worked well.
Elastic QuadTrees
I've also used an 'elastic' QuadTree. Basically, you set a limit on how many items can IDEALLY fit in each QuadTreeNode - But, unlike a standard QuadTree, you allow the code to override this limit in specific cases.
The overriding rule here is that an object may NOT be placed into a Node that cannot hold it ENTIRELY... with the top node catching any objects that are larger than the screen.
Thus, small objects will continue to 'fall through' to form a regular QuadTree but large objects will not always fall all the way through to the leaf node - but will instead expand the node that last fitted them.
Think of the non-leaf nodes as 'sieving' the objects as they fall down the tree
This turns out to be a very efficient choice for many scenarios : )
Conclusion
Remember that these standard algorithms are useful general tools, but they are not a substitute for thinking about your specific problem. Do not fall into the trap of using a specific algorithm or library 'just because it is well known' ... your application is unique, and it may benefit from a slightly different approach.
Therefore, don't just learn to apply algorithms ... learn from those algorithms, and apply the principles themselves in novel and fitting ways. These are NOT the only tools, nor are they necessarily the best fit for your application.
Hope some of those ideas helped.

Related

Using two or more index buffers when creating custom geometry with Qt 3D? [duplicate]

I have some vertex data. Positions, normals, texture coordinates. I probably loaded it from a .obj file or some other format. Maybe I'm drawing a cube. But each piece of vertex data has its own index. Can I render this mesh data using OpenGL/Direct3D?
In the most general sense, no. OpenGL and Direct3D only allow one index per vertex; the index fetches from each stream of vertex data. Therefore, every unique combination of components must have its own separate index.
So if you have a cube, where each face has its own normal, you will need to replicate the position and normal data a lot. You will need 24 positions and 24 normals, even though the cube will only have 8 unique positions and 6 unique normals.
Your best bet is to simply accept that your data will be larger. A great many model formats will use multiple indices; you will need to fixup this vertex data before you can render with it. Many mesh loading tools, such as Open Asset Importer, will perform this fixup for you.
It should also be noted that most meshes are not cubes. Most meshes are smooth across the vast majority of vertices, only occasionally having different normals/texture coordinates/etc. So while this often comes up for simple geometric shapes, real models rarely have substantial amounts of vertex duplication.
GL 3.x and D3D10
For D3D10/OpenGL 3.x-class hardware, it is possible to avoid performing fixup and use multiple indexed attributes directly. However, be advised that this will likely decrease rendering performance.
The following discussion will use the OpenGL terminology, but Direct3D v10 and above has equivalent functionality.
The idea is to manually access the different vertex attributes from the vertex shader. Instead of sending the vertex attributes directly, the attributes that are passed are actually the indices for that particular vertex. The vertex shader then uses the indices to access the actual attribute through one or more buffer textures.
Attributes can be stored in multiple buffer textures or all within one. If the latter is used, then the shader will need an offset to add to each index in order to find the corresponding attribute's start index in the buffer.
Regular vertex attributes can be compressed in many ways. Buffer textures have fewer means of compression, allowing only a relatively limited number of vertex formats (via the image formats they support).
Please note again that any of these techniques may decrease overall vertex processing performance. Therefore, it should only be used in the most memory-limited of circumstances, after all other options for compression or optimization have been exhausted.
OpenGL ES 3.0 provides buffer textures as well. Higher OpenGL versions allow you to read buffer objects more directly via SSBOs rather than buffer textures, which might have better performance characteristics.
I found a way that allows you to reduce this sort of repetition that runs a bit contrary to some of the statements made in the other answer (but doesn't specifically fit the question asked here). It does however address my question which was thought to be a repeat of this question.
I just learned about Interpolation qualifiers. Specifically "flat". It's my understanding that putting the flat qualifier on your vertex shader output causes only the provoking vertex to pass it's values to the fragment shader.
This means for the situation described in this quote:
So if you have a cube, where each face has its own normal, you will need to replicate the position and normal data a lot. You will need 24 positions and 24 normals, even though the cube will only have 8 unique positions and 6 unique normals.
You can have 8 vertexes, 6 of which contain the unique normals and 2 of normal values are disregarded, so long as you carefully order your primitives indices such that the "provoking vertex" contains the normal data you want to apply to the entire face.
EDIT: My understanding of how it works:

Required Data for IFC

I'm working on a project where I need to generate an IFC file, and am given not much more information than geometry (I have access to the density and heat-conductivity of materials, and basic labeling for Objects).
So far I could only find what IFC can store, never what IFC needs to store.
What do I need to include in an IFC file so it is properly functional?
What does an IFC file need besides basic geometry?
Disclaimer: I have not read (or bought) the standard. My knowledge primarily stems from working with IFC files and trying different things. And reading the buildingSMART documentation. So I can't give you a hard guarantee, but I am rather confident my information is correct/usable.
As an alternative to buying the official standards file, you could look into the official documentation by buildingsmart. (Also have a look here for more general information and availability of other/more modern releases).
Now assuming you are familiar with the basic STEP file layout (header and data segment), let's jump to what an IFC file absolutely has to include to be considered correct (as far as I understand the documentation; there might be parsers/loaders which can load incorrect/incomplete files, but we aren't aiming for them). Also note I am building this example for IFC 4.0. This should be correct for the current IFC 4.1 standard, but probably not for the older IFC2X3 standard (there have been some relaxations in IFC4 from IFC2X3). Also I am skipping on names and descriptions - you can set those fields for testing to recognize your structures in a viewer (it's easier than comparing GUIDs).
IfcProject
The root of all elements is the IfcProject. It also contains most basic properties and definitions for all other elements. The attributes required per documentation on this entity are only the unique id. But for a working example you usually also need a minimal unit assignment and representation context.
#20= IFCPROJECT('344O7vICcwH8qAEnwJDjSU',$,$,$,$,$,$,(#19),#13);
In the unit assignment you define required units, starting from geometric units to monetary, thermal, etc. The minimum is length, area and angle to meaningfully define geometric items. So for our example we include only those: metre as length, square meter as area and radians as angle. If you need foot or inch or degree you can define those as derived units.
#10= IFCSIUNIT(*,.LENGTHUNIT.,$,.METRE.);
#11= IFCSIUNIT(*,.AREAUNIT.,$,.SQUARE_METRE.);
#12= IFCSIUNIT(*,.PLANEANGLEUNIT.,$,.RADIAN.);
#13= IFCUNITASSIGNMENT((#10,#11,#12));
The representation context defines for a given class of representations (=geometric/parametric descriptions) the basic coordinate system. So the simple case would be a 3-dimensional right handed system at point zero. IFC is working with the z-axis pointing up - this might be important if your are working with models/files originating from 3D/OpenGl applications which usually assume the y-axis pointing upwards. You also need a precision value - I am using 1.0e-5 here, but you might want to test out if you can go with less or need more. The precision is usually applied when comparing points/edges when combining geometry (during constructive solid geometry steps). If you have errors, try a different precision value.
The second attribute of the representation context is the context type. This is a string identifying on which representations this context should be applied. The documentation states that values are based on "implementers agreement" - which means AFAIK "look what the others are using". From my experience using "Model" works for 3D geometry. Using "Plan" for 2D plans and sketches should work, too.
#14= IFCDIRECTION((1.,0.,0.));
#15= IFCDIRECTION((0.,0.,1.));
#16= IFCCARTESIANPOINT((0.,0.,0.));
#17= IFCAXIS2PLACEMENT3D(#16,#15,#14);
#18= IFCDIRECTION((0.,1.));
#19= IFCGEOMETRICREPRESENTATIONCONTEXT($,'Model',3,1.0E-5,#17,#18);
Spatial container for elements
Elements can't be added to the IfcProject directly - they need to be placed into a spatial element which is contained in the project. There are three possible choices: IfcSite, IfcBuilding and IfcSpatialZone (see section Spatial Decomposition on the IfcProject page). The IfcSpatialZone is defined as non-hierarchical spatial element - its usage is slightly different from the other two (elements are added using a different relation).
A single site is sufficient as spatial container. Adding all elements to it might be sematically vague (mostly fences are directly added to it, other elements are usually inside a building) but not incorrect. (IFC does not care if you have electrical appliances in your garden). As nearly all attributes of IfcSite are optional we can skip on those. But beware: if you give your site a representation (=some geometric shape) you will need to include a placement for it. The site will be aggregated into the project to be related to it.
#30= IFCSITE('20FpTZCqJy2vhVJYtjuIce',$,$,$,$,$,$,$,.ELEMENT.,$,$,$,$,$);
#31= IFCRELAGGREGATES('0Du7$nzQXCktKlPUTLFSAT',$,$,$,#20,(#30));
Elements
Actually that is all that is needed as absolute minimum structure. Now you can add your elements - entities of some type derived from IfcProduct. As all those elements have some sort of meaning attached to it you either need to select those closely matching the objects you have, or you might want to use IfcBuildingElementProxy which is the most "meaningless" (or better: no specialized semantic meaning) object type. The following code places one proxy without geometry. The placement references the same coordinate system definition that is used to create the coordinate system out of convenience as it doesn't transform or move anything. Your geometry would be added through a product definition shape which has shape aspects and finally some geometry items. The building smart documentation has a few examples with assigned geometry.
#40= IFCLOCALPLACEMENT($,#17);
#41= IFCBUILDINGELEMENTPROXY('3W29Drc$H6CxK3FGIxjJNl',$,$,$,$,#40,$,$,.NOTDEFINED.);
#42= IFCRELCONTAINEDINSPATIALSTRUCTURE('04ldtj6cp2dME6CiP80Bzh',#12,$,$,(#41),#30);
Conclusion
So there isn't much needed as bare minimum to add elements:
a project
basic unit definitions
one spatial container
The complete example file would be:
ISO-10303-21;
HEADER;FILE_DESCRIPTION(('IFC4'),'2;1');
FILE_NAME('example.ifc','2018-08-8',(''),(''),'','','');
FILE_SCHEMA(('IFC4'));
ENDSEC;
DATA;
#10= IFCSIUNIT(*,.LENGTHUNIT.,$,.METRE.);
#11= IFCSIUNIT(*,.AREAUNIT.,$,.SQUARE_METRE.);
#12= IFCSIUNIT(*,.PLANEANGLEUNIT.,$,.RADIAN.);
#13= IFCUNITASSIGNMENT((#10,#11,#12));
#14= IFCDIRECTION((1.,0.,0.));
#15= IFCDIRECTION((0.,0.,1.));
#16= IFCCARTESIANPOINT((0.,0.,0.));
#17= IFCAXIS2PLACEMENT3D(#16,#15,#14);
#18= IFCDIRECTION((0.,1.));
#19= IFCGEOMETRICREPRESENTATIONCONTEXT($,'Model',3,1.0E-5,#17,#18);
#20= IFCPROJECT('344O7vICcwH8qAEnwJDjSU',$,$,$,$,$,$,(#19),#13);
#30= IFCSITE('20FpTZCqJy2vhVJYtjuIce',$,$,$,$,$,$,$,.ELEMENT.,$,$,$,$,$);
#31= IFCRELAGGREGATES('0Du7$nzQXCktKlPUTLFSAT',$,$,$,#20,(#30));
#40= IFCLOCALPLACEMENT($,#17);
#41= IFCBUILDINGELEMENTPROXY('3W29Drc$H6CxK3FGIxjJNl',$,$,$,$,#40,$,$,.NOTDEFINED.);
#42= IFCRELCONTAINEDINSPATIALSTRUCTURE('04ldtj6cp2dME6CiP80Bzh',$,$,$,(#41),#30);
ENDSEC;
END-ISO-10303-21;
Note that loading this one up doesn't show anything, because it doesn't contain any geometry. Also please note that I have not yet verified if it is error free - I currently don't have my IFC tools at hand (if you would like to verify your files have a look at stepcode which can check if your files are syntactically correct - it won't check semantic meaning or enforcement of the mentioned concepts in the building smart documentation.)
Also good to know is that the order of references/ids (like #20) can be freely arranged - you can reference elements that you add later in the file and the references only need to be unique to this one file. This means the lines of the example file can be shuffled and it is still a valid file - parsers usually use a two-step apporach to create an in-memory representation (1. parse into IFC classes, 2. resolve references).

vector<vector> as a quick-traversal 2d data structure

I'm currently considering the implementation of a 2D data structure to allow me to store and draw objects in correct Z-Order (GDI+, entities are drawn in call order). The requirements are loosely:
Ability to add new objects to the top of any depth index
Ability to remove arbitrary object
(Ability to move object to the top of new depth index, accomplished by 2 points above)
Fast in-order and reverse-order traversal
As the main requirement is speed of traversal across the full data, the first thing that came to mind was an array like structure, eg. vector. It also easily allows for pushing new objects (removing objects not so great..). This works perfectly fine for our requirements, as it just so happens that the bulk of drawable entities don't change, and the ones that do sit at the top end of the order.
However it got me thinking of the implications for more dynamic requirements:
A vector will resize itself as required -> as the 'depth' vectors would need to be maintained contiguously in memory (top-level vector enforces it), this could lead to some pretty expensive vector resizes. Worst case all vectors need to be moved to new memory location, average case requiring all vectors up the chain to be moved.
Vectors will often hold a buffer at the end for adding new objects -> traversal could still easily force a cache miss while jumping between 'depth' vectors, rendering the top-level vector's contiguous memory less beneficial
Could someone confirm that these observations are indeed correct, making a vector a mostly very expensive structure for storing larger dynamic data sets?
From my thoughts above, I end up deducing that while traversing the whole dataset, specifically jumping between different vectors in the top-level vector, you might as well use any other data structure with inferior traversal complexity, or similar random access complexity (linked_list; map). Traversal would effectively be the same, as we might as well assume the cache misses will happen anyway, and we save ourselves a lot of bother by not keeping the depth vectors contiguously in memory.
Would that indeed be a good solution? If I'm not mistaken, on a 1D problem space, this would come down to what's more important traversal or addition/removal, vector or linked-list. On a 2D space I'm not so sure it is so black and white.
I'm wondering what sort of application requires good traversal across a 2D space, without compromising data addition/removal, and what sort of data structures are used there.
P.S. I just noticed I'm completely ignoring space-complexity, so might as well keep on ignoring it (unless you feel like adding more insight :D)
Your first assumption is somewhat incorrect.
Instead of thinking of vectors as the blob of memory itself, think of it as a pointer to automatically managed blob of memory and some metadata to keep track of it. A vector itself is a fixed size, the memory it keeps track of isn't. (See this example, note that the size of the vector object is constant: https://ideone.com/3mwjRz)
A vector of vectors can be thought of as an array of pointers. Resizing what the pointers point to doesn't mean you need to resize the array that contains them. The promise of items being contiguous still holds: the parent array has all of the pointers adjacent to each other and each pointer points to a contiguous chunk of memory. However, it's not guaranteed that the end of arr[0][N-1] is adjacent to the beginning of arr[1][0]. (To this end, your second point is correct.)
I guess that a Linked List would be more appropriate as you will always be traversing the whole list (vectors are good for random access). Linked lists inserts and removal are very cheap and the traversal isn't that different from a vector traversal. Maybe you should consider a Doubly Linked List as you want to traverse it in both ways.

random memory access and bank conflict

in these days, i'm trying program on mobile gpu(adreno)
the algorithm what i use for image processing has 'randomness' for memory access.
it refers some pixels in 'fixed' range for filtering.
BUT, i cant know exactly which pixel will be referred(depends on image)
as far as i understood. if multiple thread access local memory bank
it causes bank conflict. so in my case it should make bank conflict.
MY question: Can i eliminate bank conflict at random memory access?
or can i reduce them?
Assuming that the distances of your randomly accessed pixels is somehow normal distributed, you could think of tiling your image into subimages.
What I mean: instead of working with a (lets say) 1024x1024 image, you might have 4x4 images of size 256x256. Each of them is kept together in memory, so "near" pixel access stays within the same image object. Only the far distance operations need to access different subimages.
A second option: instead of using CLImage objects, try to save your data into an array. The data in the array can be stored in a Z-order curve sorting. This also leads to a reduced spatially distribution (compared to row-order-sorting)
But of course, this depends strongly on your image size.
There are a variety of ways to deal with bank conflicts - the size of the elements you are working with, the strides between lines and shifting the coordinates around to different memory addresses. It's never going to be as good as non-random / conflict free though and so what you will notice is depending on the image - you will see significantly different compute times.
See http://cuda-programming.blogspot.com/2013/02/bank-conflicts-in-shared-memory-in-cuda.html

Best solution for 2D occlusion culling

In my 2D game, I have static and dynamic objects. There can be multiple cameras. My problem: Determine objects that intersect with the current camera's view rectangle.
Currently, I simply iterate over all existing objects (not caring wheter dynamic or static) and do an AABB check with the cameras view rect on them. This seems acceptable for very dynamic objects, but not for static objects, where there can be tens of thousands of them (static level geometry scattered over the whole scene).
I have looked into multiple data structures which could solve my problem:
Quadtree
This was the first thing I considered, however the problem is that it would force my scenes to be of fixed size. (Acceptable for static, but not for dynamic objects)
Dynamic AABB tree
Seems good, but the overhead for rebalancing it seems just too great for many dynamic objects.
Spatial hash
The main problem here for me was that if you zoom out with the camera a lot, a huge number of mostly non-existing spatial hash buckets had to be queried, causing low performance.
In general, my criterias for a good solution of this problem are:
Dynamic size: The solution must not cause the scene size to be limited, or require heavy recomputation for resizing
Good query performance (for the camera)
Good support of very dynamic objects: The computations needed to handle objects with constantly changing position should be good:
The maximum sane number of dynamic objects in my game at one time probably is at 5000. Consider they all change their position every frame. Is there even a data structure which can be faster, considering the frequent insertions and deletions, than comparing the AABBs of the objects with the camera every frame?
Don't try to find the silver bullet. Just split your scene into dynamic and static parts and use different algorithms for them.
Quad trees are obviously suitable for static geometry with fixed
bounds.
Spatial hashes are ideal for sets of objects with similar sizes
(particle systems, for example).
AFAIK dynamic AABB trees are rarely used for occlusion culling, their
main purpose is the broad phase of collision detection.
And as you noticed, bruteforce culling is normal for dynamic objects
if the number of them is not really big.
static level geometry scattered over the whole scene
If your scene is highly-sparse, you can divide it into islands, i.e. create a list of scene parts with "good density".

Resources