Why would VkImageView format differ from the underlying VkImage format? - imageview

VkImageCreateInfo has the following member:
VkFormat format;
And VkImageViewCreateInfo has the same member.
What I don't understand why you would ever have a different format in the VkImageView from the VkImage needed to create it.
I understand some formats are compatible with one another, but I don't know why you would use one of the alternate formats

The canonical use case and primary original motivation (in D3D10, where this idea originated) is using a single image as either R8G8B8A8_UNORM or R8G8B8A8_SRGB -- either because it holds different content at different times, or because sometimes you want to operate in sRGB-space without linearization.
More generally, it's useful sometimes to have different "types" of content in an image object at different times -- this gives engines a limited form of memory aliasing, and was introduced to graphics APIs several years before full-featured memory aliasing was a thing.

Like a lot of Vulkan, the API is designed to expose what the hardware can do. Memory layout (image) and the interpretation of that memory as data (image view) are different concepts in the hardware, and so the API exposes that. The API exposes it simply because that's how the hardware works and Vulkan is designed to be a thin abstraction; just because the API can do it doesn't mean you need to use it ;)
As you say, in most cases it's not really that useful ...
I think there are some cases where it could be more efficient, for example getting a compute shader to generate integer data for some types of image processing can be more energy efficient than either float computation or manually normalizing integer data to create unorm data. Using aliasing you the compute shader can directly write e.g. uint8 integers and a fragment shader can read the same data as unorm8 data

Related

How are you supposed to update a texture per frame in Vulkan?

I'm trying to work with 2D in vulkan along with 3D. So right now testing out updating a texture for every frame as whatever 2D is going on. I've gotten something of a texture updater working, the problem is that it's very slow and probably not the way it's supposed to be done. Is there any better way of getting this done? The code is based on the https://vulkan-tutorial.com/ code.
https://vulkan-tutorial.com/code/26_depth_buffering.cpp
void UpdateTexture()
{
vkDeviceWaitIdle(device);
vkFreeMemory(device, textureImageMemory, nullptr);
VkBuffer stagingBuffer;
VkDeviceMemory stagingBufferMemory;
createBuffer(imageSize, VK_BUFFER_USAGE_TRANSFER_SRC_BIT, VK_MEMORY_PROPERTY_HOST_COHERENT_BIT, stagingBuffer, stagingBufferMemory);
void* data;
vkMapMemory(device, stagingBufferMemory, 0, imageSize, 0, &data);
memcpy(data, pixel2.data(), static_cast<size_t>(imageSize));
vkUnmapMemory(device, stagingBufferMemory);
createImage(texWidth, texHeight, VK_FORMAT_R8G8B8A8_SRGB, VK_IMAGE_TILING_OPTIMAL, VK_IMAGE_USAGE_TRANSFER_DST_BIT | VK_IMAGE_USAGE_SAMPLED_BIT, VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT, textureImage, textureImageMemory);
transitionImageLayout(textureImage, VK_FORMAT_R8G8B8A8_SRGB, VK_IMAGE_LAYOUT_UNDEFINED, VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL);
copyBufferToImage(stagingBuffer, textureImage, static_cast<uint32_t>(texWidth), static_cast<uint32_t>(texHeight));
transitionImageLayout(textureImage, VK_FORMAT_R8G8B8A8_SRGB, VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL, VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL);
vkDestroyBuffer(device, stagingBuffer, nullptr);
vkFreeMemory(device, stagingBufferMemory, nullptr);
createTextureImageView();
createDescriptorPool();
createDescriptorSets();
createCommandBuffers();
}
This code looks like a direct translation of some OpenGL code, and not particularly good/modern OpenGL code at that.
There's a lot wrong in this code, but most of it boils down to over-synchronization.
First, you should always view any call to vkDeviceWaitIdle as the wrong thing to do. The only exception would be when you are preparing to destroy the VkDevice itself. There is no other reason to do a full CPU/GPU sync like that.
Presumably, this synchronization exists so that you can be sure the GPU is finished using the image before modifying it. This is the wrong thing to do. You should instead employ multiple-buffering. That is, you should have two images that you use. One is currently being used in a rendering process, while the other is being transferred into.
Instead of doing a full device sync, you instead synchronize with the batch you sent two frames ago. That is, if you're wanting to transfer data for use by frame 10, then you must first do a fence-sync operation with the batch you sent in frame 8. Frame 9 is still being processed, but frame 8 is probably done by now. So the synchronization shouldn't hurt too much.
Second, never allocate memory in the middle of an operation like this. Memory gets allocated early in your application, and you leave it allocated until it's time to destroy your application. If you need a staging buffer, then keep it around and reuse it in subsequent frames. Make sure to allocate sufficient storage up-front.
Whatever your createBuffer call is doing, it seems very much like a bad idea. Vulkan is not OpenGL; Vulkan separated memory from buffers/textures that use it for a reason. Creating APIs that hide this separation basically throws all of that away.
Similarly, never unmap memory, unless you're about to destroy that memory object. There's no problem in Vulkan (or OpenGL) with leaving a piece of memory mapped indefinitely. Just map the entire memory's range and leave it mapped. Indeed, you could just pass the mapped pointer directly to your image loader, depending on how the memory get written by the image loading code (if it tries to read data from this pointer, they could be trouble).
Lastly, the commands doing the transfer need to be synchronized with the commands that consume the image. How this happens depends on which queues are being used to do the transfer.
And of course, if you want optimal performance, you may want to check to see if your implementation can read from linear images in your shader. If it can, then you may not need staging at all; you can just write the data directly to the memory in Vulkan's image format, and use it directly.
Employing all of the above is going to add a lot of complexity to your application. But that's how it's supposed to work.
A naive way consists in using the CPU to define the update depending on the time or data and then update the data for the shader, such as a MVP transformation matrix. But this is inefficient with lots of syncing and too low refresh rates, and also overloading the cpu in a loop.
So people recommend using many buffers sometimes mentioning old drivers. If someone can clarify it, that would be nice. I have a naive and probably wrong guess. If they know exactly the frame rate, then they can calculate the time for each frame and dispatch several frames in advance. But it confuses me because the frame rate is dynamic, especially for new screens with the FreeSync functionality that have dynamic refresh rates.
I have thought of a third possibility. One can use the clock directly in the shader. GL_EXT_shader_realtime_clock provides clockRealtimeEXT. It has no defined unit, and will wrap when exceeding the maximum value. But it is said "globally coherent by all invocations on the GPU". During initialization, you can measure its rate using a uniform buffer, and then assume the rate will be constant. And also manage the wrapping.
Then if you can write your shaders as a function of time, for example in a translation, that would be efficient. You just need the initial data. Remember that one must avoid if conditions in shaders.

Required Data for IFC

I'm working on a project where I need to generate an IFC file, and am given not much more information than geometry (I have access to the density and heat-conductivity of materials, and basic labeling for Objects).
So far I could only find what IFC can store, never what IFC needs to store.
What do I need to include in an IFC file so it is properly functional?
What does an IFC file need besides basic geometry?
Disclaimer: I have not read (or bought) the standard. My knowledge primarily stems from working with IFC files and trying different things. And reading the buildingSMART documentation. So I can't give you a hard guarantee, but I am rather confident my information is correct/usable.
As an alternative to buying the official standards file, you could look into the official documentation by buildingsmart. (Also have a look here for more general information and availability of other/more modern releases).
Now assuming you are familiar with the basic STEP file layout (header and data segment), let's jump to what an IFC file absolutely has to include to be considered correct (as far as I understand the documentation; there might be parsers/loaders which can load incorrect/incomplete files, but we aren't aiming for them). Also note I am building this example for IFC 4.0. This should be correct for the current IFC 4.1 standard, but probably not for the older IFC2X3 standard (there have been some relaxations in IFC4 from IFC2X3). Also I am skipping on names and descriptions - you can set those fields for testing to recognize your structures in a viewer (it's easier than comparing GUIDs).
IfcProject
The root of all elements is the IfcProject. It also contains most basic properties and definitions for all other elements. The attributes required per documentation on this entity are only the unique id. But for a working example you usually also need a minimal unit assignment and representation context.
#20= IFCPROJECT('344O7vICcwH8qAEnwJDjSU',$,$,$,$,$,$,(#19),#13);
In the unit assignment you define required units, starting from geometric units to monetary, thermal, etc. The minimum is length, area and angle to meaningfully define geometric items. So for our example we include only those: metre as length, square meter as area and radians as angle. If you need foot or inch or degree you can define those as derived units.
#10= IFCSIUNIT(*,.LENGTHUNIT.,$,.METRE.);
#11= IFCSIUNIT(*,.AREAUNIT.,$,.SQUARE_METRE.);
#12= IFCSIUNIT(*,.PLANEANGLEUNIT.,$,.RADIAN.);
#13= IFCUNITASSIGNMENT((#10,#11,#12));
The representation context defines for a given class of representations (=geometric/parametric descriptions) the basic coordinate system. So the simple case would be a 3-dimensional right handed system at point zero. IFC is working with the z-axis pointing up - this might be important if your are working with models/files originating from 3D/OpenGl applications which usually assume the y-axis pointing upwards. You also need a precision value - I am using 1.0e-5 here, but you might want to test out if you can go with less or need more. The precision is usually applied when comparing points/edges when combining geometry (during constructive solid geometry steps). If you have errors, try a different precision value.
The second attribute of the representation context is the context type. This is a string identifying on which representations this context should be applied. The documentation states that values are based on "implementers agreement" - which means AFAIK "look what the others are using". From my experience using "Model" works for 3D geometry. Using "Plan" for 2D plans and sketches should work, too.
#14= IFCDIRECTION((1.,0.,0.));
#15= IFCDIRECTION((0.,0.,1.));
#16= IFCCARTESIANPOINT((0.,0.,0.));
#17= IFCAXIS2PLACEMENT3D(#16,#15,#14);
#18= IFCDIRECTION((0.,1.));
#19= IFCGEOMETRICREPRESENTATIONCONTEXT($,'Model',3,1.0E-5,#17,#18);
Spatial container for elements
Elements can't be added to the IfcProject directly - they need to be placed into a spatial element which is contained in the project. There are three possible choices: IfcSite, IfcBuilding and IfcSpatialZone (see section Spatial Decomposition on the IfcProject page). The IfcSpatialZone is defined as non-hierarchical spatial element - its usage is slightly different from the other two (elements are added using a different relation).
A single site is sufficient as spatial container. Adding all elements to it might be sematically vague (mostly fences are directly added to it, other elements are usually inside a building) but not incorrect. (IFC does not care if you have electrical appliances in your garden). As nearly all attributes of IfcSite are optional we can skip on those. But beware: if you give your site a representation (=some geometric shape) you will need to include a placement for it. The site will be aggregated into the project to be related to it.
#30= IFCSITE('20FpTZCqJy2vhVJYtjuIce',$,$,$,$,$,$,$,.ELEMENT.,$,$,$,$,$);
#31= IFCRELAGGREGATES('0Du7$nzQXCktKlPUTLFSAT',$,$,$,#20,(#30));
Elements
Actually that is all that is needed as absolute minimum structure. Now you can add your elements - entities of some type derived from IfcProduct. As all those elements have some sort of meaning attached to it you either need to select those closely matching the objects you have, or you might want to use IfcBuildingElementProxy which is the most "meaningless" (or better: no specialized semantic meaning) object type. The following code places one proxy without geometry. The placement references the same coordinate system definition that is used to create the coordinate system out of convenience as it doesn't transform or move anything. Your geometry would be added through a product definition shape which has shape aspects and finally some geometry items. The building smart documentation has a few examples with assigned geometry.
#40= IFCLOCALPLACEMENT($,#17);
#41= IFCBUILDINGELEMENTPROXY('3W29Drc$H6CxK3FGIxjJNl',$,$,$,$,#40,$,$,.NOTDEFINED.);
#42= IFCRELCONTAINEDINSPATIALSTRUCTURE('04ldtj6cp2dME6CiP80Bzh',#12,$,$,(#41),#30);
Conclusion
So there isn't much needed as bare minimum to add elements:
a project
basic unit definitions
one spatial container
The complete example file would be:
ISO-10303-21;
HEADER;FILE_DESCRIPTION(('IFC4'),'2;1');
FILE_NAME('example.ifc','2018-08-8',(''),(''),'','','');
FILE_SCHEMA(('IFC4'));
ENDSEC;
DATA;
#10= IFCSIUNIT(*,.LENGTHUNIT.,$,.METRE.);
#11= IFCSIUNIT(*,.AREAUNIT.,$,.SQUARE_METRE.);
#12= IFCSIUNIT(*,.PLANEANGLEUNIT.,$,.RADIAN.);
#13= IFCUNITASSIGNMENT((#10,#11,#12));
#14= IFCDIRECTION((1.,0.,0.));
#15= IFCDIRECTION((0.,0.,1.));
#16= IFCCARTESIANPOINT((0.,0.,0.));
#17= IFCAXIS2PLACEMENT3D(#16,#15,#14);
#18= IFCDIRECTION((0.,1.));
#19= IFCGEOMETRICREPRESENTATIONCONTEXT($,'Model',3,1.0E-5,#17,#18);
#20= IFCPROJECT('344O7vICcwH8qAEnwJDjSU',$,$,$,$,$,$,(#19),#13);
#30= IFCSITE('20FpTZCqJy2vhVJYtjuIce',$,$,$,$,$,$,$,.ELEMENT.,$,$,$,$,$);
#31= IFCRELAGGREGATES('0Du7$nzQXCktKlPUTLFSAT',$,$,$,#20,(#30));
#40= IFCLOCALPLACEMENT($,#17);
#41= IFCBUILDINGELEMENTPROXY('3W29Drc$H6CxK3FGIxjJNl',$,$,$,$,#40,$,$,.NOTDEFINED.);
#42= IFCRELCONTAINEDINSPATIALSTRUCTURE('04ldtj6cp2dME6CiP80Bzh',$,$,$,(#41),#30);
ENDSEC;
END-ISO-10303-21;
Note that loading this one up doesn't show anything, because it doesn't contain any geometry. Also please note that I have not yet verified if it is error free - I currently don't have my IFC tools at hand (if you would like to verify your files have a look at stepcode which can check if your files are syntactically correct - it won't check semantic meaning or enforcement of the mentioned concepts in the building smart documentation.)
Also good to know is that the order of references/ids (like #20) can be freely arranged - you can reference elements that you add later in the file and the references only need to be unique to this one file. This means the lines of the example file can be shuffled and it is still a valid file - parsers usually use a two-step apporach to create an in-memory representation (1. parse into IFC classes, 2. resolve references).

What is the advantage of using a 1d image over a 1d buffer?

I understand that in 2d, images are cached in x and y directions.
But in 1d, why would you want to use an image? Is the memory used
for images faster than memory used for buffers?
1D Image stays the image, so it has all advantages that Image has against Buffer. That is:
Image IO operations are usually well-cached.
Samplers can be used, which gives benefits like computationally cheap interpolation, hardware-resolved out-ouf-bound access, etc.
Though, you should remember that Image has some constraints in comparison to regular Buffer:
Single Image can be used either for reading or for writing within one kernel.
You can't use vloadN / vstoreN operations, which can handle up to 16 values per call. Your best option is read_imageX & write_imageX functions, which can load / store up to 4 values per one call. That can be serious issue on GPU, with vector architecture.
If you are not using 4-component format, usually, you are loosing part of performance as many functions process samples from color planes simultaneously. So, payload is decreasing.
If we talk about GPU, different parts of hardware are involved into processing of Images & Buffers, so it's difficult to draw up, how one is better than another. Carefull benchmarking & algorithm optimizations are needed.

OpenCL performance: using arrays of primitives vs arrays of structures

I'm trying to use several arrays of doubles in a kernel that are all the same length. Instead of passing each double* in as a separate argument, I know I can define a structure in the .cl file that holds several doubles and then just pass into the kernel one pointer for an array of the structures instead.
Will the performance be different for the two ways? Please correct me if I am wrong, but I think passing individual double pointers means the access can be coalesced. Will accessing the structures also be coalesced?
As long as your structures don't contain any pointers what you say is absolutely possible. The primary impact is generally, as you've already considered, the effect this has on the coalescing of memory operations. How big an effect is down to your memory access pattern, the size of your struct and the device you're running on. More details would be needed to describe this more fully.
Saying that, one instance where I've used a struct in this way very successfully is where the element being read is the same for all work items in a work group. In this case there is no penalty on my hardware (nvidia GTX 570). Also it is worth remembering that in some cases the added latency introduced by the memory operation being serialised can be hidden. In the CUDA world this would be achieved by having a high occupancy for a problem with a high arithmetic intensity.
Finally it is worth pointing out that the semantic clarity of using a struct can have a benefit in and of itself. You'll have to consider this against any performance cost for your particular problem. My advice is to try it and see; it is very difficult to predict the impact of these issues ahead of time.
Theoretically it is the same performance. However if you access some of the members more often, using several segregatedvarrays will have much more performance, due to cpu cache locality. However most operations will be more difficult when you have several arrays.
The structures and the single elements will have the exact same performance.
Supose you have a big array of doubles, and the first work item uses 0, 100, 200, 300, ... and the next one uses 1, 101, 201, 301, ...
If you have a structure of 100 doubles, in memory the first structure will be first(0-99), then the second(100-199) and so on. The kernels will acess exactly the same memory and in the same places, the only difference is how you define the memory abstraction.
In a more generic case of having a structure of different element types (char, int, double, bool, ...) it may happen that the aligment is not as if it were a single array of data. But it will still be "semi-coalesced". I would even bet the performance is still the same.

I want to convert a sound from Mic to binary and match it from the database

I want to convert a sound from Mic to binary and match it from the database(a type of voice identification program but don't getting idea how to get sound from Mic directly so that i can convert it to binary?Also it is possible or not. Please guide me )
See this:
http://www.dotnetspider.com/resources/4967-How-record-voice-from-microphone.aspx
You're not going to be able to identify voices by doing a binary comparison on sound data. The binary of a particular sound will not be identical to an imitation of that sound unless it is literally the same file because of minor variations in just about everything. You'll need to do some signals processing to do a fuzzy comparison of the data. You can read about signal processing on wikipedia.
You will probably find it easier to use a third party library to process the sound for you. Something like this might be a good start.
You're looking at two very distinct problems here.
The first is pretty technical: Getting sound from the microphone into a digital waveform. How you do this exactly depends on the OS and API you're using (on Windows, you're probably looking at DirectX audio or, if available, ASIO). Typically, this is how you'd proceed:
Set up a recording buffer for the microphone, with suitable parameters (number of channels, physical input on the sound card, sample rate, bit depth, buffer size)
Start the recording. This usually involves pointing the sound library to a callback function to process the recorded buffer.
In the callback, read the buffer, convert it to a suitable format, and append it to the audio file of your choice. (You could also record to RAM only, but longer recordings may exceed available storage).
Store the recorded audio in a suitable database field (some kind of binary blob)
This is the easy part though; the harder part is matching a chunk of audio data against other chunks. A naïve approach would be to try and find exact matches, but that won't help you much, because the chance that you find one is practically zero - recording equipment, even the best, introduces a bit of random noise, and recording setups vary slightly whether you want to or not, so even if you'd have someone say something twice, perfectly identical, you'd still see differences in the recorded audio.
What you need to do, then, is find certain typical characteristics of the waveform. Things you could look for are:
Overall amplitude shape
Base frequencies
Selected harmonics (formants)
Extracting these is non-trivial and involves pretty severe math; and then you'll have to condense them into some sort of fingerprint, and find a way to compare them with some fuzziness (so that a near-match is good enough, rather than requiring exact matches). Finding the right parameters and comparison algorithms isn't easy, and it takes a lot of tweaking and testing; your best bet is to go find a library that does this for you.

Resources