I have the following OpenCl kernel code:
kernel void generateImage(global write_only image2d_t output_image)
{
const int2 pos = {get_global_id(0), get_global_id(1)};
write_imagef(output_image, (int2)(pos.x, pos.y), (float4)(1.0f, 0.0f, 0.0f, 0.0f));
}
How can I read the generated image on the CPU side to render it ? I am using plain C. Also a link to some nice tutorial would be great.
The clEnqueueReadImage() function is an image object's equivalent to a buffer object's clEnqueueReadBuffer() function - with similar semantics. The main difference is that (2D) images have a "pitch" - this is the number of bytes by which you advance in memory if you move 1 pixel along the y axis. (This is not necessarily equal to width times bytes per pixel but can be larger if your destination has special storage/alignment requirements.)
The alternative, much as is the case with buffer objects, is to memory-map the image using clEnqueueMapImage().
How you further process the image once your host program can access it depends on what you're trying to do and what platform you're developing for.
Related
I'm using a 2D texture array to store some data. As I often want to bind single layers of this 2D texture array, I create individual GL_TEXTURE_2D texture views for each layer:
for(int l(0); l < m_layers; l++)
{
QOpenGLTexture * view_texture = m_texture.createTextureView(QOpenGLTexture::Target::Target2D,
m_texture_format,
0,0,
l,l);
view_texture->setMinMagFilters(QOpenGLTexture::Filter::Linear, QOpenGLTexture::Filter::Linear);
view_texture->setWrapMode(QOpenGLTexture::WrapMode::MirroredRepeat);
assert(view_texture != 0);
m_texture_views.push_back(view_texture);
}
These 2D TextureViews work fine. However, if I want to retrieve the 2D texture data from the GPU side using that texture view it doesn't work.
In other words, the following copies no data (but throws no GL errors):
glGetTexImage(GL_TEXTURE_2D, 0, m_pixel_format, m_pixel_type, (GLvoid*) m_raw_data[layer] )
However, retrieving the entire GL_TEXTURE_2D_ARRAY does work:
glGetTexImage(GL_TEXTURE_2D_ARRAY, 0, m_pixel_format, m_pixel_type, (GLvoid*) data );
There would obviously be a performance loss if I need to copy across all layers of the 2D texture array when only data for a single layer has been modified.
Is there a way to copy GPU->CPU only a single layer of a GL_TEXTURE_2D_ARRAY? I know there is for the opposite (i.e CPU->GPU) so I would be surprised if there wasn't.
Looks like you found a solution with using glGetTexSubImage() from OpenGL 4.5. There is also a simple solution that works with OpenGL 3.2 or higher.
You can set the texture layer as an FBO attachment, and then use glReadPixels():
GLuint fboId = 0;
glGenFramebuffers(1, &fboId);
glBindFramebuffer(GL_READ_FRAMEBUFFER, fboId);
glFramebufferTextureLayer(GL_READ_FRAMEBUFFER, GL_COLOR_ATTACHMENT0,
textureId, 0, layer);
glReadPixels(...);
glBindFramebuffer(GL_READ_FRAMEBUFFER, 0);
What version of GL are you working with?
You are probably not going to like this, but... GL 4.5 introduces glGetTextureSubImage (...) to do precisely what you want. That is a pretty hefty version requirement for something so simple; it is also available in extension form, but that extension is relatively new as well.
There is no special hardware requirement for this functionality, but it requires a very recent driver.
I would not despair just yet, however.
You can copy the entire texture array into a PBO and then read a sub-rectangle of that PBO back using the buffer object API (e.g. glGetBufferSubData (...)). That requires extra memory on the GPU-side, but will allow you to transfer a single slice of this 2D array.
I am currently working on a Qt application to draw maps. I am trying to draw 400,000+ lines and it crashes after using ~2GB but I still have memory left on my machine. I am wondering if I am hitting some limit inside of Qt that is causing the problem. Anyone know if there is a limit to the number of things you can draw or if you can change this limit?
If it is helpful, I am coding in C++ with a class that has a member function to draw the lines. The code is roughly as follows
QPointF fromPoint;
QPointF toPoint;
fromPoint = foo( x );
toPoint = foo( y );
m_Painter.drawLine(fromPoint, toPoint );
//m_Painter is a QPainter
Edit: Turns out the problem was somewhere else in the code. It had to do with the custom caching that was being done. Though I am still interested if there is a limit to how many lines Qt can draw. Does anyone know?
QPainter executes its underlying graphics through QPaintEngine, which has several implementations (like qpaintengine_mac.cpp, qpaintengine_x11.cpp, or qpaintengine_preview.cpp).
Some devices are raster...and are likely drawing each line into an image buffer and throwing away the endpoints after that drawing is done. There should be no limit to the number of lines you can draw in that case.
If the target device is OpenGL, or to a printer that is doing some kind of PostScript-like output, then the limitations of that particular paint engine may well be a factor. You'd have to look at the specific one.
For example: if you trace down the X11 implementation of drawLine you'll see it passes through to drawPolygon() down through strokePolygon_dev()...and bottoms out at a call to XDrawLines:
XDrawLines(dpy, hd, gc, pts, numberPoints, CoordModeOrigin);
So there you have another abstraction layer...and so the question becomes whether the XWindows display parameter is guaranteed to be raster. (My guess would be that it is.)
Anyway, so the answer is "unlimited if raster. may depend otherwise--but the limitations (if any) are probably coming from the underlying device for the paint engine, not Qt."
Can I set up an opencl image so that coordinate access past the boundary
of the image will return the mirror image?
For example, if image is of dimensions width by height, then
read_floati(width, 0)
will return
read_floati(width-2,0)
Yes. Read section 6.11.13 of the OpenCL specification. OpenCL images are read using (for example) read_imagef function and that function takes a sampler which can be set up for mirroring using CLK_ADDRESS_MIRRORED_REPEAT.
A game uses software rendering to draw a full-screen paletted (8-bit) image in memory.
What's the fastest way to put that image on the screen, using Direct3D?
Currently I convert the paletted image to RGB in software, then put it on a D3DUSAGE_DYNAMIC texture (which is locked with D3DLOCK_DISCARD).
Is there a faster way? E.g. using shaders to perform palettization?
Related questions:
Fast paletted screen blit with OpenGL - same question with OpenGL
How do I improve Direct3D streaming texture performance? - similar question from SDL author
Create a D3DFMT_L8 texture containing the paletted image, and an 256x1 D3DFMT_X8R8G8B8 image containing the palette.
HLSL shader code:
uniform sampler2D image;
uniform sampler1D palette;
float4 main(in float2 coord:TEXCOORD) : COLOR
{
return tex1D(palette, tex2D(image, coord).r * (255./256) + (0.5/256));
}
Note that the luminance (palette index) is adjusted with a multiply-add operation. This is necessary, as palette index 255 is considered as white (maximum luminance), which becomes 1.0f when represented as a float. Reading the palette texture at that coordinate causes it to wrap around (as only the fractionary part is used) and read the first palette entry instead.
Compile it with:
fxc /Tps_2_0 PaletteShader.hlsl /FhPaletteShader.h
Use it like this:
// ... create and populate texture and paletteTexture objects ...
d3dDevice->CreatePixelShader((DWORD*)g_ps20_main, &shader)
// ...
d3dDevice->SetTexture(1, paletteTexture);
d3dDevice->SetPixelShader(shader);
// ... draw texture to screen as textured quad as usual ...
You could write a simple pixel shader to handle the palettization. Create an L8 dynamic texture and copy your paletteized image to it and create a palette lookup texture (or an array of colors in constant memory). Then just render a fullscreen quad with the palettized image set as a texture and a pixel shader that performs the palette lookup from the lookup texture or constant buffer.
That said, performing the palette conversion on the CPU shouldn't be very expensive on a modern CPU. Are you sure that is your performance bottleneck?
I am looking for a fairly simple image comparison method in AS3. I have taken an image from a web cam (with no subject) passed it in to bitmap data, then a second image is taken (this time with a subject) to compare this data, from these two images I would like to create a mask from the pixels that match on both bitmaps. I have been scratching my head for a while, and I am not really making any progress. Could any one point me in the right direction for pixel comparison method, something like getPixel32()
Cheers
Jono
use compare to create a difference between the two and then use treshold to extract the parts that interest you.
edit: actually it is pretty straight forward. the trick is to apply the threshold multiple times per channel using the mask parameter (otherwise the comparison only makes little sense, since 0x010000 (which is almost black) is consider greater than 0x0000FF (which is anything but black)). here's how:
var dif:BitmapData;//your original bitmapdata
var mask:BitmapData = new BitmapData(dif.width, dif.height, true, 0);
const threshold:uint = 0x20;
for (var i:int = 0; i < 3; i++)
mask.threshold(dif, dif.rect, new Point(), ">", threshold << (i * 8), 0xFF000000, 0xFF << (i * 8));
this creates a transparent mask. then the threshold is applied for all three channels, setting the alpha channel to fully opaque where the channels value exceeds the threshold value (you might wanna decrease it).
you can isolate the foreground object ("the guy in front of the webcam") by copying the alpha channel from the mask to the current video image.
one of the problems here is that you want to find if a pixel has ANY change to it, and if it does then to convert that pixel to another color (for masking). Unfortunately, a webcam's quality isn't great so even if your scene does not change at all the bitmapdata coming from the webcam will change slightly. Therefor, when your subject steps into frame...you will get pixel changes for the subject...but also noise in other areas due to lighting changes or camera quality. What you'll need to do is write a function that analyzes the result of a bitmapdaya.compare() for change in area's larger than _____ to determine if there is enough change to warrant an actual object being there. That will help remove noise and make your mask more accurate.