FFMpeg with X265 - pointers

I am currently trying to encode raw RGB24 images via x265. I already successfully did this with the x264 library, but a few things have changed as compared to the x265 library.
Here the problem in short: I want to convert the image I have from RGB24 to YUV 4:2:0 via the sws_scale function of FFMPEG. The prototype of the function is:
int sws_scale(SwsContext *c, uint8_t* src[], int srcStride[], int srcSliceY, int srcSliceH, uint8_t* dst[], int dstStride[])
Assuming image contains my raw image, srcstride and `m_height' the corresponding RGB stride and height of my image, I made the following call with x264
sws_scale(convertCtx, &image, &srcstride, 0, m_height, pic_in.img.plane, pic_in.img.i_stride);
pic_in is of type x264_picture_t which looks (brief) as follows
typedef struct
{
...
x264_image_t img;
} x264_picture_t;
with x264_image_t
typedef struct
{
...
int i_stride[4];
uint8_t *plane[4];
} x264_image_t;
Now, in x265 the structures have slightly changed to
typedef struct x265_picture
{
...
void* planes[3];
int stride[3];
} x265_picture;
And I am now not quite sure how to call the same function
sws_scale(convertCtx, &image, &srcstride, 0, m_height, ????, pic_in.stride);
I tried creating a temporary array, and then copying back and recasting the array items, but it doesnt seem to work
pic.planes[i] = reinterpret_cast<void*>(tmp[i]) ;
Can someone help me out?
Thanks a lot :)
Edit
I figured it out now
outputSlice = sws_scale(convertCtx, &image, &srcstride, 0, m_height, reinterpret_cast<uint8_t**>(pic_in.planes), pic_in.stride);
This seems to do the trick :)
And btw, for other people who are experiment with x265:in x264 there was a x264_picture_alloc function which I didn't manage to find in x265. So here is a function which I used in my application and which does the trick.
void x265_picture_alloc_custom( x265_picture *pic, int csp, int width, int height, uint32_t depth) {
x265_picture_init(&mParam, pic);
pic->colorSpace = csp;
pic->bitDepth = depth;
pic->sliceType = X265_TYPE_AUTO;
uint32_t pixelbytes = depth > 8 ? 2 : 1;
uint32_t framesize = 0;
for (int i = 0; i < x265_cli_csps[csp].planes; i++)
{
uint32_t w = width >> x265_cli_csps[csp].width[i];
uint32_t h = height >> x265_cli_csps[csp].height[i];
framesize += w * h * pixelbytes;
}
pic->planes[0] = new char[framesize];
pic->planes[1] = (char*)(pic->planes[0]) + width * height * pixelbytes;
pic->planes[2] = (char*)(pic->planes[1]) + ((width * height * pixelbytes) >> 2);
pic->stride[0] = width;
pic->stride[1] = pic->stride[2] = pic->stride[0] >> 1;
}

And I am now not quite sure how to call the same function
sws_scale(convertCtx, &image, &srcstride, 0, m_height, ????,
pic_in.stride);
tried with?:
sws_scale(convertCtx, &image, &srcstride, 0, m_height, pic_in.planes,pic_in.stride);
what error do you have? have you initialized memory of x265_picture?

Related

How to properly use QOpenGLBuffer.PixelPackBuffer with PyQt5

I am trying to read the color buffer content of the default framebuffer in PyQt5 using pixel buffer object given by the Qt OpenGL framework.
It looks like the reading is unsuccessful because the end image always contains all zeros. There's very little examples with pixel buffers and PyQt5, so I was mostly relying on this c++ tutorial explaining pixel buffers, specifically section Example: Asynchronous Read-back.
My code goes something like this:
class GLCanvas(QtWidgets.QOpenGLWidget):
# ...
def screenDump(self):
"""
Takes a screenshot and returns a pixmap.
:returns: A pixmap with the rendered content.
:rtype: QPixmap
"""
self.makeCurrent()
w = self.size().width()
h = self.size().height()
ppo = QtGui.QOpenGLBuffer(QtGui.QOpenGLBuffer.PixelPackBuffer)
ppo.setUsagePattern(QOpenGLBuffer.StaticRead)
ppo.create()
success = ppo.bind()
if success:
ppo.allocate(w * h * 4)
# Render the stuff
# ...
# Read the color buffer.
glReadBuffer(GL_FRONT)
glReadPixels(0, 0, w, h, GL_RGBA, GL_UNSIGNED_BYTE, 0)
# TRY1: Create an image with pixel buffer data - Doesn't work, image contains all zeros.
pixel_buffer_mapped = ppo.map(QOpenGLBuffer.ReadOnly)
image = QtGui.QImage(sip.voidptr(pixel_buffer_mapped), w, h, QtGui.QImage.Format_ARGB32)
ppo.unmap()
# TRY2: Create an image with pixel buffer data - Doesn't work, image contains all zeros.
# image = QtGui.QImage(w, h, QtGui.QImage.Format_ARGB32)
# bits = image.constBits()
# ppo.read(0, bits, w * h * 4)
ppo.release()
pixmap = QtGui.QPixmap.fromImage(image)
return pixmap
Any help would be greatly appreciated.
I didn't have any success after a couple of days, so I decided to implement color buffer fetching with pixel buffer object in C++, and then use SWIG to pass the data to Python.
I'm posting relevant code, maybe it will help somebody.
CPP side
// renderer.cpp
class Renderer{
// ...
void resize(int width, int height) {
// Set the viewport
glViewport(0, 0, width, height);
// Store width and height
width_ = width;
height_ = height;
// ...
}
// -------------------------------------------------------------------------- //
// Returns the color buffer data in GL_RGBA format.
GLubyte* screenDumpCpp(){
// Check if pixel buffer objects are available.
if (!GLInfo::pixelBufferSupported()){
return 0;
}
// Get the color buffer size in bytes.
int channels = 4;
int data_size = width_ * height_ * channels;
GLuint pbo_id;
// Generate pixel buffer for reading.
glGenBuffers(1, &pbo_id);
glBindBuffer(GL_PIXEL_PACK_BUFFER, pbo_id);
glBufferData(GL_PIXEL_PACK_BUFFER, data_size, 0, GL_STREAM_READ);
// Set the framebuffer to read from.
glReadBuffer(GL_FRONT);
// Read the framebuffer and store data in the pixel buffer.
glReadPixels(0, 0, width_, height_, GL_RGBA, GL_UNSIGNED_BYTE, 0);
// Map the pixel buffer.
GLubyte* pixel_buffer = (GLubyte*)glMapBuffer(GL_PIXEL_PACK_BUFFER, GL_READ_ONLY);
// Cleanup.
glUnmapBuffer(GL_PIXEL_PACK_BUFFER);
glBindBuffer(GL_PIXEL_PACK_BUFFER, 0);
glDeleteBuffers(1, &pbo_id);
return pixel_buffer;
}
// Returns the color buffer data in RGBA format as a numpy array.
PyObject* screenDump(){
// Get screen dump.
GLubyte* cpp_image = screenDumpCpp();
int channels = 4;
int image_size = width_* height_ * channels;
// Setup dimensions for numpy vector.
PyObject * python_image = NULL;
int ndim = 1;
npy_intp dims[1] = {image_size};
// Set up numpy vector.
python_image = PyArray_SimpleNew(ndim, dims, NPY_UINT8);
GLubyte * data = static_cast<GLubyte *>(PyArray_DATA(toPyArrayObject(python_image)));
// Copy screen dump to python space.
memcpy(data, cpp_image, image_size);
// return screen dump to python.
return python_image;
}
};
// glinfo.cpp
const GLInt GLInfo::glVersionInt(){ ... }
GLV GLInt::GLV(int major, int minor){ ... }
bool GLInfo::pixelBufferSupported(){
const GLint version = GLInfo::glVersionInt();
bool supported = false;
if (version >= GLInfo::GLV(1, 5) && version < GLInfo::GLV(3, 0)){
supported = true;
}
else if (version >= GLInfo::GLV(3, 0)){
GLint extensions_number;
glGetIntegerv(GL_NUM_EXTENSIONS, &extensions_number);
std::string pixel_buffer_extension("GL_ARB_pixel_buffer_object");
while (extensions_number--) {
const auto extension_name = reinterpret_cast<const char *>(glGetStringi(GL_EXTENSIONS, extensions_number));
std::string extension_name_str(extension_name);
if (pixel_buffer_extension == extension_name) {
supported = true;
break;
}
}
}
return supported;
}
Python side
# ...
class MyCanvas(QOpenGLWidget):
def __init__(self):
# Get renderer from c++
self._renderer = Renderer()
def resizeGL(self, width, height):
self._renderer.resize(width, height)
# ...
if __name__ == '__main__':
# ...
canvas = MyCanvas()
canvas.show()
width = canvas.width()
height = canvas.height()
data = canvas._renderer().screenDump()
image = QtGui.QImage(data.data, width, height, QtGui.QImage.Format_RGBA8888)
new_image = image.mirrored()
pixmap = QtGui.QPixmap.fromImage(new_image)
pixmap.save(path)
sys.exit(app.exec_())

timerfd mysteriously set int to 0 when read()

I am doing an timerfd hello world in ubuntu 14.04, but got a strange situation: the int count is reset after read timerfd but uint64_int not.
int main(int agrc, char **argv) {
unsigned int heartbeat_interval = 1;
struct itimerspec next_timer;
struct timespec now;
if (clock_gettime(CLOCK_REALTIME, &now) == -1)
err_sys((WHERE + std::string("timer error")).c_str());
next_timer.it_value.tv_sec = now.tv_sec;
next_timer.it_value.tv_nsec = 0;
next_timer.it_interval.tv_sec = heartbeat_interval;
next_timer.it_interval.tv_nsec = 0;
int timefd = timerfd_create(CLOCK_REALTIME, 0);
if (timerfd_settime(timefd, TFD_TIMER_ABSTIME, &next_timer, NULL) == -1) {
err_sys((WHERE).c_str());
}
uint64_t s;
int exp;
int count = 1;
uint64_t count1=0;
while (1) {
s = read(timefd, &exp, sizeof(uint64_t));
if (s != sizeof(uint64_t)) {
err_sys((WHERE).c_str());
}
}
}
int exp;
^^^
s = read(timefd, &exp, sizeof(uint64_t));
^^^ ^^^^^^^^
Unless your int and uint64_t types are the same size, this is a very bad idea. What's most likely happening is that the 64 bits you're reading are overwriting exp and whatever else happens to be next to it on the stack.
Actually, even if they are the same size, it's a bad idea. What you should have is something like:
s = read(timefd, &exp, sizeof(exp));
That way, you're guaranteed to never overwrite the data and your next line would catch the problem for you:
if (s != sizeof(uint64_t)) {
It won't solve the problem that an unsigned integral type and an integral type will be treated differently but you can fix that just by using the right type for exp.

Access vector type OpenCL

I have a variable whithin a kernel like:
int16 element;
I would like to know if there is a way to adress the third int in element like
element[2] so that i would be as same as writing element.s2
So how can i do something like:
int16 element;
int vector[100] = rand() % 16;
for ( int i=0; i<100; i++ )
element[ vector[i] ]++;
The way i did was:
int temp[16] = {0};
int16 element;
int vector[100] = rand() % 16;
for ( int i=0; i<100; i++ )
temp[ vector[i] ]++;
element = (int16)(temp[0],temp[1],temp[2],temp[3],temp[4],temp[5],temp[6],temp[7],temp[8],temp[9],temp[10],temp[11],temp[12],temp[13],temp[14],temp[15]);
I know this is terrible, but it works, ;-)
Well there is still dirtier way :), I hope OpenCL provides better way of traversing vector elements.
Here is my way of doing it.
union
{
int elarray[16];
int16 elvector;
} element;
//traverse the elements
for ( i = 0; i < 16; i++)
element.elarray[i] = temp[vector[i]]++;
Btw rand() function is not available in OpenCL kernel, how did you make it work ??
Using pointers is a very easy solution
float4 f4 = (float4)(1.0f, 2.0f, 3.0f, 4.0f);
int gid = get_global_id(0);
float *p = &f4;
result[gid]=p[3];
AMD recommends getting vector components this way:
Put the array of masks into an OpenCl constant buffer:
cl_uint const_masks[4][4] =
{
{0xffffffff, 0, 0, 0},
{0, 0xffffffff, 0, 0},
{0, 0, 0xffffffff, 0},
{0, 0, 0, 0xffffffff},
}
Inside the kernel write something like this:
uint getComponent(uint4 a, int index, __constant uint4 * const_masks)
{
uint b;
uint4 masked_a = a & const_masks[index];
b = masked_a.s0 + masked_a.s1 + masked_a.s2 + masked_a.s3;
return (b);
}
__kernel void foo(…, __constant uint4 * const_masks, …)
{
uint4 a = ….;
int index = …;
uint b = getComponent(a, index, const_masks);
}
It is possible, but it not as efficient as direct array accessing.
float index(float4 v, int i) {
if (i==0) return v.x;
if (i==1) return v.y;
if (i==2) return v.z;
if (i==3) return v.w;
}
But of course, if you need component-wise access this way, then chances are that you're better off not using vectors.
I use this workaround, hoping that compilers are smart enough to see what I mean (I think that element access is a serious omission form the standard):
int16 vec;
// access i-th element:
((int*)vec)[i]=...;
No that's not possible. At least not dynamically at runtime. But you can use an "compile-time"-index to access a component:
float4 v;
v.s0 == v.x; // is true
v.s01 == v.xy // also true
See http://www.khronos.org/registry/cl/specs/opencl-1.1.pdf Section 6.1.7

OpenCL enqueueNDRangeKernel causes Access Violation error

I am continuously getting an Access Violation Error with a all my kernels which I am trying to build. Other kernels which I take from books seem to work fine.
https://github.com/ssarangi/VideoCL - This is where the code is.
Something seems to be missing in this. Could someone help me with this.
Thanks so much.
[James] - Thanks for the suggestion and you are right. I am doing it on Win 7 with a AMD Redwood card. I have the Catalyst 11.7 drivers with AMD APP SDK 2.5. I am posting the code below.
#include <iostream>
#include "bmpfuncs.h"
#include "CLManager.h"
void main()
{
float theta = 3.14159f/6.0f;
int W ;
int H ;
const char* inputFile = "input.bmp";
const char* outputFile = "output.bmp";
float* ip = readImage(inputFile, &W, &H);
float *op = new float[W*H];
//We assume that the input image is the array “ip”
//and the angle of rotation is theta
float cos_theta = cos(theta);
float sin_theta = sin(theta);
try
{
CLManager* clMgr = new CLManager();
// Build the Source
unsigned int pgmID = clMgr->buildSource("rotation.cl");
// Create the kernel
cl::Kernel* kernel = clMgr->makeKernel(pgmID, "img_rotate");
// Create the memory Buffers
cl::Buffer* clIp = clMgr->createBuffer(CL_MEM_READ_ONLY, W*H*sizeof(float));
cl::Buffer* clOp = clMgr->createBuffer(CL_MEM_READ_WRITE, W*H*sizeof(float));
// Get the command Queue
cl::CommandQueue* queue = clMgr->getCmdQueue();
queue->enqueueWriteBuffer(*clIp, CL_TRUE, 0, W*H*sizeof(float), ip);
// Set the arguments to the kernel
kernel->setArg(0, clOp);
kernel->setArg(1, clIp);
kernel->setArg(2, W);
kernel->setArg(3, H);
kernel->setArg(4, sin_theta);
kernel->setArg(5, cos_theta);
// Run the kernel on specific NDRange
cl::NDRange globalws(W, H);
queue->enqueueNDRangeKernel(*kernel, cl::NullRange, globalws, cl::NullRange);
queue->enqueueReadBuffer(*clOp, CL_TRUE, 0, W*H*sizeof(float), op);
storeImage(op, outputFile, H, W, inputFile);
}
catch(cl::Error error)
{
std::cout << error.what() << "(" << error.err() << ")" << std::endl;
}
}
I am getting the error at the queue->enqueueNDRangeKernel line.
I have the queue and the kernel stored in a class.
CLManager::CLManager()
: m_programIDs(-1)
{
// Initialize the Platform
cl::Platform::get(&m_platforms);
// Create a Context
cl_context_properties cps[3] = {
CL_CONTEXT_PLATFORM,
(cl_context_properties)(m_platforms[0])(),
0
};
m_context = cl::Context(CL_DEVICE_TYPE_GPU, cps);
// Get a list of devices on this platform
m_devices = m_context.getInfo<CL_CONTEXT_DEVICES>();
cl_int err;
m_queue = new cl::CommandQueue(m_context, m_devices[0], 0, &err);
}
cl::Kernel* CLManager::makeKernel(unsigned int programID, std::string kernelName)
{
cl::CommandQueue queue = cl::CommandQueue(m_context, m_devices[0]);
cl::Kernel* kernel = new cl::Kernel(*(m_programs[programID]), kernelName.c_str());
m_kernels.push_back(kernel);
return kernel;
}
I checked your code. I'm on Linux though. At runtime I'm getting Error -38, which means CL_INVALID_MEM_OBJECT. So I went and checked your buffers.
cl::Buffer* clIp = clMgr->createBuffer(CL_MEM_READ_ONLY, W*H*sizeof(float));
cl::Buffer* clOp = clMgr->createBuffer(CL_MEM_READ_WRITE, W*H*sizeof(float));
Then you pass the buffers as a Pointer:
kernel->setArg(0, clOp);
kernel->setArg(1, clIp);
But setArg is expecting a value, so the buffer pointers should be dereferenced:
kernel->setArg(0, *clOp);
kernel->setArg(1, *clIp);
After those changes the cat rotates ;)

How can I use memfrob() to encrypt an entire file?

#include <Windows.h>
void memfrob(void * s, size_t n)
{
char *p = (char *) s;
while (n-- > 0)
*p++ ^= 42;
}
int main()
{
memfrob("C:\\Program Files\\***\***\\***\***\\***", 30344);
}
There's my code. If you can't tell, I'm not sure what I'm doing. I've Googled for about an hour and I haven't seen an example of how to use memfrob(), which is probably why I'm so lost. I'm trying to pass it the name of the file and then the size of the file in bytes, but my program just crashes.
Alright, this is what I have right now:
#include <Windows.h>
#include <stdio.h>
int count = 0;
FILE* pFile = 0;
long Size = 0;
void *memfrob(void * s, size_t n)
{
char *p = (char *) s;
while (n-- > 0)
*p++ ^= 42;
return s;
}
int main()
{
fopen_s(&pFile, "C:\\Program Files\\CCP\\EVE\\lib\\corelib\\nasty.pyj", "r+");
fseek(pFile, 0, SEEK_END);
Size = ftell(pFile);
char *buffer = (char*)malloc(Size);
memset(buffer, 0, Size);
fread(buffer, Size, 1, pFile);
fclose(pFile);
memfrob(buffer, Size);
fopen_s(&pFile, "C:\\Program Files\\CCP\\EVE\\lib\\corelib\\nasty.pyj", "w+");
fwrite(buffer, Size, 1, pFile);
fclose(pFile);
}
In my debugger, it seems that fread is not writing anything to buffer, and my ending file is just 2A over and over, which is 00 xor'd with 42. So can I get another hint?
You need to pass memfrob a piece of memory containing the contents of the file, rather than the name of the file. It's crashing because you're passing in a buffer of read-only memory, and then trying to modify it.
Investigate the open and read I/O functions, or alternatively fopen and fread. Your mainline should look something like:
int main() {
// open file
// find size of file
// allocate buffer of that size
// read contents of file into the buffer
// close the file
// call memfrob on the buffer
// do what you want with the file
// free the buffer
}
Well, several things are wrong here.
The minor problem is that you're passing it the location of the file and not the file itself. Read up on how to do file I/O in C (this being a pretty good link).
The real problem is that you seem to think this is encryption. This doesn't really encrypt your file from anything but the most trivial security issues (such as someone randomly opening your file).

Resources