Can you pass const unsigned int4* to a kernel? - opencl

I have instructions to use:
__kernel void myKernel(__global const unsigned int4* data
But I get CL_INVALID_PROGRAM_EXECUTABLE whenever I try to build it. However, both of these build without error:
__kernel void myKernel(__global const int4* data
__kernel void myKernel(__global const unsigned int* data

"unsigned int" is a valid type, but "unsigned int4" is not. I think what you're looking for is "uint4". See section 6.1.2 of the specification ("Built-in Vector Data Types").

Related

How can I access functions of this abstract class?

Im currently working with an UI tool (Qt Creator 9.5.9) to create UI Interfaces. While messing with the tool i came across following problem:
The following code is from an automatically generated cpp file which is generated when creating a new project.
At the top there are a few functions which I assume can be used to access and possibly change data points.
I want to use the function SetWriteDP() to write my data to the data points.
/**
// register ids
bool registerReadIds(const QList<unsigned int> &ids);
bool registerReadIds(const QUintSet &ids);
bool registerReadIds(const QUintSet &ids, void (*func)(void*, const QUintSet &));
bool registerWriteIds(const QList<unsigned int> &ids);
bool registerWriteIds(const QUintSet &ids);
// read data point values
unsigned int GetReadDP(const unsigned int &id) const;
int GetReadDPInt(const unsigned int &id) const;
float GetReadDPFloat(const unsigned int &id) const;
QString GetReadDPString(const unsigned int &id) const;
// write data point values
void SetWriteDP(const unsigned int &id, const unsigned int &value);
void SetWriteDP(const unsigned int &id, const int &value);
void SetWriteDP(const unsigned int &id, const float &value);
void SetWriteDP(const unsigned int &id, const QString &value);
// execute sql statement
QSqlQuery execSqlQuery(const QString &query, bool &success) const;
**/
#include "hmi_api.h"
#include "widget.h"
#include "ui_arbaseform.h"
#include <iostream>
HMI_API::HMI_API(QWidget *parent) :
AbstractAPI(parent), m_ui(NULL)
{
Widget *widget = dynamic_cast<Widget *>(parent);
if(!widget) return;
m_ui = widget->ui;
QUintSet readIdsToRegister, writeIdsToRegister;
writeIdsToRegister.insert(10001);
registerReadIds(readIdsToRegister);
registerWriteIds(writeIdsToRegister);
SetWriteDP(100001, 69);
}
I tried using the function in another cpp file in different ways:
HMI_API.SetWriteDP()
HMI_API.Abstract_API.SetWriteDP()
This resulted in this error: expected unqualified-id before . token
AbstractAPI::SetWriteDP()
which resulted in this error: cannot call member function 'void DPObject::SetWriteDP(const unsigned int&, const int&, unsigned int)' without object AbstractAPI::SetWriteDP();
the i tried making a DPObject which resulted in this error: cannot declare variable 'test' to be of abstract type 'DPObject'
Im really at my wits end now how to access this function. Can someone maybe explain to me what happens after "HMI_API::HMI_API(QWidget *parent) :" and why it is possible to use the function in that block and how i can make it possible for me to use this function.
I tried reading the documentation but nowwhere in the documentation this function is ever mentioned.
The function works in the code snippet i posted but doesnt when i want to use it in another function, i know its because of some stuff regarding classes but im dont understand how to work around this in this case.
Thanks in advance!
how i can make it possible for me to use this function.
I might be wrong but from my understanding of C++ you would first have to create an object of the class, in this case that would be
HMI_API *uiName = new HMI_API(some_parent_obj);
With QWidget being your earlier created QWidget, you can then call the function using a .
uiName.SetWriteDP(x,y);
Can someone maybe explain to me what happens after "HMI_API::HMI_API(QWidget *parent) :
after "HMI_API::HMI_API(QWidget *parent)" the class makes it clear that it inherits base functionality from the classes AbstractAPI and m_ui, you can learn more about inheritance here :https://www.learncpp.com/cpp-tutorial/basic-inheritance-in-c/
Afterwards im not sure but it looks like it just creates some basic functionality so you can call the functions using the class.
I found an answer to my problem which in hindsight might have been obvious but i dont know how i should have known.
I was able to use the functions by declaring a new function like this:
void HMI_API::myFunction(int arg1){
my code with the functions i wanted to use
}
I really hope this will help someone that might had some understanding problems as well.

Where should I define a C function that will be called in C kernel code when using PYOPENCL

Since Kernel Code in PyOpenCl needs to be written only in C, I have written few functions that need to be called inside the Kernel code in PyOpenCL.Where should I store these functions? how to pass a global variable to that function.
In PyOpenCl my kernel code looks like this:
program = cl.Program(context, """
__kernel void Kernel_OVERLAP_BETWEEN_N_IP_GPU(__constant int *FBNs_array,__local int *Binary_IP, __local int *cc,__global const int *olp)
{
function1(int *x, int *y,__global const int *olp);
}
""").build()
Where should I write and store the function1 function. should I define it in kernel itself, or in some other file and provide a path. If i need to define it at some other place and provide a path, please provide me some details , I am completely new to C.
Thanks
Like in C, before the kernel.
program = cl.Program(context, """
void function1(int *x, int *y)
{
//function1 code
}
__kernel void kernel_name()
{
function1(int *x, int *y);
}""").build()
program = cl.Program(context, """
void function1(int x, int *y,__global const int *cc)
{
x=10;
}
__kernel void kernel_name(__global const int *cc)
{
int x=1;
int y[1]={10};
function1(x,y,cc); //now x=10
}""").build()

Workgroup Bound Check not working

In my OpenCL kernel i'm checking if the global_id is inside the global problem size but it is not working.
If the global problem size is dividable by the workgroupsize everything is fine. If not, the kernel is doing just nothing.
__kernel void move_points(const unsigned int points,
const unsigned int floors,
const unsigned int gridWidth,
const unsigned int gridHeight,
__global const GraphData *graph,
__global const float *pin_x,
__global const float *pin_y,
__global const float *pin_z,
__global float *pout_x,
__global float *pout_y,
__global float *pout_z,
__global clrngMrg31k3pHostStream *streams)
{
int id = get_global_id(0);
if (id < points) {
do kernel things...
}
}
Do somebody know where the problem is?
Thanks a lot. Robin.
If your global size is not divisible by your local size (workgroup size), then the kernel will not run at all.
The enqueueNDRangeKernel() call will return CL_INVALID_WORK_GROUP_SIZE as an error as specified here.
If you really want to follow the CUDA mode, where you may have unused work items. Then put the check in the kernel (as you already have), and use a bigger global size, that is multiple of your local size.

OpenCL Local Memory Declaration

What is the difference between declaring local memory as follows:
__kernel void mmul(const int Ndim, const int Mdim, const int Pdim,
const __global int* A,
const __global int* B,
__global char* C,
__local int* restrict block_a,
__local int* restrict block_b)
and declaring local memory inside the kernel
#define a_size 1024
#define b_size 1024 * 1024
__kernel void mmul(const int Ndim, const int Mdim, const int Pdim,
const __global int* A,
const __global int* B,
__global char* C) {
__local int block_a[a_size]
__local int block_b[b_size]
...
}
In both cases, all threads will update a single cell in the shared A and B arrays
I understand that it's not possible to have "variable" length arrays in the kernel (hence the #define at the top of the second kernel), but is there any other difference? Is there any difference with regards to when the memory is freed?
In both cases, local memory exists for the lifetime of the work-group. The only difference, as you have noted, is that passing the local memory pointer as an argument allows the size of the buffer to be specified dynamically, rather than being a compile-time constant. Different work-groups will always use different local memory allocations.
The second method is better if you want to port code to CUDA, because the __shared__ memory in CUDA (equivalent to __local in OpenCL) does not support to be declared like the first case.

Problem reinterpreting parameters in OpenCL 1.0

Is it possible to reinterpret parameters that have been passed into an OpenCL Kernel. For example, if I have an array of integers being passes in, but I want to interpret the integer at index 16 as a float (don't ask why!) then I would have thought this would work.
__kernel void Test(__global float* im, __constant int* constArray)
{
float x = *( (__constant float*) &constArray[16] );
im[0] = x;
}
However, I get a CL_INVALID_COMMAND_QUEUE error when I next try to use the command queue, implying that the above code has performed an illegal operation.
Any suggests what is wrong with the above, and/or how to achieve the reinterpretation?
I have now tried:
__kernel void Test(__global float* im, __constant int* constArray)
{
float x = as_float(0x3f800000);
im[0] = x;
}
and this does indeed give a 1.0f in im[0]. However,
__kernel void Test(__global float* im, __constant int* constArray)
{
float x = as_float(constArray[16]);
im[0] = x;
}
always results in zero in im[0] regardless of what is in constArray[16].
Regards,
Mark.
OpenCL includes the as_typen family of operators for reinterpret casting of values from one type to another. If I am understanding the question, you should be able to do something like
__kernel void Test(__global float* im, __constant int* constArray)
{
float x = as_float(constArray[16]);
im[0] = x;
}

Resources