I'm currently writing a library using OpenCL and wondered:
Is it possible for an OpenCL platform handle and an OpenCL device handle to have the same numeric value?
More generally: Are handle collisions between OpenCL objects of different types possible?
Yes, it is possible. It is up to the implementation. There is nothing in the spec saying these have to be unique across types, so they could be duplicated.
Related
Is it possible to translate OpenCL-style SPIR-V to Vulkan-style SPIR-V?
I know that it is possible to use clspv to compile OpenCL C to Vulkan-style SPIR-V, but I haven't seen any indication that it also supports ingesting OpenCL-style SPIR-V.
Thank you for any suggestions if you know how to achieve this :)
I know that it is possible to use clspv to compile OpenCL C to
Vulkan-style SPIR-V, but I haven't seen any indication that it also
supports ingesting OpenCL-style SPIR-V.
clspv compiles to "Opencl-style SPIR-V". IOW, it uses OpenCL execution model and also OpenCL memory model. The answer to your question is no (in general). The problem is that e.g. GLSL uses logical memory model, which means pointers are abstract, so you can't have pointers to pointers. While OpenCL allows this, because it uses physical memory model. Plus there are other things in OpenCL which cannot be expressed in GLSL. You could try to write some translator, and it might work for some very simple code, but that's about it.
I am writing a N-body physics simulation. I would like to ask if there is an alternative to the OpenCL clGetGLContextInfoKHR() function ? I need to find out during runtime which GPU is used for OpenGL rendering so that I can use OpenCL for vertex manipulation on this same GPU (for performance reasons).
I have searched the OpenCL.dll for the function clGetGLContextInfoKHR() using Dependancy Walker, but it seems that the implementation that is installed on my computer does not support it sice this function is missing from the DLL. I have also tried glGetString(GL_RENDERER), but the name string it returns differres from the name string which clGetDeviceInfo(.., CL_DEVICE_NAME, ...) returns (not by much, but enough to make it for example difficult to destinguish two GPUs from the same manufacturer). Is there any other way except manually choosing the correct OpenCL device ?
Thanks for help !
I am fairly certain that a warp is only defined in CUDA. But maybe I'm wrong. What is a warp in terms of OpenCL?
It's not the same as work group, is it?
Any relevant feedback is highly appreciated. Thanks!
It isn't defined in the OpenCL standard. A warp is a thread as executed by the hardware (CUDA threads are not really threads and map onto a warp as separate SIMD elements with some clever hardware/software mapping). It is a collection of work-items and there can be multiple warps in a work-group.
An OpenCL subgroup was designed to be compatible with a hardware thread, and hence is able to represent a warp in the OpenCL kernel, but it is entirely up to NVIDIA to decide to implement subgroups or not and of course an OpenCL subgroup cannot expose every feature that NVIDIA can expose for warps because it is a standard, while NVIDIA can do anything they like on their own devices.
Pipe is one of the OpenCL 2.0's new features, and this feature has been demonstrated in the AMDAPPSDK's producer/consumer example. I've read some articles abut pipe's use cases and they're all like the producer/consumer way.
My question is, the same functionality can be achieved by creating a global memory space/object and passing the pointer to 2 kernel functions given that OpenCL 2.0 provides the shared virtual memory. So what's the difference between a pipe object and a global memory object? Or is it invented just for optimization?
It is as useful as std::vector and std::queue.
One is useful to store data, while the other is useful to store packets.
Packets are indeed data, but it is much easier to handle them as small units rather than a big block.
Pipes in OpenCL allow you to consume these small packets in a kernel, without having to deal with the indexing + storing + pointers + forloops hell that would happen if you manually implement a pipe mechanism yourself in the kernel.
Pipes are useful for example when each work item can generate variable number of outputs. Prior to OpenCL 2.0 this was difficult to handle.
Pipes may reside in faster memory (vendor specific) i.e. Altera recommends using pipes to exchange data between kernels instead of using global memory.
Pipes are designed to transfer data from one kernel to another kernel/s without the need to store/load data in/from global or host memory. This is essentially a FIFO on the FPGA device. So, the speed of access of the data are much faster than that of through DDR or host memory. This is probably the reason to use FPGA as an accelerator.
Sometimes the DDRs are used to share data between kernels as well. One example is that a SIMD kernel want to share some data with a single task kernel with requirement on input data sequence. As, Pipes will run out of order in a SIMD way.
Other than the Pipes, you can use Altera channels for more function support. But this is not portable to other OpenCL devices.
Hope this can help. :)
as I understood, the OpenCL uses a modified C language (by adding some keywords like __global) as the general purpose for defining kernel function. And now I am doing a front-end inside F# language, which has a code quotation feature that can do meta programming (you can think it as some kind of reflection tech). So I would like to know if there is a general binary intermediate representation for the kernel instead of C source file.
I know that CUDA supports LLVM IR for the binary intermediate representation, so we can create kernel programmatically, and I want to do the same thing with OpenCL. But the document says that the binary format is not specified, each implementation can use their own binary format. So is there any general purpose IR which can be generated by program and can also run with NVIDIA, AMD, Intel implementation of OpenCL?
Thansk.
No, not yet. Khronos is working on SPIR (the spec is still provisional), which would hopefully become this. As far as I can tell, none of the major implementations support it yet. Unless you want to bet your project on its success and possibly delay your project for a year or two, you should probably start with generating code in the C dialect.