I am trying to use recursion inside an OpenCL kernel. Compilation is successful but while running it is giving compilation error so I want to know, as Dynamic Parallelism is now supported by CUDA, does OpenCL support Dynamic Parallelism or not?
Recursion is not supported by OpenCL. See point i in section 6.9 of the standard v1.2.
EDIT: The new Dynamic Parallelism capability of CUDA does't have anything to do with recursion (it was already supported a while ago by CUDA. See this question. This new capability allow threads running on the device to configure and launch new grids which was previously only done by the host. See this document for an overview.
SECOND EDIT: regarding the answer of #Michael: This is only the spec, you will have to wait for the implementation release. Besides, at that point in the future you will also have to make sure to have the proper hardware (even dynamic parallelism is supported by CUDA only for devices of capability 3.5 and higher). So when you asked your question, and still today: NO OpenCL implementation supports dynamic parallelism.
Dynamic Parallelism in now supported in OpenCL 2.
Khronos Group announced it at Siggraph 2013.
You can find the specifications here
Related
Is it possible to run OpenCL on a system designed by a user on a SoC prototyping board? To be more specific, I have a ZedBoard (Xilinx Zynq) that has Dual ARM cores and a Programmable Logic (PL) Area. If I design a simple system of my own that has a video processing accelerator implemented in the logic area, an ARM core and an AXI interconnect, what do I have to do to provide OpenCL support for this simple system? (In this simple system, the ARM core could be the "Host" and the video processing accelerator could be the "device").
I am a student and I have only some basic knowledge about OpenCL. I have researched about my question and have only ended up confusing myself. What are the things that have to be done to provide OpenCL support for a SoC? I understand that this may be a big project, but I need a guideline where to start and how to proceed.
what do I have to do to provide OpenCL support for this simple system?
Implement a OpenCL platform that makes either use of your ARM CPU or the FPGA (or both). I'd say that is pretty much impossible for you; ARM would surely offer one for the CPU if it was easy (and they definitely have the financial means to employ capable engineers/computer scientists), and implementing accelerators on an FPGA requires in-depth knowledge of FPGA development, as well as compiler theory and experience in systems design. I don't want to sound mean, but you seem to have none of these three.
You asked where to get started; I recommend just writing a first accelerator that e.g. adds up a vector of numbers; as soon as you have that, you will have a clearer idea of your task.
If you want to have a look at a reference: The Ettus USRP E310 is a zynq-based SDR device. Ettus has a technology called RFNoC, which allows users to write their own blocks to push data through. Notice that this took quite a few engineers and quite some time to get started. Notice further that it's much easier than implementing something that converts OpenCL to FPGA implementations.
If you have access to the Xilinx tools: Vivado HLS 15.1 System Edition should compile OpenCL kernels. This will also be included in the SDAccel tool suite.
Source: UG973: Vivado Design Suite User Guide Release Notes, Installation,and Licensing
An alternative might be switching to Altera. They provide some good examples for the Altera Cyclone V SoC which is comparable to Xilinx Zynq devices (also includes ARM Cortex-A9) :
AlteraSDK for OpenCL
I am also a student and my current project is also going on a similar direction, i have successfully installed a version of opencl called POCL on the zedboard, it successfully detects the arm cpu of the zedboard. To install pocl, you need llvm and a horde of other things as well. but basic steps to get pocl up on the zedboard are given below:-
Installing pocl:
http://www.hosseinabady.com/install-pocl-opencl
running example:
http://www.hosseinabady.com/embedded-system-by-examples/opencl_embedded_system/opencl-vector-addition
Lots of dependency: can resolved easily
but LLVM make sure you install 3.4 version for pocl 0.9
Steps to install llvm
https://github.com/pacs-course/pacs/wiki/Instructions-to-install-clang-3.1-on-ubuntu-12.04.1-and-12.10
POCL 0.9 is successfully working for me, as you do the installation you will face many other missing dependencies like hwloc, mesa libraries, open gl/cl headers icd loaders i hope you can resolve them as its a very big list to put up in stack overflow.
In order to detect your fpga as an open cl device, thats not going to be a trivial thing to do, you can refer to this link question i posted on github
https://github.com/pocl/pocl/issues/285
and also a research paper published by hosseinbady found on the publications link on the pocl website
http://pocl.sourceforge.net/publications.html
hope this helps you
Try the ARM OpenCL SDK. The Zedboard has an ARM A9 CPU, this should have a NEON SIMD vector unit http://www.arm.com/products/processors/technologies/neon.php which can run OpenCL. See http://www.arm.com/products/multimedia/mali-technologies/opencl-for-neon.php.
The Zedboard isn't listed as an OpenCL conformant platform https://www.khronos.org/conformance/adopters/conformant-products#opencl.
So there is a chance the ARM driver will not work.
Good luck!
If still relevant, try this paper OpenCL on ZYNQ [PDF]
Also note that Zynq-7000 is listed on https://www.khronos.org/conformance/adopters/conformant-products#opencl ( OpenCL_1_0 ), hence the compatibility.
My application is OpenCL 1.1 compatiable and I want to check whether each of the device has drivers for that version. There are 2 ways for this:
clGetDeviceInfo() ->CL_DEVICE_VERSION
clGetPlatformInfo() ->CL_PLATFORM_VERSION
I have following doubts:
I do not understand why method 1 is provided as method 2 seems the
correct way to me?
Is it possible that the version given by the platform will not
match with the version given by a device from the same platform?
What is clGetDeviceInfo::CL_DRIVER_VERSION for?
From all these options which one should I use to check if a device
can run my OpenCL 1.1 code?
There are some features in OpenCL that have hardware requirements. This means that even if a particular vendor's OpenCL implementation (the platform) supports an OpenCL version, the device might not. So, it is entirely possible for the versions returned from the CL_DEVICE_VERSION and CL_PLATFORM_VERSION queries to differ.
This will probably start to happen more frequently when OpenCL 2.0 implementations start appearing, as there is plenty of hardware on the market that doesn't have the necessary support for OpenCL 2.0 features. Imagine a system that has two devices from Vendor X: a new Device A that can run OpenCL 2.0, and a much older Device B that can't. In this instance, the platform version may be OpenCL 2.0, but the device version could be OpenCL 2.0 for Device A and OpenCL 1.2 for Device B.
The CL_DRIVER_VERSION is for getting a vendor specific version number for the implementation. This number could using any version numbering system that the vendor uses to keep track of different software releases, and is completely independent from OpenCL version numbers (although some vendors may well include the OpenCL version here too).
So, in order to be sure that both the device and platform support your required OpenCL version, you should just need to check CL_DEVICE_VERSION.
I wonder how we can have OpenCl "seeing" my K20. Xeon, and Xeon Phi at the same time?
Especially I'm confused about the use of two libraries here (from NVidia and Intel).
How to do it, if possible at all?
The OpenCL Installable Client Driver (ICD) takes care of this for you. It is the same regardless of whose implementation you have installed, and exposes all implementations as separate OpenCL "Platforms".
When you call clGetPlatformIDs it will tell you how many platforms you have installed. There could be one for AMD, one for NVIDIA, and one for Intel, for example.
Then within each platform you call clGetDeviceIDs which will return the number of devices within that platform. On your NVIDIA platform you'll find your K20, and within your Intel platform you'll find your Xeon CPU and Xeon Phi co-processor.
If you build or download the clInfo utility you'll see a nice dump of all the installed platforms and devices and the capabilities of each.
The problem is solved.
Looking at the key directory:
/etc/OpenCL/vendors/*.icd
I noticed that for Nvidia the library in used was a link which was duplicated in difference places and pointing to two different releases.
I just replace the former one by the most recent one, the one I've installed recently, and here we go.
Opencl did not know which one to use I guess.
It's like the installation location has changed between the two nividia versions.
When I was supposed to have removed it before reinstalling that was actually not true.
Thank you all for your hell.
I've been doing some research in to OpenCL, and the possibility of using it on a project. The question I have is, is there a way to run OpenCL code on a CPU that is unsupported by the OpenCL SDKs in a C++ application. I know Java has Aparapi, however I'm wondering how to run OpenCL code in a C++ application without hardware that is supported by the SDKs. There is some code I would like to write in OpenCL kernels to take advantage of the OpenCL parallelism where available, however I'm unsure if I wouldn't be able to run it on older hardware (still X86, but not recent hardware). Could anyone explain to me how this can be done, or if it is even a problem at all to run OpenCL code on older systems?
Thanks,
Peter
I would say best way to approach this is to check if the device supports OpenCL via OpenCL API calls such as clPlatformIDs then once you figure it isn't a OpenCL device then run the required code as normal C/C++ function otherwise run it using openCL kernel. But for portability you need to write the program logic twice once in .cl file and once as normal c/c++ method/function.
I have a system with an NVidia graphics card and I'm looking at using openCL to replace openMP for some small on CPU tasks (thanks to VS2010 making openMP useless)
Since I have NVidia's opencl SDK installed clGetPlatformIDs() only returns a single platform (NVidia's) and so only a single device (the GPU).
Do I need to also install Intel's openCL sdk to get access to the CPU platform?
Shouldn't the CPU platform always be available - I mean, how do you NOT have a cpu?
How do you manage to build against two openCL SDKs simultaneously?
You need to have a SDK which provides interface to CPU. nVidia does not, AMD and Intel's SDKs do; in my case the one from Intel is significantly (something like 10x) faster, which might due to bad programming on my part however.
You don't need the SDK for programs to run, just the runtime. In Linux, each vendor installs a file in /etc/OpenCL/vendors/*.icd, which contains path of the runtime library to use. That is scanned by the OpenCL runtime you link to (libOpenCL.so), which then calls each of the vendor's libs when querying for devices on that particular platform.
In Linux, the GPU drivers install OpenCL runtime automatically, the Intel runtime is likely to be downloadable separately from the SDK, but is part of the SDK as well, of course.
Today i finally got around to trying to start doing openCl development and wow... it is not straight forward at all.
There's an AMD sdk, there's an intel sdk, there's an nvidia sdk, each with their own properties (CPU only vs GPU only vs specific video card support only perhaps?)
There may be valid technical reasons for it having to be this way but i really wish there was just one sdk, and that when programming perhaps you could specify GPU / CPU tasks, or that maybe it would use whatever resources made most sense / preformed best or SOMETHING.
Time to dive in though I guess... trying to decide though if i go CPU or GPU. I have a pretty new 4000$ alienware laptop with SLI video cards, but then also an 8 core cpu so yeah... guess ill have to try a couple sdk's and see which preforms best for my needs?
Not sure what end users of my applications would do though... it doesnt seem like they can flip a switch to make it run on cpu or gpu instead.
The OpenCL landscape really needs some help...