I have a system with an NVidia graphics card and I'm looking at using openCL to replace openMP for some small on CPU tasks (thanks to VS2010 making openMP useless)
Since I have NVidia's opencl SDK installed clGetPlatformIDs() only returns a single platform (NVidia's) and so only a single device (the GPU).
Do I need to also install Intel's openCL sdk to get access to the CPU platform?
Shouldn't the CPU platform always be available - I mean, how do you NOT have a cpu?
How do you manage to build against two openCL SDKs simultaneously?
You need to have a SDK which provides interface to CPU. nVidia does not, AMD and Intel's SDKs do; in my case the one from Intel is significantly (something like 10x) faster, which might due to bad programming on my part however.
You don't need the SDK for programs to run, just the runtime. In Linux, each vendor installs a file in /etc/OpenCL/vendors/*.icd, which contains path of the runtime library to use. That is scanned by the OpenCL runtime you link to (libOpenCL.so), which then calls each of the vendor's libs when querying for devices on that particular platform.
In Linux, the GPU drivers install OpenCL runtime automatically, the Intel runtime is likely to be downloadable separately from the SDK, but is part of the SDK as well, of course.
Today i finally got around to trying to start doing openCl development and wow... it is not straight forward at all.
There's an AMD sdk, there's an intel sdk, there's an nvidia sdk, each with their own properties (CPU only vs GPU only vs specific video card support only perhaps?)
There may be valid technical reasons for it having to be this way but i really wish there was just one sdk, and that when programming perhaps you could specify GPU / CPU tasks, or that maybe it would use whatever resources made most sense / preformed best or SOMETHING.
Time to dive in though I guess... trying to decide though if i go CPU or GPU. I have a pretty new 4000$ alienware laptop with SLI video cards, but then also an 8 core cpu so yeah... guess ill have to try a couple sdk's and see which preforms best for my needs?
Not sure what end users of my applications would do though... it doesnt seem like they can flip a switch to make it run on cpu or gpu instead.
The OpenCL landscape really needs some help...
Related
So, Intel SDK works with intel cpu, gpu, and xeon phi.
AMD SDK works with AMD gpu and cpu.
I would like to develop an application that targets intel cpu and AMD gpu.
Can anyone suggest a development strategy to achieve this?
Thanks.
Edit: I would like to run both cpu and gpu kernels concurrently on the same system.
When you get list of available platforms, in case of Intel CPU/AMD GPU you shall have 2 platforms, each with it's own ID.
Usually, that's it, you create devices an so on, using necessary platform ID in each case.
If you are using Windows, it's not so difficult to see in debugger, that different platforms corresponds to different OpenCL libraries (just go deeper into cl_platform_id structure) - both of dll's are loaded.
Put your OpenCL code (not necessarily the kernel) in a library and create and link the DLL files for the AMD and Intel (and NVIDIA) devices. Create a new program and dynamically load the library based on which platforms the user has installed.
Kind of a pain in the butt but it works in Labview so it should work in other languages.
If you are using Windows, you can use LoadLibrary and put the library in a folder that is in your PATH (Windows Environment Variable) or in the same folder as the .EXE.
I wonder how we can have OpenCl "seeing" my K20. Xeon, and Xeon Phi at the same time?
Especially I'm confused about the use of two libraries here (from NVidia and Intel).
How to do it, if possible at all?
The OpenCL Installable Client Driver (ICD) takes care of this for you. It is the same regardless of whose implementation you have installed, and exposes all implementations as separate OpenCL "Platforms".
When you call clGetPlatformIDs it will tell you how many platforms you have installed. There could be one for AMD, one for NVIDIA, and one for Intel, for example.
Then within each platform you call clGetDeviceIDs which will return the number of devices within that platform. On your NVIDIA platform you'll find your K20, and within your Intel platform you'll find your Xeon CPU and Xeon Phi co-processor.
If you build or download the clInfo utility you'll see a nice dump of all the installed platforms and devices and the capabilities of each.
The problem is solved.
Looking at the key directory:
/etc/OpenCL/vendors/*.icd
I noticed that for Nvidia the library in used was a link which was duplicated in difference places and pointing to two different releases.
I just replace the former one by the most recent one, the one I've installed recently, and here we go.
Opencl did not know which one to use I guess.
It's like the installation location has changed between the two nividia versions.
When I was supposed to have removed it before reinstalling that was actually not true.
Thank you all for your hell.
I want to use 2 OpenCL runtimes in one system together (in my case AMD and Nvidia, but the question is pretty generic).
I know that I can compile my program with any SDK. But when running the program, I need to provide libOpenCL.so. How can I provide the libs of both runtimes so that I see 3 devices (AMD CPU, AMD GPU, Nvidia GPU) in my OpenCL program?
I know that it must be possible somehow, but I didn't find a description on how to do it for linux, yet.
Thanks a lot,
Tomas
You're not thinking of it right. SDK's are not provided by the application, and are not needed for running a compiled program. OpenCL runtimes are provided by the client system, and that's what's giving your program platforms and devices to use in clGetPlatformIDs and clGetDeviceIDs.
If the user does not have an Nvidia graphics card, you are simply not going to be able to use an Nvidia platform and device on his system, because he doesn't have the Nvidia OpenCL runtime or hardware.
All different OpenCL SDK's provide you are vendor-specific extensions, which are then understood by the vendor runtime.
The Khronos OpenCL working group defined a ICD layer (installable client driver) that allows multiple vendor drivers to be installed on the system. The application accesses the vendor drivers through the ICD layer. For more details see cl_khr_icd.txt.
The Smith and Thomas answers are correct; this is just expanding on that information: When you enumerate the OpenCL platforms, you'll get one for each installed driver. Within each platform you enumerate the devices. The AMD and Intel drivers also expose CPU devices. So on a fully populated machines, you might see an AMD platform (with CPU and GPU devices), an NVIDIA platform (with GPU device), and an Intel platform (with CPU and GPU devices). Your code creates a context on whichever devices you want to use, and one or more command queues to feed them work. You can keep them all busy working on things, but you can only share data buffers between devices from the same platform. To share data across platforms, it must hit CPU memory in between.
In regards to running on multiple OpenCL devices at the same time. If you want to run on multiple devices create a separate context for each device/vendor and run each one in a separate thread. For example I have a GTX 590. This shows up as two GTX 590 devices. I also have the Intel i7 processor. I create three contexts: two for the 590 devices and one for the CPU and run each context/device in three threads using SDL_CreateThread (pthreads works well as well). You have to weight the number of jobs for each device proportional to their "speed" if you want to get good results. For example 45% for each GTX 590 and 10% for the CPU. The best weights to use depend on the application.
In my computer with Windows 7 OS I have three versions of OpenCL SDKS's from this vendors:
Intel
NVIDIA
AMD.
I build my application with each of them.
As the output I have three different binaries.
For example: my_app_intel_x86, my_app_amd_x86, my_app_nvidia_x86
This binaries are different on this:
They use different SDK's in likange process
They try to find different OpenCL platform name in runtime
Can I use only one SDK and check platform on running time?
SDK's give debuggings tools, a platform, and possibly extensions, the OpenCL API remains the same regardless. You can link to any SDK you want, and it'll produce an executable compatible with any OpenCL runtimes you can find. Remember those are SDK's, meant for the developer - the end-user will probably only have his graphics driver (OpenCL-enabled) which doesn't care what SDK you used to build the software.
Ideally you should use a default platform for your program, but let the user override it (you can select various platforms at runtime!). You can also use heuristics to figure out which device is the fastest, e.g.:
iterate over each available platform
for each platform, iterate over each device
benchmark this device somehow in a relevant way
select the fastest one
Also, if you are using specific extensions, make sure to only accept devices which support them...
Can I use only one SDK and check platform on running time?
Yes, you absolutely can and should do that, but I am worried about what you mean by "check platform" - as I stated above, the SDK bears absolutely no influence on the platforms you can run your built program on. I can build my code with the AMD SDK, and run the executable on a system with an nVidia graphics card or an Intel processor just fine (the only difference is that I may not have access to AMD-specific extensions which will be provided by my SDK, but the extensions will be recognized by an AMD driver, so you don't even need the SDK installed to run the code - but you will to build it though).
I have seen that AMD APP SDK samples work on a machine having only Intel CPU.
How can this happen? How does the compiler target a different machine architecture?
Do I not need Intel's set of compilers for running the code on the intel CPU?
I think if we have to run an OpenCL application on a specific hardware, I have to (re)compile it using device's vendor specifics compiler.
Where is my understanding wrong?
Firstly, OpenCL is built to work on CPU's and GPU's. You can compile and run the same source code on either type of device. However, its very likely that CPU code will be sub-optimal for a GPU and vice-versa.
AMD H/W is 7% - 14% of total x86/x64 CPU's. So AMD must develop compilers for both AMD and Intel chips to be relevant. AMD have history developing compilers for both sets of chips. Conversely, Intel have developed compilers that either don't work on AMD chips or don't work that well. That's no surprise.
With OpenCL, the AMD APP SDK is the most flexible it will work well on AMD and Intel CPU's and AMD GPUs. Intel's OpenCL SDK doesn't even install on AMD x86 H/W.
If you compile an OpenCL program to binary, you can save and reuse it as long as it matches the OpenCL Platform and Device that created it. So, if you compile for one device and use on another you are very likely to get an error.
The power of OpenCL is abstracting the underlaying hardware and offer massive, parallel and heterogeneous computing power.
Some SDKs and platforms offers some specific features to "optimize" the code, i honestly think that such features are just marketing and they introduce boilerplate code making the application less portable.
There are also some pseudo-new technologies that are just wrappers to OpenCL or they are really similar in the concept like the Intel quick sync.
About Intel i should say that at the first place they were supporting all the iCore generation and even some C2D, now the new SDK only support the 3rd iCore generation, i don't get their strategy honestly, probably Intel is the last option if you want to adopt OpenCL and targeting the biggest possible audience, also their SDK doesn't seems to be really good at all .
Stick with the standard and you will avoid both possible legal and performance issues and your code will also be more portable.
The bottom line is that the AMD SDK includes a compiler for targeting x86 CPUs for OpenCL. That means that even though you are running an Intel CPU the generated code will run on it. It's the same concept as compiling a C program to run on an x86 CPU: it works on Intel and AMD CPUs (or any that implement the x86 instruction set).
The vendor's compiler might have specific optimizations, like user827992 mentions, but in my experience the performance of AMD's CPU compiler isn't that bad when running on an Intel CPU. I haven't tried Intel's OpenCL implementation.
It is true that for some (maybe most in the future) hardware, only the vendor's compiler will support it. AMD's SDK won't build code that will run on an NVIDIA card, and vice-versa. CPUs happen to be a bit of a special case in that the basic instruction set is so widely deployed that the CPU compiler will work on most machines you're likely to come in contact with.