In OpenCL, why do I have one platform for each Intel device? - opencl

I'm starting OpenCL. As I've understood, a platform is a vendor-specific OpenCL implementation, and a device is a processing unit that can be used by a platform.
I've made a simple C++ code that prints the platform name and for each of its devices prints the device name, and its output is
Platform 0: Intel(R) OpenCL HD Graphics
Device 0: Intel(R) Gen9 HD Graphics NEO
Platform 1: Intel(R) CPU Runtime for OpenCL(TM) Applications
Device 0: Intel(R) Core(TM) i5-6200U CPU # 2.3GHz
My question is, shouldn't I expect the two devices to be under the same platform? Given I have a laptop, and the GPU is integrated together with the processor. Also, will this then forbid me for assigning both GPU and CPU devices to the same context? (which I've read has some memory sharing advantages)

shouldn't I expect the two devices to be under the same platform
Only if the vendor provides a platform with drivers for both those devices. I'm not sure if Intel's "NEO" platform has also CPU driver, but i'm pretty sure the "CPU runtime" only has driver for the CPU, not the iGPU. You'll have to list the devices of each platform to find out.
will this then forbid me for assigning both GPU and CPU devices to the same context
You have to list the devices - if NEO has both devices then you can use that. But you can't have devices from different platforms in a single context.

Related

What is host_selector in SYCL device selector?

I am newbie in SYCL, OpenCL and GPU programming. I read about the device selector in the SYCL and found the following four:
default_selector : Devices selected by heuristics of the system. If no OpenCL device is found then it defaults to the SYCL host device.
gpu_selector : Select devices according to device type info::device::device_type::gpu from all the available OpenCL devices.
If no OpenCL GPU device is found the selector fails.
cpu_selector : Select devices according to device type info::device::device_type::cpu from all the available devices and
heuristics. If no OpenCL CPU device is found the selector fails.
host_selector : Selects the SYCL host CPU device that does not require an OpenCL runtime.
I ran computecpp_info to find the devices are:
$ /usr/local/computecpp/bin/computecpp_info
/usr/local/computecpp/bin/computecpp_info: /usr/local/cuda-8.0/lib64/libOpenCL.so.1: no version information available (required by /usr/local/computecpp/bin/computecpp_info)
/usr/local/computecpp/bin/computecpp_info: /usr/local/cuda-8.0/lib64/libOpenCL.so.1: no version information available (required by /usr/local/computecpp/bin/computecpp_info)
********************************************************************************
ComputeCpp Info (CE 0.7.0)
********************************************************************************
Toolchain information:
GLIBC version: 2.19
GLIBCXX: 20150426
This version of libstdc++ is supported.
********************************************************************************
Device Info:
Discovered 3 devices matching:
platform : <any>
device type : <any>
--------------------------------------------------------------------------------
Device 0:
Device is supported : NO - Device does not support SPIR
CL_DEVICE_NAME : GeForce GTX 750 Ti
CL_DEVICE_VENDOR : NVIDIA Corporation
CL_DRIVER_VERSION : 384.111
CL_DEVICE_TYPE : CL_DEVICE_TYPE_GPU
--------------------------------------------------------------------------------
Device 1:
Device is supported : UNTESTED - Device not tested on this OS
CL_DEVICE_NAME : Intel(R) HD Graphics
CL_DEVICE_VENDOR : Intel(R) Corporation
CL_DRIVER_VERSION : r5.0.63503
CL_DEVICE_TYPE : CL_DEVICE_TYPE_GPU
--------------------------------------------------------------------------------
Device 2:
Device is supported : YES - Tested internally by Codeplay Software Ltd.
CL_DEVICE_NAME : Intel(R) Core(TM) i7-4790 CPU # 3.60GHz
CL_DEVICE_VENDOR : Intel(R) Corporation
CL_DRIVER_VERSION : 1.2.0.475
CL_DEVICE_TYPE : CL_DEVICE_TYPE_CPU
If you encounter problems when using any of these OpenCL devices, please consult
this website for known issues:
https://computecpp.codeplay.com/releases/v0.7.0/platform-support-notes
So, GeForce GTX 750 Ti and Intel(R) HD Graphics devices are GPU devices and Intel(R) Core(TM) i7-4790 CPU # 3.60GHz is CPU devices. What's about host devices here?
If I select host_selector, where my SYCL code would run?
In SYCL there is the notion of the host device and the OpenCL device. The OpenCL device is any OpenCL enabled device, like Intel GPU, AMD GPUs, FPGAs with OpenCL support, etc.
The Host device on the other hand, is the device which is operating the OpenCL device. In essence it is your cpu and it controls all the attached OpenCL enabled devices and does not use OpenCL by itself. Sometimes, some CPU vendors provide an OpenCL driver, enabling you to run OpenCL on your CPU as well. In this case the host device and the OpenCL device share the same hardware components.
In your case, Intel provides an OpenCL implementation for CPUs as well as GPUs, thus all your devices are OpenCL enabled. The Host device exists even if you have no OpenCL devices
I would also like to point out that ComputeCpp contains experimental support for NVidia so you might be able to run SYCL on that but with no guarantees

Do any GPUs support fine grain system SVM?

OpenCL 2.0 introduced Shared Virtual Memory (SVM), allowing virtual memory addresses to be shared between hosts and devices.
There are a number of different SVM capabilities, see this extract from cl.h:
/* cl_device_svm_capabilities */
#define CL_DEVICE_SVM_COARSE_GRAIN_BUFFER (1 << 0)
#define CL_DEVICE_SVM_FINE_GRAIN_BUFFER (1 << 1)
#define CL_DEVICE_SVM_FINE_GRAIN_SYSTEM (1 << 2)
#define CL_DEVICE_SVM_ATOMICS (1 << 3)
According to this article from Intel, the CL_DEVICE_SVM_FINE_GRAIN_SYSTEM capability means that an OpenCL device can share an operating systems' address space, without creating an SVM buffer for it.
Supporting fine grained SVM with a CPU device should be relatively simple. My (6th gen, Skylake) system reports that it supports CL_DEVICE_SVM_FINE_GRAIN_SYSTEM using the Intel Experimental OpenCL 2.1 CPU Only Platform. However, the Skylake GPU and CPU do not support CL_DEVICE_SVM_FINE_GRAIN_SYSTEM using the normal Intel(R) OpenCL platform.
I can imagine that it is very hard (if not impossible!) for a GPU on a graphics card to support fine grained SVM. However, it should be possible for a GPU on an APU, such as an Intel i7 or an AMD A10 to support it.
Do any GPUs support fine grained system Shared Virtual Memory?

Getting started with GP GPU?

I have two basic questions about getting started with GPGPU programming:
(1) If I do GPGPU on my Mac, will it affect the image on the monitor? How do I know the windowing system or other programs output is not competing for the GPU?
(2) Is there a way to try out AMD GPU programming somewhere without buying a high-end graphics card? The rental cloud places I have seen all use Nvidia. My computation would be logical integer (bit-twiddling) compute-bound, and I have read that AMD GPU is better for these applications.
1) It won't affect the image on the monitor. And to check if another process is using the GPU you'll need something like AMD System Monitor for mac (this application only works on Windows)
2) Any radeon HD 4xxx and above supports OpenCL (previous card might support this, but I'm not sure). This mean any new AMD card you can buy, including the cheapest ones support OpenCL.
The difference between the expensive cards and the cheap ones is the number of stream processors. For example
Radeon HD 4350: 80 stream processors
Radeon x290: 2560 stream processors

How to combine CPU and GPU in JavaCL?

My laptop had
- one CPU core i5: Intel(R) Core(TM) i5-3210M CPU # 2.50GHz
- one Graphic card: Intel(R) HD Graphics 4000
- one Nvidia card ( external card ): GeForce GT 630M
But When I tried to use JavaCL.createBestContext(), it looks like just use one card Intel HD Graphics. So I tried to combine 3 : CPU and 2 GPUs by using:
List<CLDevice> devices = new ArrayList<CLDevice>();
// try to list all platform and devices
for(CLPlatform platform : JavaCL.listPlatforms()) {
//System.out.println(platform.getName());
for (CLDevice device : platform.listAllDevices(true)) {
System.out.println(device.getName().trim());
devices.add(device);
}
}
CLDevice device1 = (CLDevice)devices.get(0);
CLDevice device2 = (CLDevice)devices.get(1);
CLDevice device3 = (CLDevice)devices.get(2);
CLContext context = JavaCL.createContext(null, device1, device2, device3);
But I got error when try to use 3 at the same. So How can compile CPU and GPUs in JavaCL ? Because I read that OpenCL is standard to support parallel programming by using CPU and GPU. So If I miss something, please let me know. Any idea or answers will be appreciated.
Thanks,
Duy.
Sadly, its not that easy. When creating a single context across multiple devices, the devices all have to come from the same platform. Creating a context containing the Intel CPU and GPU should work, but the Nvidia GPU has to be in its own context (different platform, Nvidia not Intel).
Here's how I handle this scenario: I create a context for each device and a thread for each context. Each thread takes a portion of the data I'm working on and dispatches it to its assigned OpenCL device. This way, you can mix, CPUs, GPUs from both AMD and Nvidia, and any other hardware that comes along.
Its important to do load balancing across the threads so that you don't have faster devices sitting idle waiting for a slower device to catch up.

Does any OpenCL host have more than one platform?

The definition of a platform in Khronos' OpenCL 1.0 and 1.1 specification:
Platform: The host plus a collection of devices managed by the OpenCL framework that allow an application to share resources and execute kernels on devices in the platform.
The OpenCL function clGetPlatformIDs creates an array of platforms, implying that multiple platforms are possible. Is it safe to assume that a given OpenCL host has only one platform?
In other words, will I lose anything on any host by doing this:
cl_platform_id platform_id;
cl_uint num_platforms;
errcode = clGetPlatformIDs(1, &platform_id, &num_platforms);
I wouldn't rely on there being only one Platform. When you have multiple OpenCL implementations on one system (which should be possible with the OpenCL ICD, although I'm not sure if that is only planned or already finished), you should get multiple platforms, one for each opencl implementation. One example where there could be multiple opencl implementations would be an nvidia implementation to run opencl on gpu and an amd implementation to run on cpu, so that it not that far fetched either.
edit: look at http://developer.amd.com/support/KnowledgeBase/Lists/KnowledgeBase/DispForm.aspx?ID=71 for (better) desciption of this
To complement the answer of Tim Child with an example (Thinkpad X201 with both AMD and Intel SDK's installed):
$ python /usr/share/doc/python-pyopencl/examples/benchmark-all.py
Execution time of test without OpenCL: 10.9563219547 s
===============================================================
Platform name: AMD Accelerated Parallel Processing
Platform profile: FULL_PROFILE
Platform vendor: Advanced Micro Devices, Inc.
Platform version: OpenCL 1.1 AMD-APP-SDK-v2.5 (684.213)
---------------------------------------------------------------
Device name: Intel(R) Core(TM) i5 CPU M 520 # 2.40GHz
Device type: CPU
Device memory: 7799 MB
Device max clock speed: 2399 MHz
Device compute units: 2
Execution time of test: 0.00842799 s
Results OK
===============================================================
Platform name: Intel(R) OpenCL
Platform profile: FULL_PROFILE
Platform vendor: Intel(R) Corporation
Platform version: OpenCL 1.1 LINUX
---------------------------------------------------------------
Device name: Intel(R) Core(TM) i5 CPU M 520 # 2.40GHz
Device type: CPU
Device memory: 7799 MB
Device max clock speed: 2400 MHz
Device compute units: 2
Execution time of test: 0.00260659 s
Results OK
Yes, there is one Platform Id for each vendors OpenCL installation. So if you install AMD's and Intel's OpenCL SDK's you will get one Platform Id for each.
Even if you assume that a host has only one platform, you would have to figure out what the Id of that platform is, before calling clGetPlatformInfo. So its better if you call clGetPlatformIDs, pick up a default or user supplied platform and then call clGetPlatformInfo.

Resources