OpenCL performance under OSX Lion - opencl

I have bitcoin miner which is using OpenCL kernel. Windows and Linux implementations of ATI SDK has comparable performance while Apple's native OpenCL implementation works too bad. I want to know if I can profile opencl kernel somehow to optimize it for Lion's OpenCL implementation

Related

What is a example of a target platform in OpenCL?

What is an example of a target platofrm in OpenCL? is it, for example, an OS like Windows, Android, Mac, or is it the actual chips in a device?
An OpenCL platform is essentially an OpenCL implementation. It is not related to the operating system (if any). It is commonly related to the hardware chip, but not necessarily. For example, the AMD platform also supports Intel CPUs. Another example is the Beignet project, an open-source OpenCL platform which runs on Intel hardware, or Pocl, which runs on ARM and x86.
Some examples of existing OpenCL SDKs/platforms (I don't have their full names at hand so I'll just list the vendor/SDK name):
Proprietary:
Intel® SDK for OpenCL™ Applications
AMD APP SDK
NVIDIA OpenCL
ARM Mali OpenCL SDK
Apple OpenCL
Qualcomm Adreno SDK
Imagination Technologies OpenCL on PowerVR
There are also implementations by IBM, Samsung, Altera, Vivante, Xilinx, MediaTek, STMicroelectronics...
Open Source:
Beignet
Portable Computing Language
This list is not exhaustive.

Intel OpenCL Vs. Khronos OpenCL

What is the difference between Intel, AMD and Khronos OpenCLs. I am totally new to OpenCL and want to start with it. I don't know which one is better to install on my operating system.
OpenCL is an "extension" to C and C++ languages that enables parallelization of software on your compute devices: CPU, GPU, etc.
OpenCL is defined by a standard (created by Khronos Group) and implemented by hardware vendors Intel, AMD, nVidia, etc.. So each OpenCL implementation requires a vendor specific OpenCL driver that will enable the usage of the vendor's hardware.
So to conclude, if you have an Intel based system, use the Intel OpenCL because only so you would be able to use all compute devices in your machine. The same goes if you have an AMD system. Also, take note that there is no Khronos OpenCL implementation.
Of course you can have a platform with OpenCL enabled devices from multiple vendors (e.g. Intel CPU+GPU and nVidia discrete card). In this case the OpenCL runtime contains a generic layer (a dynamic loaded library). This layer is an interface which calls the implementations provided in each device driver depending on the selected OpenCL platform.
OpenCL is a standard defined by Kronos. They distribute header files that you have to give to your compiler. They do not distribute binaries to link against. For that, you must get an ICD (Installable Client Driver), on Windows this is in the form of a DLL file. You will get it from installing one or more of...
Nvidia drivers (if you have an Nvidia GPU)
AMD drivers (if you have an AMD GPU or an AMD CPU)
Intel Drivers (if you have an Intel CPU, also some Intel CPU's have built in GPU's).
Do not worry about compiling against one vendor and it not working on another, OpenCL has been carefully designed to work around this. Compile against any version you have, it will work with any other version that is the same or newer, regardless of who made it.
Be Aware, the AMD OpenCL driver will operate as an OpenCL driver for Intel CPU's. If, for example, you have an AMD GPU and an Intel CPU, and have installed the Intel OpenCL driver and the AMD OpenCL driver, the AMD driver will report that it can provide both a GPU device and a CPU device (your CPU), and the Intel driver will report having a CPU device (also your CPU) and most likely also a GPU device (the GPU that is on the Intel CPU die, for example on an i7-3770, this will be a HD4000). If you blindly ask OpenCL for "All CPU's available" you will get the AMD drivers and the Intel drivers offering you the same CPU. Your code will not run very well in this case.
On Windows it is expected that you will download the header files yourself, and then either create a library from the DLL (MSVC), or link directly against the DLL (Mingw & Clang default behavior).
On Linux, you package manager will likely have a library to link against, consult your distributions documentation regarding this. On Ubuntu and Debian this command will work...
sudo apt-get install ocl-icd-opencl-dev
On Mac, there is nothing to install, and trying to install something will likely damage your system. Just install Xcode, and use the framework "OpenCL".
There are other platforms, for example Android. Some FPGA vendors offer OpenCL libraries. Consult your vendors documentation.
Khronos defines OpenCL standard. Each vendor/ open source will implement that standards.
Khronos defines set of conformance tests which need to pass if a vendor claims that his opencl implementation is as per standard.

How do I program an INTEL GPU

I am quite new in the world of GPU Computing. So I would really like someone to explain me the very basics. I have to Intel chipsets with the following GPUs:
GMA4500
HD graphics
I am interested in running algebraic and bitwise functions with huge data sets, like transpose of an array or bitwise shift of the lines of an array, in a GPU. The goal is of course to gain more performance.
My main question is how can I program such on GPUs? In the past I have used CUDA to program on nVIDIA video card. I understand from previous topics that I can't use CUDA for an INTEL GPUs. Thanks in advance!!
Update 1
I found out that Intel supports OpenCL for HD graphics. More precisely the Intel SDK for OpenCL applications provides a comprehensive development environment for OpenCL application on Intel® platforms including compatible drivers, code samples, development tools, such as the code builder, optimization guide, and support for optimization tools.
The SDK supports OpenCL 1.2 on 3rd and 4th generation Intel® Core™ processors with Intel® HD Graphics and Intel® Iris™ Graphics Family, Intel® Atom™ Processors with Intel HD Graphics, Intel® Xeon® processors, and Intel® Xeon Phi™ coprocessors.
OpenCL is the standard, cross-vendor API for GPGPU programming, roughly analogous to nVidia's proprietary CUDA.

AMD APP OpenCL SDK on Intel

I have seen that AMD APP SDK samples work on a machine having only Intel CPU.
How can this happen? How does the compiler target a different machine architecture?
Do I not need Intel's set of compilers for running the code on the intel CPU?
I think if we have to run an OpenCL application on a specific hardware, I have to (re)compile it using device's vendor specifics compiler.
Where is my understanding wrong?
Firstly, OpenCL is built to work on CPU's and GPU's. You can compile and run the same source code on either type of device. However, its very likely that CPU code will be sub-optimal for a GPU and vice-versa.
AMD H/W is 7% - 14% of total x86/x64 CPU's. So AMD must develop compilers for both AMD and Intel chips to be relevant. AMD have history developing compilers for both sets of chips. Conversely, Intel have developed compilers that either don't work on AMD chips or don't work that well. That's no surprise.
With OpenCL, the AMD APP SDK is the most flexible it will work well on AMD and Intel CPU's and AMD GPUs. Intel's OpenCL SDK doesn't even install on AMD x86 H/W.
If you compile an OpenCL program to binary, you can save and reuse it as long as it matches the OpenCL Platform and Device that created it. So, if you compile for one device and use on another you are very likely to get an error.
The power of OpenCL is abstracting the underlaying hardware and offer massive, parallel and heterogeneous computing power.
Some SDKs and platforms offers some specific features to "optimize" the code, i honestly think that such features are just marketing and they introduce boilerplate code making the application less portable.
There are also some pseudo-new technologies that are just wrappers to OpenCL or they are really similar in the concept like the Intel quick sync.
About Intel i should say that at the first place they were supporting all the iCore generation and even some C2D, now the new SDK only support the 3rd iCore generation, i don't get their strategy honestly, probably Intel is the last option if you want to adopt OpenCL and targeting the biggest possible audience, also their SDK doesn't seems to be really good at all .
Stick with the standard and you will avoid both possible legal and performance issues and your code will also be more portable.
The bottom line is that the AMD SDK includes a compiler for targeting x86 CPUs for OpenCL. That means that even though you are running an Intel CPU the generated code will run on it. It's the same concept as compiling a C program to run on an x86 CPU: it works on Intel and AMD CPUs (or any that implement the x86 instruction set).
The vendor's compiler might have specific optimizations, like user827992 mentions, but in my experience the performance of AMD's CPU compiler isn't that bad when running on an Intel CPU. I haven't tried Intel's OpenCL implementation.
It is true that for some (maybe most in the future) hardware, only the vendor's compiler will support it. AMD's SDK won't build code that will run on an NVIDIA card, and vice-versa. CPUs happen to be a bit of a special case in that the basic instruction set is so widely deployed that the CPU compiler will work on most machines you're likely to come in contact with.

AMD CPU versus Intel CPU openCL

With some friends we want to use openCL. For this we look to buy a new computer, but we asked us the best between AMD and Intel for use of openCL. The graphics card will be a Nvidia and we don't have choice on the graphic card, so we start to want buy an intel cpu, but after some research we figure out that may be AMD cpu are better with openCL. We didn't find benchmarks which compare the both.
So here is our questions:
Is AMD better than Intel with openCL?
Is it a matter to have a Nvidia card with an AMD cpu for the performance of openCL?
Thank you,
GrWEn
You shouldn't care as much about what CPU you use as much as what GPU you use. You would need to choose between an AMD/ATI GPU or nVidia GPU.
I would personally recommend an nVidia GPU as, in addition to OpenCL support, you can experiment with their more proprietary CUDA technology which offers a far richer development experience than OpenCL does today. While you're at it take a look at the new AMP technology that was just announced by Microsoft for C++ which aims to bring language extensions akin to nVidia's CUDA. nVidia also has offerings for the enterprise with their Tesla GPUs with several vendors offering GPU clusters and you can even get a GPU compute cluster on Amazon EC2 now which is all based on nVidia hardware.
You want to buy a new computer with your friends? What kind of project do you plan to do? The question about the hardware is answered with the needs you have. If you give some more information, we can provide better suggestions.
As written before, the CPU is not the important point as long as you do not want to buy a multiprocessor multicore system like 4 Quadprocessors. The difference in performance is mostly the differences of the GPUs used and there you can find different cards for all needs. From a cheap GPU to the nVidia Tesla cards.
It is definitely not a problem to run a nVidia board on a AMD system. I do it here. You also can use the OpenCL devices from the AMD Multicore CPU and the nVidia GPU in parallel.
You should pay attention: If you plan to buy a potent system to run your software (like a webserver), every developer of OpenCL software needs a system for testing. So every developer needs at least a modern multi-core CPU with an OpenCL SDK. Where the OpenCL kernels are developed does not matter. OpenCL is platform independed.
Both Intel and AMD have good OpenCL-support for their CPUs, so currently it does not really matter which you cooose. If you want to use the embedded GPU on AMD Fusion or Intel SandyBridge, then I suggest you go for Fusion since Intel does not have a driver for their GPUs (yet). Depending on what you are going to use OpenCL for, I could suggest a GPU - sometimes NVidia is faster, sometimes AMD.
AMP, CUDA, RenderScript and the many, many others all work nice but they don't work on all hardware as OpenCL does. CUDA certainly has advantages, but in the time you have learnt openCL I can assure you the tools around OpenCL have catched up.
The CPU has no influence on GPU OpenCL performance.
You might also want to try running the OpenCL kernels on CPU. Checkout the Intel OpenCL compiler beta. You can even run kernels on both CPU and GPU.

Resources