Can an OpenCL .cl file compiled to SPIR-V be cross-compiled to Metal? - opencl

For context, I am researching upgrades into our OpenCL code at work as we are quite far behind current spec. The team I work in develops on all three of the major OSes and as far as I can tell Apple doesn't support past OpenCL 1.2 on any of their machines instead prompting you to learn and use Metal performance shaders.
I would like to update our code base to C++ for OpenCL which is based on OpenCL 2.0 or 3.0 depending on which version you use. C++ for OpenCL is fine on Windows and Linux as they have GPU drivers which support up to and past the 2.0 requirement, but Mac might have a problem with it without official support in the drivers. So I had the idea that if we could cross compile the CL kernels to Metal for the Mac users they would be able to run a Metal version of the code locally.
Has anyone got any solutions to this problem.

Related

Use OpenCL on AMD APU but use discrete GPU for the X server

Is it possible to enable OpenCL on an A10-7800 without using it for the X server? I have a Linux box that I use for GPGPU programming. A discrete GEForce 740 card is used for both the X server and running OpenCL & Cuda programs I develop. I would also like the option of running OpenCL code on the APU's integrated GPU cores.
Everything I've read so far implies that if I want to use the APU for OpenCL, I have to install Catalyst and, AFAIK, that means using it for the X server. Is this true? Would there be an advantage to using the APU for my X server and using the GEForce solely for GPGPU code?
I had a similar goal, so I've built a system with AMD APU (4 regular cores + 6 GPUs) and Nvidia discrete graphics board. Sorry to say it wasn't easy to make it work, so I asked a question on the Ask Ubuntu forum, didn't get any answers, experimented a lot with hardware and software setup, and finally have posted my own answer to my question.
I'll describe my setup again here - who knows, what might happen with my auto-answered question on the Ask Ubuntu?
At first, I had to enable the integrated graphics hardware via a BIOS flag. This flag is called IGFX Multi-Monitor on my motherboard (ASUS A88X-PRO).
The second step was to find a right mix of a low-level graphics driver and high-level OpenCL implementation. The low-level driver for AMD processors is called AMD Catalyst and has a file name fglrx. I didn't install this driver from the Ubuntu software center - instead I used a version 15.302, directly downloaded from the AMD site. I had to install a significant number of prerequisites for this driver. The most important finding was that I had to skip running the aticonfig command after the fglrx installation - this command actually configures the X server to use this driver for graphics output, and I didn't want that.
Then I've installed the AMD SDK Ver 3.0 (release 130.136, earlier releases didn't work with my fglrx) - it's the OpenCL implementation from AMD. The clinfo command reports both CPUs and GPUs with correct number of cores now.
So, I have a hybrid AMD processor, supported by the OpenCL, with all the graphics output, supported by a discrete graphics card with Nvidia processor.
Good luck!
I maintain a Linux server (OpenSUSE, but the distribution shouldn't matter) containing both NVIDIA and (a discrete) AMD GPU. It's headless, so technically I do not know whether the X server will create additional problems, but I don't think so. You can always configure xorg.conf to use exactly the driver you want. Or for that matter: install Catalyst, but delete the X server driver file itself, which is not the same thing that you need for OpenCL.
There is one problem with a mixed-vendor system that I noticed, however: AMDs OpenCL driver (ICD) will go spelunking for a libGL.so library, I guess in order to do OpenCL/OpenGL-interop. If it finds any of the NVIDIA-supplied libGL.so's, it will get confused and hang - at least on my machine. I "solved" this by deleting all libGL.so's (I do not need it on a headless compute server), but that might not be an acceptable solution for you. Maybe you can arrange things such that the AMD-supplied libGL.so's take precedence, possibly by installing the AMD driver last.

OpenCL for custom systems on SoC prototyping board

Is it possible to run OpenCL on a system designed by a user on a SoC prototyping board? To be more specific, I have a ZedBoard (Xilinx Zynq) that has Dual ARM cores and a Programmable Logic (PL) Area. If I design a simple system of my own that has a video processing accelerator implemented in the logic area, an ARM core and an AXI interconnect, what do I have to do to provide OpenCL support for this simple system? (In this simple system, the ARM core could be the "Host" and the video processing accelerator could be the "device").
I am a student and I have only some basic knowledge about OpenCL. I have researched about my question and have only ended up confusing myself. What are the things that have to be done to provide OpenCL support for a SoC? I understand that this may be a big project, but I need a guideline where to start and how to proceed.
what do I have to do to provide OpenCL support for this simple system?
Implement a OpenCL platform that makes either use of your ARM CPU or the FPGA (or both). I'd say that is pretty much impossible for you; ARM would surely offer one for the CPU if it was easy (and they definitely have the financial means to employ capable engineers/computer scientists), and implementing accelerators on an FPGA requires in-depth knowledge of FPGA development, as well as compiler theory and experience in systems design. I don't want to sound mean, but you seem to have none of these three.
You asked where to get started; I recommend just writing a first accelerator that e.g. adds up a vector of numbers; as soon as you have that, you will have a clearer idea of your task.
If you want to have a look at a reference: The Ettus USRP E310 is a zynq-based SDR device. Ettus has a technology called RFNoC, which allows users to write their own blocks to push data through. Notice that this took quite a few engineers and quite some time to get started. Notice further that it's much easier than implementing something that converts OpenCL to FPGA implementations.
If you have access to the Xilinx tools: Vivado HLS 15.1 System Edition should compile OpenCL kernels. This will also be included in the SDAccel tool suite.
Source: UG973: Vivado Design Suite User Guide Release Notes, Installation,and Licensing
An alternative might be switching to Altera. They provide some good examples for the Altera Cyclone V SoC which is comparable to Xilinx Zynq devices (also includes ARM Cortex-A9) :
AlteraSDK for OpenCL
I am also a student and my current project is also going on a similar direction, i have successfully installed a version of opencl called POCL on the zedboard, it successfully detects the arm cpu of the zedboard. To install pocl, you need llvm and a horde of other things as well. but basic steps to get pocl up on the zedboard are given below:-
Installing pocl:
http://www.hosseinabady.com/install-pocl-opencl
running example:
http://www.hosseinabady.com/embedded-system-by-examples/opencl_embedded_system/opencl-vector-addition
Lots of dependency: can resolved easily
but LLVM make sure you install 3.4 version for pocl 0.9
Steps to install llvm
https://github.com/pacs-course/pacs/wiki/Instructions-to-install-clang-3.1-on-ubuntu-12.04.1-and-12.10
POCL 0.9 is successfully working for me, as you do the installation you will face many other missing dependencies like hwloc, mesa libraries, open gl/cl headers icd loaders i hope you can resolve them as its a very big list to put up in stack overflow.
In order to detect your fpga as an open cl device, thats not going to be a trivial thing to do, you can refer to this link question i posted on github
https://github.com/pocl/pocl/issues/285
and also a research paper published by hosseinbady found on the publications link on the pocl website
http://pocl.sourceforge.net/publications.html
hope this helps you
Try the ARM OpenCL SDK. The Zedboard has an ARM A9 CPU, this should have a NEON SIMD vector unit http://www.arm.com/products/processors/technologies/neon.php which can run OpenCL. See http://www.arm.com/products/multimedia/mali-technologies/opencl-for-neon.php.
The Zedboard isn't listed as an OpenCL conformant platform https://www.khronos.org/conformance/adopters/conformant-products#opencl.
So there is a chance the ARM driver will not work.
Good luck!
If still relevant, try this paper OpenCL on ZYNQ [PDF]
Also note that Zynq-7000 is listed on https://www.khronos.org/conformance/adopters/conformant-products#opencl ( OpenCL_1_0 ), hence the compatibility.

Is There A Way To Upgrade to OpenCL 2.0?

There is a feature in OpenCL 2.0 that I would like to use.
I dual boot Ubuntu and Mac OS (Graphics: GTX 670 + HD Graphics 4600). Is it possible to install OpenCL 2.0?
This may be a dumb question - from what I have read, it seems like 2.0-compatible drivers may not be written yet? And also possibly my hardware will not support the new spec?
Basically, when will OpenCL 2.0 be easy/available?
You mentioned you have an NVidia GTX 670; you should note that NVidia's drivers only support OpenCL 1.1, not 1.2.
NVidia have(*) refrained over the past several years from updating their drivers to support the newer OpenCL standard, even though the hardware obviously supports it and CUDA has all the relevant functionality... so don't expect 2.0 to happen so soon on your hardware.
(*) - Due to being Evilâ„¢.
There are no OpenCL 2.0 drivers yet. The specification just became final yesterday. I don't believe any public statements have been made yet about when drivers will become available, and for what hardware. You'll have to wait for whatever fun new feature you wanted. Better yet, let your favorite vendors know that you'd like them to support OpenCL 2.0!
There is some support for OpenCL in Clang 3.0 and from the LLVM organisation.
See the CLang 3.0 release notes
http://llvm.org/releases/3.0/docs/ClangReleaseNotes.html
Here's an LLVM presentation on OpenCL
http://llvm.org/devmtg/2009-10/OpenCLWithLLVM.pdf
Here's another Stackoveflow answer on Clang 3.0 for OpenCL
How to use clang to compile OpenCL to ptx code?
So there are some good folks working on an open source version of OpenCL that compiles to PTX for NVida cards. Not having used it and not being familiar with these efforts, I can't say if there are plans or when when they can get to the OpenCL 2.0 spec.

AMD APP OpenCL SDK on Intel

I have seen that AMD APP SDK samples work on a machine having only Intel CPU.
How can this happen? How does the compiler target a different machine architecture?
Do I not need Intel's set of compilers for running the code on the intel CPU?
I think if we have to run an OpenCL application on a specific hardware, I have to (re)compile it using device's vendor specifics compiler.
Where is my understanding wrong?
Firstly, OpenCL is built to work on CPU's and GPU's. You can compile and run the same source code on either type of device. However, its very likely that CPU code will be sub-optimal for a GPU and vice-versa.
AMD H/W is 7% - 14% of total x86/x64 CPU's. So AMD must develop compilers for both AMD and Intel chips to be relevant. AMD have history developing compilers for both sets of chips. Conversely, Intel have developed compilers that either don't work on AMD chips or don't work that well. That's no surprise.
With OpenCL, the AMD APP SDK is the most flexible it will work well on AMD and Intel CPU's and AMD GPUs. Intel's OpenCL SDK doesn't even install on AMD x86 H/W.
If you compile an OpenCL program to binary, you can save and reuse it as long as it matches the OpenCL Platform and Device that created it. So, if you compile for one device and use on another you are very likely to get an error.
The power of OpenCL is abstracting the underlaying hardware and offer massive, parallel and heterogeneous computing power.
Some SDKs and platforms offers some specific features to "optimize" the code, i honestly think that such features are just marketing and they introduce boilerplate code making the application less portable.
There are also some pseudo-new technologies that are just wrappers to OpenCL or they are really similar in the concept like the Intel quick sync.
About Intel i should say that at the first place they were supporting all the iCore generation and even some C2D, now the new SDK only support the 3rd iCore generation, i don't get their strategy honestly, probably Intel is the last option if you want to adopt OpenCL and targeting the biggest possible audience, also their SDK doesn't seems to be really good at all .
Stick with the standard and you will avoid both possible legal and performance issues and your code will also be more portable.
The bottom line is that the AMD SDK includes a compiler for targeting x86 CPUs for OpenCL. That means that even though you are running an Intel CPU the generated code will run on it. It's the same concept as compiling a C program to run on an x86 CPU: it works on Intel and AMD CPUs (or any that implement the x86 instruction set).
The vendor's compiler might have specific optimizations, like user827992 mentions, but in my experience the performance of AMD's CPU compiler isn't that bad when running on an Intel CPU. I haven't tried Intel's OpenCL implementation.
It is true that for some (maybe most in the future) hardware, only the vendor's compiler will support it. AMD's SDK won't build code that will run on an NVIDIA card, and vice-versa. CPUs happen to be a bit of a special case in that the basic instruction set is so widely deployed that the CPU compiler will work on most machines you're likely to come in contact with.

Need to install opencl for CPU and GPU platforms?

I have a system with an NVidia graphics card and I'm looking at using openCL to replace openMP for some small on CPU tasks (thanks to VS2010 making openMP useless)
Since I have NVidia's opencl SDK installed clGetPlatformIDs() only returns a single platform (NVidia's) and so only a single device (the GPU).
Do I need to also install Intel's openCL sdk to get access to the CPU platform?
Shouldn't the CPU platform always be available - I mean, how do you NOT have a cpu?
How do you manage to build against two openCL SDKs simultaneously?
You need to have a SDK which provides interface to CPU. nVidia does not, AMD and Intel's SDKs do; in my case the one from Intel is significantly (something like 10x) faster, which might due to bad programming on my part however.
You don't need the SDK for programs to run, just the runtime. In Linux, each vendor installs a file in /etc/OpenCL/vendors/*.icd, which contains path of the runtime library to use. That is scanned by the OpenCL runtime you link to (libOpenCL.so), which then calls each of the vendor's libs when querying for devices on that particular platform.
In Linux, the GPU drivers install OpenCL runtime automatically, the Intel runtime is likely to be downloadable separately from the SDK, but is part of the SDK as well, of course.
Today i finally got around to trying to start doing openCl development and wow... it is not straight forward at all.
There's an AMD sdk, there's an intel sdk, there's an nvidia sdk, each with their own properties (CPU only vs GPU only vs specific video card support only perhaps?)
There may be valid technical reasons for it having to be this way but i really wish there was just one sdk, and that when programming perhaps you could specify GPU / CPU tasks, or that maybe it would use whatever resources made most sense / preformed best or SOMETHING.
Time to dive in though I guess... trying to decide though if i go CPU or GPU. I have a pretty new 4000$ alienware laptop with SLI video cards, but then also an 8 core cpu so yeah... guess ill have to try a couple sdk's and see which preforms best for my needs?
Not sure what end users of my applications would do though... it doesnt seem like they can flip a switch to make it run on cpu or gpu instead.
The OpenCL landscape really needs some help...

Resources