OpenCL for custom systems on SoC prototyping board - opencl

Is it possible to run OpenCL on a system designed by a user on a SoC prototyping board? To be more specific, I have a ZedBoard (Xilinx Zynq) that has Dual ARM cores and a Programmable Logic (PL) Area. If I design a simple system of my own that has a video processing accelerator implemented in the logic area, an ARM core and an AXI interconnect, what do I have to do to provide OpenCL support for this simple system? (In this simple system, the ARM core could be the "Host" and the video processing accelerator could be the "device").
I am a student and I have only some basic knowledge about OpenCL. I have researched about my question and have only ended up confusing myself. What are the things that have to be done to provide OpenCL support for a SoC? I understand that this may be a big project, but I need a guideline where to start and how to proceed.

what do I have to do to provide OpenCL support for this simple system?
Implement a OpenCL platform that makes either use of your ARM CPU or the FPGA (or both). I'd say that is pretty much impossible for you; ARM would surely offer one for the CPU if it was easy (and they definitely have the financial means to employ capable engineers/computer scientists), and implementing accelerators on an FPGA requires in-depth knowledge of FPGA development, as well as compiler theory and experience in systems design. I don't want to sound mean, but you seem to have none of these three.
You asked where to get started; I recommend just writing a first accelerator that e.g. adds up a vector of numbers; as soon as you have that, you will have a clearer idea of your task.
If you want to have a look at a reference: The Ettus USRP E310 is a zynq-based SDR device. Ettus has a technology called RFNoC, which allows users to write their own blocks to push data through. Notice that this took quite a few engineers and quite some time to get started. Notice further that it's much easier than implementing something that converts OpenCL to FPGA implementations.

If you have access to the Xilinx tools: Vivado HLS 15.1 System Edition should compile OpenCL kernels. This will also be included in the SDAccel tool suite.
Source: UG973: Vivado Design Suite User Guide Release Notes, Installation,and Licensing
An alternative might be switching to Altera. They provide some good examples for the Altera Cyclone V SoC which is comparable to Xilinx Zynq devices (also includes ARM Cortex-A9) :
AlteraSDK for OpenCL

I am also a student and my current project is also going on a similar direction, i have successfully installed a version of opencl called POCL on the zedboard, it successfully detects the arm cpu of the zedboard. To install pocl, you need llvm and a horde of other things as well. but basic steps to get pocl up on the zedboard are given below:-
Installing pocl:
http://www.hosseinabady.com/install-pocl-opencl
running example:
http://www.hosseinabady.com/embedded-system-by-examples/opencl_embedded_system/opencl-vector-addition
Lots of dependency: can resolved easily
but LLVM make sure you install 3.4 version for pocl 0.9
Steps to install llvm
https://github.com/pacs-course/pacs/wiki/Instructions-to-install-clang-3.1-on-ubuntu-12.04.1-and-12.10
POCL 0.9 is successfully working for me, as you do the installation you will face many other missing dependencies like hwloc, mesa libraries, open gl/cl headers icd loaders i hope you can resolve them as its a very big list to put up in stack overflow.
In order to detect your fpga as an open cl device, thats not going to be a trivial thing to do, you can refer to this link question i posted on github
https://github.com/pocl/pocl/issues/285
and also a research paper published by hosseinbady found on the publications link on the pocl website
http://pocl.sourceforge.net/publications.html
hope this helps you

Try the ARM OpenCL SDK. The Zedboard has an ARM A9 CPU, this should have a NEON SIMD vector unit http://www.arm.com/products/processors/technologies/neon.php which can run OpenCL. See http://www.arm.com/products/multimedia/mali-technologies/opencl-for-neon.php.
The Zedboard isn't listed as an OpenCL conformant platform https://www.khronos.org/conformance/adopters/conformant-products#opencl.
So there is a chance the ARM driver will not work.
Good luck!

If still relevant, try this paper OpenCL on ZYNQ [PDF]
Also note that Zynq-7000 is listed on https://www.khronos.org/conformance/adopters/conformant-products#opencl ( OpenCL_1_0 ), hence the compatibility.

Related

OpenCl is a Library or is a Compiler?

I started to learn OpenCl.
I read these links:
https://en.wikipedia.org/wiki/OpenCL
https://github.com/KhronosGroup/OpenCL-Guide/blob/main/chapters/os_tooling.md
https://www.khronos.org/opencl/
but I did not understand well that OpenCl is a library by including header file in source code or it is a compiler by using OpenCl C Compiler?!
It is both a library and a compiler.
The OpenCL C/C++ bindings that you include as header files and that you link against are a library. These provide the necessary functions and commands in C/C++ to control the device (GPU).
For the device, you write OpenCL C code. This language is not C or C++, but rather based on C99 and has some specific limitations (for example only 1D arrays) as well as extensions (math and vector functionality).
The OpenCL compiler sits in between the C/C++ bindings and the OpenCL C part. Using the C/C++ bindings, you run the compiler at runtime of the executable with the command clBuildProgram (C bindings) or program.build(...) (C++). It then at runtime once compiles the OpenCL C code to the device-specific assembly, which is different for every vendor. With an Nvidia GPU, you can for example look at the Compiler PTX assembly for the device.
If you know OpenGL then OpenCL works on the same principle.
Disclaimer: Self-learned knowledge ahead. I wanted to learn OpenCL and found the OpenCL term as confusing as you do. So I did some painful research until I got my first OpenCL hello-world program working.
Overview
OpenCL is an open standard - i.e. just API specification which targets heterogeneous computing hardware in particular.
The standard comprises a set of documents available here. It is up to the manufacturers to implement the standard for their devices and make OpenCL available e.g. through GPU drivers to users. Perhaps in the form of a shared library.
It is also up to the manufacturers to provide tools for developer to make applications using OpenCL.
Here's where it gets complicated.
SDKs
Manufacturers provide SDKs - software packages that contain everything the said developer needs. (See the link above). But they are specific for each - e.g. NVIDIA SDK won't work without their gpu.
ICD Loader
Because of SDKs being tied to a signle vendor, the most portable(IMHO) solution is to use what is known as Khronos' ICD loader. It is kind of "meta-driver" that will, during run-time, search for other ICDs present in the system by AMD, Intel, NVIDIA, and others; then forward them calls from our application. So, as a developer, we can develop against this generic driver and use clGetPlatformIDs to fetch the available platforms and devices. It is availble as libOpenCL.so, at least on Linux, and we should link against it.
Counterpart for OpenGL's libOpenGL, well almost, because the vast majority of OpenGL(1.1+) is present in the form of extensions and must be loaded separately with e.g. GLAD. In that sense, GLAD is very similar to the ICD loader.
Again, it does not contain any actual "computing" code, only stub implementations of the API which forward everything to the chosen platform's ICD.
Headers
We are still missing the headers, thankfully Khronos organization releases C headers and also C++ bindings. But nothing is stopping you from writing them yourself based on the official API documents. It would just be really tedious and error-prone.
Here we can find yet another parallel with OpenGL because the headers are also just the consequence of the Standard and GLAD generates them directly from its XML version! How cool is that?!
Summary
To write a simple OpenCL application we need to:
Download an ICD from the device's manufactures - e.g. up-to-date GPU drivers is enough.
Download the headers and place them in some folder.
Download, build, and install an ICD loader. It will likely need the headers too.
Include the headers, use API in them, and link against the ICD loader.
For Debian, maybe Ubuntu and others there is a simpler approach:
Download the drivers... look for <vendor>-opencl-icd, the drivers on Linux are usually not as monolithic as on Windows and might span many packages.
Install ocl-icd-opencl-dev which contains C, C++ headers + the loader.
Use the headers and link the library.

Altera Arria V latest software for OpenCL

I recently bought a new Altera Arria V board 1. I am planning to use it to design a certain application using OpenCL.
Unfortunately, I didn't find so far the required software to get it work. I mean by that the Altera RTE for OpenCL and the required driver (aclsoc_drv.ko).
I would be grateful if you could help me how I can find the latest software!
Thank you all
You need to read the manual about how to install and what you need to install which you should be able to find here: Arria V SoC Development Kit and SoC Embedded Design Suite
OpenCL runtime environment and SDK you can find here: Altera SDK for OpenCL
Also Altera OpenCL guides will be helpful.
If I understand to your problem correctly, you are missing the required BSP and the base design to support OpenCL development for your Arria V board.
For this problem, please check with the board vendor (customer support) for the availability of the two items.
To be able to develop under Altera OpenCL framework, you need to load a basic design file on to the flash ROM / configuration device on the board. This is to provide some necessary IP support for the basic PCIe and memory access. It is usually provided with the BSP from the board vendor.
Along with the basic design file, drivers should be included as well for your host to recognise the OpenCL device.
Until you sort out the above missing parts, your OpenCL development environment should be up and running. Good luck!

How much memory does linux kernel and base services use?

I'm doing and embedded linux+qt project and I was wondering what was the base memory consuption of the linux kernel plus some basic services. Just enough to run some framebuffer based application.
I ended up in here: http://qt-project.org/doc/qt-4.8/requirements-embedded-linux.html but as I'm reading that seems like it's just the qt requirements without counting the linux overhead.
Can someone point me to a more detailed resource on the topic?
The numbers in the table you cited look reasonable.
The actual answer is "it depends". Yes, Virginia: it is possible to have a working OS and a Qt-based GUI in under 4MB.
The actual memory usage will vary wildly, depending on:
Which kernel you use
How you configure your kernel build
Which kernel drivers you load at runtime
What you start up during system init
Etc etc
Book recommendation:
Embedded Linux Primer, Christopher Hallinan
ALSO: here's a list of prebuilt-distros with GUIs that all run on Pentium IVs with 512MB RAM:
http://www.osnews.com/story/26087
I would suggest using Yocto for such builds, but you can also take a look at the upcoming "Boot to Qt" project which is basic a Qt 5 replacement for Qt embedded with Qt 4. I would not suggest looking into the link you pasted in your question.
You should definitely focus on Qt 5 for several reasons. The foremost is probably because you can get hardware acceleration and Qt got a lot of utilization for embedded, including decoupling the QtWidgets module, and so forth.
Here you can find the technology preview that the guy in Norway are working on. This is just for future reference:
http://blog.qt.digia.com/blog/2013/05/21/introducing-boot-to-qt-a-technology-preview/
I would start using the Yocto project for now. We have worked on a "meta-qt5" layer which is not perfect, but good enough. Yocto will also take care of the Linux with "minimal images", et cetera.
Not sure if you had seen the classic example a couple of years ago, but there was a "Qt boot" for an embedded Linux board which happened within a second. Here is the link to the reading material. Unfortunately, the original video does not seem to be available anymore.
http://www.embedded-bits.co.uk/2011/1-second-linux-boot-to-qt/

Need to install opencl for CPU and GPU platforms?

I have a system with an NVidia graphics card and I'm looking at using openCL to replace openMP for some small on CPU tasks (thanks to VS2010 making openMP useless)
Since I have NVidia's opencl SDK installed clGetPlatformIDs() only returns a single platform (NVidia's) and so only a single device (the GPU).
Do I need to also install Intel's openCL sdk to get access to the CPU platform?
Shouldn't the CPU platform always be available - I mean, how do you NOT have a cpu?
How do you manage to build against two openCL SDKs simultaneously?
You need to have a SDK which provides interface to CPU. nVidia does not, AMD and Intel's SDKs do; in my case the one from Intel is significantly (something like 10x) faster, which might due to bad programming on my part however.
You don't need the SDK for programs to run, just the runtime. In Linux, each vendor installs a file in /etc/OpenCL/vendors/*.icd, which contains path of the runtime library to use. That is scanned by the OpenCL runtime you link to (libOpenCL.so), which then calls each of the vendor's libs when querying for devices on that particular platform.
In Linux, the GPU drivers install OpenCL runtime automatically, the Intel runtime is likely to be downloadable separately from the SDK, but is part of the SDK as well, of course.
Today i finally got around to trying to start doing openCl development and wow... it is not straight forward at all.
There's an AMD sdk, there's an intel sdk, there's an nvidia sdk, each with their own properties (CPU only vs GPU only vs specific video card support only perhaps?)
There may be valid technical reasons for it having to be this way but i really wish there was just one sdk, and that when programming perhaps you could specify GPU / CPU tasks, or that maybe it would use whatever resources made most sense / preformed best or SOMETHING.
Time to dive in though I guess... trying to decide though if i go CPU or GPU. I have a pretty new 4000$ alienware laptop with SLI video cards, but then also an 8 core cpu so yeah... guess ill have to try a couple sdk's and see which preforms best for my needs?
Not sure what end users of my applications would do though... it doesnt seem like they can flip a switch to make it run on cpu or gpu instead.
The OpenCL landscape really needs some help...

Is the SPARC architecture still relevant as a JIT compiler target on high-end servers?

X86 and AMD64 are the most important architectures for many computing environments (desktop, servers, and supercomputers). Obviously a JIT compiler should support both of them to gain acceptance.
Until recently, the SPARC architecture was the logical next step for a compiler, specially on high-end servers markets. But now that Sun is dead, things are not clear.
Oracle doesn't seem to be really interested in it, and some big projects are dropping support for that architecture (Ubuntu for example). But on the other hand, the OpenSPARC initiative intended to open source recent processors is quite promising, meaning that a lot of manufacturers could implement and use SPARC for free in the near future.
So, is SPARC still a good choice as the next target architecture for a JIT compiler? Or is it better to choose another one (POWER, ARM, MIPS, ...)?
I don't know any more than you about SPARC's future. I hope it has one; it's been tragic how many good architectures have died out while x86 has kept going.
But i would suggest you look at ARM as a target. It isn't present in big server hardware, but it's huge in the mobile market, and powers all sorts of interesting little boxes, like my NAS, my ADSL router, and so on.
Your next target architecture should definitely be ARM - power consumption in large datacenters is a huge issue and the next big thing will be trying to reduce that by using low-power CPUs; see Facebook's first attempt on this.

Resources