OpenCl is a Library or is a Compiler? - opencl

I started to learn OpenCl.
I read these links:
https://en.wikipedia.org/wiki/OpenCL
https://github.com/KhronosGroup/OpenCL-Guide/blob/main/chapters/os_tooling.md
https://www.khronos.org/opencl/
but I did not understand well that OpenCl is a library by including header file in source code or it is a compiler by using OpenCl C Compiler?!

It is both a library and a compiler.
The OpenCL C/C++ bindings that you include as header files and that you link against are a library. These provide the necessary functions and commands in C/C++ to control the device (GPU).
For the device, you write OpenCL C code. This language is not C or C++, but rather based on C99 and has some specific limitations (for example only 1D arrays) as well as extensions (math and vector functionality).
The OpenCL compiler sits in between the C/C++ bindings and the OpenCL C part. Using the C/C++ bindings, you run the compiler at runtime of the executable with the command clBuildProgram (C bindings) or program.build(...) (C++). It then at runtime once compiles the OpenCL C code to the device-specific assembly, which is different for every vendor. With an Nvidia GPU, you can for example look at the Compiler PTX assembly for the device.

If you know OpenGL then OpenCL works on the same principle.
Disclaimer: Self-learned knowledge ahead. I wanted to learn OpenCL and found the OpenCL term as confusing as you do. So I did some painful research until I got my first OpenCL hello-world program working.
Overview
OpenCL is an open standard - i.e. just API specification which targets heterogeneous computing hardware in particular.
The standard comprises a set of documents available here. It is up to the manufacturers to implement the standard for their devices and make OpenCL available e.g. through GPU drivers to users. Perhaps in the form of a shared library.
It is also up to the manufacturers to provide tools for developer to make applications using OpenCL.
Here's where it gets complicated.
SDKs
Manufacturers provide SDKs - software packages that contain everything the said developer needs. (See the link above). But they are specific for each - e.g. NVIDIA SDK won't work without their gpu.
ICD Loader
Because of SDKs being tied to a signle vendor, the most portable(IMHO) solution is to use what is known as Khronos' ICD loader. It is kind of "meta-driver" that will, during run-time, search for other ICDs present in the system by AMD, Intel, NVIDIA, and others; then forward them calls from our application. So, as a developer, we can develop against this generic driver and use clGetPlatformIDs to fetch the available platforms and devices. It is availble as libOpenCL.so, at least on Linux, and we should link against it.
Counterpart for OpenGL's libOpenGL, well almost, because the vast majority of OpenGL(1.1+) is present in the form of extensions and must be loaded separately with e.g. GLAD. In that sense, GLAD is very similar to the ICD loader.
Again, it does not contain any actual "computing" code, only stub implementations of the API which forward everything to the chosen platform's ICD.
Headers
We are still missing the headers, thankfully Khronos organization releases C headers and also C++ bindings. But nothing is stopping you from writing them yourself based on the official API documents. It would just be really tedious and error-prone.
Here we can find yet another parallel with OpenGL because the headers are also just the consequence of the Standard and GLAD generates them directly from its XML version! How cool is that?!
Summary
To write a simple OpenCL application we need to:
Download an ICD from the device's manufactures - e.g. up-to-date GPU drivers is enough.
Download the headers and place them in some folder.
Download, build, and install an ICD loader. It will likely need the headers too.
Include the headers, use API in them, and link against the ICD loader.
For Debian, maybe Ubuntu and others there is a simpler approach:
Download the drivers... look for <vendor>-opencl-icd, the drivers on Linux are usually not as monolithic as on Windows and might span many packages.
Install ocl-icd-opencl-dev which contains C, C++ headers + the loader.
Use the headers and link the library.

Related

OpenCL for custom systems on SoC prototyping board

Is it possible to run OpenCL on a system designed by a user on a SoC prototyping board? To be more specific, I have a ZedBoard (Xilinx Zynq) that has Dual ARM cores and a Programmable Logic (PL) Area. If I design a simple system of my own that has a video processing accelerator implemented in the logic area, an ARM core and an AXI interconnect, what do I have to do to provide OpenCL support for this simple system? (In this simple system, the ARM core could be the "Host" and the video processing accelerator could be the "device").
I am a student and I have only some basic knowledge about OpenCL. I have researched about my question and have only ended up confusing myself. What are the things that have to be done to provide OpenCL support for a SoC? I understand that this may be a big project, but I need a guideline where to start and how to proceed.
what do I have to do to provide OpenCL support for this simple system?
Implement a OpenCL platform that makes either use of your ARM CPU or the FPGA (or both). I'd say that is pretty much impossible for you; ARM would surely offer one for the CPU if it was easy (and they definitely have the financial means to employ capable engineers/computer scientists), and implementing accelerators on an FPGA requires in-depth knowledge of FPGA development, as well as compiler theory and experience in systems design. I don't want to sound mean, but you seem to have none of these three.
You asked where to get started; I recommend just writing a first accelerator that e.g. adds up a vector of numbers; as soon as you have that, you will have a clearer idea of your task.
If you want to have a look at a reference: The Ettus USRP E310 is a zynq-based SDR device. Ettus has a technology called RFNoC, which allows users to write their own blocks to push data through. Notice that this took quite a few engineers and quite some time to get started. Notice further that it's much easier than implementing something that converts OpenCL to FPGA implementations.
If you have access to the Xilinx tools: Vivado HLS 15.1 System Edition should compile OpenCL kernels. This will also be included in the SDAccel tool suite.
Source: UG973: Vivado Design Suite User Guide Release Notes, Installation,and Licensing
An alternative might be switching to Altera. They provide some good examples for the Altera Cyclone V SoC which is comparable to Xilinx Zynq devices (also includes ARM Cortex-A9) :
AlteraSDK for OpenCL
I am also a student and my current project is also going on a similar direction, i have successfully installed a version of opencl called POCL on the zedboard, it successfully detects the arm cpu of the zedboard. To install pocl, you need llvm and a horde of other things as well. but basic steps to get pocl up on the zedboard are given below:-
Installing pocl:
http://www.hosseinabady.com/install-pocl-opencl
running example:
http://www.hosseinabady.com/embedded-system-by-examples/opencl_embedded_system/opencl-vector-addition
Lots of dependency: can resolved easily
but LLVM make sure you install 3.4 version for pocl 0.9
Steps to install llvm
https://github.com/pacs-course/pacs/wiki/Instructions-to-install-clang-3.1-on-ubuntu-12.04.1-and-12.10
POCL 0.9 is successfully working for me, as you do the installation you will face many other missing dependencies like hwloc, mesa libraries, open gl/cl headers icd loaders i hope you can resolve them as its a very big list to put up in stack overflow.
In order to detect your fpga as an open cl device, thats not going to be a trivial thing to do, you can refer to this link question i posted on github
https://github.com/pocl/pocl/issues/285
and also a research paper published by hosseinbady found on the publications link on the pocl website
http://pocl.sourceforge.net/publications.html
hope this helps you
Try the ARM OpenCL SDK. The Zedboard has an ARM A9 CPU, this should have a NEON SIMD vector unit http://www.arm.com/products/processors/technologies/neon.php which can run OpenCL. See http://www.arm.com/products/multimedia/mali-technologies/opencl-for-neon.php.
The Zedboard isn't listed as an OpenCL conformant platform https://www.khronos.org/conformance/adopters/conformant-products#opencl.
So there is a chance the ARM driver will not work.
Good luck!
If still relevant, try this paper OpenCL on ZYNQ [PDF]
Also note that Zynq-7000 is listed on https://www.khronos.org/conformance/adopters/conformant-products#opencl ( OpenCL_1_0 ), hence the compatibility.

Does Qt have general-purpose classes besides GUI-related classes?

I have recently regained some interest in learning Qt, but have the following doubt:
Does Qt have enough classes that are not GUI-related?
For example, Python is "batteries-included", .NET is definitely "batteries-included", and as far as I have seen, Android API also has a lot of classes to design and implement application/domain logic, not directly related to visual presentation.
The main reason I am asking is because I don't know C++ and don't plan to learn it deeply (too much time needed), so if I had to take third party C++ libraries all the time and struggle to use them inside Qt projects that would be a strong point against going ahead.
The intended use is mostly to create small desktop apps for personal use while gaining insight on software design good practices - a profession I am slowly migrating to.
I have already used some Python/Pygtk (without IDE) and WPF (in VStudio/ExpressionBlend). In both platforms, most of my work is related to scientific computations, image processing and interactive scientific visualization, and there are good libraries for that either in Python (Numpy, Scipy, Matplotlib, Pandas, PIL, cairo) and .NET(AForge, alglib, System.Media.Media3D). I wonder if the Qt ecosystem is so complete in that regard.
Qt isn't a language in itself, so you can't compare it to Python or .NET. With that being said, Qt does provide general-purpose classes like containers, a Unicode string class, character set encoders/decoders, multimedia, device and file I/O, etc. All these modules are fully documented.
There are also some external modules available for Qt, like Qwt which provides widgets for technical applications.
For other functionality where something Qt-specific isn't available, you can obviously use another appropriate library. Like OpenCV.
Oh, and you can use Qt in Python too, through PyQt.
As far as I know, Qt doesn't have image processing libraries. For that, you'll need to use something like OpenCV. Qt does have libraries for loading most common image types.
However, Qt does extend beyond just GUI classes.
There is a database module that's quite convenient. The concurrency/threading classes are nice. I've enjoyed making use of the Qt Networking classes. The FileIO classes are alright.
These classes/modules are all useful for making platform-independent code. Things like image processing are mostly algorithmic and tend to be platform-independent by nature. So I think they fall out of the scope of the Qt framework. It shouldn't be too difficult, however, to simply find a library that does what you need and link that in to your project.
A lot of the Qt Core services that heltonbiker and Nikos C. mentioned, can be thought of as extensions to C++, a little like std. Although I often prefer the Qt implementations myself. But Qt has gone much further with their libraries, with the I/O, and web services etc...
The QtXML library provides reading and writing of XML files. Traditionally we had always used xerces, but the Qt XML library is almost as simple as .NETs.
The QtNetwork library offers TCP/IP and other networks services
The QtMultimedia library performs playback and recording of audio
and video content to the use of available devices like cameras and
radios.
The QtSQL library interfaces with SQL databases.
And there is much more than that. Although these are probably services that are used to most. The other benefit is that for the most part the implementations are cross platform. So for example using the I/O services does not require you to write separate code for Linux and Windows. That is a general rule, and there are exceptions. But I am sure most people would agree that any of the services they offer are easy to use, and well documented.
Happy coding.
Qt provides ample abstraction besides UI - it comes with a set of functionality enhancing features that come with certain usage paradigms.
Container classes - shallow copy by value, copy on write
Implicit sharing for containers and certain data types
Event driven, signals and slots
A powerful and usable metasystem
Properties
Platform abstraction for a lot of functionality, from file access to network and multithreading
Cross platform atomics (not that important since C++11 atomics)
Settings API
Undo API
OpenGL abstraction (not necessarily UI, custom graphics)
Basic image formats and basic image manipulations
Qt Declarative, a.k.a QtQuick and QML markup (usable for all kind of structure markup BTW)
Dynamic plugin API
Platform abstraction and portability - same code, multiple platforms
High and low level multimedia - audio, video
Sensors and serial port
Unit test
XML, JSON, SQL
An outdated and hopefully soon updated OpenCL abstraction
Last but not least, a lot of 3rd party modules built around Qt fitting a wide range of applications
Honestly, all its missing is support for some more popular formats for file, media encoding/decoding and containers, some parallel and vector abstraction, USB, WIFI/NFC (in the works in an addon module) and it will be 100% versatile.
Note that you can also use Python with Qt, although I have no experience with that, Qt is a very versatile tool that allows for quick and easy application development - and since 5.1 supports pretty much the entire market, with the addition of Android and iOS to the list of supported platforms. It is very useful for creating custom use applications for creative or research purposes.
Although not perfect for every task, Qt is easily the "best of the bunch" of tools you can use in this regard. Unbeatable in terms of portability and very thorough, if not a little bloated for the set of functionality it provides. And finally, it is free, you can even develop commercial applications under LGPL as long as you link dynamically. All in all, it is well worth the investment to learn, the only downside is it lacks uniformity between the old C++ native APIs and the QML runtime, which is actively worked on and is based on JS, so the APIs are a bit different and some glue APIs are required to fuse C++ with JS and QML.
(just for the record, from the official site):
The Foundation: Qt Core Module
The Qt Core module forms the foundation of all Qt-based applications
with core non-graphical classes used by other modules.
Key Functions
File IO, event and object handling
Multi-threading and concurrency
Plugins, setting management
Signals and Slots inter-object communications mechanism
Benefits
Reduce development time and cost by leveraging a complete set of application building blocks
Develop portable code from the ground up with cross-platform functionality

Many OpenCL SDK's. Which of them i should choose?

In my computer with Windows 7 OS I have three versions of OpenCL SDKS's from this vendors:
Intel
NVIDIA
AMD.
I build my application with each of them.
As the output I have three different binaries.
For example: my_app_intel_x86, my_app_amd_x86, my_app_nvidia_x86
This binaries are different on this:
They use different SDK's in likange process
They try to find different OpenCL platform name in runtime
Can I use only one SDK and check platform on running time?
SDK's give debuggings tools, a platform, and possibly extensions, the OpenCL API remains the same regardless. You can link to any SDK you want, and it'll produce an executable compatible with any OpenCL runtimes you can find. Remember those are SDK's, meant for the developer - the end-user will probably only have his graphics driver (OpenCL-enabled) which doesn't care what SDK you used to build the software.
Ideally you should use a default platform for your program, but let the user override it (you can select various platforms at runtime!). You can also use heuristics to figure out which device is the fastest, e.g.:
iterate over each available platform
for each platform, iterate over each device
benchmark this device somehow in a relevant way
select the fastest one
Also, if you are using specific extensions, make sure to only accept devices which support them...
Can I use only one SDK and check platform on running time?
Yes, you absolutely can and should do that, but I am worried about what you mean by "check platform" - as I stated above, the SDK bears absolutely no influence on the platforms you can run your built program on. I can build my code with the AMD SDK, and run the executable on a system with an nVidia graphics card or an Intel processor just fine (the only difference is that I may not have access to AMD-specific extensions which will be provided by my SDK, but the extensions will be recognized by an AMD driver, so you don't even need the SDK installed to run the code - but you will to build it though).

Assembly language standard

Is there a standard that defines the syntax and semantics of assembly language? Similarly as language C has ISO standard and language C# has ECMA standard? Is there only one standard, or are there more of them?
I'm asking because I noticed that assembly language code looked different on Windows and Linux environment. I hoped that assembly language is not dependent on OS, that it's only language with some defined standard and via assembler (compiler of assembly language) is translated into machine instructions for particular processor.
thank you for answer
Yes, there is a standard.
People that built assemblers even up til the 1980s chose an incredible variety of syntax schemes.
The IEEE community reacted with a standard to try to avoid that problem:
694-1985 - IEEE Standard for Microprocessor Assembly Language
As with many things in the software world, it was and continues to be largely ignored.
The closest thing to a standard is that the vendor that created the processor/instruction set will have a document describing that language and often that vendor will provide some sort of an assembler (program). Some vendors are more detail and standard oriented than others so you get what you get. Then things like this intel/at&t happen to mess things up. Add to that gnu assembler loves to mess up the assembly language for the chips it supports as well so in general you have chaos.
If there were an assembly language whose use were comparable to C or C++ then you would expect an organization to try to come up with a standard. Part of the problem would still be that with things like the C language there is an interpretation before it hits the hardware, with assembler there is none to very little so a chip vendor is going to make whatever they want to make due to market factors and the standard would have to be dragged along to match the hardware, instead of the other way around where a standard drives the vendors.
The opencore processor might be one that could be standards driven since it is not vendor specific, perhaps it is already.
With assembly assume that each version of each assembler program/software/tool has its own syntax rules within the same instruction set as well as across different instruction sets. (which is actually what you get with C/C++ but that is another topic) either choose your favorite tool and only know it, or try to memorize all the variations across all the tools, or my preference is to try to avoid as many tool specific syntax and nuances, and try to find the middle ground that works or at least has a chance to work or port across tools.
No, there is no standard.
There are even two different types of syntax: the intel-syntax which is predominant on Windows plattforms and the AT&T-sytanx which is dominant in the *nix-world.
Regarding the differently looking code in the wikipedia: the windows example uses the Win32API and the linux example uses a system call of the 0x80 interrupt.
Assembly languages differ from processor to processor so no, there is no standard.
In general, the "standard" assembly language for a particular family of processor is whatever the processor designers say it is. For example, the "standard" syntax for x86 is whatever Intel says it is. However, that doesn't prevent other people from creating a variant of the assembly language that targets the processor with slightly different syntax or additional features (Nasm is one example).
Well, I'm not sure if you are asking about syntax for x86 processors (I suppose yes, because you're mentioning NASM).
But there are two common standards:
Intel syntax that was originally used for documentation of the x86 platform
AT&T syntax which is common in Linux/Unix worlds.
NASM you have mentioned prefers the Intel syntax.
You can find some examples of the syntax differences in this article: http://www.ibm.com/developerworks/linux/library/l-gas-nasm/index.html.
There's none because there are many different CPUs with different instructions and other peculiarities and it's entirely up to their designer what syntax to use and how to name things. And there's little need to standardize that because assembly code is inherently unportable and needs to be rewritten for a different CPU anyway.
Assembly language is not OS-specific per se, it's CPU-specific, but for an assembly routine to access things that appear standard to you (e.g. some subroutine to print text in the console) OS-specific code is needed. For MSDOS you'd use BIOS and DOS interrupt service routines (invokable on the x86 CPU through int 13h, int 10h, int 21h, int 33h, etc instructions), for Windows you'd use Windows' (available through int 2eh and sysenter/syscall instructions), for Linux you'd use Linux' (e.g. int 80h). All of them are implemented differently in different OSes and expect different number and kinds of parameters and in different places (registers or memory). You can't standardize this part. The only thing you can do about it is build a compatibility/abstraction layer on top of the OS functionality so it looks the same from your assembly routines' point of view.
Assembly syntax / language depends on CPU rather then OS. For the x86 CPU family there are however two syntax's AT&T (used by Unix like operating systems by default) and Intel (used by Windows and DOS etc.)
However the two assembly examples on the wiki are both doing different things. The windows example uses the WIN32 API and to show a message box, so all function arguments are pushed onto the stack in reversed order and then calls the function MessageBox() which on his turn creates the messagebox.
The linux example uses the write syscall to write a string to stdout. Here all 'arguments' are stored in the registers and then the int 0x80 creates an 'interrupt' now the OS is entering kernel land and the kernel prints the string to stdout.
The linux assemly could be rewritten like:
section .data
msg: db "Hello, world!", 10
.len: equ $ - msg
section .text
extern write
extern exit
global _start
_start:
push msg.len
push msg
push dword 1
call write
push dword 0
call exit
The above assembly must be linked against libc and then this will call write in libc which on his turn executes exactly the same code as the example on the wiki.
Another thing to note, is that Windows and Unix like operating system use different file formats in there libraries and applications.
Unix like systems use ELF http://en.wikipedia.org/wiki/Executable_and_Linkable_Format and windows uses PE http://en.wikipedia.org/wiki/Portable_Executable
This is why you see different sections in the assemblies on the wiki page.

Can C/C++ software be compiled into bytecode for later execution? (Architecture independent unix software.)

I would want to compile existing software into presentation that can later be run on different architectures (and OS).
For that I need a (byte)code that can be easily run/emulated on another arch/OS (LLVM IR? Some RISC assemby?)
Some random ideas:
Compiling into JVM bytecode and running with java. Too restricting? C-compilers available?
MS CIL. C-Compilers available?
LLVM? Can Intermediate representation be run later?
Compiling into RISC arch such as MMIX. What about system calls?
Then there is the system call mapping thing, but e.g. BSD have system call translation layers.
Are there any already working systems that compile C/C++ into something that can later be run with an interpreter on another architecture?
Edit
Could I compile existing unix software into not-so-lowlevel binary, which could be "emulated" more easily than running full x86 emulator? Something more like JVM than XEN HVM.
There are several C to JVM compilers listed on Wikipedia's JVM page. I've never tried any of them, but they sound like an interesting exercise to build.
Because of its close association with the Java language, the JVM performs the strict runtime checks mandated by the Java specification. That requires C to bytecode compilers to provide their own "lax machine abstraction", for instance producing compiled code that uses a Java array to represent main memory (so pointers can be compiled to integers), and linking the C library to a centralized Java class that emulates system calls. Most or all of the compilers listed below use a similar approach.
C compiled to LLVM bit code is not platform independent. Have a look at Google portable native client, they are trying to address that.
Adobe has alchemy which will let you compile C to flash.
There are C to Java or even JavaScript compilers. However, due to differences in memory management, they aren't very usable.
Web Assembly is trying to address that now by creating a standard bytecode format for the web, but unlike the JVM bytecode, Web Assembly is more low level, working at the abstraction level of C/C++, and not Java, so it's more like what's typically called an "assembly language", which is what C/C++ code is normally compiled to.
LLVM is not a good solution for this problem. As beautiful as LLVM IR is, it is by no means machine independent, nor was it intended to be. It is very easy, and indeed necessary in some languages, to generate target dependent LLVM IR: sizeof(void*), for example, will be 4 or 8 or whatever when compiled into IR.
LLVM also does nothing to provide OS independence.
One interesting possibility might be QEMU. You could compile a program for a particular architecture and then use QEMU user space emulation to run it on different architectures. Unfortunately, this might solve the target machine problem, but doesn't solve the OS problem: QEMU Linux user mode emulation only works on Linux systems.
JVM is probably your best bet for both target and OS independence if you want to distribute binaries.
As Ankur mentions, C++/CLI may be a solution. You can use Mono to run it on Linux, as long as it has no native bits. But unless you already have a code base you are trying to port at minimal cost, maybe using it would be counter productive. If it makes sense in your situation, you should go with Java or C#.
Most people who go with C++ do it for performance reasons, but unless you play with very low level stuff, you'll be done coding earlier in a higher level language. This in turn gives you the time to optimize so that by the time you would have been done in C++, you'll have an even faster version in whatever higher level language you choose to use.
The real problem is that C and C++ are not architecture independent languages. You can write things that are reasonably portable in them, but the compiler also hardcodes aspects of the machine via your code. Think about, for example, sizeof(long). Also, as Richard mentions, there's no OS independence. So unless the libraries you use happen to have the same conventions and exist on multiple platforms then it you wouldn't be able to run the application.
Your best bet would be to write your code in a more portable language, or provide binaries for the platforms you care about.

Resources