Convert OpenCL/CUDA to Metal - opencl

I'm about to convert some GPU kernels of my project from OpenCL/Cuda to Metal in order to run my application on Apple devices. Currently, my project was written completely in C/C++. After doing some research, I think I need to get my hand dirty with Swift or Objective-C. But to be honest, I'm not sure about this stuff because Metal language for computation and deep learning is quite new.
I know there's a library called "CoreML", but my app requires some custom kernels. My question: what is the best way to deal with low-level API of Apple devices in my situation?

The Metal Shading Language is a version of C++. I haven't had too much trouble with porting OpenCL or CUDA kernels to Metal.
Core ML only supports a limited set of layers. You can write your own custom layers, which involves writing a CPU version and optionally a GPU version (in the Metal language).
I wrote a blog post about this: http://machinethink.net/blog/coreml-custom-layers/

Related

OpenCL for custom systems on SoC prototyping board

Is it possible to run OpenCL on a system designed by a user on a SoC prototyping board? To be more specific, I have a ZedBoard (Xilinx Zynq) that has Dual ARM cores and a Programmable Logic (PL) Area. If I design a simple system of my own that has a video processing accelerator implemented in the logic area, an ARM core and an AXI interconnect, what do I have to do to provide OpenCL support for this simple system? (In this simple system, the ARM core could be the "Host" and the video processing accelerator could be the "device").
I am a student and I have only some basic knowledge about OpenCL. I have researched about my question and have only ended up confusing myself. What are the things that have to be done to provide OpenCL support for a SoC? I understand that this may be a big project, but I need a guideline where to start and how to proceed.
what do I have to do to provide OpenCL support for this simple system?
Implement a OpenCL platform that makes either use of your ARM CPU or the FPGA (or both). I'd say that is pretty much impossible for you; ARM would surely offer one for the CPU if it was easy (and they definitely have the financial means to employ capable engineers/computer scientists), and implementing accelerators on an FPGA requires in-depth knowledge of FPGA development, as well as compiler theory and experience in systems design. I don't want to sound mean, but you seem to have none of these three.
You asked where to get started; I recommend just writing a first accelerator that e.g. adds up a vector of numbers; as soon as you have that, you will have a clearer idea of your task.
If you want to have a look at a reference: The Ettus USRP E310 is a zynq-based SDR device. Ettus has a technology called RFNoC, which allows users to write their own blocks to push data through. Notice that this took quite a few engineers and quite some time to get started. Notice further that it's much easier than implementing something that converts OpenCL to FPGA implementations.
If you have access to the Xilinx tools: Vivado HLS 15.1 System Edition should compile OpenCL kernels. This will also be included in the SDAccel tool suite.
Source: UG973: Vivado Design Suite User Guide Release Notes, Installation,and Licensing
An alternative might be switching to Altera. They provide some good examples for the Altera Cyclone V SoC which is comparable to Xilinx Zynq devices (also includes ARM Cortex-A9) :
AlteraSDK for OpenCL
I am also a student and my current project is also going on a similar direction, i have successfully installed a version of opencl called POCL on the zedboard, it successfully detects the arm cpu of the zedboard. To install pocl, you need llvm and a horde of other things as well. but basic steps to get pocl up on the zedboard are given below:-
Installing pocl:
http://www.hosseinabady.com/install-pocl-opencl
running example:
http://www.hosseinabady.com/embedded-system-by-examples/opencl_embedded_system/opencl-vector-addition
Lots of dependency: can resolved easily
but LLVM make sure you install 3.4 version for pocl 0.9
Steps to install llvm
https://github.com/pacs-course/pacs/wiki/Instructions-to-install-clang-3.1-on-ubuntu-12.04.1-and-12.10
POCL 0.9 is successfully working for me, as you do the installation you will face many other missing dependencies like hwloc, mesa libraries, open gl/cl headers icd loaders i hope you can resolve them as its a very big list to put up in stack overflow.
In order to detect your fpga as an open cl device, thats not going to be a trivial thing to do, you can refer to this link question i posted on github
https://github.com/pocl/pocl/issues/285
and also a research paper published by hosseinbady found on the publications link on the pocl website
http://pocl.sourceforge.net/publications.html
hope this helps you
Try the ARM OpenCL SDK. The Zedboard has an ARM A9 CPU, this should have a NEON SIMD vector unit http://www.arm.com/products/processors/technologies/neon.php which can run OpenCL. See http://www.arm.com/products/multimedia/mali-technologies/opencl-for-neon.php.
The Zedboard isn't listed as an OpenCL conformant platform https://www.khronos.org/conformance/adopters/conformant-products#opencl.
So there is a chance the ARM driver will not work.
Good luck!
If still relevant, try this paper OpenCL on ZYNQ [PDF]
Also note that Zynq-7000 is listed on https://www.khronos.org/conformance/adopters/conformant-products#opencl ( OpenCL_1_0 ), hence the compatibility.

Does Qt have general-purpose classes besides GUI-related classes?

I have recently regained some interest in learning Qt, but have the following doubt:
Does Qt have enough classes that are not GUI-related?
For example, Python is "batteries-included", .NET is definitely "batteries-included", and as far as I have seen, Android API also has a lot of classes to design and implement application/domain logic, not directly related to visual presentation.
The main reason I am asking is because I don't know C++ and don't plan to learn it deeply (too much time needed), so if I had to take third party C++ libraries all the time and struggle to use them inside Qt projects that would be a strong point against going ahead.
The intended use is mostly to create small desktop apps for personal use while gaining insight on software design good practices - a profession I am slowly migrating to.
I have already used some Python/Pygtk (without IDE) and WPF (in VStudio/ExpressionBlend). In both platforms, most of my work is related to scientific computations, image processing and interactive scientific visualization, and there are good libraries for that either in Python (Numpy, Scipy, Matplotlib, Pandas, PIL, cairo) and .NET(AForge, alglib, System.Media.Media3D). I wonder if the Qt ecosystem is so complete in that regard.
Qt isn't a language in itself, so you can't compare it to Python or .NET. With that being said, Qt does provide general-purpose classes like containers, a Unicode string class, character set encoders/decoders, multimedia, device and file I/O, etc. All these modules are fully documented.
There are also some external modules available for Qt, like Qwt which provides widgets for technical applications.
For other functionality where something Qt-specific isn't available, you can obviously use another appropriate library. Like OpenCV.
Oh, and you can use Qt in Python too, through PyQt.
As far as I know, Qt doesn't have image processing libraries. For that, you'll need to use something like OpenCV. Qt does have libraries for loading most common image types.
However, Qt does extend beyond just GUI classes.
There is a database module that's quite convenient. The concurrency/threading classes are nice. I've enjoyed making use of the Qt Networking classes. The FileIO classes are alright.
These classes/modules are all useful for making platform-independent code. Things like image processing are mostly algorithmic and tend to be platform-independent by nature. So I think they fall out of the scope of the Qt framework. It shouldn't be too difficult, however, to simply find a library that does what you need and link that in to your project.
A lot of the Qt Core services that heltonbiker and Nikos C. mentioned, can be thought of as extensions to C++, a little like std. Although I often prefer the Qt implementations myself. But Qt has gone much further with their libraries, with the I/O, and web services etc...
The QtXML library provides reading and writing of XML files. Traditionally we had always used xerces, but the Qt XML library is almost as simple as .NETs.
The QtNetwork library offers TCP/IP and other networks services
The QtMultimedia library performs playback and recording of audio
and video content to the use of available devices like cameras and
radios.
The QtSQL library interfaces with SQL databases.
And there is much more than that. Although these are probably services that are used to most. The other benefit is that for the most part the implementations are cross platform. So for example using the I/O services does not require you to write separate code for Linux and Windows. That is a general rule, and there are exceptions. But I am sure most people would agree that any of the services they offer are easy to use, and well documented.
Happy coding.
Qt provides ample abstraction besides UI - it comes with a set of functionality enhancing features that come with certain usage paradigms.
Container classes - shallow copy by value, copy on write
Implicit sharing for containers and certain data types
Event driven, signals and slots
A powerful and usable metasystem
Properties
Platform abstraction for a lot of functionality, from file access to network and multithreading
Cross platform atomics (not that important since C++11 atomics)
Settings API
Undo API
OpenGL abstraction (not necessarily UI, custom graphics)
Basic image formats and basic image manipulations
Qt Declarative, a.k.a QtQuick and QML markup (usable for all kind of structure markup BTW)
Dynamic plugin API
Platform abstraction and portability - same code, multiple platforms
High and low level multimedia - audio, video
Sensors and serial port
Unit test
XML, JSON, SQL
An outdated and hopefully soon updated OpenCL abstraction
Last but not least, a lot of 3rd party modules built around Qt fitting a wide range of applications
Honestly, all its missing is support for some more popular formats for file, media encoding/decoding and containers, some parallel and vector abstraction, USB, WIFI/NFC (in the works in an addon module) and it will be 100% versatile.
Note that you can also use Python with Qt, although I have no experience with that, Qt is a very versatile tool that allows for quick and easy application development - and since 5.1 supports pretty much the entire market, with the addition of Android and iOS to the list of supported platforms. It is very useful for creating custom use applications for creative or research purposes.
Although not perfect for every task, Qt is easily the "best of the bunch" of tools you can use in this regard. Unbeatable in terms of portability and very thorough, if not a little bloated for the set of functionality it provides. And finally, it is free, you can even develop commercial applications under LGPL as long as you link dynamically. All in all, it is well worth the investment to learn, the only downside is it lacks uniformity between the old C++ native APIs and the QML runtime, which is actively worked on and is based on JS, so the APIs are a bit different and some glue APIs are required to fuse C++ with JS and QML.
(just for the record, from the official site):
The Foundation: Qt Core Module
The Qt Core module forms the foundation of all Qt-based applications
with core non-graphical classes used by other modules.
Key Functions
File IO, event and object handling
Multi-threading and concurrency
Plugins, setting management
Signals and Slots inter-object communications mechanism
Benefits
Reduce development time and cost by leveraging a complete set of application building blocks
Develop portable code from the ground up with cross-platform functionality

Can C/C++ software be compiled into bytecode for later execution? (Architecture independent unix software.)

I would want to compile existing software into presentation that can later be run on different architectures (and OS).
For that I need a (byte)code that can be easily run/emulated on another arch/OS (LLVM IR? Some RISC assemby?)
Some random ideas:
Compiling into JVM bytecode and running with java. Too restricting? C-compilers available?
MS CIL. C-Compilers available?
LLVM? Can Intermediate representation be run later?
Compiling into RISC arch such as MMIX. What about system calls?
Then there is the system call mapping thing, but e.g. BSD have system call translation layers.
Are there any already working systems that compile C/C++ into something that can later be run with an interpreter on another architecture?
Edit
Could I compile existing unix software into not-so-lowlevel binary, which could be "emulated" more easily than running full x86 emulator? Something more like JVM than XEN HVM.
There are several C to JVM compilers listed on Wikipedia's JVM page. I've never tried any of them, but they sound like an interesting exercise to build.
Because of its close association with the Java language, the JVM performs the strict runtime checks mandated by the Java specification. That requires C to bytecode compilers to provide their own "lax machine abstraction", for instance producing compiled code that uses a Java array to represent main memory (so pointers can be compiled to integers), and linking the C library to a centralized Java class that emulates system calls. Most or all of the compilers listed below use a similar approach.
C compiled to LLVM bit code is not platform independent. Have a look at Google portable native client, they are trying to address that.
Adobe has alchemy which will let you compile C to flash.
There are C to Java or even JavaScript compilers. However, due to differences in memory management, they aren't very usable.
Web Assembly is trying to address that now by creating a standard bytecode format for the web, but unlike the JVM bytecode, Web Assembly is more low level, working at the abstraction level of C/C++, and not Java, so it's more like what's typically called an "assembly language", which is what C/C++ code is normally compiled to.
LLVM is not a good solution for this problem. As beautiful as LLVM IR is, it is by no means machine independent, nor was it intended to be. It is very easy, and indeed necessary in some languages, to generate target dependent LLVM IR: sizeof(void*), for example, will be 4 or 8 or whatever when compiled into IR.
LLVM also does nothing to provide OS independence.
One interesting possibility might be QEMU. You could compile a program for a particular architecture and then use QEMU user space emulation to run it on different architectures. Unfortunately, this might solve the target machine problem, but doesn't solve the OS problem: QEMU Linux user mode emulation only works on Linux systems.
JVM is probably your best bet for both target and OS independence if you want to distribute binaries.
As Ankur mentions, C++/CLI may be a solution. You can use Mono to run it on Linux, as long as it has no native bits. But unless you already have a code base you are trying to port at minimal cost, maybe using it would be counter productive. If it makes sense in your situation, you should go with Java or C#.
Most people who go with C++ do it for performance reasons, but unless you play with very low level stuff, you'll be done coding earlier in a higher level language. This in turn gives you the time to optimize so that by the time you would have been done in C++, you'll have an even faster version in whatever higher level language you choose to use.
The real problem is that C and C++ are not architecture independent languages. You can write things that are reasonably portable in them, but the compiler also hardcodes aspects of the machine via your code. Think about, for example, sizeof(long). Also, as Richard mentions, there's no OS independence. So unless the libraries you use happen to have the same conventions and exist on multiple platforms then it you wouldn't be able to run the application.
Your best bet would be to write your code in a more portable language, or provide binaries for the platforms you care about.

Interested in Device Programming. Where to Start

All
I've a Good Command over C++, But I've never done anything anything on device programming. I've some basic understanding on Digital Logic Design. But I am complete Noob in Electronics. Currently I am getting huge interest on microcontroller Programming.
Where To Start ?
I don't think one really needs to have huge amount of knowledge on electronics to run a program on a microcontroller.
I am using Linux. and I've downloaded Keil. never tried to run it through Wine. I've ran it in Windows. But how the code works is not completely clear to me. though I can understand Logic as its written in C. But Its still like a Fog to me.I Just need a Quick Kickstart.
SO is not the best site to ask this kind of question. There's really a large distinction between programming for a PC and programming for an embedded system, other SE sites specialize in physical computing. I got this email from Robert Cartaino on Tuesday:
...Barring any last-minute interest from
[chiphacker.com], we will be launching [electronics.stackexchange.com]
either tomorrow [Wednesday 9/22] or
Thursday.
So, go commit to electronics.stackexchange.com here, and browse chiphacker.com while you wait. Take a look at these questions on Chiphacker:
How to become an embedded software developer?
Steps to learning Arduino Programming
PIC Programming
What are the best beginner project[s] using an arduino
There are a few things you should consider when planning your entry path to embedded systems programming.
What do you want to do?
What do you know how to do?
How fast are you comfortable learning?
I've outlined a few options in the following paragraphs.
You tagged your question linux-device-driver, does this mean that you want to make a custom device to use in Linux? If you meant embedded-linux, then you're into a larger class of microcontrollers. I suggest that you look at the BeagleBoard, also look at this Chiphacker question for some other options. If you want to do embedded linux, and want to build your own board, you'll first need to build up some experience in simpler levels of embedded systems design.
You also tagged your question avr, which is a popular microcontroller class made by Atmel (check out the avrfreaks forum for more info). I started learning embedded systems on the ATmega324p; they really have great documentation, are easy to use, and there are more sites online for the avr than most any other processor.
If you want an easier learning curve, I suggest taking a look at the Arduino environment. It uses Wiring, which is very similar to C/C++, and the Arduino can be enhanced with 'shields', which are modules that can be plugged into the Arduino main board to add functionality. This is your Quick Kickstart.
A good learning path would be to get familiar with the Arduino, then build your own AVR board (possibly a Linux device, like a joystick), then work with an ARM-based development kit, and finally move on to to building your own embedded linux board. You can skip a few steps if you don't mind a steep learning curve, or stop at any point along the way if a given level's capabilities satisfy your needs. You don't necessarily need a "huge amount of knowledge on electronics to run a program on a microcontroller", it's true, but you should understand some basic things like voltage and current before you try to light an LED or connect two devices.
Finally, you said in your question that you've installed the Keil IDE. While this is a fine and rather popular IDE, I'd suggest that you learn using a gcc-based command line toolchain. There are a staggering number of ways in which things that can go wrong when working with embedded systems, and an IDE adds a layer of magic on top of everything that happens. While this can be nice, I'm a strong advocate of minimizing the magic when trying to learn the system. You need to understand the low-level stuff when things don't work automagically. This advice doesn't apply when using the Arduino, which is designed to (and does) make all of the automagical stuff work well.
sparkfun.com has a lot of boards, arduino family and other. I recommend the armmite pro, the lillypad instead of the arduino pro because there is no soldering involved, for either you will need/want the correct usb to serial/power. The mbed2 costs a little more, the blue leds are brutal on the eyes, but easy to use. For none of the above are you required to play in their sandbox, you can use the canned environment, etc but not required.
if it is linux development you are after I recommend the hawkboard.org over the beagleboard.org, to make the beagleboard useable costs about twice as much as the board itself, the hawkboard is usable by only buying something to power it. But you can just learn linux drivers on your desktop/laptop and dont need to mess with embedded necessarily.
Emulators are a good start. Qemu is good stuff, emulates a number of processors, great for emulating virtual linux systems, learning linux driver development, etc. But getting visibility into what the (virtual/emulated) processor is doing is not the goal. I find it useful to have visibility. gdb includes a few emulators as well. mame is loaded with them, but like qemu designed for fast emulation and not for education. visual boy advance is good. Emulation is never perfect, so eventually you want to run on hardware, but emulators and compiler tools are free and you can learn quite a bit before you have to buy hardware. There is a considerable amount you cannot learn from an emulator though, loading your programs into flash/ram, debugging using jtag or other interfaces. i2c, spi, etc.

Architectural decision : QT or Eclipse Platform?

We are in the process of designing a tool to be used with HDEM(High Definition Electron Microscope).We get stacks of 2D images from HDEM and first step is 'detecting borders' on the sections.After detecting edges of 2D slices ,next step is construct the 3D model using these 2D slices.
This 'border detecting' algorithm(s) is/are implemented by one of professor and he has used and suggests to use C.(to gain high performance and probably will parallelise in future)
We have to develop comprehensive UI ,3D viewer ,2D editor...etc and use this algorithm.
Application should support usual features like project save/open.Undo,Redo...etc
Our technology decisions are:
A) Build entire platform from the
scratch using QT.
B) Use Eclipse Platform
Our concerns are,
if we choose A) we can easily integrate the 'border detecting' algorithm(s) because the development environment is C/C++ But we have to implement the basic features from the scratch.
If we choose B) we get basic features from the Eclipse platform , but integrating C libraries going to be a tedious task.
Any suggestions on this?
I'd go for Qt any time:-)
If you need a IDE framework to build your project on you might want to consider Qt together with Qt creator. The latter is a really nice IDE to develop with and can be extended with custom plugins, pretty much like eclipse.
If you need performance and a well-controlled process I'd suggest going the Qt way.
Qt has a well documented class library that should make implementation of basic features fairly easy and intuitive. It also has OpenGL support for 3D and good 2D editing capabilities.
I've recently built a monitoring application with a custom UI and it was fairly easy once you get past the basic concepts behind the framework.

Resources