What are Intel execution modes?

What are Intel execution modes? - intel

I'm trying to study a bit about x86 internals, I read about System Management Mode, Real Mode , Protected Mode and Long Mode, however I heard there are a few more undocumented
can anyone refer me to any such documentation which proves that or explains them ?

Are you asking about only operating modes or about x86 modes in general? The wikipedia article on this documents all these modes. Your question is a bit broad.

Related

How to develop an OpenCL application targeting specifically Intel CoffeeLake-H GT2 (UHD Graphics 630) without this device?

I've been tasked to develop an OpenCL application for a specific platform, Intel CoffeeLake-H GT2 (UHD Graphics 630). There are two problems for me:
Even having some OpenCL programming experience (not that much though), I wouldn't know where to begin. I have no prior experience with targeting specific hardware before.
The device itself has to be emulated or something, because I don't have it at hand.
Of course, I tried googling information today, but couldn't find anything that could really help me. Guess, it's just because of my lack of experience. So, I'm stumped right now, and asking for help.
It would be really great if I can be helped. Any help would be appreciated. Thanks in advance.
Small note: I'm working on this project under Ubuntu 18.04.

I'm not aware of any emulated environment, and anyway, ultimately nothing replaces access to the target hardware. I see a few workarounds:
Target a similar-enough device. Intel GPUs haven't changed that drastically, so especially if you have an older/lower-spec one around, whatever you end up with should run better on the newer GPU. You can also work with a GPU from another vendor if you have at least sporadic access to a system with an Intel GPU. You don't want to go for too long at a time without testing on your target hardware. (It's generally a good idea to test OpenCL code against different implementations while developing, as it's easy to accidentally rely on implementation-defined or undefined behaviour otherwise.)
Rent a relevant physical device. Places exist that allow you to rent laptops or desktop PCs for a short time period.
Remote access to a target device. Presumably whoever posed the requirement actually has such devices. Ask for remote access to one of them, via the magic of the internet. (RDP, VNC, SSH)
Rent similar hardware in a data centre. There are bare metal hosting companies that rent out physical servers built from commodity hardware. Find one that offers servers with a close enough match to the system you're targeting and rent one there.
As for the skill gap, well, you'll either have to bridge that one yourself by following enough documentation, tutorials, etc. or by finding (hiring…) someone who will give you some degree of hand-holding through the project.

OpenCL for custom systems on SoC prototyping board

Is it possible to run OpenCL on a system designed by a user on a SoC prototyping board? To be more specific, I have a ZedBoard (Xilinx Zynq) that has Dual ARM cores and a Programmable Logic (PL) Area. If I design a simple system of my own that has a video processing accelerator implemented in the logic area, an ARM core and an AXI interconnect, what do I have to do to provide OpenCL support for this simple system? (In this simple system, the ARM core could be the "Host" and the video processing accelerator could be the "device").
I am a student and I have only some basic knowledge about OpenCL. I have researched about my question and have only ended up confusing myself. What are the things that have to be done to provide OpenCL support for a SoC? I understand that this may be a big project, but I need a guideline where to start and how to proceed.

what do I have to do to provide OpenCL support for this simple system?
Implement a OpenCL platform that makes either use of your ARM CPU or the FPGA (or both). I'd say that is pretty much impossible for you; ARM would surely offer one for the CPU if it was easy (and they definitely have the financial means to employ capable engineers/computer scientists), and implementing accelerators on an FPGA requires in-depth knowledge of FPGA development, as well as compiler theory and experience in systems design. I don't want to sound mean, but you seem to have none of these three.
You asked where to get started; I recommend just writing a first accelerator that e.g. adds up a vector of numbers; as soon as you have that, you will have a clearer idea of your task.
If you want to have a look at a reference: The Ettus USRP E310 is a zynq-based SDR device. Ettus has a technology called RFNoC, which allows users to write their own blocks to push data through. Notice that this took quite a few engineers and quite some time to get started. Notice further that it's much easier than implementing something that converts OpenCL to FPGA implementations.

If you have access to the Xilinx tools: Vivado HLS 15.1 System Edition should compile OpenCL kernels. This will also be included in the SDAccel tool suite.
Source: UG973: Vivado Design Suite User Guide Release Notes, Installation,and Licensing
An alternative might be switching to Altera. They provide some good examples for the Altera Cyclone V SoC which is comparable to Xilinx Zynq devices (also includes ARM Cortex-A9) :
AlteraSDK for OpenCL

I am also a student and my current project is also going on a similar direction, i have successfully installed a version of opencl called POCL on the zedboard, it successfully detects the arm cpu of the zedboard. To install pocl, you need llvm and a horde of other things as well. but basic steps to get pocl up on the zedboard are given below:-
Installing pocl:
http://www.hosseinabady.com/install-pocl-opencl
running example:
http://www.hosseinabady.com/embedded-system-by-examples/opencl_embedded_system/opencl-vector-addition
Lots of dependency: can resolved easily
but LLVM make sure you install 3.4 version for pocl 0.9
Steps to install llvm
https://github.com/pacs-course/pacs/wiki/Instructions-to-install-clang-3.1-on-ubuntu-12.04.1-and-12.10
POCL 0.9 is successfully working for me, as you do the installation you will face many other missing dependencies like hwloc, mesa libraries, open gl/cl headers icd loaders i hope you can resolve them as its a very big list to put up in stack overflow.
In order to detect your fpga as an open cl device, thats not going to be a trivial thing to do, you can refer to this link question i posted on github
https://github.com/pocl/pocl/issues/285
and also a research paper published by hosseinbady found on the publications link on the pocl website
http://pocl.sourceforge.net/publications.html
hope this helps you

Try the ARM OpenCL SDK. The Zedboard has an ARM A9 CPU, this should have a NEON SIMD vector unit http://www.arm.com/products/processors/technologies/neon.php which can run OpenCL. See http://www.arm.com/products/multimedia/mali-technologies/opencl-for-neon.php.
The Zedboard isn't listed as an OpenCL conformant platform https://www.khronos.org/conformance/adopters/conformant-products#opencl.
So there is a chance the ARM driver will not work.
Good luck!

If still relevant, try this paper OpenCL on ZYNQ [PDF]
Also note that Zynq-7000 is listed on https://www.khronos.org/conformance/adopters/conformant-products#opencl ( OpenCL_1_0 ), hence the compatibility.

How much memory does linux kernel and base services use?

I'm doing and embedded linux+qt project and I was wondering what was the base memory consuption of the linux kernel plus some basic services. Just enough to run some framebuffer based application.
I ended up in here: http://qt-project.org/doc/qt-4.8/requirements-embedded-linux.html but as I'm reading that seems like it's just the qt requirements without counting the linux overhead.
Can someone point me to a more detailed resource on the topic?

The numbers in the table you cited look reasonable.
The actual answer is "it depends". Yes, Virginia: it is possible to have a working OS and a Qt-based GUI in under 4MB.
The actual memory usage will vary wildly, depending on:
Which kernel you use
How you configure your kernel build
Which kernel drivers you load at runtime
What you start up during system init
Etc etc
Book recommendation:
Embedded Linux Primer, Christopher Hallinan
ALSO: here's a list of prebuilt-distros with GUIs that all run on Pentium IVs with 512MB RAM:
http://www.osnews.com/story/26087

I would suggest using Yocto for such builds, but you can also take a look at the upcoming "Boot to Qt" project which is basic a Qt 5 replacement for Qt embedded with Qt 4. I would not suggest looking into the link you pasted in your question.
You should definitely focus on Qt 5 for several reasons. The foremost is probably because you can get hardware acceleration and Qt got a lot of utilization for embedded, including decoupling the QtWidgets module, and so forth.
Here you can find the technology preview that the guy in Norway are working on. This is just for future reference:
http://blog.qt.digia.com/blog/2013/05/21/introducing-boot-to-qt-a-technology-preview/
I would start using the Yocto project for now. We have worked on a "meta-qt5" layer which is not perfect, but good enough. Yocto will also take care of the Linux with "minimal images", et cetera.
Not sure if you had seen the classic example a couple of years ago, but there was a "Qt boot" for an embedded Linux board which happened within a second. Here is the link to the reading material. Unfortunately, the original video does not seem to be available anymore.
http://www.embedded-bits.co.uk/2011/1-second-linux-boot-to-qt/

Dynamic parallelism is supported by OpenCL...?

I am trying to use recursion inside an OpenCL kernel. Compilation is successful but while running it is giving compilation error so I want to know, as Dynamic Parallelism is now supported by CUDA, does OpenCL support Dynamic Parallelism or not?

Recursion is not supported by OpenCL. See point i in section 6.9 of the standard v1.2.
EDIT: The new Dynamic Parallelism capability of CUDA does't have anything to do with recursion (it was already supported a while ago by CUDA. See this question. This new capability allow threads running on the device to configure and launch new grids which was previously only done by the host. See this document for an overview.
SECOND EDIT: regarding the answer of #Michael: This is only the spec, you will have to wait for the implementation release. Besides, at that point in the future you will also have to make sure to have the proper hardware (even dynamic parallelism is supported by CUDA only for devices of capability 3.5 and higher). So when you asked your question, and still today: NO OpenCL implementation supports dynamic parallelism.

Dynamic Parallelism in now supported in OpenCL 2.
Khronos Group announced it at Siggraph 2013.
You can find the specifications here

Minimal FOSS RTOS with TCP/IP, SSL, USB and basic file-system support for ARM

Here's a candid admission first -- that I know zilch about RTOS or Embedded programming, so folks who know better may help me frame the query more appropriately.
What would be the minimal FOSS RTOS (or any OS for that matter) with support for TCP/IP, SSL, USB and some basic file-system for low-end ARM devices like Cortex-M3's ?
Have not ruled out something like ARM9/ARM7TDMI, so an RTOS that has "optional" MMU support, may be a major plus. We are at present dabbling with few uncertainities like precise processor, MMU/no-MMU, running completely head-less (no display), however I wanted to start a little ramp-up.
Would gladly answer counter questions to clarify the requirement.

I believe that eCOS has support for all you need and is scalable.
Alternatively you could build from a self-selected kit of parts; choosing independent RTOS, filesystem, USB, etc. From different sources, and integrating them yourself.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex