mpi under the hood

mpi under the hood - mpi

I need to deliver a presentation on programming in MPI. I need to add a segment on how MPI works under the hood. For Example What happens when I call MPI_Init?
Do you know of any good source from where I can learn these details?

The MPI Spec contains the description of the knobs, sliders, and displays that are on the outside of the "black box" of each API.
The interior details of the black boxes will be implementation dependent...and will also depend on the interconnect (e.g. TCP, IBV, DAPL, etc), the OS (e.g. is the implementation using LSB, or native libraries, etc), and on many other factors to a lesser degree (e.g. message size thresholds will trigger different code paths, and so on). Using "strace" and "ltrace" on the a.out may provide some insight into the actual goings on inside the blackbox.
The best recommendation is to pick an open source implementation and examine the code to determine the internal details.

MPI is a specification, not a particular implementation. The observable behavior is given in the MPI spec. How it works under the hood depends on the particular implementation. If you'd like to take a look at an example implementation, you might be interested in looking at MPICH2 and browsing their source code.

Complement your study of the source code of an implementation of MPI with consideration of how you would implement MPI_Init on your platform of choice. MPI sits on top of already available O/S functionality. I don't mean to suggest that you can figure out how a particular version of MPI is implemented by this approach, but to suggest that you can learn better what is going on under the hood by tackling the problem from another angle.

MPI is only a spec. MPI spec is implemented by various groups and organizations. You will want to pick one implementation, say, MPICH, and you can find their design documentation. That will tell you how the MPI spec is implemented by that group.
If you just want to describe what happens when an application written in MPI is started, you can read about MPI and MPI programming. I highly recommend http://www.citutor.org

Related

What is the main idea of OpenFOAM?

I just want to get the main idea/principle of openFOAM and how you create a simulation, please let me know where I go wrong,
So basically you have a object that interacts with gas or liquid and you want to simulate this, so you create model of the object, mesh it, specify where the gas will flow in and out and what are the walls, and set the other correct parameters and then run the program (with the approprate time step etc)?

OpenFOAM is an open source C++ library which implements the finite volume method (FVM), which is widely used in CFD.

What you have explained is a vague understanding of some of the applications of CFD. Those things you specified might not always be the case (i.e. the fluid might not necessarily be (a) gas and so on.
The main stages of a CFD problem are: making the geometry - mesh generation - preprocess - solving - postprocess.
There might be more stages added depending on the resolution and other specifics of the case.
Now OpenFoam is an open source (free for all) tool which is in C++ and helps solve the CFD problems. If the problem is simple and routine, and you have access to a commercial solver such as ANSYS fluent, then you can use that since it is easier and much less work if the problem is not specific. However, if the problem is specific and there are customized criteria OpenFoam is a nice tool.
It is written in C++ thus it is object oriented and also there are many many different solvers already written and available to use, so you will not have to write all the schemes and everything on your own from scratch.
However, my main advice to you is to read more about CFD to have a clear understanding, there are tens of good books avaiable.

Qt Application Perimeter

I wonder what is the bounds of Qt's perimeter. I know for exemple that it can specify types (such as qint or QString), and I know it cannot get system informations such as CPU Usage or Memory Usage.
My question is about the limits of Qt.
Is it correct that Qt can only interact with what is inside the project but not with what is outside (I mean system-related) ?

You can get information about operating system with QSysInfo class, if you are looking for this. This is one example, I am sure there are other helper classes. I think you should use other libraries for information like CPU usage etc, see here and also this question.

QT is nothing more/nothing less then a GUI C++ cross-platform framework. It's doesn't really have a perimeter, it has certain cross-platform functions implemented (like widgets/frames/controls a lot of other things). And within it's own functionality it provides (As being mentioned above) QSysInfo class, but you are free to add any OS dependent (if you target your application for particular platform) or cross-platform solutions for whatever tasks you need - hardware info/OS monitoring/etc..

what are exactly MPI, MPICH, and OPENMPI? what does "implementation" mean in this context?

My question might seem silly to those who have been in the field for long time, but I appreciate your patience in elaborating it for me.
When they say MPICH is an "implementation" of MPI, what does it mean?
Is the following analogy true(?):
if we think of MPI as a set of standards for a FORTRAN compiler, then MPICH, and OPENMPI are different versions of FORTRAN compilers, like Intel.Fortran, Compaq.Fortran, GNU.Fortran, and so on.

MPI is a standard: it outlines a particular model for message passing in a distributed system. However, it only gives a series of requirements: it does not actually include any code, nor does it specify how exactly these requirements need to be fulfilled. For example, take a look at this excerpt from the official MPI 2.2 spec (as of today):
A valid MPI implementation guarantees certain general properties of
point-to-point communication, which are described in this section.
Order Messages are non-overtaking: If a sender sends two messages in succession to the same destination, and both match the same
receive, then this operation cannot receive the second message if the
first one is still pending.
It then goes on to explain the rationale behind this requirement and provide an example, but says nothing more about the requirement itself.
An MPI implementation is a library that fulfills every requirement - like the one above - in the MPI specification. However, the standard contains absolutely no requirements as to what language constructs, OS calls, 3rd party libraries, etc can/can't/should be used. Occasionally, it will give advice to implementors, like this:
Advice to implementors. The implementation may keep a reference count
of active communications that use the datatype, in order to decide
when to free it. Also, one may implement constructors of derived
datatypes so that they keep pointers to their datatype arguments,
rather then copying them. In this case, one needs to keep track of
active datatype definition references in order to know when a datatype
object can be freed. (End of advice to implementors.)
however, these are still vague, very language-agnostic, and only recommendations: an implementation can ignore every single one of these advices, and still conform to the standard.
So yes, in essence it's similar to various implementations of a compiler. If a program takes valid source code for a language, and produces binary code that does everything that the language specification says it should do given the original source code, it's a conforming compiler for that language. Similarly, if you can use a library to pass messages in a way that doesn't break any rules of the MPI spec, then that's a valid MPI implementation.

Use cases for self-modifying code?

On a Von Neumann architecture, program and data are both stored in memory, so a program can modify itself. Is this useful for a programmer? Could you give some examples?

Metamorphism
One (questionable) use case that comes to my mind is metamorphic computer viruses. These are malicious pieces of software that conceal themselves from signature based detection by rewriting their own machine code to an semantically equivalent representation that looks different.
Trampolining
Another (more complex, but also more common) use case is trampolining, a technique based on dynamic code generation to solve certain problems with nested function calls.
JIT compilation
The most common usage of dynamic code generation that I can think of is JIT (just-in-time) compilation. Modern languages like .NET or Java are not compiled into native machine code, but into some kind of intermediate language (called bytecode). This bytecode is then interpreted when the program is executed (by a virtual machine written for the target architecture). At the same time, a background process checks which parts of the code are executed very often. These parts then have a good chance of being dynamically compiled into native machine language for maximum performance. All this happens during the run time of the program!
Security implications
One thing to keep in mind is that the possibility to interpret data as code is useful for exploiting security holes in computer software, which is why the trend in modern hardware and operating systems is to enable and, if possible, even enforce the separation of code and data (also see NX bit and DEP).

I can best answer this by referring you to an answer to a similar (exceptionally well written and answered) question, also on StackOverflow - Homoiconic and "unrestricted" self modifying code + Is lisp really self modifying?. The answer focuses on Lisp, a family languages known for taking "code is data" to the next level, and explores the uses of that in AI.

lowest level language until asp.net?

it's assembler right? can someone please point out the progression that we've had in programming languages since assembler to the days of asp.net, namely the chronological order of languages?

Here's a wiki timeline of all programming languages.
I would include a FTA table, but the list is very robust and extensive.
And also, the lowest language you ever get to is assembly (aside from straight up issuing machine instructions), regardless of what other language is built on top (including ASP.NET). Other languages are really just abstractions on top of assembly. In fact, ASP.NET gets compiled into IL (Intermediate Language) code, which then get's JITed into assembly. Assembly is as close to the metal as you're going to get.

To be pedantic, "assembler" is not actually a language (any more than "compiler" is;-) -- rather, it's a program that takes a source file in "assembly language" and emits binary machine code. The binary machine code can be said to be lower-level than the assembly language, since the latter allows use of some symbols and often includes a macro processing ability as well.
"Below" binary machine code, there may be other levels, known as "microcode" (but there might not be -- the CPU might be implemented entirely in real hardware, without any microprogramming aspect). That might be relevant only if the system's architecture allowed programmers to alter the microcode, especially by adding to it, etc -- there have been machines that did that, but I don't believe any currently commercialized CPU does. So you probably don't have to care about that (and the by-now-esoteric distinctions between vertical and horizontal microcode, etc, etc;-).

Programming languages are just ways to assemble solutions to computing problems.
The argument is "assembled out of what?"
From that point of view, I'd suggest the following evolutionary curve:
Napier's Bones
Babbage's difference engine
Jacquard (card) looms
(Conceptual) Abstract Turing machines/Post Systems/Church's calculus
Relay Computers (Aiken?)
Vacuum tubes as switching elements (Eniac)
Transistor-based computers
Microprogrammed machines
Integrated Circuits
Large Scale Circuits
with "assembler" being the programming language used to
put together solutions consisting of instructions for
real machines starting with the vacuum tube systems.
(I'm not sure the relay machines actually had assemblers).
Programming langauges are just ways to put together high
level commands that reduce in effect to assembler instructions.

There are two different dimensions to consider here, what I'd call vertical growth (languages build up over time from one generation to the next) and horizontal growth (syntactic improvements and reduction in complexity.)
A good explanation of vertical change is seen here: http://web.sxu.edu/rogers/sys/generations.html
And a nice, yet incomplete, illustration of horizontal change it here: http://oreilly.com/news/graphics/prog_lang_poster.pdf