how to intercept __fork() in GNU C standard library? - ld-preload

I look at the source of GNU C standard library and I see that the implementation of system function calls __fork(). I need to intercept that call with my own wrapper for __fork and the LD_PRELOAD technique.
I think I know how to use LD_PRELOAD because:
If I call __fork() myself in my application, it is intercepted correctly. So that means, one is able to intercept __fork() in principle.
If I change __fork() to fork() in the standard library implementation of system() and recompile standard library and use that, then fork() is intercepted.
However, __fork() in standard library, is not intercepted - my wrapper is not called.

Firstly, __fork is a synonym for __libc_fork which does a fork system call directly. fork is a weak symbol that refers to the same thing. Overriding any of these functions will work if some other shared library is calling that particular function.
$ readelf -Wa /lib/x86_64-linux-gnu/ | grep b84c0
42: 00000000000b84c0 784 FUNC GLOBAL DEFAULT 13 __libc_fork##GLIBC_PRIVATE
80: 00000000000b84c0 784 FUNC GLOBAL DEFAULT 13 __fork##GLIBC_2.2.5
408: 00000000000b84c0 784 FUNC WEAK DEFAULT 13 fork##GLIBC_2.2.5
However, within libc itself, the linker knows that __fork is located inside the same library and it opts to not go through the PLT to reach that function. It just emits a direct call instruction. This is a common optimization for GCC to do when a module calls a static function, or when a library calls one of its own functions. See below (the call would be to __fork#plt if it went through the PLT):
$ objdump -d /lib/x86_64-linux-gnu/ | grep __fork | head -n 4
6a074: e8 47 e4 04 00 callq b84c0 <__fork##GLIBC_2.2.5>
00000000000b84c0 <__fork##GLIBC_2.2.5>:
b84da: 74 5c je b8538 <__fork##GLIBC_2.2.5+0x78>
b84e1: 74 ed je b84d0 <__fork##GLIBC_2.2.5+0x10>
When you changed libc to use fork() internally, it was calling a weak symbol, and those may be overridden by the user. Thus the linker has no choice but to emit a call that goes through the PLT. This means that you can actually overload it with LD_PRELOAD.
In general, it's not going to be easy to hijack fork like this. Functions can always invoke a fork system call directly and there is no way to intercept that. You might be interested in pthread_atfork if your code uses pthreads. This adds functions to the __fork_handlers array inside glibc. Unfortunately that array is marked as protected and the symbol cannot be accessed directly.


gperftools failing to identify files

Is there a way to avoid Google Performance Tools listing files as "??:?", that is, failing to locate which file contains the function it is reporting on? How can I work out which library contains the function being called?
$ env LD_PRELOAD="/usr/lib/" \ python
$ google-pprof --text --files /usr/bin/python
Using local file /usr/bin/python.
Using local file
Removing _L_unlock_13 from all stack traces.
Total: 433 samples
362 83.6% 83.6% 362 83.6% dtrsm_ ??:?
58 13.4% 97.0% 58 13.4% dgemm_ ??:?
1 0.2% 97.2% 1 0.2% PyDict_GetItem /.../Objects/dictobject.c
1 0.2% 97.5% 1 0.2% PyParser_AddToken /.../Parser/parser.c
I am aiming to be able to profile the C code in a python package that has many compiled C extension modules. In the toy example above, what would I do to track down where "dtrsm_" is defined? If there are multiple loaded libraries that contain functions with that same name, is there any way to tell which version is being called?
C/C++ won't compile if the same pre-processed sourcefile (e.g. with #includes expanded) contains duplicate definitions for the same symbol. (Note that in the case of C++, symbols are mangled, according to compiler-specific schemes, to incorporate the argument signature so as to facilitate overloaded functions, which could not otherwise be differentiated.)
The linker is only concerned with unresolved symbols (so there ought be nothings preventing multiple libraries concurrently calling their own respective internally-defined functions with coincident names). If a file invokes a declared but undefined function, and multiple available libraries implement that symbol, then the linker is free to choose (say by precedence in a search-path) which version gets substituted in. (Incidentally, this is the same mechanism by which profilers such as gperftools or hpctoolkit are able to inject themselves and alter the normal behaviour of another application.)
Since different libraries are mapped to separate pages of memory, it ought to be possible to identify (from memory addresses) which library contains the executing version of a function. Indeed, the GNU debugger can identify the library that code is contained by, even when it fails to name a function.
$ gdb python
(gdb) run -c "from numpy import *; linalg.inv(random.random((1000,1000)))"
(gdb) backtrace
#0 0x00007ffff5ba9df8 in dtrsm_ () from /usr/lib/
#3 0x00007ffff420df83 in ?? () from /.../numpy/linalg/
Linux (or rather the GNU C library) provides the "backtrace" call (for getting a list of pointers from the call stack), and the "backtrace_symbols" call for automatically converting each of those pointers to a descriptive string such as:
"/lib/x86_64-linux-gnu/ [0x7fc429929ec5]"
Gperftools can (judging from a query on the github mirror) call the generic "backtrace", but instead of "backtrace_symbols" it "forks out to pprof to do the actual symbolizing". This is a fairly-epic perl script, and looks likely where the "??" comes from.
Crucially, google-pprof is trying to report on the source-file (and line-number) which defines the function, not the binary-file containing the machine-code (that is typically quoted in stack traces). It invokes the "nm" utility. On my system it appears (by running "nm -l -D") that libblas, unlike libc and the python binary, has been stripped of such debugging symbols (presumably for optimisation), explaining the result.
To answer the original question: the call-stack samples should definitively and explicitly specify which version is being called. These can probably be dumped using an option which was added in google-pprof several months ago, or (for time-intensive functions) can be roughly ascertained by manual resampling using gdb. (It's even conceivable that g-pprof can be adjusted to explicitly identify the binaries paths in its output summaries.) Alternatively one can run "nm" (and grep) on the candidate binaries/libraries (of which a short-list can be obtained by running "strings" on the profiler's raw output, among other methods). If the source is accessible (to grep) or the libraries are popular (on the web) then of course (and per Mike Dunlavey) it may be easiest to just query for the function name. In theory the "??:?" may be addressed by carefully recompiling the offending objects.
Just Google the offending function names. The ones you show above are defined in LAPACK. dtrsm is for solving a matrix equation. dgemm is for multiplying matrices.
What you need to know is 1) why they are being called, and 2) how big the matrices are.
To find out why they are being called, what I do is just examine individual stack samples, as here.
The reason matrix size matters is if they are small, these LAPACK routines can actually spend a relatively large fraction of their time just classifying their inputs, such as by calling a function LSAME.

Ada object declarations "Unsigned Not Declared in System"

In some code that I inherited, I get the compile error "Unsigned" not declared in "System".
I'm trying to compile this using GNAT, but ultimately the code must compile with the original tools, which I don't have ready access to. So I'd like to understand how to resolve this from within the development environment (including the project file), and not modify the existing code.
I checked the file, and Unsigned is not defined there. Am I referring to the wrong libraries? How would I resolve this with the self imposed constraint mentioned above (to compile in the original environment)?
unsigned is the name of a predefined type in C. If what you need it an Ada type that matches the C type, what you need is Interfaces.C.unsigned. An older Ada implementation (before Interfaces.C was introduced by the 1995 standard) might have defined System.Unsigned for this purpose.
It would help to know what Ada implementation the code was originally written for.
You should examine the code to see whether it uses that type to interface to C code. If not (i.e., if it's just being used as a general unsigned integer type), you might instead consider defining your own modular type.
If I understand correctly, you need the code to compile both in the original environment and with GNAT. That might be difficult. One approach would be to define a new package with two different versions, one for the original environment and one for GNAT (or, ideally, for any modern Ada implementation). For example:
-- version for original environment
with System;
package Foo is
subtype Unsigned is System.Unsigned;
end foo;
-- version for GNAT
with Interfaces.C;
package Foo is
subtype Unsigned is Interfaces.C.Unsigned;
end Foo;
Picking a better name than Foo is left as an exercise, as is determining automatically which version to use.
You could rebuild the GNAT runtime system (RTS) with a slightly modified
There’s a Makefile.adalib in the system RTS (well, there is in GNAT GPL 2014) which lets you do this. It’s at the last directory indicated in the “Object Search Path” section of the output of gnatls -v.
The RTS source is similarly indicated in the “Source Search Path” section.
Create a directory say unsigned with subdirectories adainclude, adalib.
Copy the RTS source into unsigned/adainclude, and edit to include
type Unsigned is mod 2 ** 32;
(I’m guessing a bit, but this is probably what you want!)
Then, in unsigned/adalib,
make -f Makefile.adalib ADA_INCLUDE_PATH=../adainclude ROOT=/opt/gnat-gpl-2014
(ROOT is where you have the compiler installed; it will be different on your system, it’s one above the bin directory in which gnatls and friends are installed).
There will be several errors during this, all caused (when I tried it) by units that use System.Unsigned_Types;. Work round this by inserting this immediately after the package body in the .adb:
subtype Unsigned is System.Unsigned_Types.Unsigned;
The files I had to change were
It may be best at this stage to remove all the .ali and .a files from unsigned/adalib and repeat, to get a clean build.
Now, you should be able to use System.Unsigned by
gnatmake --RTS=/location/of/unsigned t.adb
In my case, t.adb contained
with System;
with Ada.Text_IO; use Ada.Text_IO;
procedure T is
Put_Line ("first: " & System.Unsigned'First'Img);
Put_Line ("last: " & System.Unsigned'Last'Img);
Put_Line ("42: " & System.Unsigned'Value ("42")'Img);
Put_Line ("16#42#:" & System.Unsigned'Value ("16#42#")'Img);
end T;
and the output was
$ ./t
first: 0
last: 4294967295
42: 42
16#42#: 66

How can one really create a process using Unix.create_process in OCaml?

I have tried
let _ = Unix.create_process "ls" [||] Unix.stdin Unix.stdout Unix.stderr
in utop, it will crash the whole thing.
If I write that into a .ml and compile and run, it will crash the terminal and my ubuntu will throw a system error.
But why?
The right way to call it is:
let pid = Unix.create_process "ls" [|"ls"|] Unix.stdin Unix.stdout Unix.stderr
The first element of the array must be the "command" name.
On some systems /bin/ls is a link to some bigger executable that will look at argv.(0) to know how to behave (c.f. Busybox); so you really need to provide that info.
(You see more often that with /usr/bin/vi which is now on many systems a sym-link to vim).
Unix.create_process actually calls fork and the does an execvpe, which itself calls the execv primitive (in the OCaml C implementation of the Unix module).
That function then calls cstringvect (a helper function in the C side of the module implementation), which translates the arg parameters into an array of C string, with last entry set to NULL. However, execve and the like expect by convention (see the execve(2) linux man page) the first entry of that array to be the name of the program:
argv is an array of argument strings passed to the new program. By
convention, the first of these strings should contain the filename
associated with the file being executed.
That first entry (or rather, the copy it receives) can actually be changed by the program receiving these args, and is displayed by ls, top, etc.

MPI - one function for MPI_Init and MPI_Init_thread

Is it possible to have one function to wrap both MPI_Init and MPI_Init_thread? The purpose of this is to have a cleaner API while maintaining backward compatibility. What happens to a call to MPI_Init_thread when it is not supported by the MPI run time? How do I keep my wrapper function working for MPI implementations when MPI_Init_thread is not supported?
MPI_INIT_THREAD is part of the MPI-2.0 specification, which was released 15 years ago. Virtually all existing MPI implementations are MPI-2 compliant except for some really archaic ones. You might not get the desired level of thread support, but the function should be there and you should still be able to call it instead of MPI_INIT.
You best and most portable option is to have a configure-like mechanism probe for MPI_Init_thread in the MPI library, e.g. by trying to compile a very simple MPI program and see if it fails with an unresolved symbol reference, or you can directly examine the export table of the MPI library with nm (for archives) or objdump (for shared ELF objects). Once you've determined that the MPI library has MPI_Init_thread, you can have a preprocessor symbol defined, e.g. CONFIG_HAS_INITTHREAD. Then have your wrapped similar to this one:
int init_mpi(int *pargc, char ***pargv, int desired, int *provided)
return MPI_Init_thread(pargc, pargv, desired, provided);
*provided = MPI_THREAD_SINGLE;
return MPI_Init(pargc, pargv);
Of course, if the MPI library is missing MPI_INIT_THREAD, then MPI_THREAD_SINGLE and the other thread support level constants will also not be defined in mpi.h, so you might need to define them somewhere.

How to add a macro with clBuildProgram for OpenCL

In my kernel i have this defined.
#define ACTIVATION_FUNCTION(X) (1.7159f*tanh(2.0f/3.0f*X))
I would like to define it in the clBuildProgram call, such i can alter the kernel at runtime. How can i do this?
You can use the -D argument to the OpenCL compiler, by passing it in the options parameter of the clBuildProgram function. Passing -D x=y, is equivalent to adding #define x y at the top of your kernel file. Similarly, passing -D x is equivalent to adding #define x (for any x and y, of course).
In your case, you probably want to pass something like this:
-D ACTIVATION_FUNCTION(X)=(1.7159f*tanh(2.0f/3.0f*X))
Which you can then change as you see fit, directly from your program, at runtime.
Note you could also open the kernel file and write the define directly into it, as an alternative solution, but this is probably the cleanest way. Just be careful with newlines, I'm not sure how well they are handled.
Ref. this documentation page on clBuildProgram, "Preprocessor Options" section.
