Should -fsanitize=address go into CFLAGS or LDFLAGS? - address-sanitizer

I'm trying to use address sanitizer using (-fsanitize=address) and I'm not sure if it belongs into CFLAGS or LDFLAGS. It actually seems to work fine when added just to LDFLAGS, but I do not know if that is a coincidence or if it is supposed to be like that.
Is -fsanitize=address needed for the compilation itself, or does it suffice to provide the flag for the linking step?

Is -fsanitize=address needed for the compilation itself, or does it suffice to provide the flag for the linking step?
Address Sanitizer instruments source code to insert additional checks, and so must be present at compilation time.
Providing the argument only on the link line results in asan runtime being linked into the process, but no checks being actually done, except for a small subset -- namely the checks achievable by interposing new delete, malloc, free, and other standard functions.
Example:
1 #include <malloc.h>
2 #include <stdio.h>
3
4 void fn(int *ip)
5 {
6 ip[0] = 1; // BUG: heap buffer overflow
7 }
8
9 int main()
10 {
11 int *ip = malloc(1); // Allocation too small.
12 printf("%d\n", ip[0]); // BUG: heap buffer overflow
13 free(ip);
14 free(ip); // BUG: double free
15 }
With no instrumentation, only the double-free is detected:
gcc -g -c t.c && gcc -fsanitize=address t.o && ./a.out
190
=================================================================
==55787==ERROR: AddressSanitizer: attempting double-free on 0x602000000010 in thread T0:
With instrumentation: both the bug in printf and the bug in fn are also detected.
gcc -g -c -fsanitize=address t.c && gcc -fsanitize=address t.o && ./a.out
=================================================================
==58202==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x602000000010 at pc 0x564565639252 bp 0x7ffe36b0a560 sp 0x7ffe36b0a558
READ of size 4 at 0x602000000010 thread T0
#0 0x564565639251 in main /tmp/t.c:12

Related

Pass an R package on CRAN with issues on MACOS due + OpenMP

I have an R package with Fortran and OpenMP than can't pass CRAN. I receive the following message:
Your package no longer installs on macOS with OpenMP issues.
My Makevars file is:
USE_FC_TO_LINK =
PKG_FFLAGS = $(SHLIB_OPENMP_FFLAGS)
PKG_LIBS = $(SHLIB_OPENMP_FFLAGS)
C_OBJS = init.o
FT_OBJS = e_bottomup.o e_topdown.o check_nt.o
all:
#$(MAKE) $(SHLIB)
#rm -f *.o
$(SHLIB): $(FT_OBJS) $(C_OBJS)
init.o: e_bottomup.o e_topdown.o check_nt.o
How to solve this issue? Thanks.
Edit 1:
I tried adding the flag cpp:
USE_FC_TO_LINK =
PKG_FFLAGS = $(SHLIB_OPENMP_FFLAGS) *-cpp*
PKG_LIBS = $(SHLIB_OPENMP_FFLAGS)
to add the condition #ifdef _OPENMP on Fortran code before !omp...
But with R CMD Check I got the message:
Non-portable flags in variable 'PKG_FFLAGS': -cpp
The Makevars file is fine. The OMP directives must be commented !$, including the USE OMP.
For instance, I created an R package with Fortran and OMP to test (and play with it).
I included an R function to return the max number of threads in each machine:
get_threads
The Fortran code is :
SUBROUTINE checkntf (nt)
!$ USE OMP_LIB
IMPLICIT NONE
INTEGER nt
!$ nt = OMP_GET_MAX_THREADS()
RETURN
END
The already install on Windows, Ubuntu and macOS as shown here
You can look how the data.table package deal with that using #ifdef _OPENMP: https://github.com/Rdatatable/data.table/blob/master/src/myomp.h It should be pretty similar in Fortran I guess
#ifdef _OPENMP
#include <omp.h>
#else
// for machines with compilers void of openmp support
#define omp_get_num_threads() 1
#define omp_get_thread_num() 0
#define omp_get_max_threads() 1
#define omp_get_thread_limit() 1
#define omp_get_num_procs() 1
#define omp_set_nested(a) // empty statement to remove the call
#define omp_get_wtime() 0
#endif

False negative with address sanitizer?

Consider the code below. No error is shown when I compile and run it with address sanitizer. But there should be an error right i.e assigning/accessing out of bounds memory location? Why doesn't address sanitizer detect it?
int arr[30];
int main(){
arr[40] = 34;
printf(ā€œ%dā€, arr[40]);
}
Thanks!
clang -fsanitize=address -fno-omit-frame-pointer test.c
./a.out
This is described by the following entry in FAQ:
Q: Why didn't ASan report an obviously invalid memory access in my code?
A1: If your errors is too obvious, compiler might have already optimized it
out by the time Asan runs.
A2: Another, C-only option is accesses to global common symbols which are
not protected by Asan (you can use -fno-common to disable generation of
common symbols and hopefully detect more bugs).

Is it possible to uniquely identify dynamically imported functions by their name?

I used
readelf --dyn-sym my_elf_binary | grep FUNC | grep UND
to display the dynamically imported functions of my_elf_binary, from the dynamic symbol table in the .dynsym section to be precise. Example output would be:
[...]
3: 00000000 0 FUNC GLOBAL DEFAULT UND tcsetattr#GLIBC_2.0 (3)
4: 00000000 0 FUNC GLOBAL DEFAULT UND fileno#GLIBC_2.0 (3)
5: 00000000 0 FUNC GLOBAL DEFAULT UND isatty#GLIBC_2.0 (3)
6: 00000000 0 FUNC GLOBAL DEFAULT UND access#GLIBC_2.0 (3)
7: 00000000 0 FUNC GLOBAL DEFAULT UND open64#GLIBC_2.2 (4)
[...]
Is it safe to assume that the names associated to these symbols, e.g. the tcsetattr or access, are always unique? Or is it possible, or reasonable*), to have a dynamic symbol table (filtered for FUNC and UND) which contains two entries with the same associated string?
The reason I am asking is that I am looking for a unique identifier for dynamically imported functions ...
*) Wouldn't the dynamic linker resolve all "UND FUNC symbols" with the same name to the same function anyway?
Yes, given a symbol name and the set of libraries an executable is linked against, you can uniquely identify the function. This behavior is required for linking and dynamic linking to work.
An illustrative example
Consider the following two files:
librarytest1.c:
#include <stdio.h>
int testfunction(void)
{
printf("version 1");
return 0;
}
and librarytest2.c:
#include <stdio.h>
int testfunction(void)
{
printf("version 2");
return 0;
}
Both compiled into shared libraries:
% gcc -fPIC -shared -Wl,-soname,liblibrarytest.so.1 -o liblibrarytest.so.1.0.0 librarytest1.c -lc
% gcc -fPIC -shared -Wl,-soname,liblibrarytest.so.2 -o liblibrarytest.so.2.0.0 librarytest2.c -lc
Note that we cannot put both functions by the same name into a single shared library:
% gcc -fPIC -shared -Wl,-soname,liblibrarytest.so.0 -o liblibrarytest.so.0.0.0 librarytest1.c librarytest2.c -lc
/tmp/cctbsBxm.o: In function `testfunction':
librarytest2.c:(.text+0x0): multiple definition of `testfunction'
/tmp/ccQoaDxD.o:librarytest1.c:(.text+0x0): first defined here
collect2: error: ld returned 1 exit status
This shows that symbol names are unique within a shared library, but do not have to be among a set of shared libraries.
% readelf --dyn-syms liblibrarytest.so.1.0.0 | grep testfunction
12: 00000000000006d0 28 FUNC GLOBAL DEFAULT 10 testfunction
% readelf --dyn-syms liblibrarytest.so.2.0.0 | grep testfunction
12: 00000000000006d0 28 FUNC GLOBAL DEFAULT 10 testfunction
Now lets link our shared libraries with an executable. Consider linktest.c:
int testfunction(void);
int main()
{
testfunction();
return 0;
}
We can compile and link this against either shared library:
% gcc -o linktest1 liblibrarytest.so.1.0.0 linktest.c
% gcc -o linktest2 liblibrarytest.so.2.0.0 linktest.c
And run each of them (note I'm setting the dynamic library path so the dynamic linker can find the libraries, which are not in a standard library path):
% LD_LIBRARY_PATH=. ./linktest1
version 1%
% LD_LIBRARY_PATH=. ./linktest2
version 2%
Now lets link our executable to both libraries. Each is exporting the same symbol testfunction and each library has a different implementation of that function.
% gcc -o linktest0-1 liblibrarytest.so.1.0.0 liblibrarytest.so.2.0.0 linktest.c
% gcc -o linktest0-2 liblibrarytest.so.2.0.0 liblibrarytest.so.1.0.0 linktest.c
The only difference is the order the libraries are referenced to the compiler.
% LD_LIBRARY_PATH=. ./linktest0-1
version 1%
% LD_LIBRARY_PATH=. ./linktest0-2
version 2%
Here are the corresponding ldd output:
% LD_LIBRARY_PATH=. ldd ./linktest0-1
linux-vdso.so.1 (0x00007ffe193de000)
liblibrarytest.so.1 => ./liblibrarytest.so.1 (0x00002b8bc4b0c000)
liblibrarytest.so.2 => ./liblibrarytest.so.2 (0x00002b8bc4d0e000)
libc.so.6 => /lib64/libc.so.6 (0x00002b8bc4f10000)
/lib64/ld-linux-x86-64.so.2 (0x00002b8bc48e8000)
% LD_LIBRARY_PATH=. ldd ./linktest0-2
linux-vdso.so.1 (0x00007ffc65df0000)
liblibrarytest.so.2 => ./liblibrarytest.so.2 (0x00002b46055c8000)
liblibrarytest.so.1 => ./liblibrarytest.so.1 (0x00002b46057ca000)
libc.so.6 => /lib64/libc.so.6 (0x00002b46059cc000)
/lib64/ld-linux-x86-64.so.2 (0x00002b46053a4000)
Here we can see that while symbols are not unique, the way the linker resolves them is defined (it appears that it always resolves the first symbol it encounters). Note that this is a bit of a pathological case as you normally wouldn't do this. In the cases where you would go this direction there are better ways of handling symbol naming so they would be unique when exported (symbol versioning, etc)
In summary, yes, you can uniquely identify the function given its name. If there happens to be multiple symbols by that name, you identify the proper one using the order the libraries are resolved in (from ldd or objdump, etc). Yes, in this case you need a bit more information that just its name, but it is possible if you have the executable to inspect.
Note that in your case, the name of the first function import is not just tcsetattr but tcsetattr#GLIBC_2.0. The # is how the readelf program displays a versioned symbol import.
GLIBC_2.0 is a version tag that glibc uses to stay binary compatible with old binaries in the (unusual but possible) case that the binary interface to one of its functions needs to change. The original .o file produced by the compiler will just import tcsetattr with no version information but during static linking, the linker has noticed that the actual symbol exported by lic.so carries a GLIBC_2.0 tag, and so it creates a binary that insists on importing the particular tcsetattr symbol that has version GLIBC_2.0.
In the future there might be a libc.so that exports one tcsetattr#GLIBC_2.0 and a different tcsetattr#GLIBC_2.42, and the version tag will then be used to find which one a partcular ELF object refers to.
It is possible that the same process may also use tcsetattr#GLIBC_2.42 at the same time, such as if it uses another dynamic library which was linked against a libc.so new enough to provide it. The version tags ensure that both the old binary and the new library get the function they expect from the C library.
Most libraries don't use this mechanism and instead just rename the entire library if they need to make breaking changes to their binary interfaces. For example, if you dump /usr/bin/pngtopnm you'll find that the symbols it imports from libnetpbm and libpng are not versioned. (Or at least that's what I see on my machine).
The cost of this is that you can't have a binary that links against one version of libpng and also links against another library that itself links against a different libpng version; the exported names from the two libpng's would clash.
In most cases this is manageable enough through careful packaging practice that maintaining the library source to produce useful version tags and stay backwards compatible is not worth the trouble.
But in the particular case of the C library and a few other vital system libraries, changing the name of the library would be so extremely painful that it makes sense for the maintainers to jump through some hoops in order to ensure it will never need to happen again.
Although in most cases every symbol is unique, there are a handful of exceptions. My favorite is multiple identical symbol import used by PAM (pluggable authentication modules) and NSS (Name Service Switch). In both cases all modules written for either interface use a standard interface with standard names. A common and frequently used example is what happens when you call get host by name. The nss library will call the same function in multiple libraries to get an answer. A common configuration calles the same function in three libraries! I have seen the same function called in five different libraries from one function call, and that was not the limit just what was useful. There is special calls to the dynamic linker need to do this and I have not familiarised myself with the mechanics of doing this, but there is nothing special about the linking of the library module that is so loaded.

RcppZiggurat unable to compile example code

I am trying to employ the Ziggurat sampler in R, however actually wanted to use it directly in my C++ code. I installed the GSL library, RcppGSL and RcppZiggurat and using zrnorm() in R works just fine. I thought ok, lets try to compile the code sample provided in the RcppZiggurat.pdf, and go from there to implement the Ziggurat sampler directly in my C++ code... the following happens though...
From the pdf file I thought I can simply utilize:
#include <Rcpp.h>
#include <Ziggurat.h>
static Ziggurat::Ziggurat::Ziggurat zigg;
// [[Rcpp::export]]
Rcpp::NumericVector zrnorm(int n) {
Rcpp::NumericVector x(n);
for (int i=0; i<n; i++) {
x[i] = zigg.norm();
}
return x;
}
// [[Rcpp::export]]
void zsetseed(unsigned long int s) {
zigg.setSeed(s);
return;
}
Error:
official_zigg_code.cpp:2:10: fatal error: 'Ziggurat.h' file not found
#include <Ziggurat.h>
^
1 error generated.
make: *** [official_zigg_code.o] Error 1
clang++ -I/Library/Frameworks/R.framework/Resources/include -DNDEBUG -I/usr/local/include -
I/usr/local/include/freetype2 -I/opt/X11/include -
I"/Library/Frameworks/R.framework/Versions/3.1/Resources/library/Rcpp/include" -fPIC -Wall -
mtune=core2 -g -O2 -c official_zigg_code.cpp -o official_zigg_code.o
Error in Rcpp::sourceCpp("official_zigg_code.cpp") :
Error 1 occurred building shared library.
I have absolutely no clue how to proceed from here. I desperately tried to find answers on stack exchange but nothing could help me to solve this. From what I understand the RcppZiggurat package actually uses the above function so how can I fail to compile it, when I am able to use zrnorm() directly?
The error is fairly obvious:
fatal error: 'Ziggurat.h' file not found
This means that you did not tell R / the compiler about RcppZiggurat.
The fix is easy. In the case of an Rcpp-driven compilation via sourceCpp(), add
this one line
// [[Rcpp::depends(RcppZiggurat)]]
which does just that. All this is documented with Rcpp, and you are more or less expected to read at least some of its documentation.
If you want to build outside of Rcpp, you need to make sure the compiler find the header file(s). One commonly uses the -I flag for that, this is typically discussed where compiler are introduced.

Howto build Interbase plugin for Qt by MinGW

I want to build Interbase plugin for Qt using MinGW toolchain.
According to Qt documentation, I can do it only by MSVC, but I need MinGW... So, I wrote this .cmd file
set QTDIR=C:\Qt\4.8.0-minGW
set PATH=C:\Qt\4.8.0-minGW\bin
set PATH=%PATH%;C:\MinGW\bin
set QMAKESPEC=win32-g++
set INCLUDE=%INCLUDE%;c:\Program Files\Borland\InterBase\SDK\include
set LIB=%LIB%;c:\Program Files\Borland\InterBase\SDK\lib_ms
qmake -o Makefile ibase.pro
mingw32-make.exe
pause
and ran it from c:\Qt\4.8.0-minGW\src\plugins\sqldrivers\ibase\. Whole output is very long, but there's many similar lines, that's why I'll show only one of them and the final lines
tmp/obj/debug_shared/qsql_ibase.o: In function `ZN12QIBaseDriver24qHandleEventNo
tificationEPv':
C:\Qt\4.8.0-minGW\src\plugins\sqldrivers\ibase/../../../sql/drivers/ibase/qsql_i
base.cpp:1845: undefined reference to `isc_event_counts'
C:\Qt\4.8.0-minGW\src\plugins\sqldrivers\ibase/../../../sql/drivers/ibase/qsql_i
base.cpp:1864: undefined reference to `isc_que_events'
collect2: ld returned 1 exit status
mingw32-make.exe: *** [debug-all] Error 2
Could you tell me, how should I achive my target. Thank you.
P.S. I googled a lot and saw this quiestion - Compiling InterBase support in Qt - but there wasn't exact answer what to do...
I've done it !!!
The problem was in header file ibase.h from Interbase's SDK. There was following lines:
#if (defined(_MSC_VER) && defined(_WIN32)) || \
(defined(__BORLANDC__) && (defined(__WIN32__) || defined(__OS2__)))
...
#define ISC_EXPORT __stdcall
...
Macro ISC_EXPORT was not defined and all function's declarations was wrong. When I changed these lines in the following way:
#if defined(WIN32) || defined(_WIN32) || defined(__WIN32__)
...
#define ISC_EXPORT __stdcall
...
plugin was successfuly build

Resources