Is it possible to uniquely identify dynamically imported functions by their name? - dynamic-linking

I used
readelf --dyn-sym my_elf_binary | grep FUNC | grep UND
to display the dynamically imported functions of my_elf_binary, from the dynamic symbol table in the .dynsym section to be precise. Example output would be:
[...]
3: 00000000 0 FUNC GLOBAL DEFAULT UND tcsetattr#GLIBC_2.0 (3)
4: 00000000 0 FUNC GLOBAL DEFAULT UND fileno#GLIBC_2.0 (3)
5: 00000000 0 FUNC GLOBAL DEFAULT UND isatty#GLIBC_2.0 (3)
6: 00000000 0 FUNC GLOBAL DEFAULT UND access#GLIBC_2.0 (3)
7: 00000000 0 FUNC GLOBAL DEFAULT UND open64#GLIBC_2.2 (4)
[...]
Is it safe to assume that the names associated to these symbols, e.g. the tcsetattr or access, are always unique? Or is it possible, or reasonable*), to have a dynamic symbol table (filtered for FUNC and UND) which contains two entries with the same associated string?
The reason I am asking is that I am looking for a unique identifier for dynamically imported functions ...
*) Wouldn't the dynamic linker resolve all "UND FUNC symbols" with the same name to the same function anyway?

Yes, given a symbol name and the set of libraries an executable is linked against, you can uniquely identify the function. This behavior is required for linking and dynamic linking to work.
An illustrative example
Consider the following two files:
librarytest1.c:
#include <stdio.h>
int testfunction(void)
{
printf("version 1");
return 0;
}
and librarytest2.c:
#include <stdio.h>
int testfunction(void)
{
printf("version 2");
return 0;
}
Both compiled into shared libraries:
% gcc -fPIC -shared -Wl,-soname,liblibrarytest.so.1 -o liblibrarytest.so.1.0.0 librarytest1.c -lc
% gcc -fPIC -shared -Wl,-soname,liblibrarytest.so.2 -o liblibrarytest.so.2.0.0 librarytest2.c -lc
Note that we cannot put both functions by the same name into a single shared library:
% gcc -fPIC -shared -Wl,-soname,liblibrarytest.so.0 -o liblibrarytest.so.0.0.0 librarytest1.c librarytest2.c -lc
/tmp/cctbsBxm.o: In function `testfunction':
librarytest2.c:(.text+0x0): multiple definition of `testfunction'
/tmp/ccQoaDxD.o:librarytest1.c:(.text+0x0): first defined here
collect2: error: ld returned 1 exit status
This shows that symbol names are unique within a shared library, but do not have to be among a set of shared libraries.
% readelf --dyn-syms liblibrarytest.so.1.0.0 | grep testfunction
12: 00000000000006d0 28 FUNC GLOBAL DEFAULT 10 testfunction
% readelf --dyn-syms liblibrarytest.so.2.0.0 | grep testfunction
12: 00000000000006d0 28 FUNC GLOBAL DEFAULT 10 testfunction
Now lets link our shared libraries with an executable. Consider linktest.c:
int testfunction(void);
int main()
{
testfunction();
return 0;
}
We can compile and link this against either shared library:
% gcc -o linktest1 liblibrarytest.so.1.0.0 linktest.c
% gcc -o linktest2 liblibrarytest.so.2.0.0 linktest.c
And run each of them (note I'm setting the dynamic library path so the dynamic linker can find the libraries, which are not in a standard library path):
% LD_LIBRARY_PATH=. ./linktest1
version 1%
% LD_LIBRARY_PATH=. ./linktest2
version 2%
Now lets link our executable to both libraries. Each is exporting the same symbol testfunction and each library has a different implementation of that function.
% gcc -o linktest0-1 liblibrarytest.so.1.0.0 liblibrarytest.so.2.0.0 linktest.c
% gcc -o linktest0-2 liblibrarytest.so.2.0.0 liblibrarytest.so.1.0.0 linktest.c
The only difference is the order the libraries are referenced to the compiler.
% LD_LIBRARY_PATH=. ./linktest0-1
version 1%
% LD_LIBRARY_PATH=. ./linktest0-2
version 2%
Here are the corresponding ldd output:
% LD_LIBRARY_PATH=. ldd ./linktest0-1
linux-vdso.so.1 (0x00007ffe193de000)
liblibrarytest.so.1 => ./liblibrarytest.so.1 (0x00002b8bc4b0c000)
liblibrarytest.so.2 => ./liblibrarytest.so.2 (0x00002b8bc4d0e000)
libc.so.6 => /lib64/libc.so.6 (0x00002b8bc4f10000)
/lib64/ld-linux-x86-64.so.2 (0x00002b8bc48e8000)
% LD_LIBRARY_PATH=. ldd ./linktest0-2
linux-vdso.so.1 (0x00007ffc65df0000)
liblibrarytest.so.2 => ./liblibrarytest.so.2 (0x00002b46055c8000)
liblibrarytest.so.1 => ./liblibrarytest.so.1 (0x00002b46057ca000)
libc.so.6 => /lib64/libc.so.6 (0x00002b46059cc000)
/lib64/ld-linux-x86-64.so.2 (0x00002b46053a4000)
Here we can see that while symbols are not unique, the way the linker resolves them is defined (it appears that it always resolves the first symbol it encounters). Note that this is a bit of a pathological case as you normally wouldn't do this. In the cases where you would go this direction there are better ways of handling symbol naming so they would be unique when exported (symbol versioning, etc)
In summary, yes, you can uniquely identify the function given its name. If there happens to be multiple symbols by that name, you identify the proper one using the order the libraries are resolved in (from ldd or objdump, etc). Yes, in this case you need a bit more information that just its name, but it is possible if you have the executable to inspect.

Note that in your case, the name of the first function import is not just tcsetattr but tcsetattr#GLIBC_2.0. The # is how the readelf program displays a versioned symbol import.
GLIBC_2.0 is a version tag that glibc uses to stay binary compatible with old binaries in the (unusual but possible) case that the binary interface to one of its functions needs to change. The original .o file produced by the compiler will just import tcsetattr with no version information but during static linking, the linker has noticed that the actual symbol exported by lic.so carries a GLIBC_2.0 tag, and so it creates a binary that insists on importing the particular tcsetattr symbol that has version GLIBC_2.0.
In the future there might be a libc.so that exports one tcsetattr#GLIBC_2.0 and a different tcsetattr#GLIBC_2.42, and the version tag will then be used to find which one a partcular ELF object refers to.
It is possible that the same process may also use tcsetattr#GLIBC_2.42 at the same time, such as if it uses another dynamic library which was linked against a libc.so new enough to provide it. The version tags ensure that both the old binary and the new library get the function they expect from the C library.
Most libraries don't use this mechanism and instead just rename the entire library if they need to make breaking changes to their binary interfaces. For example, if you dump /usr/bin/pngtopnm you'll find that the symbols it imports from libnetpbm and libpng are not versioned. (Or at least that's what I see on my machine).
The cost of this is that you can't have a binary that links against one version of libpng and also links against another library that itself links against a different libpng version; the exported names from the two libpng's would clash.
In most cases this is manageable enough through careful packaging practice that maintaining the library source to produce useful version tags and stay backwards compatible is not worth the trouble.
But in the particular case of the C library and a few other vital system libraries, changing the name of the library would be so extremely painful that it makes sense for the maintainers to jump through some hoops in order to ensure it will never need to happen again.

Although in most cases every symbol is unique, there are a handful of exceptions. My favorite is multiple identical symbol import used by PAM (pluggable authentication modules) and NSS (Name Service Switch). In both cases all modules written for either interface use a standard interface with standard names. A common and frequently used example is what happens when you call get host by name. The nss library will call the same function in multiple libraries to get an answer. A common configuration calles the same function in three libraries! I have seen the same function called in five different libraries from one function call, and that was not the limit just what was useful. There is special calls to the dynamic linker need to do this and I have not familiarised myself with the mechanics of doing this, but there is nothing special about the linking of the library module that is so loaded.

Related

Frama-C aborted Invalid user input

I am very new to Frama-c and I got an issue when I am trying to open a C source file.
The error shows as
"fatal error: event.h: No such file or directory. Compilation terminated".
[kernel] Parsing FRAMAC_SHARE/libc/__fc_builtin_for_normalization.i (no preprocessing)
[kernel] Parsing WorkSpace/bipbuffer.c (with preprocessing)
[kernel] user error: failed to run: gcc -E -C -I. -dD -D__FRAMAC__ -nostdinc -D__FC_MACHDEP_X86_32 -I/usr/share/frama-c/libc -o '/tmp/bipbuffer.ce6d077.i' '/home/xxx/WorkSpace/bipbuffer.c' you may set the CPP environment variable to select the proper preprocessor command or use the option "-cpp-command".
[kernel] user error: stopping on file "/home/xxx/WorkSpace/bipbuffer.c" that has errors. Add'-kernel-msg-key pp' for preprocessing command.
So bascially I am trying to open a C source file but it returns an error like this. I aslo tried other very simple C files like hello world and other slicing functions, it works well.
I thought it was because I didn't have the dependencies of 'event.h' but it still return these errors after I installed the libevent dependencies. I am not sure if I need to manually set some path of the dependencies for frama-c
Here is part of the C file (Source link: https://memcached.org/) that I would like to open:
#include "stdio.h"
#include <stdlib.h>
/* for memcpy */
#include <string.h>
#include "bipbuffer.h"
static size_t bipbuf_sizeof(const unsigned int size)
{
return sizeof(bipbuf_t) + size;
}
int bipbuf_unused(const bipbuf_t* me)
{
if (1 == me->b_inuse)
/* distance between region B and region A */
return me->a_start - me->b_end;
else
return me->size - me->a_end;
}
......
Thanks,
Compilers and other tools working with C source code need to know where to find header files. There are some standard places where they look automatically, but Frama-C has fewer of those than (and different ones from) a normal compiler.
You need to find out where event.h is installed, then pass something like -cpp-extra-args "-I /path/to/directory/" to Frama-C. Pass the directory name only, not including the name event.h itself.
In addition to Isabelle Newbie's answer, I'd like to point out that the Chlorine version of Frama-C, whose beta has been recently announced, features a new option -json-compilation-database that attempts to read the arguments to be passed to the pre-processor from a compilation database.
Such database can be generated directly by cmake, but there are solutions for make-based project such as the one you refer to, in particular bear, which intercepts the commands launched by make to build the database.
Here's a detailed summary of how you could proceed, using the new -json-compilation-database option from Frama-C 17 Chlorine, plus an extra script list_files.py (which is not in the beta, but will be available in the final 17 release, and can be downloaded here):
Get the source files you want to analyze with Frama-C, run ./configure, and if possible try to disable optional dependencies from external libraries; for instance, some code bases include optional dependencies based on availability of libraries/system features, but have fallback options (resorting to standard C library or POSIX functions). The more you give Frama-C, the better the chances of analyzing it well, so if such external libraries are not essential, excluding them might help get a more "POSIXy" code, which should help. This is typically visible in config.h files, in macros commonly named HAVE_*.
Compile and install Build EAR or some equivalent tool to obtain a compile_commands.json file.
Run bear make (or cmake with flag CMAKE_EXPORT_COMPILE_COMMANDS) to get the compile_commands.json file.
Run the aforementioned list_files.py in the directory containing compile_commands.json to obtain the list of C sources used during compilation.
Run Frama-C (17 Chlorine or newer), giving it the list of sources found in the previous step, plus option -json-compilation-database . to parse the compile_commands.json and, hopefully, get the appropriate preprocessing flags.
Ideally, this should suffice, but in practice, this is rarely enough. In particular due to the presence of external libraries and non-C99, non-POSIX functions, the following steps are always needed.
6. Inclusion of external libraries
At this step, Frama-C will complain about the lack of event.h. You'll have to include the headers of this library yourself. Note: copying headers directly from your /usr/include is not likely to work, due to several architecture-specific definitions, especially files such as bits/*.h..
Instead, consider downloading the external libraries and preparing them (e.g. running ./configure at least). Then manually add the extra include directory via -cpp-extra-args="-I <path/to/your/sources/for/libevent.h>/include".
7. Inclusion of missing non-POSIX headers
Some other headers may be missing, in particular GNU- or BSD-specific sources (e.g. sysexits.h). Get these headers and add them when necessary. The error message in this case comes from the preprocessor (gcc) and is similar to this:
memcached.c:51:10: fatal error: sysexits.h: No such file or directory
#include <sysexits.h>
^~~~~~~~~~~~
compilation terminated.
8. Definition of missing non-POSIX types and constants
At this point, all necessary headers should be available, but parsing with Frama-C may still fail. This is due to usage of non-POSIX type definitions (e.g. caddr_t, struct ling), non-POSIX constants (e.g. MAXPATHLEN, SOCK_NONBLOCK, NI_MAXSERV). Error messages typically resemble the following:
[kernel] memcached.c:3261: Failure: Cannot resolve variable MAXPATHLEN
Constants are often easy to provide manually, by grepping what's available in your /usr/include.
Type definitions, on the other hand, may require some copy-pasting at the right places, especially if they depend on other types which are also missing. This step is hardly automatizable, but relatively straightforward once you get used to some specific error messages.
For instance, the following error message is related to a missing type definition (caddr_t):
[kernel] Parsing memcached.c (with preprocessing)
[kernel] memcached.c:1074:
syntax error:
Location: line 1074, between columns 38 and 47, before or at token: c
1072 *hdr++ = 0;
1073 *hdr++ = 0;
1074 assert((void *) hdr == (caddr_t)c->msglist[i].msg_iov[0].iov_base + UDP_HEADER_SIZE);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1075 }
1076
Note that the token just before c is (caddr_t), which has never been defined (it is often defined as either void * or char *).
The following error message is related to an incomplete type, i.e., a struct used somewhere but never defined:
[kernel] memcached.c:5811: User Error:
variable `ling' has initializer but incomplete type
It means that variable ling's type, which is struct linger (non-POSIX), has never been defined. In this case, we can copy it from our /usr/include/bits/socket.h:
struct linger
{
int l_onoff; /* Nonzero to linger on close. */
int l_linger; /* Time to linger. */
};
Note: if there are POSIX constants/definitions missing from Frama-C's libc, consider notifying its developers, or proposing pull requests in Frama-C's Github.
9. Fixing incompatible and missing function prototypes
Parsing is likely to succeed after the previous step, but it may still fail due to incompatible function prototypes. For instance, you may get:
[kernel] User Error: Incompatible declaration for usleep:
different integer types int and unsigned int
First declaration was at assoc.c:238
Current declaration is at items.c:1573
This is the consequence of a warning emitted earlier:
[kernel:typing:implicit-function-declaration] slabs.c:1150: Warning:
Calling undeclared function usleep. Old style K&R code?
It means that function usleep is called, but it does not have a prototype, therefore Frama-C uses the pre-C99 convention of "implicit int": it generates such a prototype, but later in the code, an actual declaration of usleep is found, and its type is not int. Hence the error.
To prevent this, you need to ensure usleep's prototype is properly included. Since it is not POSIX.1-2008, you need to either define/undefine the appropriate macros (see unistd.h), or add your own prototype.
At the end, this should allow Frama-C to parse the files and build an AST.
However, there are several missing prototypes yet; we were just lucky that none conflicted with actual declarations. Ideally, you'll consider the parsing stage done when there are no more messages such as implicit-function-declaration and similar warnings.
Some of the missing prototypes in memcached, such as getsubopt, are POSIX and should be integrated into Frama-C's standard library. Others might make part of a small library of non-standard stubs, to be reused for other software.
Contributing with results for future reuse
Successful conclusion of the parsing stage for such open source libraries is enough to consider them for integration into this repository of open source case studies, so that future users can start their analyses without having to redo all of these steps. (The repository is oriented towards Eva, but not exclusively: parsing is useful for all of Frama-C plug-ins.)

Call Rmath via Ctypes from Ocaml on OS X

I want to use R's mathematical functions as provided in libRmath from Ocaml. I successfully installed the library via brew tap homebrew science && brew install --with-librmath-only r. I end up with a .dylib in /usr/local/lib and a .h in /usr/local/include. Following the Ocaml ctypes tutorial, i do this in utop
#require "ctypes.foreign";;
open Ctypes;;
open Foreign;;
let test_pow = foreign "pow_di" (float #-> int #-> returning float);;
this complains that it can't find the symbol. What am I doing wrong? Do I need to open the dynamic library first? Set some environment variables? After googling, I also did this:
nm -gU /usr/local/lib/libRmath.dylib
which gives a bunch of symbols all with a leading underscore including 00000000000013ff T _R_pow_di. In the header file, pow_di is defined via some #define directive from _R_pow_di. I did try variations of the name like "R_pow_di" etc.
Edit: I tried compiling a simple C program using Rmath using Xcode. After setting the include path manually to include /usr/local/include, Xcode can find the header file Rmath.h. However, inside the header file, there is an include of R_ext/Boolean.h which does not seem to exist. This error is flagged by Xcode and compilation stops.
Noob alert: this may be totally obvious to a C programmer...
In order to use external library you still need to link. There're at least two different ways, either link using compiler, or link even more dynamically using dlopen.
For the first method use the following command (as an initial approximation):
ocamlbuild -pkg ctypes.foreign -lflags -cclib,-lRmath yourapp.native
under premise that your code is put into yourapp.ml file.
The second method is to use ctypes interface to dlopen to open the library. Using the correct types and name for the C function call, this goes like this:
let library = Dl.dlopen ~filename:"libRmath.dylib" ~flags:[]
let test_pow = foreign ~from:library "R_pow_di" (double #-> int #-> returning double)

Clang + AVR Compilation error: '=w' in asm

I got this error below and I can not find a solution. Does anybody know how
to fix this error?
rafael#ubuntu:~/avr/projeto$ clang -fsyntax-only -Os -I /usr/lib/avr/include -D__AVR_ATmega328P__ -DARDUINO=100 -Wno-ignored-attributes -Wno-attributes serial_tree.c
In file included from serial_tree.c:3:
In file included from /usr/lib/avr/include/util/delay.h:43:
/usr/lib/avr/include/util/delay_basic.h:108:5: error: invalid output
constraint
'=w' in asm
: "=w" (__count)
^
1 error generated.
util/delay.h is a Header from avr-libc, and the feature which you are using (some of the delay stuff) is using inline asm to implement it. avr-libc is written to be used with avr-gcc and uses features from avr-gcc like machine dependent constrains in inline asm. llvm does not recognize constraint "w", thus you have to use a different approach like using avr-gcc or porting that code to llvm.
avr-gcc also implements built-in function __builtin_avr_delay_cycles to implement wasting a specified number of clock cycles. If llvm is properly mimicking avr-gcc, then you can use that function as a starting point.

blackfin gcc-toolchain linking error math function like atan2 :undefined reference to ... within watan2.o

I've a problem with the classic math function linking of my bare metal program with blackfin tool chain linker. I tried many things but I cannot see why the libm.a does't provide the definitions for the function it use. Do I need to add an extra library? if yes which one?
I've put my linker verbose lign with linked libraries and the example linking error I got.
Thanks,
William
bfin-elf-ld -v -o test_ad1836_driver -T coreb_test_ad1836_driver.lds --just-symbol ../../icc_core/icc queue.o ezkit_561.o heap_2.o port.o tasks.o test_ad1836_driver.o list.o croutine.o user_isr.o bfin_isr.o app_c.o context_sl_asm.o cycle_count.o CFFT_Rad4_NS_NBRev.o fir_decima.o fir_decima_spl.o math_tools.o -Ttext 0x3c00000 -L /opt/uClinux/bfin-elf/bfin-elf/lib -lbffastfp -lbfdsp -lg -lc -lm -Map=test_ad1836_driver.map
argv[0] = 'bfin-elf-ld'
bindir = '/opt/uClinux/bfin-elf/bin/'
tooldir = '/opt/uClinux/bfin-elf/bin/../bfin-elf/bin/'
linker = '/opt/uClinux/bfin-elf/bin/../bfin-elf/bin/ld.real'
elf2flt = '/opt/uClinux/bfin-elf/bin/../bfin-elf/bin/elf2flt'
nm = '/opt/uClinux/bfin-elf/bin/../bfin-elf/bin/nm'
objdump = '/opt/uClinux/bfin-elf/bin/bfin-elf-objdump'
objcopy = '/opt/uClinux/bfin-elf/bin/bfin-elf-objcopy'
ldscriptpath = '/opt/uClinux/bfin-elf/bin/../bfin-elf/bin/../lib'
Invoking: '/opt/uClinux/bfin-elf/bin/../bfin-elf/bin/ld.real' '-v' '-o' 'test_ad1836_driver' '-T' 'coreb_test_ad1836_driver.lds' '--just-symbol' '../../icc_core/icc' 'queue.o' 'ezkit_561.o' 'heap_2.o' 'port.o' 'tasks.o' 'test_ad1836_driver.o' 'list.o' 'croutine.o' 'user_isr.o' 'bfin_isr.o' 'app_c.o' 'context_sl_asm.o' 'cycle_count.o' 'CFFT_Rad4_NS_NBRev.o' 'fir_decima.o' 'fir_decima_spl.o' 'math_tools.o' '-Ttext' '0x3c00000' '-L' '/opt/uClinux/bfin-elf/bfin-elf/lib' '-lbffastfp' '-lbfdsp' '-lg' '-lc' '-lm' '-Map=test_ad1836_driver.map'
GNU ld version 2.17
/opt/uClinux/bfin-elf/bfin-elf/lib/libm.a(w_atan2.o): In function `atan2':
/usr/src/packages/BUILD/blackfin-toolchain-2010R1/gcc-4.3/newlib/libm/math/w_atan2.c:96: undefined reference to `__eqdf2'
/usr/src/packages/BUILD/blackfin-toolchain-2010R1/gcc-4.3/newlib/libm/math/w_atan2.c:96: relocation truncated to fit: R_BFIN_PCREL24 against undefined symbol `__eqdf2'
.....
/opt/uClinux/bfin-elf/bfin-elf/lib/libm.a(e_sqrt.o): In function `_ieee754_sqrt':
/usr/src/packages/BUILD/blackfin-toolchain-2010R1/gcc-4.3/newlib/libm/math/e_sqrt.c:110: undefined reference to `__muldf3'
/usr/src/packages/BUILD/blackfin-toolchain-2010R1/gcc-4.3/newlib/libm/math/e_sqrt.c:110: undefined reference to `__adddf3'
.....
/opt/uClinux/bfin-elf/bfin-elf/lib/libm.a(s_atan.o): In function `atan':
/usr/src/packages/BUILD/blackfin-toolchain-2010R1/gcc-4.3/newlib/libm/math/s_atan.c:169: undefined reference to `__muldf3'
/usr/src/packages/BUILD/blackfin-toolchain-2010R1/gcc-4.3/newlib/libm/math/s_atan.c:170: undefined reference to `__muldf3'
/usr/src/packages/BUILD/blackfin-toolchain-2010R1/gcc-4.3/newlib/libm/math/s_atan.c:172: undefined reference to `__muldf3'
Add -lgcc. You need the functions to compare, add and multiply C double type values, respectively, __eqdf2, __adddf3 and __muldf3.
Usually, I'd recommend using the compiler driver (gcc) instead of linking directly with ld, even for firmware/kernel type outputs, because the former will take care of the necessary startup files and compiler runtime libraries.
Hi I think I know the problem, blackfin is not really compatible with the maths std lib. It is why in the VDSP version the maths functions are re implemented. To solve my problem, I did convert the VDSP maths lib to gcc and it compile fine now.
Thanks
Actually I found a better answer,
blackfin does in fact support std math. I just had some library flag in the wrong order.
For the linker, use the following lib flag order and it should work:
/opt/uClinux/bfin-elf/bin/../bfin-elf/bin/ld.real' '-v' '-o' .... '-L' '/opt/uClinux/bfin-elf/lib/gcc/bfin-elf/4.3.5/' '-lgcc' '-L' '/opt/uClinux/bfin-elf/bfin-elf/lib' '-lbfdsp' '-lg' '-lm' '-lbffastfp' '-lc'

UNIX 'comm' utility allows for case insensitivity in BSD but not Linux (via -i flag). How can I get it in Linux?

I'm using the excellent UNIX 'comm' command line utility in an application which I developed on a BSD platform (OSX). When I deployed to my Linux production server I found out that sadly, Ubuntu Linux's 'comm' utility does not take the -i flag to indicate that the lines should be compared case-insensitive. Apparently the POSIX standard does not require the -i option.
So... I'm in a bind. I really need the -i option that works so well on BSD. I've gone so far to try to compile the BSD comm.c source code on the Linux box but I got:
http://svn.freebsd.org/viewvc/base/user/luigi/ipfw3-head/usr.bin/comm/comm.c?view=markup&pathrev=200559
me#host:~$ gcc comm.c
comm.c: In function ‘getline’:
comm.c:195: warning: assignment makes pointer from integer without a cast
comm.c: In function ‘wcsicoll’:
comm.c:264: warning: assignment makes pointer from integer without a cast
comm.c:270: warning: assignment makes pointer from integer without a cast
/tmp/ccrvPbfz.o: In function `getline':
comm.c:(.text+0x421): undefined reference to `reallocf'
/tmp/ccrvPbfz.o: In function `wcsicoll':
comm.c:(.text+0x691): undefined reference to `reallocf'
comm.c:(.text+0x6ef): undefined reference to `reallocf'
collect2: ld returned 1 exit status
Does anyone have any suggestions as to how to get a version of comm on Linux that supports 'comm -i'?
Thanks!
You can add the following in comm.c:
void *reallocf(void *ptr, size_t size)
{
void *ret = realloc(ptr, size);
if (ret == NULL) {
free(ptr);
}
return ret;
}
You should be able to compile it then. Make sure comm.c has #include <stdlib.h> in it (it probably does that already).
The reason your compilation fails is because BSD comm.c uses reallocf() which is not a standard C function. But it is easy to write.
#OP ,there's no need to go to such length as to do your own src code compilation . Here's an alternative suggestion. Since you want case insensitive, you can just convert the cases in both files to lower (or upper case) using another tool such as tr before you pass the files to comm.
tr '[A-Z]' '[a-z]' <file1 > temp1
tr '[A-Z]' '[a-z]' <file2 > temp2
comm temp1 temp2
You could try to cat both files and pipe them to uniq -c -i. It'll show all lines in both files, with the number of appearances in the first column. As long as the original files don't have repeated lines, all lines with the first column >1 are lines common to both files.
Hope it helps!
Does anyone have any suggestions as to how to get a version of comm on Linux that supports 'comm -i'?
Not quite that; but have you checked if your requirements could be satisfied by the join utility? This one does have the -i option on Linux...

Resources