Can we hide/obscure symbol names in the symbol table of ELF executable object file? - unix

According to this ELF specification: ELF object file contains various sections and one of them is symbol table section .symtab which contains information of all symbols (files, functions, objects etc).
ELF contains information like name, attribute flags, type, value and binding etc for each symbol in the symbol table.
The name of an object for a file, function or object (array, variable, string) etc. actually exposes the internal information of the code. This way any person can analyze an ELF (using strings, objdump or readelf tools) and see this information and get idea of things internal to the code which should be kept secret.
For readability and maintainability we write code which can be understood by the developers. So, we need to keep using proper file names and variable names etc. We cannot obscure them using code obfuscation as it will make it difficult to maintain.
Question (edited): Is there any way by which we can hide or remove symbol "names" from the symbol table of executable ELF so that no one can get the insight of code and the executable is still operational?

Is there any way by which we can hide or obscure symbol names in the symbol table of ELF so that no one can get the insight of how code is developed (without code obfuscation)?
Depends on what kind of ELF file you are shipping to the end user.
If you are shipping a fully-linked ELF executable, running strip a.out will remove symbol table completely (but not the dynamic symbol, which must remain for obvious reasons).
If you are shipping an ELF shared library, you need to carefully control its exposed API using -fvisibility=hidden or a linker version script. If you do, strip will again remove everything except your public API.
If you are shipping a relocatable ELF object (or an archive library), then you can't do anything about its symbol table (again for obvious reason: symbol table is used to perform the final link).
Finally, your question appears to be predicated on misconception:
We cannot obscure them using code obfuscation as it will make it difficult to maintain.
The usual way to apply code obfuscation is just before you make the final shipping product (i.e. at exactly the same point where you would use strip, or any other method that would hide implementation details). Applying code obfuscation at that point will make the result difficult to maintain in exactly the same way as any other method of hiding implementation details.
Notably, you don't (usually) apply obfuscation to the code under development and maintenance (i.e. your development builds remain un-obfuscated).

Yes, it is possible. You can use strip to remove static library symbols and you can remove dynamic library symbols by loading the library yourself instead of letting the OS do this automatically.

Related

How do I save a dynamically generated Lisp system in external files?

Basically, I want to be able to generate class definitions, compile the system, and save it for reuse. Would that involve a code walker, or is there a simpler option?
(save-lisp-and-die "isn't going to work for me")
Expanding to explain. I'm generating systems based on OpenAPI definitions, so a system roughly corresponds to an API client.
There will be dozens, if not hundreds of these.
The idea is to NOT keep them all in the image, but load at run time as required.
I see two possible routes here, and to some extent, I suspect they mainly differ in "the last mile" (as it were).
The route you seem to have settled on, run-time definition of classes and functions.
A route whereby you generate your function/class forms, but don't go the full way to get them "Live" in the image and instead emit the form(s) to a file.
I suspect that it would be possible to have most of the generating code shared between the two and for the first route have a wrapping macro that effectively returns a PROGN, and in the second calls a function to pretty-print what the macro would have returned on a stream.
Saying that, building a tailored environment and saving it to a "core" file is a pretty good way of getting excellent startup times.

cuda global pointer allocation in different source file

I faced a situation that I need some tables to be filled in one source file (for example fill.cu) and then be used in different kernels in different source files.
I tried declaring a pointer __device__ float *myTable; as 'extern' in fill.h header file and adding that to others.cpp and defining that pointer in fill.cu and allocate and fill it there.
This way, I got linker error indicating that myTable has been already defined in fill.cpp.
After many unsuccessful try, I decided to put all kernels that need this table in same source file, this way everything works fine until I added an cudaMalloc in main function before allocating my table in fill.cpp.
This way I noticed that table values and data allocated in main are overlapped and using cuda debugging tools of MS visual studio 2015, I found that 2 allocated pointer are same!!!
Please advice how to declare a global pointer in cuda without conflict.
The traditional CUDA linkage model requires that all device symbols, textures, functions, etc. are defined and used within the scope of the same translation unit. It sounds like your code structure is violating this requirement.
You have two choices:
Continue to the same code structure, but provide wrapper functions which your main can call to perform operations on statically declared device variables, rather than directly manipulating device symbols with the CUDA API from other code.
Use separate compilation. Here, you define the device symbol you want to access in exactly one file and declare the same symbol as externeverywhere else you need to use that symbol. You must explicitly use several nvcc options to compile your device code and use a separate device code linking stage.
Both approaches are well documented.

cmake: qt resources inside a module

i have this tree structure:
repository/modules/module1
repository/modules/module2
repository/modules/module..
repository/apps/application1
repository/apps/application2
repository/apps/application..
where the applications are using some modules.
now, I'd like to put some resources inside a module (like a very colorfull icons inside a widget used by several applications) but.. something gets wrong.
inside the module CMakeLists.txt if I use only:
set(${MODULE_NAME}_RCS
colors.qrc
)
...
qt4_add_resources (${MODULE_NAME}_RHEADERS ${${MODULE_NAME}_RCS})
no qrc_colors.cxx are created anywhere. so I've tried to add:
ADD_EXECUTABLE (${MODULE_NAME}
${${MODULE_NAME}_RHEADERS}
)
but.. I get this weird error:
CMake Error at repo/modules/ColorModule/CMakeLists.txt:51 (ADD_EXECUTABLE):
add_executable cannot create target "ColorModule" because another
target with the same name already exists. The existing target is a static
library created in source directory
"repo/modules/ColorModule". See documentation for
policy CMP0002 for more details.
(I've changed the path of the error of course)
so.. don't know what to think because i'm new both to cmake and qt..
what can i try?
EDIT:
if I add the ${MODULE_NAME}_RHEADERS and ${MODULE_NAME}_RCS in the add_library command the qrc_colors.cxx is created BUT it is in repository/modules/module1/built and not copied in the application built directory...
There is at least two errors in your code.
1) It is usually not necessary to use ${MODULE_NAME} everywhere like that, just "MODULE_NAME". You can see that the difference is the raw string vs. variable. It is usually recommended to avoid double variable value dereference if possible.
2) More importantly, you seem to be setting ${MODULE_NAME} in more than one executable place, which is "ColorModule" according to the error output. You should have individual executable names for different binaries.
Also, the resource file focus is a bit of red herring in here. There are several other issues with your project.
You can cmake files as CmakeLists.txt instead of CMakeLists.txt which inherently causes issues on case sensitive systes as my Linux box.
You use Findfoo.cmake, and find_package(foo) for that matter, rather than the usual FindFoo.cmake convention alongside find_package(Foo).
Your FindFoo.cmake is quite odd, and you should probably be rewritten.
Most importantly, you should use config files rather than find modules.
Documentation and examples can be found at these places:
http://www.cmake.org/Wiki/CMake/Tutorials#CMake_Packages
https://projects.kde.org/projects/kde/kdeexamples/repository/revisions/master/show/buildsystem
When you would like use a find module, you need to have that at hand already. That will tell you what to look for, where things are, or if they are not anywhere where necessary. It is not something that you should write. You should just reuse existing ones for those projects that are not using cmake, and hence the find modules are added separately.
It is a bit like putting the treasure map just next to the treasure. Do you understand the irony? :) Once you find the map, you would automatically have the treasure as well. i.e. you would not look for it anymore.

Closure: --namespace Foo does not include Foo.Bar, and related issues

I have a rather big library with a significant set of APIs that I need to expose. In fact, I'd like to expose the whole thing. There is a lot of namespacing going on, like:
FooLibrary.Bar
FooLibrary.Qux.Rumps
FooLibrary.Qux.Scrooge
..
Basically, what I would like to do is make sure that the user can access that whole namespace. I have had a whole bunch of trouble with this, and I'm totally new to closure, so I thought I'd ask for some input.
First, I need closurebuilder.py to send the full list of files to the closure compiler. This doesn't seem supported: --namespace Foo does not include Foo.Bar. --input only allows a single file, not a directory. Nor can I simply send my list of files to the closure compiler directly, because my code is also requiring things like "goog.assers", so I do need the resolver.
In fact, the only solution I can see is having a FooLibrary.ExposeAPI JS file that #require's everything. Surely that can't be right?
This is my main issue.
However, later the closure compiler, with ADVANCED_OPTIMIZATIONS on, will optimize all these names away. Now I can fix that by adding "#export" all over the place, which I am not happy about, but should work. I suppose it would also be valid to use an extern here. Or I could simply disable advanced optimizations.
What I can't do, apparently, is say "export FooLibrary.*". Wouldn't that make sense?
Finally, for working in source mode, I need to do goog.require() for every namespace I am using. This is merely an inconvenience, though I am mentioning because it sort of related to my trouble above. I would prefer to be able to do:
goog.requireRecursively('FooLibrary')
in order to pull all the child namespaces as well; thus, recreating with a single command the environment that I have when I am using the compiled version of my library.
I feel like I am possibly misunderstanding some things, or how Closure is supposed to be used. I'd be interested in looking at other Closure-based libraries to see how they solve this.
You are discovering that Closure-compiler is built more for the end consumer and not as much for the library author.
If you are exporting basically everything, then you would be better off with SIMPLE_OPTIMIZATIONS. I would still highly encourage you to maintain compatibility of your library with ADVANCED_OPTIMIZATIONS so that users can compile the library source with their project.
First, I need closurebuilder.py to send the full list of files to the closure compiler. ...
In fact, the only solution I can see is having a FooLibrary.ExposeAPI JS file that #require's everything. Surely that can't be right?
You would need to specify an --root of your source folder and specify the namespaces of the leaf nodes of your file dependency tree. You may have better luck with the now deprecated CalcDeps.py script. I still use it for some projects.
What I can't do, apparently, is say "export FooLibrary.*". Wouldn't that make sense?
You can't do that because it only makes sense based on the final usage. You as the library writer wish to export everything, but perhaps a consumer of your library wishes to include the source (uncompiled) version and have more dead code elimination. Library authors are stuck in a kind of middle ground between SIMPLE and ADVANCED optimization levels.
What I have done for this case is maintain a separate exports file for my namespace that exports everything. When compiling a standalone version of my library for distribution, the exports file is included in the compilation. However I can still include the library source (without the exports) into a project and get full dead code elimination. The work/payoff balance of this though must be weighed against just using SIMPLE_OPTIMIZATIONS for the standalone library.
My GeolocationMarker library has an example of this strategy.

Change in Binary without change in Source Code

I have the following requirement: To Find if my binary has changed or not.
My source code is unchanged. When I recompile the binary (without change in Source Code), I notice that the Binary is changed. Not in Size, but in Contents.
On debugging a little, I found there is something called "Link Time" inside the binary file. This is the actual timestamp when the binary was linked. Now since each compile will give different timestamps, hence my binary contents are always different. But actually it should be the same.
Can somebody suggest me a way of finding out if the binary has actually changed due to change in source code, and not anything else.
Thanks
Unlike on Windows (where every .obj file has a compile timestamp in its file header), UNIX object files, and in particular ELF files do not encode any kind of timestamp.
However, if your source uses __TIME__ and __DATE__ macros, then the object file produced by compilation will obviously change. Also, all kinds of information, including compilation timestamp could be recorded as part of the debug info, if you are building -g binaries.
Finally, it's possible that the linker you are using does record the link timestamp (as a vendor extension).
Your fist task should be to understand where the differences from one build to the next come from.
If from __DATE__ and __TIME__, eliminate them from your source.
If from debug info, compare the binaries after passing them through strip -g.
If from vendor linker extension, see if there is a flag to disable such timestamps. If there isn't one, you'll have to write a tool that compares only the parts you are interested in. E.g. you could use readelf -x.text a.out, etc. to compare only the .text section (you'll also want to compare .data, .rodata, and likely many others).

Resources