Is there a way to specify the base address of a shared library using dlopen()? - dynamic-linking

It seems that when we dlopen() some libraries, they will be loaded into some preferred (but not fixed) addresses. I've checked the source code of dlopen(), and a core function says
static __always_inline const char *
_dl_map_segments (struct link_map *l, int fd,
const ElfW(Ehdr) *header, int type,
const struct loadcmd loadcmds[], size_t nloadcmds,
const size_t maplength, bool has_holes,
struct link_map *loader)
{
const struct loadcmd *c = loadcmds;
if (__glibc_likely (type == ET_DYN))
{
/* This is a position-independent shared object. We can let the
kernel map it anywhere it likes, but we must have space for all
the segments in their specified positions relative to the first.
So we map the first segment without MAP_FIXED, but with its
extent increased to cover all the segments. Then we remove
access from excess portion, and there is known sufficient space
there to remap from the later segments.
As a refinement, sometimes we have an address that we would
prefer to map such objects at; but this is only a preference,
the OS can do whatever it likes. */
ElfW(Addr) mappref
= (ELF_PREFERRED_ADDRESS (loader, maplength,
c->mapstart & GLRO(dl_use_load_bias))
- MAP_BASE_ADDR (l));
/* Remember which part of the address space this object uses. */
l->l_map_start = (ElfW(Addr)) __mmap ((void *) mappref, maplength,
c->prot,
MAP_COPY|MAP_FILE,
fd, c->mapoff);
if (__glibc_unlikely ((void *) l->l_map_start == MAP_FAILED))
return DL_MAP_SEGMENTS_ERROR_MAP_SEGMENT;
...
}
The comment says you can specify a preferred address, but OS will determine whether to use it.
Question
Is there any way we can specify the base address for each dlopened module?
ELF_PREFERRED_ADDRESSS is set to 0 by default, but this macro seems to infer that the preferred addresses can be changed, say by an environment variable? But even there is one, I doubt that it can be changed for each dlopened library.
If I want to implement this feature myself, it seems that I need to wrap a new dlopen function and pass the preferred address to the above core function (and use MAP_FIXED maybe). Is it correct?
Thanks!

Is there any way we can specify the base address for each dlopened module?
No.
ELF_PREFERRED_ADDRESSS is set to 0 by default, but this macro seems to infer that the preferred addresses can be changed, say by an environment variable? But even there is one, I doubt that it can be changed for each dlopened library.
This code is compiled into the dynamic loader ld-linux.so and can not be changed after the compilation.
If I want to implement this feature myself, it seems that I need to wrap a new dlopen function and pass the preferred address to the above core function (and use MAP_FIXED maybe). Is it correct?
The function is private to ld-linux. You will not be able to wrap it, or call it from outside of ld-linux.
P.S. What you are likely looking for is the prelink command.

Related

glVertexAttribPointer last attribute value or pointer

The last attribute of glVertexAttribPointer is of type const GLvoid*. But is it really a pointer? It is actually an offset. If I put 0, it means an offset of 0 and not a null pointer to an offset. In my engine, I use this function:
void AbstractVertexData::vertexAttribPtr(int layout) const
{
glVertexAttribPointer(layout,
getShaderAttribs()[layout]->nbComponents,
static_cast<GLenum>(getShaderAttribs()[layout]->attribDataType),
getShaderAttribs()[layout]->shouldNormalize,
getVertexStride(layout),
reinterpret_cast<const void*>(getVertexAttribStart(layout)));
}
getVertexAttribStart returns an intptr_t. When I run drmemory, it says "uninitialized read" and I want to remove that warning. This warning comes from the reinterpret_cast. I can't static_cast to a const void* since my value isn't a pointer. What should I do to fix this warning?
Originally, back in OpenGL-1.1 when vertex arrays got introduces, functions like glVertexPointer, glTexCoordPointer and so on were accepting pointers into client address space. When shaders got introduced they came with arbitrary vertex attributes and the function glVertexAttribPointer follows the same semantics (this was in OpenGL-2.1).
The buffer objects API was then reusing existing functions, where you'd pass an integer for a pointer parameter.
OpenGL-3.3 core eventually made the use of buffer objects mandatory and ever since the glVertexAttribPointer functions being defines with a void* in their function signature are a sore spot; I've written in extent about it in https://stackoverflow.com/a/8284829/524368 (but make sure to read the rest of the answers as well).
Eventually new functions got introduced that allow for a more fine grained control over how vertex attributes are accessed, replacing glVertexAttribPointer, and those operate purely on offsets.

What's the Nicest Way to do This? (Qt and Enum style arguments)

In Qt, it is common to see something similar to the following:
QSettings obj3(QSettings::SystemScope, "MySoft", "Star Runner");
The important bit is the QSettings::SystemScope, which is an enum.
I want to have a settings provider (pay no attention to the previous example here, it has nothing to do with the following), with a get/set property.
Settings.set(Settings::refreshRate)
The refreshRate has to link to a key (string), and a default value (variant).
Should I make an enum and two dicts for the key and default values, or make a struct and a whole bunch of variables that encapsulate the settings I need? Should I try something else?
Thanks!
Edit!
This is what I did.
// Interface
class Settings {
public:
static QVariant get(Setting setting);
static void set(Setting setting, QVariant value);
const static Setting serverRefreshRate;
const static Setting serverReportTimeout;
};
// Implementation
const Setting Settings::serverRefreshRate = { "server/refreshRate", 10000 };
const Setting Settings::serverReportTimeout = { "server/reportTimeout", 1000 };
Well I guess since you're using enum which most likely will be easily castable to numbers from to 0 to N-1 I guess just storing variants and strings in two vectors or one vector of pairs would work just fine.
There's also another question though -- how to initialize all of that and how you will be adding new settings to it. I can suggest two methods - first one writing a bunch of function calls with arguments: enum, string, variant. Thus way though if programmer adds another value to enum he can forget to call initializing function. The other way is to create function (or maybe two) which will do switch on all enum values (without default case) and will return pair of string and variant. You can turn on the compiler warning about all enum values being processed in switch and thus way control if you forget to implement some of them in that function. And then initialize your structures using loop on all of enum values. These initializing functions should be called somewhere near the beginning of your program (before reading settings initially).
Well, that's my thoughts on it, you are free to try some different ways though.

Go Programming - bypassing access privileges using pointers

Let's say I have the following hierarchy for my project:
fragment/fragment.go
main.go
And in the fragment.go I have the following code, with one getter and no setter:
package fragment
type Fragment struct {
number int64 // private variable - lower case
}
func (f *Fragment) GetNumber() *int64 {
return &f.number
}
And in the main.go I create a Fragment and try to change Fragment.number without a setter:
package main
import (
"fmt"
"myproject/fragment"
)
func main() {
f := new(fragment.Fragment)
fmt.Println(*f.GetNumber()) // prints 0
//f.number = 8 // error - number is private
p := f.GetNumber()
*p = 4 // works. Now f.number is 4
fmt.Println(*f.GetNumber()) // prints 4
}
So by using the pointer, I changed the private variable outside of the fragment package. I understand that in for example C, pointers help to avoid copying large struct/arrays and they are supposed to enable you to change whatever they're pointing to. But I don't quite understand how they are supposed to work with private variables.
So my questions are:
Shouldn't the private variables stay private, no matter how they are accessed?
How is this compared to other languages such as C++/Java? Is it the case there too, that private variables can be changed using pointers outside of the class?
My Background: I know a bit C/C++, rather fluent in Python and new to Go. I learn programming as a hobby so don't know much about technical things happening behind the scenes.
You're not bypassing any access privilegies. If you acquire a *T from any imported package then you can always mutate *T, ie. the pointee at whole, as in an assignment. The imported package designer controls what you can get from the package, so the access control is not yours.
The restriction to what's said above is for structured types (structs), where the previous still holds, but the finer granularity of access control to a particular field is controlled by the field's name case even when referred to by a pointer to the whole structure. The field name must be uppercase to be visible outside its package.
Wrt C++: I believe you can achieve the same with one of the dozens C++ pointer types. Not sure which one, though.
Wrt Java: No, Java has no pointers. Not really comparable to pointers in Go (C, C++, ...).

Changing function reference in Mach-o binary

I need to change to reference of a function in a mach-o binary to a custom function defined in my own dylib. The process I am now following is,
Replacing references to older functions to the new one. e.g _fopen to _mopen using sed.
I open the mach-o binary in MachOView to find the address of the entities I want to change. I then manually change the information in the binary using a hex editor.
Is there a way I can automate this process i.e write a program to read the symbols, and dynamic loading info and then change them in the executable. I was looking at the mach-o header files at /usr/include/mach-o but am not entire sure how to use them to get this information. Do there exist any libraries present - C or python which help do the same?
interesting question, I am trying to do something similar to static lib; see if this helps
varrunr - you can easily achieve most if not all of the functionality using DYLD's interposition. You create your own library, and declare your interposing functions, like so
// This is the expected interpose structure
typedef struct interpose_s {
void *new_func;
void *orig_func;
} interpose_t;
static const interpose_t interposing_functions[] \
__attribute__ ((section("__DATA, __interpose"))) = {
{ (void *)my_open, (void *) open }
};
.. and you just implement your open. In the interposing functions all references to the original will work - which makes this ideal for wrappers. And, you can insert your dylib forcefully using DYLD_INSERT_LIBRARIES (same principle as LD_PRELOAD on Linux).

How do game trainers change an address in memory that's dynamic?

Lets assume I am a game and I have a global int* that contains my health. A game trainer's job is to modify this value to whatever in order to achieve god mode. I've looked up tutorials on game trainers to understand how they work, and the general idea is to use a memory scanner to try and find the address of a certain value. Then modify this address by injecting a dll or whatever.
But I made a simple program with a global int* and its address changes every time I run the app, so I don't get how game trainers can hard code these addresses? Or is my example wrong?
What am I missing?
The way this is usually done is by tracing the pointer chain from a static variable up to the heap address containing the variable in question. For example:
struct CharacterStats
{
int health;
// ...
}
class Character
{
public:
CharacterStats* stats;
// ...
void hit(int damage)
{
stats->health -= damage;
if (stats->health <= 0)
die();
}
}
class Game
{
public:
Character* main_character;
vector<Character*> enemies;
// ...
}
Game* game;
void main()
{
game = new Game();
game->main_character = new Character();
game->main_character->stats = new CharacterStats;
// ...
}
In this case, if you follow mikek3332002's advice and set a breakpoint inside the Character::hit() function and nop out the subtraction, it would cause all characters, including enemies, to be invulnerable. The solution is to find the address of the "game" variable (which should reside in the data segment or a function's stack), and follow all the pointers until you find the address of the health variable.
Some tools, e.g. Cheat Engine, have functionality to automate this, and attempt to find the pointer chain by themselves. You will probably have to resort to reverse-engineering for more complicated cases, though.
Discovery of the access pointers is quite cumbersome and static memory values are difficult to adapt to different compilers or game versions.
With API hooking of malloc(), free(), etc. there is a different method than following pointers. Discovery starts with recording all dynamic memory allocations and doing memory search in parallel. The found heap memory address is then reverse matched against the recorded memory allocations. You get to know the size of the object and the offset of your value within the object. You repeat this with backtracing and get the jump-back code address of a malloc() call or a C++ constructor. With that information you can track and modify all objects which get allocated from there. You dump the objects and compare them and find a lot more interesting values. E.g. the universal elite game trainer "ugtrain" does it like this on Linux. It uses LD_PRELOAD.
Adaption works by "objdump -D"-based disassembly and just searching for the library function call with the known memory size in it.
See: http://en.wikipedia.org/wiki/Trainer_%28games%29
Ugtrain source: https://github.com/sriemer/ugtrain
The malloc() hook looks like this:
static __thread bool no_hook = false;
void *malloc (size_t size)
{
void *mem_addr;
static void *(*orig_malloc)(size_t size) = NULL;
/* handle malloc() recursion correctly */
if (no_hook)
return orig_malloc(size);
/* get the libc malloc function */
no_hook = true;
if (!orig_malloc)
*(void **) (&orig_malloc) = dlsym(RTLD_NEXT, "malloc");
mem_addr = orig_malloc(size);
/* real magic -> backtrace and send out spied information */
postprocess_malloc(size, mem_addr);
no_hook = false;
return mem_addr;
}
But if the found memory address is located within the executable or a library in memory, then ASLR is likely the cause for the dynamic. On Linux, libraries are PIC (position-independent code) and with latest distributions all executables are PIE (position-independent executables).
EDIT: never mind it seems it was just good luck, however the last 3 numbers of the pointer seem to stay the same. Perhaps this is ASLR kicking in and changing the base image address or something?
aaahhhh my bad, i was using %d for printf to print the address and not %p. After using %p the address stayed the same
#include <stdio.h>
int *something = NULL;
int main()
{
something = new int;
*something = 5;
fprintf(stdout, "Address of something: %p\nValue of something: %d\nPointer Address of something: %p", &something, *something, something);
getchar();
return 0;
}
Example for a dynamicaly allocated varible
The value I want to find is the number of lives to stop my lives from being reduced to 0 and getting game over.
Play the Game and search for the location of the lifes variable this instance.
Once found use a disassembler/debugger to watch that location for changes.
Lose a life.
The debugger should have reported the address that the decrement occurred.
Replace that instruction with no-ops
Got this pattern from the program called tsearch
A few related websites found from researching this topic:
http://deviatedhacking.com/index.php?/topic/75-dynamic-memory-allocation/
http://www.edgeofnowhere.cc/viewforum.php?f=183
http://www.oldschoolhack.de/tutorials/Theories%20and%20methods%20of%20code-caves.htm
http://webcache.googleusercontent.com/search?q=cache:4wzMzFIZx54J:gamehacking.com/forums/tutorials-beginners/11597-c-making-game-trainer.html+reading+a+dynamic+memory+address+game+trainer&cd=2&hl=en&ct=clnk&gl=au&client=firefox-a (A google cache version)
http://www.codeproject.com/KB/cpp/codecave.aspx
The way things like Gameshark codes were figured out were by dumping the memory image of the application, then doing one thing, then looking to see what changed. There might be a few things changing, but there should be patterns to look for. E.g. dump memory, shoot, dump memory, shoot again, dump memory, reload. Then look for changes and get an idea for where/how ammo is stored. For health it'll be similar, but a lot more things will be changing (since you'll be moving at the very least). It'll be easiest though to do it when minimizing the "external effects," e.g. don't try to diff memory dumps during a firefight because a lot is happening, do your diffs while standing in lava, or falling off a building, or something of that nature.

Resources