This is what I see by disassemble for the statement function(1,2,3);:
movl $0x3,0x8(%esp)
movl $0x2,0x4(%esp)
movl $0x1,(%esp)
call 0x4012d0 <_Z8functioniii>
It seems the ret address is not pushed into stack at all,then how does ret work?
On an x86 processor (as for your assembly language example), the call instruction pushes the return address on the stack and transfers control to the function.
So on entry to a function, the stack pointer is pointing at a return address, ready for ret to pop it into the program counter (EIP / RIP).
Not all processor architectures put the return address on the stack- often there's a set of one or more registers designed to hold return addresses. On ARM processors, the BL instruction places the return address in a specific register (LR, or the 'link register') and transfers control to the function.
The ia64 processor does something similar, except that there are several possible registers (b0-b7) that can receive the return address and one will be specified in the instruction (with b0 being the default).
Ideally, the call statement should take care of that. The program counter's next location will be pushed into the stack. When the function (sub routine) that was called completes it work and when it encounters a return statement, the control now goes to the address that was pushed into the stack and it will get popped.
It depends on the ABI and the architecture, but if the return address does end up on the stack it's a side-effect of the call instruction that puts it there.
call pushes the current value of the RIP register (return address) to the stack + does the call
ret pops the return address(that call pushed) from the top of the stack (RSP register points there) and writes it in the RIP register.
Example on a GNU/Linux box: function f calls function g and lets look at the frame of g.
LOW ADDRESS
... <- RSP (stack pointer shows top of stack) register points at this address
g's local vars
f's base pointer (old RBP value) <- RBP (base pointer) register points at this address
f's ret address (old RIP value) (this is what the call (from f) pushed, and what the ret (from g) will pop)
args that f called g with and didn't fit in the registers (I think on Windows this is different)
...
HIGH ADDRESS
g will free the local vars (movq %rsp, %rbp)
g will pop the "old RBP" and store it in RBP register (pop %rbp)
g will ret, which will modify RIP with the value that is stored where RSP points at
Hope it helps
Related
I'm reading a book called The Go Programming Language, and in the 2nd chapter about pointers the following is written
It is perfectly safe for a function to return the address of a local variable. For instance, in the
code below, the local variable v created by this particular call to f will remain in existence even
after the call has returned, and the pointer p will still refer to it:
var p = f()
func f() *int {
v := 1
return &v
}
I totally don't get this, a local variable is supposed to be destroyed after the function execution. Is it because maybe v is allocated on the heap. I know in C if you allocate space using malloc it won't be destroyed after function execution because it's on the heap.
Go isn't C. Despite similarities, it is much higher-level. It utilizes a complete runtime with a green thread scheduler and garbage-collecting memory manager. It will never collect memory so long as it has live references.
The Go compiler includes a stage called "escape analysis", where it tracks each variable to see if it "escapes" the function in which it is declared. Any value that can escape is allocated on the heap and managed by garbage collection; otherwise, it is (usually) allocated on the stack.
You can find more information on the subject:
https://blog.golang.org/ismmkeynote
https://dave.cheney.net/2014/06/07/five-things-that-make-go-fast
https://dougrichardson.org/2016/01/23/go-memory-allocations.html
https://www.agardner.me/golang/garbage/collection/gc/escape/analysis/2015/10/18/go-escape-analysis.html
https://www.ardanlabs.com/blog/2017/05/language-mechanics-on-escape-analysis.html
Exactly how it sounds.
I load the OFFSET of a procedure into a register, then try to call that register:
MOV EBX, OFFSET MyProc
CALL EBX
At first I would assume that this will call the function, however when you call a procedure you don't type CALL OFFSET MyProc, you simply type CALL MyProc.
In C you can call a pointer to a function with the * operator: (*MyProc)();.
Which leads me to wonder if dereferencing the pointer to the function would call the procedure.
CALL [EBX]
However if I dereference it, masm tells me that I need to specify a size, the only possible sizes that I am aware of that I could specify are DWORD PTR, WORD PTR, and BYTE PTR, and I don't think that a procedure is of a particular size.
To sum it up, can you call a pointer to a procedure simply by directly supplying the pointer as an operand to the call instruction, or would you have to dereference the pointer in the call instruction?
Thanks
Why not CALL OFFSET MyProc - because that would be annoying to type every time, and the inconsistent syntax didn't bother MASM creators much (consider the mov eax,var1 vs mov eax,[ebx], both dereferencing memory).
The call [ebx] would fetch the value stored at ebx address and use that as final target address, so in your case it would try to interpret the first instructions of procedure as target address, and jump who-know-where (probably causing illegal access crash from OS).
The required size in such case is not classic integer size, but jump/call addresses size, like NEAR PTR and FAR PTR, which affects how many bytes from memory will be used (NEAR PTR in 32b mode is 32b wide vs 16b in real mode (just offset part), FAR PTR is 32b in real mode (16b offset + 16b segment), and 48b in 32b protected mode (32b offset + 16b segment, which works more like selector or something, I never actually needed to fully understand this one, so consult your favourite x86 documentation/book for details).
I read that the SFP is used to restore EBP to its previous value. Why does EBP needs to return to it's initial value?
Why does EBP needs to return to it's initial value?
When a function call is made, the compiler typically, as the first thing for the function body, pushes the current EBP value on to the stack and sets the EBP (base pointer/frame pointer) to the current ESP (stack pointer, always points to the top of the stack). Then EBP is used to access local variables and arguments of the function.
The value of EBP is restored when a function returns o that it can serve the function call of the previous function.
Here's a test procedure from a program I'm working on, I pass in some parm's via the stack, one of which is a pointer. When I try to change the value of the dereferenced pointer, the variable isn't updated.
_testProc proc
push bp ;Save base pointer to stack
mov bp, sp ;Set new base pointer
sub sp, 4 ;Allocate stack space for locals
pusha ;Save registers to stack
mov di, [bp + 08] ;Parm 3 - ptr to variable
mov word ptr [di], 10 ; <---- Doesn't work. di contains an address,
; but what it points at doesn't get updated
popa ;Restore registers from stack
mov sp, bp ;Remove local vars by restoring sp
pop bp ;Restore base pointer from stack
ret 6 ;Return and also clean up parms on stack
_testProc endp
The 8086 produces and address by combining the contents of a segment register and an index register; I show that as [SR,IR].
Your update via register di is updating a location defined by [DS,DI]; mov instructions without any special prefix default to using the DS register. If you got the address DI as an offset for some other segment (ES? SS?) then you are in effect combining the wrong registers to hit the address you desire.
Your mistake is in not being clear about what the conventions are for passing a "pointer" to your routine. What you've define assume a relative offset from DS.
The very best thing you can do is to abandon 16-bit segmented code as soon as you can! :)
Failing that, there's "far data" and a "far pointer" to point to it. Your "proc" doesn't say if it's near or far - I assume near (or Parm3 probably isn't where you think it is on the stack... since the far return address is 4 bytes). If the variable you intend to alter is on the stack, you're in for some more complication. mov word ptr ss:[di], 10 at least. If you need to handle either a local or static variable, I think you're going to need a far pointer (4 bytes, segment and offset) to find it.
What first came to my mind is that you say you're trying to change the value of a dereferenced pointer, you don't "dereference" it (as I understand it). Try mov di, [di] after you get the value off the stack. Easy to try, anyway. :)
If all else fails, show us the calling code. (and get into 32-bit code as soon as you can!)
I have found these few lines of assembly in ollydbg:
MOV ECX,DWORD PTR DS:[xxxxxxxx] ; xxxxxxxx is an address
MOV EDX,DWORD PTR DS:[ECX]
MOV EAX,DWORD PTR DS:[EDX+116]
CALL EAX
Could someone step through and tell me what's happening here?
This is an invocation of a function pointer stored in a struct.
This first line obtains a pointer stored at address DS:xxxxxxxx. The square brackets indicate dereferencing of the address, much like * in C. The value from memory is about to be used as a pointer; it is placed into ecx register.
MOV ECX,DWORD PTR DS:[xxxxxxxx] ; xxxxxxxx is an address
The second line dereferences the pointer obtained above. That value from ecx is now used as the address, which is dereferenced. The value found in memory is another pointer. This second pointer is placed into the edx register.
MOV EDX,DWORD PTR DS:[ECX]
The third line again dereferences memory; this time, the access occurs to an address offset from the pointer obtained above by 0x116 bytes. This is not evenly divisible by four, so this function pointer does not appear to come from a C++ vtable. The value obtained from the memory is this time stored in register eax.
MOV EAX,DWORD PTR DS:[EDX+116]
Finally, the function pointed to by eax is executed. This simply invokes the function via a function pointer. The function appears to take zero arguments, but I have a question on revision of my answer: are there PUSH instruction which precede this snippet? Those would be the function arguments. The question marks indicate this function might return a value, we can't tell from our vantage.
CALL EAX
Overall, the code snippet looks like an invocation of an extension function from a plug-in library to OllyDbg. The OllyDbg ABI specifies various structs which contain some function pointers. There are also arrays of function pointers, but the double-indirection to get to the edx-held pointer (also the not-aligned-by-even-multiple offset) makes me think this is a struct and not an array of function pointers or a C++ class's vtable.
In other words, xxxxxxxx is a pointer to a pointer to a struct containing a function pointer.
In the OllyDbg source file PlugIn.h are some candidate struct definitions. Here's an example:
typedef struct t_sorted { // Descriptor of sorted table
char name[MAX_PATH]; // Name of table, as appears in error
int n; // Actual number of entries
int nmax; // Maximal number of entries
int selected; // Index of selected entry or -1
ulong seladdr; // Base address of selected entry
int itemsize; // Size of single entry
ulong version; // Unique version of table
void *data; // Entries, sorted by address
SORTFUNC *sortfunc; // Function which sorts data or NULL
DESTFUNC *destfunc; // Destructor function or NULL
int sort; // Sorting criterium (column)
int sorted; // Whether indexes are sorted
int *index; // Indexes, sorted by criterium
int suppresserr; // Suppress multiple overflow errors
} t_sorted;
Those examples are allowed to be NULL, and your asm snippet does not check for NULL pointer in the function pointer. Therefore, it would have to be DRAWFUNC from t_table or SPECFUNC of t_dump.
You could create a small project which includes the header file and uses printf() and offsetof() to determine whether either of those is at an offset of 0x116.
Otherwise, I imagine that the insides of OllyDbg are written in this same style. So there are likely to be private struct definitions (not published in the Plugin.h file) used for various purposes within OllyDbg.
I would like to add, I think it's a shame that OllyDbg sources are not available. I was under the impression that the statically-linked disassembler it contains was under some kind of ?GPL license, but I haven't had any luck getting the sources to OllyDbg.
Take the 32 bit number from the address xxxxxxx and put it in ECX register, then use this value as an address and read the value and put it in EDX register, finally add 116 to this number and read the value of that address into EAX. Then it starts executing the code at the address now held in EAX. When that code encounters a return opcode, execution will continue after the call instruction.
This is pretty basic assembly. It makes me wonder wtf you are doing with a debugger and when your assignment is due ;-)
It's been awhile since I did ASM (1997) and even then I was only doing i386 ASM so forgive me if my answer isn't all that helpful...
Unfortunately, these 4 lines of code don't tell me much. It's mostly just loading stuff into CPU registers and calling a function.
Specifically, It looks like data or perhaps a pointer is being loaded from that address into your CX register. Then that value is being copied from CX to DX. So you have the value of the pointer of CX located in DX. Then that value in DX plus an offset of 116 is being copied into the AX register (your accumulator?)
Then whatever function located at that address copied into AX is being executed.
I'm 99% sure it's a virtual method call, considering comments about compiler being MSVC.
MOV ECX,DWORD PTR DS:[xxxxxxxx]
Pointer to a class instance is loaded into ECX from a global variable. (NB: default __thiscall calling convention uses ECX to pass the instance pointer, aka the this pointer).
MOV EDX,DWORD PTR DS:[ECX]
vftable (virtual function table) pointer is usually the first item in the class layout. Here the pointer is loaded into EDX.
MOV EAX,DWORD PTR DS:[EDX+116]
A method pointer at offset 116 (0x74) in the table is loaded into EAX. Since each pointer is 4 bytes, this is the 30th virtual method of the class (116/4 + 1).
CALL EAX
The method is called.
In original C++ it would look something like this:
g_pObject1->method30();
To know more about MSVC's implementation of C++ classes, including virtual methods, see my article here.