x86_64 On Unix
When I print %rdi 3 times the value is correct for the first print, then the second and third time printing the value is equal to each other but different from the first print ( set value )
Why is the value changing? how do I properly set/get/print the value from the %rdi register without this changing? also an example on how to pass %rdi to a function and maintaining the value on return would be Extremely helpful.
Looked extensively for solutions but i cant find any. Im building a compiler and my professor has given us close to zero documentation on x86_64, wouldn't tell me where to find proper documentation and I'm struggling to grasp what is happening here.
compile with gcc -o test test.s
.section .rodata
.int_wformat: .string "%d\n"
.int_rformat: .string "%d"
.str_wformat: .string "%s"
.nl_const: .string "\n"
.initial_prompt_const: .string "\n\n\tAssembly X86 Test Program:\n----------------------------------------------------\n"
.ext_const: .string "\n----------------------------------------------------\n\t\tProgram Complete\n\n"
.sec_div_const: .string "\n------------------\n"
/* Set and Scan */
.calla_readb_const0: .string "\ta = 10\n\tEnter b = "
/* Arithmetic prints */
.aplusb_const0: .string "\ta + b = "
.aminusb_const0: .string "\ta - b = "
.atimesb_const0: .string "\ta * b = "
.agtb_const0: .string "\tWrite(a>b) : "
.agtb_const1: .string "\ta = 10\n\tb = "
.comm _gp, 16, 4
.text
.globl main
.type main,#function
main: nop
/* Push high address (saved registers) */
pushq %rbp
/* Set the top of the stack ( low address ) to the top of the stack (high address) */
movq %rsp, %rbp
movl $.initial_prompt_const, %ebx
movl $4, (%rdi)
movl (%rdi), %esi
movl $0, %eax
movl $.int_wformat, %edi
call printf
movl (%rdi), %esi
movl $0, %eax
movl $.int_wformat, %edi
call printf
movl (%rdi), %esi
movl $0, %eax
movl $.int_wformat, %edi
call printf
/* Sets %rsp to %rbp, pops top of stack into %rbp */
leave
ret
I have never seen this assembly syntax.
#include "syscall.h"
#include "traps.h"
#define SYSCALL(name) \
.globl name; \
name: \
movl $SYS_ ## name, %eax; \
int $T_SYSCALL; \
ret
SYSCALL(fork)
SYSCALL(exit)
SYSCALL(wait)
SYSCALL(pipe)
SYSCALL(read)
SYSCALL(write)
SYSCALL(close)
SYSCALL(kill)
SYSCALL(exec)
SYSCALL(open)
SYSCALL(mknod)
SYSCALL(unlink)
SYSCALL(fstat)
SYSCALL(link)
SYSCALL(mkdir)
SYSCALL(chdir)
SYSCALL(dup)
SYSCALL(getpid)
SYSCALL(sbrk)
SYSCALL(sleep)
SYSCALL(uptime)
For assembly language file with extension .S, gcc will use a C preprocessor.
In C, \ at the end of line means that "connect the next line to this line".
For that reason, the macro becomes
#define SYSCALL(name) .globl name; name: movl $SYS_ ## name, %eax; int $T_SYSCALL; ret
## operator will concatenate the tokens in its left and right.
Therefore, for example, SYSCALL(fork) will be expanded to
.globl fork; fork: movl $SYS_fork, %eax; int $T_SYSCALL; ret
This means
make the identifire fork public
define a label fork (this will work as a function)
in this function
assign an immediate value SYS_fork to register %eax
generate an interrupt with code T_SYSCALL
return from this function
I am working on minix 3.1.7,
I want to use a macro "SAVE_PROCESS_CTX" defined in "/usr/src/kernel/arch/i386/sconst.h" in my assembly file in "/usr/src/kernel/test.s":
/usr/src/kernel/arch/i386/sconst.h:
#ifndef __SCONST_H__
#define __SCONST_H__
#include "kernel/const.h"
/* Miscellaneous constants used in assembler code. */
W = _WORD_SIZE /* Machine word size. */
/* Offsets in struct proc. They MUST match proc.h. */
P_STACKBASE = 0
GSREG = P_STACKBASE
FSREG = GSREG+2 /* 386 introduces FS and GS segments*/
ESREG = FSREG+2
DSREG = ESREG+2
DIREG = DSREG+2
SIREG = DIREG+W
BPREG = SIREG+W
STREG = BPREG+W /* hole for another SP*/
BXREG = STREG+W
DXREG = BXREG+W
CXREG = DXREG+W
AXREG = CXREG+W
RETADR = AXREG+W /* return address for save() call*/
PCREG = RETADR+W
CSREG = PCREG+W
PSWREG = CSREG+W
SPREG = PSWREG+W
SSREG = SPREG+W
P_STACKTOP = SSREG+W
FP_SAVE_AREA_P = P_STACKTOP
P_LDT_SEL = FP_SAVE_AREA_P + 532
P_CR3 = P_LDT_SEL+W
P_CR3_V = P_CR3+4
P_LDT = P_CR3_V+W
P_MISC_FLAGS = P_LDT + 50
Msize = 9 /* size of a message in 32-bit words*/
/*
* offset to current process pointer right after trap, we assume we always have
* error code on the stack
*/
#define CURR_PROC_PTR 20
/*
* tests whether the interrupt was triggered in kernel. If so, jump to the
* label. Displacement tell the macro ha far is the CS value saved by the trap
* from the current %esp. The kernel code segment selector has the lower 3 bits
* zeroed
*/
#define TEST_INT_IN_KERNEL(displ, label) \
cmpl $CS_SELECTOR, displ(%esp) ;\
je label ;
/*
* saves the basic interrupt context (no error code) to the process structure
*
* displ is the displacement of %esp from the original stack after trap
* pptr is the process structure pointer
* tmp is an available temporary register
*/
#define SAVE_TRAP_CTX(displ, pptr, tmp) \
movl (0 + displ)(%esp), tmp ;\
movl tmp, PCREG(pptr) ;\
movl (4 + displ)(%esp), tmp ;\
movl tmp, CSREG(pptr) ;\
movl (8 + displ)(%esp), tmp ;\
movl tmp, PSWREG(pptr) ;\
movl (12 + displ)(%esp), tmp ;\
movl tmp, SPREG(pptr) ;\
movl tmp, STREG(pptr) ;\
movl (16 + displ)(%esp), tmp ;\
movl tmp, SSREG(pptr) ;
#define SAVE_SEGS(pptr) \
mov %ds, %ss:DSREG(pptr) ;\
mov %es, %ss:ESREG(pptr) ;\
mov %fs, %ss:FSREG(pptr) ;\
mov %gs, %ss:GSREG(pptr) ;
#define RESTORE_SEGS(pptr) \
movw %ss:DSREG(pptr), %ds ;\
movw %ss:ESREG(pptr), %es ;\
movw %ss:FSREG(pptr), %fs ;\
movw %ss:GSREG(pptr), %gs ;
/*
* restore kernel segments, %ss is kernnel data segment, %cs is aready set and
* %fs, %gs are not used
*/
#define RESTORE_KERNEL_SEGS \
mov %ss, %si ;\
mov %si, %ds ;\
mov %si, %es ;\
movw $0, %si ;\
mov %si, %gs ;\
mov %si, %fs ;
#define SAVE_GP_REGS(pptr) \
mov %eax, %ss:AXREG(pptr) ;\
mov %ecx, %ss:CXREG(pptr) ;\
mov %edx, %ss:DXREG(pptr) ;\
mov %ebx, %ss:BXREG(pptr) ;\
mov %esi, %ss:SIREG(pptr) ;\
mov %edi, %ss:DIREG(pptr) ;
#define RESTORE_GP_REGS(pptr) \
movl %ss:AXREG(pptr), %eax ;\
movl %ss:CXREG(pptr), %ecx ;\
movl %ss:DXREG(pptr), %edx ;\
movl %ss:BXREG(pptr), %ebx ;\
movl %ss:SIREG(pptr), %esi ;\
movl %ss:DIREG(pptr), %edi ;
/*
* save the context of the interrupted process to the structure in the process
* table. It pushses the %ebp to stack to get a scratch register. After %esi is
* saved, we can use it to get the saved %ebp from stack and save it to the
* final location
*
* displ is the stack displacement. In case of an exception, there are two extra
* value on the stack - error code and the exception number
*/
#define SAVE_PROCESS_CTX_NON_LAZY(displ) \
\
cld /* set the direction flag to a known state */ ;\
\
push %ebp ;\
;\
movl (CURR_PROC_PTR + 4 + displ)(%esp), %ebp ;\
;\
/* save the segment registers */ \
SAVE_SEGS(%ebp) ;\
\
SAVE_GP_REGS(%ebp) ;\
pop %esi /* get the orig %ebp and save it */ ;\
mov %esi, %ss:BPREG(%ebp) ;\
\
RESTORE_KERNEL_SEGS ;\
SAVE_TRAP_CTX(displ, %ebp, %esi) ;
#define SAVE_PROCESS_CTX(displ) \
SAVE_PROCESS_CTX_NON_LAZY(displ) ;\
push %eax ;\
push %ebx ;\
push %ecx ;\
push %edx ;\
push %ebp ;\
call _save_fpu ;\
pop %ebp ;\
pop %edx ;\
pop %ecx ;\
pop %ebx ;\
pop %eax ;
/*
* clear the IF flag in eflags which are stored somewhere in memory, e.g. on
* stack. iret or popf will load the new value later
*/
#define CLEAR_IF(where) \
mov where, %eax ;\
andl $0xfffffdff, %eax ;\
mov %eax, where ;
#endif /* __SCONST_H__ */
in my file "/usr/src/kernel/test.s":
#include "arch/i386/sconst.h"
.sect .text
.align 16
save_context:
SAVE_PROCESS_CTX(0)
But I recieve error :syntax error at the line where i call macro SAVE_PROCESS_CTX(0)
What is wrong?
here's a file "mpx.S"that already exist in "usr/src/kernel/arch/i386" and uses the same macro "SAVE_PROCESS_CTX"and it compiles with no problem ,but when call the same macro "SAVE_PROCESS_CTX" i can't compile the file ,here's the file :
/*
* This file is part of the lowest layer of the MINIX kernel. (The other part
* is "proc.c".) The lowest layer does process switching and message handling.
* Furthermore it contains the assembler startup code for Minix and the 32-bit
* interrupt handlers. It cooperates with the code in "start.c" to set up a
* good environment for main().
*
* Kernel is entered either because of kernel-calls, ipc-calls, interrupts or
* exceptions. TSS is set so that the kernel stack is loaded. The user cotext is
* saved to the proc table and the handler of the event is called. Once the
* handler is done, switch_to_user() function is called to pick a new process,
* finish what needs to be done for the next process to run, sets its context
* and switch to userspace.
*
* For communication with the boot monitor at startup time some constant
* data are compiled into the beginning of the text segment. This facilitates
* reading the data at the start of the boot process, since only the first
* sector of the file needs to be read.
*
* Some data storage is also allocated at the end of this file. This data
* will be at the start of the data segment of the kernel and will be read
* and modified by the boot monitor before the kernel starts.
*/
#include "kernel/kernel.h" /* configures the kernel */
/* sections */
#include <machine/vm.h>
#ifdef __ACK__
.text
begtext:
#ifdef __ACK__
.rom
#else
.data
#endif
begrom:
.data
begdata:
.bss
begbss:
#endif
#include <minix/config.h>
#include <minix/const.h>
#include <minix/com.h>
#include <machine/interrupt.h>
#include "archconst.h"
#include "kernel/const.h"
#include "kernel/proc.h"
#include "sconst.h"
/* Selected 386 tss offsets. */
#define TSS3_S_SP0 4
/*
* Exported functions
* Note: in assembly language the .define statement applied to a function name
* is loosely equivalent to a prototype in C code -- it makes it possible to
* link to an entity declared in the assembly code but does not create
* the entity.
*/
.globl _restore_user_context
.globl _reload_cr3
.globl _divide_error
.globl _single_step_exception
.globl _nmi
.globl _breakpoint_exception
.globl _overflow
.globl _bounds_check
.globl _inval_opcode
.globl _copr_not_available
.globl _double_fault
.globl _copr_seg_overrun
.globl _inval_tss
.globl _segment_not_present
.globl _stack_exception
.globl _general_protection
.globl _page_fault
.globl _copr_error
.globl _alignment_check
.globl _machine_check
.globl _simd_exception
.globl _params_size
.globl _params_offset
.globl _mon_ds
.globl _switch_to_user
.globl _save_fpu
.globl _hwint00 /* handlers for hardware interrupts */
.globl _hwint01
.globl _hwint02
.globl _hwint03
.globl _hwint04
.globl _hwint05
.globl _hwint06
.globl _hwint07
.globl _hwint08
.globl _hwint09
.globl _hwint10
.globl _hwint11
.globl _hwint12
.globl _hwint13
.globl _hwint14
.globl _hwint15
/* Exported variables. */
.globl begbss
.globl begdata
.text
/*===========================================================================*/
/* MINIX */
/*===========================================================================*/
.globl MINIX
MINIX:
/* this is the entry point for the MINIX kernel */
jmp over_flags /* skip over the next few bytes */
.short CLICK_SHIFT /* for the monitor: memory granularity */
flags:
/* boot monitor flags:
* call in 386 mode, make bss, make stack,
* load high, don't patch, will return,
* uses generic INT, memory vector,
* new boot code return
*/
.short 0x01FD
nop /* extra byte to sync up disassembler */
over_flags:
/* Set up a C stack frame on the monitor stack. (The monitor sets cs and ds */
/* right. The ss descriptor still references the monitor data segment.) */
movzwl %sp, %esp /* monitor stack is a 16 bit stack */
push %ebp
mov %esp, %ebp
push %esi
push %edi
cmp $0, 4(%ebp) /* monitor return vector is */
je noret /* nonzero if return possible */
incl _mon_return
noret:
movl %esp, _mon_sp /* save stack pointer for later return */
/* Copy the monitor global descriptor table to the address space of kernel and */
/* switch over to it. Prot_init() can then update it with immediate effect. */
sgdt _gdt+GDT_SELECTOR /* get the monitor gdtr */
movl _gdt+GDT_SELECTOR+2, %esi /* absolute address of GDT */
mov $_gdt, %ebx /* address of kernel GDT */
mov $8*8, %ecx /* copying eight descriptors */
copygdt:
movb %es:(%esi), %al
movb %al, (%ebx)
inc %esi
inc %ebx
loop copygdt
movl _gdt+DS_SELECTOR+2, %eax /* base of kernel data */
and $0x00FFFFFF, %eax /* only 24 bits */
add $_gdt, %eax /* eax = vir2phys(gdt) */
movl %eax, _gdt+GDT_SELECTOR+2 /* set base of GDT */
lgdt _gdt+GDT_SELECTOR /* switch over to kernel GDT */
/* Locate boot parameters, set up kernel segment registers and stack. */
mov 8(%ebp), %ebx /* boot parameters offset */
mov 12(%ebp), %edx /* boot parameters length */
mov 16(%ebp), %eax /* address of a.out headers */
movl %eax, _aout
mov %ds, %ax /* kernel data */
mov %ax, %es
mov %ax, %fs
mov %ax, %gs
mov %ax, %ss
mov $_k_boot_stktop, %esp /* set sp to point to the top of kernel stack */
/* Save boot parameters into these global variables for i386 code */
movl %edx, _params_size
movl %ebx, _params_offset
movl $SS_SELECTOR, _mon_ds
/* Call C startup code to set up a proper environment to run main(). */
push %edx
push %ebx
push $SS_SELECTOR
push $DS_SELECTOR
push $CS_SELECTOR
call _cstart /* cstart(cs, ds, mds, parmoff, parmlen) */
add $5*4, %esp
/* Reload gdtr, idtr and the segment registers to global descriptor table set */
/* up by prot_init(). */
lgdt _gdt+GDT_SELECTOR
lidt _gdt+IDT_SELECTOR
ljmp $CS_SELECTOR, $csinit
csinit:
movw $DS_SELECTOR, %ax
mov %ax, %ds
mov %ax, %es
mov %ax, %fs
mov %ax, %gs
mov %ax, %ss
movw $TSS_SELECTOR, %ax /* no other TSS is used */
ltr %ax
push $0 /* set flags to known good state */
popf /* esp, clear nested task and int enable */
jmp _main /* main() */
/*===========================================================================*/
/* interrupt handlers */
/* interrupt handlers for 386 32-bit protected mode */
/*===========================================================================*/
#define PIC_IRQ_HANDLER(irq) \
push $irq ;\
call _irq_handle /* intr_handle(irq_handlers[irq]) */ ;\
add $4, %esp ;
/*===========================================================================*/
/* hwint00 - 07 */
/*===========================================================================*/
/* Note this is a macro, it just looks like a subroutine. */
#define hwint_master(irq) \
TEST_INT_IN_KERNEL(4, 0f) ;\
\
**SAVE_PROCESS_CTX(0)** ;\
push %ebp ;\
movl $0, %ebp /* for stack trace */ ;\
call _context_stop ;\
add $4, %esp ;\
PIC_IRQ_HANDLER(irq) ;\
movb $END_OF_INT, %al ;\
outb $INT_CTL /* reenable interrupts in master pic */ ;\
jmp _switch_to_user ;\
\
0: \
pusha ;\
call _context_stop_idle ;\
PIC_IRQ_HANDLER(irq) ;\
movb $END_OF_INT, %al ;\
outb $INT_CTL /* reenable interrupts in master pic */ ;\
CLEAR_IF(10*4(%esp)) ;\
popa ;\
iret ;
and this is the file "sconst.h" in"usr/src/kernel/arch/i386" which contains the macro definition:
#ifndef __SCONST_H__
#define __SCONST_H__
#include "kernel/const.h"
/* Miscellaneous constants used in assembler code. */
W = _WORD_SIZE /* Machine word size. */
/* Offsets in struct proc. They MUST match proc.h. */
P_STACKBASE = 0
GSREG = P_STACKBASE
FSREG = GSREG+2 /* 386 introduces FS and GS segments*/
ESREG = FSREG+2
DSREG = ESREG+2
DIREG = DSREG+2
SIREG = DIREG+W
BPREG = SIREG+W
STREG = BPREG+W /* hole for another SP*/
BXREG = STREG+W
DXREG = BXREG+W
CXREG = DXREG+W
AXREG = CXREG+W
RETADR = AXREG+W /* return address for save() call*/
PCREG = RETADR+W
CSREG = PCREG+W
PSWREG = CSREG+W
SPREG = PSWREG+W
SSREG = SPREG+W
P_STACKTOP = SSREG+W
FP_SAVE_AREA_P = P_STACKTOP
P_LDT_SEL = FP_SAVE_AREA_P + 532
P_CR3 = P_LDT_SEL+W
P_CR3_V = P_CR3+4
P_LDT = P_CR3_V+W
P_MISC_FLAGS = P_LDT + 50
Msize = 9 /* size of a message in 32-bit words*/
/*
* offset to current process pointer right after trap, we assume we always have
* error code on the stack
*/
#define CURR_PROC_PTR 20
/*
* tests whether the interrupt was triggered in kernel. If so, jump to the
* label. Displacement tell the macro ha far is the CS value saved by the trap
* from the current %esp. The kernel code segment selector has the lower 3 bits
* zeroed
*/
#define TEST_INT_IN_KERNEL(displ, label) \
cmpl $CS_SELECTOR, displ(%esp) ;\
je label ;
/*
* saves the basic interrupt context (no error code) to the process structure
*
* displ is the displacement of %esp from the original stack after trap
* pptr is the process structure pointer
* tmp is an available temporary register
*/
#define SAVE_TRAP_CTX(displ, pptr, tmp) \
movl (0 + displ)(%esp), tmp ;\
movl tmp, PCREG(pptr) ;\
movl (4 + displ)(%esp), tmp ;\
movl tmp, CSREG(pptr) ;\
movl (8 + displ)(%esp), tmp ;\
movl tmp, PSWREG(pptr) ;\
movl (12 + displ)(%esp), tmp ;\
movl tmp, SPREG(pptr) ;\
movl tmp, STREG(pptr) ;\
movl (16 + displ)(%esp), tmp ;\
movl tmp, SSREG(pptr) ;
#define SAVE_SEGS(pptr) \
mov %ds, %ss:DSREG(pptr) ;\
mov %es, %ss:ESREG(pptr) ;\
mov %fs, %ss:FSREG(pptr) ;\
mov %gs, %ss:GSREG(pptr) ;
#define RESTORE_SEGS(pptr) \
movw %ss:DSREG(pptr), %ds ;\
movw %ss:ESREG(pptr), %es ;\
movw %ss:FSREG(pptr), %fs ;\
movw %ss:GSREG(pptr), %gs ;
/*
* restore kernel segments, %ss is kernnel data segment, %cs is aready set and
* %fs, %gs are not used
*/
#define RESTORE_KERNEL_SEGS \
mov %ss, %si ;\
mov %si, %ds ;\
mov %si, %es ;\
movw $0, %si ;\
mov %si, %gs ;\
mov %si, %fs ;
#define SAVE_GP_REGS(pptr) \
mov %eax, %ss:AXREG(pptr) ;\
mov %ecx, %ss:CXREG(pptr) ;\
mov %edx, %ss:DXREG(pptr) ;\
mov %ebx, %ss:BXREG(pptr) ;\
mov %esi, %ss:SIREG(pptr) ;\
mov %edi, %ss:DIREG(pptr) ;
#define RESTORE_GP_REGS(pptr) \
movl %ss:AXREG(pptr), %eax ;\
movl %ss:CXREG(pptr), %ecx ;\
movl %ss:DXREG(pptr), %edx ;\
movl %ss:BXREG(pptr), %ebx ;\
movl %ss:SIREG(pptr), %esi ;\
movl %ss:DIREG(pptr), %edi ;
/*
* save the context of the interrupted process to the structure in the process
* table. It pushses the %ebp to stack to get a scratch register. After %esi is
* saved, we can use it to get the saved %ebp from stack and save it to the
* final location
*
* displ is the stack displacement. In case of an exception, there are two extra
* value on the stack - error code and the exception number
*/
#define SAVE_PROCESS_CTX_NON_LAZY(displ) \
\
cld /* set the direction flag to a known state */ ;\
\
push %ebp ;\
;\
movl (CURR_PROC_PTR + 4 + displ)(%esp), %ebp ;\
;\
/* save the segment registers */ \
SAVE_SEGS(%ebp) ;\
\
SAVE_GP_REGS(%ebp) ;\
pop %esi /* get the orig %ebp and save it */ ;\
mov %esi, %ss:BPREG(%ebp) ;\
\
RESTORE_KERNEL_SEGS ;\
SAVE_TRAP_CTX(displ, %ebp, %esi) ;
**#define SAVE_PROCESS_CTX(displ) \**
SAVE_PROCESS_CTX_NON_LAZY(displ) ;\
push %eax ;\
push %ebx ;\
push %ecx ;\
push %edx ;\
push %ebp ;\
call _save_fpu ;\
pop %ebp ;\
pop %edx ;\
pop %ecx ;\
pop %ebx ;\
pop %eax ;
I am trying to compile my 1st Assembler program for UNIX, but get a lot of errors. For example, this code (expected to read and write a number from keyboard) gives me "Segmentation fault" message:
.data
.code32
printf_format:
.string "%d\n"
scanf_format:
.string "%d"
number:
.space 4
.text
.globl main
main:
pushl $number
pushl $scanf_format
call scanf
addl $8, %esp
pushl number
pushl $printf_format
call printf
addl $8, %esp
movl $0, %eax
ret
Your example works fine, if you compile it with gcc -m32 example.s. If you use a GAS/LD-combination, you cannot terminate it with ret. Use instead:
pushl $0
call exit
As far as I understand, a tail recursive function calls itself at the very last step (like the return statement) however, the first instance of the function is not terminated until all the other instances are terminated as well, thus we get a stack overflow after a number of instances are reached. Given that the recursion is at the very last step, is there any way to terminate the previous instance during or right before the next instance? If the only purpose of an instance is to call the next one, there is no reason why it should reside in memory, right?
Yes, some compilers will optimize tail recursion so that it doesn't require extra stack space. For example, let's look at this example C function:
int braindeadrecursion(int n)
{
if (n == 0)
return 1;
return braindeadrecursion(n - 1);
}
This function is pretty easy; it just returns 1. Without optimizations, clang generated code on my machine that looks like this:
_braindeadrecursion:
00 pushq %rbp
01 movq %rsp,%rbp
04 subq $0x10,%rsp
08 movl %edi,0xf8(%rbp)
0b movl 0xf8(%rbp),%eax
0e cmpl $_braindeadrecursion,%eax
11 jne 0x0000001c
13 movl $0x00000001,0xfc(%rbp)
1a jmp 0x0000002e
1c movl 0xf8(%rbp),%eax
1f subl $0x01,%eax
22 movl %eax,%edi
24 callq _braindeadrecursion
29 movl %eax,%ecx
2b movl %ecx,0xfc(%rbp)
2e movl 0xfc(%rbp),%eax
31 addq $0x10,%rsp
35 popq %rbp
36 ret
As you can see, the recursive call is in there at 0x24. Now, let's try with higher optimization:
_braindeadrecursion:
00 pushq %rbp
01 movq %rsp,%rbp
04 movl $0x00000001,%eax
09 popq %rbp
0a ret
Now, look at that - no more recursion at all! This example is pretty simple, but the optimization can still occur on more complicated tail recursive cases.
Either:
- you can increase the stack size or
- do not use recursion, but instead use some sort of looping.