What does [BX] mean in this part of code? - pointers

I have this code which is a alternative to print a string of characters using the loop command.
data segment
mystr db "Hello World!"
ends
code segment
start:
mov ax, data
mov ds, ax
lea bx,mystr
mov cx,50
L1:
mov dl,[BX]
inc BX
cmp dl,'!'
je L2
mov ah,02
int 21h
loop L1
L2:
mov ax, 4c00h
int 21h
ends
end start
The lea command saves mystr to the BX register what does [BX] mean and why does incrementing the BX value gives us access to different parts of the string?

In Intel-style assembly code, square brackets ([..]) mean dereference -- access the memory pointed at by the thing in the brackets.
So [bx] means access the memory pointed at by the bx register, and move dl, [bx] means load a byte from that address and put it into dl

Related

TASM struct initilization and pointer math issues

I am attempting to write a simple DOS test program in assembly using TASM v4.1 that walks through a structure that contains four strings of equal length, but I've hit two issues.
ideal
model small
stack 1024
struc Strings_s
s1 db 32 dup (?)
s2 db 32 dup (?)
s3 db 32 dup (?)
s4 db 32 dup (?)
ends Strings_s
codeseg
start:
mov ax, #data
mov ds, ax ; Set %DS to point to the data segment
mov cx, 4 ; Load loop count
mov si, offset mystrings.s1 ; Load seg offset of first string
start_1:
push si
call putstr ; Print asciiz string
pop si
;add si, offset (Strings_s ptr ds:0).s1 ; ***BROKEN***
add si, offset (Strings_s ptr ds:0).s2 ; FIXED
loop start_1 ; Loop
fin:
mov ax, 4C00h ; [DOS] terminate program
int 21h ; ...
putstr_0:
mov bx, 07h
mov ah, 0Eh ; [BIOS] Display character
int 10h ; ...
putstr:
lodsb ; Get next char from %SI
test al, al ; End of string?
jne putstr_0 ; no, loop
return:
ret ; Return to caller
LF equ 10
CR equ 13
dataseg
mystrings Strings_s <"One string","Two strings","Three strings","Four strings">
end start
The first issue is that I need to terminate the strings I'm declaring in the struct, but adding ,CR,LF,0 is misinterpreted as additional struct members and TASM doesn't see \r\n\0 as escape sequences.
The second issue is that I'm trying to add the length Strings_s.s1 without hard coding 32 into my code. I first tried using the sizestr directive on the struct member, but even with version t300 defined before the ideal directive, TASM considers it an undefined symbol. So then I tried the example I included using the offset and struct cast, but it ends up being encoded as add si,0.
Ideas?
EDIT: The second issue turned out to be a simple error. You need to offset to the second member of the struct. (code fixed)
EDIT2: The sizestr directive only works against text macros and is really just a simple strlen of the text after the equ directive, so it isn't what I thought it was. I also tried slipping in a Strings_len EQU $-Strings_s between the s1 and s2 members, but it incorrectly equated to 23, not 32.

Recursion in Assembly x86: Fibonacci

I am trying to code a recursive fibonacci sequence in assembly, but it is not working for some reason.
It does not give an error, but the output number is always wrong.
section .bss
_bss_start:
; store max iterations
max_iterations resb 4
_bss_end:
section .text
global _start
; initialise the function variables
_start:
mov dword [max_iterations], 11
mov edx, 0
push 0
push 1
jmp fib
fib:
; clear registers
mov eax, 0
mov ebx, 0
mov ecx, 0
; handle fibonacci math
pop ebx
pop eax
add ecx, eax
add ecx, ebx
push eax
push ebx
push ecx
; incriment counter / exit contitions
inc edx
cmp edx, [max_iterations]
je print
; recursive step
call fib
ret
print:
mov eax, 1
pop ebx
int 0x80
For instance, the above code prints a value of 79 rather than 144 (11th fibonacci number).
Alternatively, if I make
mov dword [max_iterations], 4
Then the above code prints 178 rather than 5 (5th fibonacci number).
Any one have an idea?
K
As an approach, you should try to debug it with the smallest possible input, like 1 iteration.  That will be most revealing as you can watch it do the wrong thing in great detail without worrying about multiple recursing's.  When that works, go to 2 iterations.
When you use complex addressing modes, it is harder to debug as we cannot see what the processor is doing.  So, when an instruction using a complex addressing mode doesn't work, and you want to debug it, then split that instruction into 2 instructions as follows:
mov dword [fibonacci_seq + edx + 4], ecx
---
lea esi, [fibonacci_seq + edx + 4]
mov [esi], ecx
With the alternate code sequence, you can observe the value of the addressing mode computation, which will provide you with additional debugging insight.
As another example:
cmp edx, [max_iterations]
---
mov edi, [max_iterations]
cmp edx, edi
Using the 2 instruction version, you will be able to see what value the processor is comparing edx with.
Or better, do that that mov load once before the loop, so you're keeping the loop bound in a register all the time. That's what you should normally do when you have enough registers, only using memory when you run out.
You are jmping to fib from one place in the code and calling it from another.  Though your logic should work because when you've reached the limit, you don't return to the main, this is really bad form: to mix main code with function.  More on that below...
mov dword [fibonacci_seq + edx + 4], ecx
Is this working for you? You're only incrementing edx by 1.  Perhaps you wanted:
mov dword [fibonacci_seq + edx * 4], ecx
I would argue that your code is not really recursive.
call fib ; jumps to fib, pushes a return address
ret ; never, ever reached, so, pointless
---
jmp fib ; iterate w/o pushing unwanted return address onto the stack
The 1-instruction jmp will be superior to the call as a mechanism to iterate, in part b/c it doesn't push an unnecessary return address onto the stack.
When you debug with 2 iterations, you'll probably see that the unused return address pushed by the call messes up your "parameter" passing, pops.
To expand on the "recursion", when the iteration stops and control transfers to print, there will be some 11 (depending on iteration count) unused return addresses on the stack (modulo the interference by the pop's and pushes).
The recursive call is only used for iteration, the recursion never unwinds.  Thus, I would argue it's not recursive (not even tail recursive) — it just erroneously pushes some unused return addresses onto the stack — that's not recursion.
This line is your main problem:
mov dword [fibonacci_seq + edx + 4], ecx
Because of the +4, you never write to the first entry of the "array". And because you only increment EDX by 1, each write to the array overwrites 3 bytes of the previous entry. Try this instead:
mov dword [fibonacci_seq + edx * 4], ecx
A bit of redesign, as I did not realise that the call instruction used the stack in this way, and the solution is here
section .bss
_bss_start:
; store max iterations and current iteration
max_iterations resb 4
iteration resb 4
; store arguments
n_0 resb 4
n_1 resb 4
_bss_end:
section .text
global _start
; initialise the function variables
_start:
mov dword [max_iterations], 11
mov dword [iteration], 0
mov dword [n_0], 0
mov dword [n_1], 1
jmp fib
fib:
mov ecx, 0
mov edx, 0
mov eax, [n_0]
mov ebx, [n_1]
add ecx, eax
add ecx, ebx
mov edx, [n_1]
mov dword [n_0], edx
mov dword [n_1], ecx
mov edx, [iteration]
inc edx
mov dword [iteration], edx
cmp edx, [max_iterations]
je print
call fib
ret
print:
mov eax, 1
mov ebx, [n_1]
int 0x80

Nasm - access struct elements by value and by address

I started to code in NASM assembly lately and my problem is that I don't know how I access struct elements the right way. I already searched for solutions on this site and on google but everywhere I look people say different things. My program is crashing and I have the feeling the problem lies in accessing the structs.
When looking at the example code:
STRUC Test
.normalValue RESD 1
.address RESD 1
ENDSTRUC
TestStruct:
istruc Test
at Test.normalValue dd ffff0000h
at Test.address dd 01234567h
iend
;Example:
mov eax, TestStruct ; moves pointer to first element to eax
mov eax, [TestStruct] ; moves content of the dereferenced pointer to eax (same as mov eax, ffff0000h)
mov eax, TestStruct
add eax, 4
mov ebx, eax ; moves pointer to the second element (4 because RESD 1)
mov eax, [TestStruct+4] ; moves content of the dereferenced pointer to eax (same as mov eax, 01234567h)
mov ebx, [eax] ; moves content at the address 01234567h to ebx
Is that right?
Help is appreciated
I dont know if you figured out but here is our code with some little modification that works. All instructions are correct except the last one mov ebx, [eax] which is expected caus you are trying to access content at address 0x1234567 resulting in SIGSEGV
section .bss
struc Test
normalValue RESD 1
address RESD 1
endstruc
section .data
TestStruct:
istruc Test
at normalValue, dd 0xffff0000
at address, dd 0x01234567
iend
section .text
global _start
_start:
mov eax, TestStruct ; moves pointer to first element to eax
mov eax, [TestStruct] ; moves content of the dereferenced pointer to eax same as mov eax, ffff0000h
mov eax, TestStruct
add eax, 4
mov ebx, eax ; moves pointer to the second element 4 because RESD 1
mov eax, [TestStruct+4] ; moves content of the dereferenced pointer to eax same as mov eax, 01234567h
mov ebx, [eax] ; moves content at the address 01234567h to ebx
Compile, link and run step by step with nasm -f elf64 main.nasm -o main.o; ld main.o -o main; gdb main

Issue with strchr() function implementation

I've recently started looking into assembly code and I'm trying to recode some basic system functions to get a grip on it, I'm currently stuck on a segmentation fault at 0x0 on my strchr.
section .text
global strchr
strchr:
xor rax, rax
loop:
cmp BYTE [rdi + rax], 0
jz end
cmp sil, 0
jz end
cmp BYTE [rdi + rax], sil
jz good
inc rax
jmp loop
good:
mov rax, [rdi + rcx]
ret
end:
mov rax, 0
ret
I can't figure out how to debug it using GDB, also the documentation I've came across is pretty limited or hard to understand.
I'm using the following main in C to test
extern char *strchr(const char *s, int c);
int main () {
const char str[] = "random.string";
const char ch = '.';
char *ret;
ret = strchr(str, ch);
printf("%s\n", ret);
printf("String after |%c| is - |%s|\n", ch, ret);
return(0);
}
The Problem
The instruction immediately following the good label:
mov rax, [rdi + rcx]
should actually be:
lea rax, [rdi + rax]
You weren't using rcx at all, but rax and, what you need is the address of that position, not the value at that position (i.e. lea instead of mov).
Some Advice
Note that the typical idiom for comparing sil against zero is actually test sil, sil instead of cmp sil, 0. It would be then:
test sil, sil
jz end
However, if we look at the strchr(3) man page, we can find the following:
char *strchr(const char *s, int c);
The terminating
null byte is considered part of the string, so that if c is specified as '\0', these functions return a pointer to the terminator.
So, if we want this strchr() implementation to behave as described in the man page, the following code must be removed:
cmp sil, 0
jz end
The typical zeroing idiom for the rax register is neither mov rax, 0 nor xor rax, rax, but rather xor eax, eax, since it doesn't have the encode the immediate zero and saves one byte respect to the latter.
With the correction and the advice above, the code would look like the following:
section .text
global strchr
strchr:
xor eax, eax
loop:
; Is end of string?
cmp BYTE [rdi + rax], 0
jz end
; Is matched?
cmp BYTE [rdi + rax], sil
jz good
inc rax
jmp loop
good:
lea rax, [rdi + rax]
ret
end:
xor eax, eax
ret

How to move the value in a larger-typed register into a smaller-typed destination?

For example, I want to move a DWORD value in a register into a memory location typed WORD, but am getting errors:
mov [arr + eax*TYPE arr], edx ; error: operands must be same size
the [] brackets dereference to an array element of type WORD.
I've tried doing this as well:
mov dx, edx ; error: operands must be same size.
mov [arr + eax*TYPE arr], dx
Also no luck trying to use PTR:
mov dx, WORD PTR edx ; error: invalid use of register
OR
mov WORD PTR [arr + eax*TYPE arr], edx ; error: invalid use of register
OR
mov [arr + eax*TYPE arr], WORD PTR edx ; error: invalid use of register
Solution? Thanks for any help!
The register DX is actually the lowest 16 bit of the 32 bit register EDX. You don't need to mov dx, edx because DX is already there. So you simply need to store DX in the word sized variable:
mov [word_variable], dx
Of course the highest 16 bit of edx will be lost in such a transfer.

Resources