This question already has an answer here:
How does the jmp instruction work in att assembly in this instance
(1 answer)
Closed 3 years ago.
The offending line:
8048f70: ff 24 85 00 a4 04 08 jmp *0x804a400(,%eax,4)
There is no instruction in the disassembled code at location 804a400 (my list ends at 804a247)
When I check to see what's at that memory location I get:
(gdb) x/c 0x804a40c
0x804a40c: -103 '\231'
(gdb) x/t 0x804a40c
0x804a40c: 10011001
(gdb) x/s 0x804a40c
0x804a40c: "\231\217\004\b\222\217\004\b\211\217\004\b\202\217\004\bw\217\004\b\002"
(gdb) x/3x 0x804a40c
0x804a40c: 0x99 0x8f 0x04
What exactly is this jmp statement trying to do?
That instruction is an indirect jump. This means that the memory address specified is not the jump target, but a pointer to the jump target.
First, the instruction loads the value at the memory address:
*0x804a400(,%eax,4)
which is more sensibly written as:
0x804a400 + %eax * 4 // %eax can be negative
And then set the %eip to that value.
The best way to decipher these is to use the Intel Programmer's Reference manual. Table 2-2 in Volume 2A provides a break down the ModR/M byte and in this case the SIB byte also.
Related
how do you simulate the RTI (Real Time Interrupt) in the 68HC11 THRSim11 simulator (see http://www.hc11.demon.nl/thrsim11/thrsim11.htm)? the following program works at the 68HC11 module but not in THRSim11. It's a test program to read from Analog to Digital Converter and display results to serial port using RTI. I tried the RTI interrupt vector 00EB and FFF0. My chip is the 68H711E9 with the following memory map.
I expected the THRSim11 to simulate the interrupt vector. When running the "again BRA again" just before CLI (enable Interrupt). It must be running the subroutine that reads from ADC and display to serial. It works perfectly in my 68HC711E9 Evaluation board with buffalo
REGBS EQU $1000 ;start of registers
BAUD EQU REGBS+$2B ;sci baud reg
SCCR1 EQU REGBS+$2C ;sci control1 reg
SCCR2 EQU REGBS+$2D ;sci control2 reg
SCSR EQU REGBS+$2E ;sci status reg
SCDR EQU REGBS+$2F ;sci data reg
TMSK2 EQU REGBS+$24 ;Timer Interrupt Mask Register 2
TFLG2 EQU REGBS+$25 ;Timer Interrupt Flag Register 2
ADR3 EQU $1033 ;ADC address 3
OPTION EQU $1039 ;ADC enable
SCS EQU $2E ;SCSR low bit
ADCTL EQU $1030 ;ADC setting
ADCT EQU $30 ;ADC setting low bit
PACTL EQU $1026 ;Pulse Accumulator control
***************************************************************
* Main program starts here *
***************************************************************
ORG $0110
* ORG $E000
start LDS #$01FF ;set stack pointer
JSR ONSCI ;initialize serial port
JSR t_init ;initialize timer
CLI ;enable interrupts
again BRA again
************************************************************
* t_init - Initialize the RTI timer
************************************************************
t_init LDAA #$01 ; set PTR1 and PTR0 to 0 and 1
STAA PACTL ;which leads to an RTI rate of 8.19 ms
LDAA #$40
STAA TFLG2 ;clears RTIF flag (write 1 in it!)
STAA TMSK2 ;sets RTII to allow interruptssec
RTS
************************************************************
* ADC_SERIAL - timer overflow interrupt service routine
************************************************************
ADC_SERIAL
LDX #REGBS
LDAA #%00010010
STAA ADCTL
LDAB #6
ADF00 DECB
BNE ADF00
ldaa ADR3 ; read ADC value
ldab SCSR ; read first Status
staa SCDR ; save in TX Register
BUFFS BRCLR SCS,X #$80 BUFFS
CLRFLG LDAA #$40
STAA TFLG2 ;clear RTIF
RTI ;return from ISR
************************************************************
* ONSCI() - Initialize the SCI for 9600
* baud at 8 MHz
************************************************************
ONSCI LDAA #$30
STAA BAUD baud register
LDAA #$00
STAA SCCR1
LDAA #$0C
STAA SCCR2 enable
LDAA #%10011010 ; enable the ADC
STAA OPTION
RTS
* Interrupt Vectors for BUFALO monitor
* ORG $FFF0 ;RTI vector for microcontroller
*
ORG $00EB ;Real Time Interrupt under Buffalo monitor
JMP ADC_SERIAL ;this instruction is executed every
* time there is a timer overflow
Presumably you mixed up "vector table" and "jump table". The HC11 expects an address at $FFF0, not an instruction.
In contrast, the Buffalo monitor expects an instruction at $00EB.
ORG $FFF0 ;RTI vector for microcontroller
FDB ADC_SERIAL
ORG $FFFE ;Reset vector for microcontroller
FDB start
As you will note, the same holds true for the reset vector at $FFFE.
With these changes it works for me. Be aware that the simulation is really slow*, depending on the number and kind of views opened.
Another side note: You send the single byte of conversion result without further processing. The serial receiver view of the simulator will try to interpret this byte as an ASCII character, and only if it fails, show a decimal number in angles. You might want to consider to convert the conversion result into a human readable value. The most simple solution may be a hex representation.
EDIT:
*) A simulator needs to be factors faster than the original machine, depending on the specific implementation of the simulation. In this case, they seem to have used a quite slow way. The documentation has some words on this. To gain some speed, close any view you don't need, and use the fastest PC you can get. To gain some understanding, think about how slow a simulation would be if it will simulate the analog electronics with each semiconductor of the chip. And even that is just a model, the "real" world currently starts at quantum mechanics.
Without further measure, you cannot use Buffalo's jump table entries, because the Buffalo monitor is not included in the simulator.
If you want to use an unmodified version of your firmware, you will need to add at least the used parts of the Buffalo monitor. If you have the monitor as a file loadable by the simulator, you might want to load it before loading your application.
The least you could do is to provide the jump table yourself, placing the appropriate address of the jump in the vector:
ORG $FFF0 ;RTI vector for microcontroller
FDB $00EB
The "problem" with the ASCII interpretation becomes visible, if values of printable characters are sent. Put the slider in the first third, and you will see some letter or digit or punctuation. Slide it minimally up and down for other characters. Yes, terminals can be dumb, and this one is no exception. Actually it is a little bit smart and shows the printable characters instead of their ASCII value. Additionally it knows at least CR (carriage return, $0D, decimal 13) and LF (line feed, $0A, decimal 10). You might want to write a little test program that sends "Hello, world", CR, LF. Or another experiment that sends all values from $00 to $FF.
The meaning of a value always depends on its interpretation. This terminal interprets values as ASCII characters, if possible.
I wrote a simple Hello world program in NASM, to then look at using objdump -d out of curiosity. The program is as follows:
BITS 64
SECTION .text
GLOBAL _start
_start:
mov rax, 0x01
mov rdi, 0x00
mov rsi, hello_world
mov rdx, hello_world_len
syscall
mov rax, 0x3C
syscall
SECTION .data
hello_world: db "Hello, world!", 0x0A
hello_world_len: equ $-hello_world
When I inspected this program, I found that the actual implementation of this uses movabs with the hex value 0x402000 in place of a name, which makes sense, except for the fact that surely this would mean that it knows 'Hello, world!' is going to be stored at 0x402000 everytime the program is run, and there is no reference to 'Hello, world!' anywhere in the output of objdump -d hello_world (the output of which I provided below).
I tried rewriting the program; This time I replaced hello_world on line 8 with mov rsi, 0x402000 and the program still compiled and worked perfectly.
I thought maybe it was some encoding of the name, however changing the text 'hello_world' in SECTION .data did not change the outcome either.
I'm more confused than anything - How does it know the address at compile time, and how come it never changes, even on recompilation?
(OUTPUT OF objdump -d hello_world)
./hello_world: file format elf64-x86-64
Disassembly of section .text:
0000000000401000 <_start>:
401000: b8 01 00 00 00 mov $0x1,%eax
401005: bf 00 00 00 00 mov $0x0,%edi
40100a: 48 be 00 20 40 00 00 movabs $0x402000,%rsi
401011: 00 00 00
401014: ba 0e 00 00 00 mov $0xe,%edx
401019: 0f 05 syscall
40101b: b8 3c 00 00 00 mov $0x3c,%eax
401020: bf 00 00 00 00 syscall
(as you can see, no 'Disassembly of section .data', which further confuses me)
The string is known at compile time too. It statically exists in your executable. The compiler put it at the address in the first place, so of course it knows the address!
(And in an ASLR or dylib environment this would still apply, because all addresses relative to the module would get shifted as needed and the compiler would put a relocation entry so the loader knows there is an address reference there to fix up, but they would still stay the same relative to each other.)
And this doesn't mean that every program ever existing will have unique memory locations, nor does it mean that all contents of a program have to idly sit around and use up all of your memory even if they are rarely needed, because this is virtual memory.
The address is only meaningful within your own process, and the memory page in question doesn't have to exist in memory physically, it can be paged in and out as needed, and it's the OS' memory manager's job to decide what to keep in physical memory at what times. Attempting to access an address belonging to a page that's not physically in memory will make it transparently get paged in by the kernel at that point in time. But with such a small program, most likely the whole program will be in memory from the start.
In user-mode code, you will generally never see physical memory addresses. This is entirely abstracted away by the kernel.
I need a helping hand in order to understand the following assembly instruction. It seems to me that I am calling a address at someUnknownValue += 20994A?
E8 32F6FFFF - call std::_Init_locks::operator=+20994A
Whatever you're using to obtain the disassembly is trying to be helpful, by giving the target of the call as an offset from some symbol that it knows about -- but given that the offset is so large, it's probably confused.
The actual target of the call can be calculated as follows:
E8 is a call with a relative offset.
In a 32-bit code segment, the offset is specified as a signed 32-bit value.
This value is in little-endian byte order.
The offset is measured from the address of the following instruction.
e.g.
<some address> E8 32 F6 FF FF call <somewhere>
<some address>+5 (next instruction)
The offset is 0xFFFFF632.
Interpreted as a signed 32-bit value, this is -0x9CE.
The call instruction is at <some address> and is 5 bytes long; the next instruction is at <some address> + 5.
So the target address of the call is <some address> + 5 - 0x9CE.
If you are analyzing the PE file with a disassembler, the disassembler might had given you the wrong code. Most malware writer uses insertion of E8 as anti-disassembly technique. You can verify if the codes above E8 are jump instructions where the jump location is after E8.
I've wrote my very first MSP-EXP430F5529LP LED on/off program.
and I wanted to analyze my program. but I had problem at my first step.
I extracted my LED program from board and I've got unclear data. (3)
that's my first question. what is that file format? I mean I want to know file format for my memory dump file. (3)
my second question is that why CCS 6 doesn't indicate memory address properly?
I know that MSP430 is 16 bit MCU. so every memory address should be 16 bit-width. but my assembly code(2) which is copied from CCS6 Disassembly View show me address just like 01XXXX format.
relative data dereference and execution flow branches work well. but why CSS6 make me confused? I mean I want to know that why CCS6 display memory addresse 24 bit-width??
anyone who know where is TI document which explain what I want to know, please let me know. please just don't mention MSP430xxxx User's Guide.
sorry for my english :(
1.c code
#include <msp430f5529.h>
volatile unsigned int i;
void main(void) {
WDTCTL = WDTPW | WDTHOLD;
P1DIR |= 0x01;
while(1){
P1OUT ^= 0x01;
for(i = 20000;i > 0; i--);
}
}
2.assembly code
0100c2: 40B2 5A80 015C MOV.W #0x5a80,&Watchdog_Timer_WDTCTL
0100c8: D3D2 0204 BIS.B #1,&Port_A_PADIR
0100cc: E3D2 0202 XOR.B #1,&Port_A_PAOUT
0100d0: 40B2 4E20 2400 MOV.W #0x4e20,&i
0100d6: 3C02 JMP (0x00dc)
0100d8: 8392 2400 DEC.W &i
0100dc: 9382 2400 TST.W &i
0100e0: 27F5 JEQ (0x00cc)
0100e2: 3FFA JMP (0x00d8)
0100e4: 4303 NOP
0100e6: D032 0010 BIS.W #0x0010,SR
0100ea: 3FFD JMP (0x00e6)
0100ec: 431C MOV.W #1,R12
0100ee: 0110 RETA
0100f0: 4303 NOP
0100f2: 3FFF JMP (0x00f2)
3.memory dump (MAIN)
:1044000031400044b113ec000c930224b1130000be
:104410000c43b113c200b113f00000000200000011
:10442000840001001a44000000240000ffffffff89
:10443000ffffffffffffffffffffffffffffffff8c
:10444000ffffffffffffffffffffffffffffffff7c
...
...
If one reads the User Guide (which is why they exist) then one is informed that the Program Counter is 20-bit. So, now you know why you see an address in the 20-bit range.
Link to the MSP430 User Guide: http://www.ti.com/lit/ug/slau208n/slau208n.pdf
The 20-bit PC (PC/R0) points to the next instruction to be executed.
Each instruction uses an even number of bytes (2, 4, 6, or 8 bytes),
and the PC is incremented accordingly. Instruction accesses are
performed on word boundaries, and the PC is aligned to even addresses.
Figure 6-3 shows the PC.
The above is an excerpt from the User Guide. I cannot emphasis this enough - but you really need to read the User Guide. Not doing so and attempting to program microcontrollers is perlious to your mental health.
The memory dump seems to be in the Intel hex file format https://en.wikipedia.org/wiki/Intel_HEX
Can somebody suggest me any disassembler for Atmel AVR 8-bit microcontrollers? There are opensource projects for this?
Thanx.
You can also use avr-objdump, a tool part of the avr-gcc toolset ( http://www.nongnu.org/avr-libc/ ). Ex:
avr-objdump -s -m <avr architecture> .d program.hex > program.dump
where <avr architecture> is found on http://www.nongnu.org/avr-libc/user-manual/using_tools.html
[plug]IDA Pro supports AVR disassembly[/plug]:
As for opensource, AVR GCC package includes a port of objdump, including disassembling functionality.
http://www.onlinedisassembler.com/odaweb/
Lots of platforms (AVR also) but Microchip (which you didn't need either) is missing.
Big plus is that it is web based.
Checkout vAVRdisasm.
AVRDisassembler is an open source (MIT) AVR / Arduino disassembler written in .NET Core (which means it can run on Windows, Mac, Linux). Apart from writing the disassembly to stdout, it can also emit a JSON dump (for interopability, analysis purposes).
Disclaimer: I am the author of said library.
I'm using avrdisas by Johannes Bauer. It works with dumped flash, rather than the .hex file or ELF.
Compiling the following :
.include "tn13def.inc"
ldi r16,1
out ddrb,r16 ; PB0 as output
sbiw r24,1 ; slight wait
brne PC-1
sbi pinb,pinb0 ; toggle
rjmp PC-3 ; forever
produces listing:
C:000000 e001 ldi r16,1
C:000001 bb07 out ddrb,r16 ; PB0 as output
C:000002 9701 sbiw r24,1 ; slight wait
C:000003 f7f1 brne PC-1
C:000004 9ab0 sbi pinb,pinb0 ; toggle
C:000005 cffc rjmp PC-3 ; forever
extracting the flash contents with:
$ avrdude -p t13 -P usb -c usbtiny -U flash:r:flash.bin:r
gives: e001 bb07 9701 f7f1 9ab0 cffc
disassembly:
$ ./avrdisas -a1 -o1 -s1 flash.bin
; Disassembly of flash.bin (avr-gcc style)
.text
main:
0: 01 e0 ldi r16, 0x01 ; 1
2: 07 bb out 0x17, r16 ; 23
; Referenced from offset 0x06 by brne
; Referenced from offset 0x0a by rjmp
Label1:
4: 01 97 sbiw r24, 0x01 ; 1
6: f1 f7 brne Label1
8: b0 9a sbi 0x16, 0 ; 0x01 = 1
a: fc cf rjmp Label1
and this works for me, even if the endian-ness does not match the listing and I would need to resolve 0x17 back to DDRB etc.
As opensource disassembler I've tried Radare2 which is command-line oriented but you can also use the GUI called Cutter. https://rada.re/n/
Or you can just use the classical avr-objdump:
avr-objdump.exe -j .sec1 -d -m avr5 dumpfile.hex
Information source here
The question is rather about disassembling the HEX file and as a solution there are mentioned quite a lot tools above in other answers. Hard to add something more.
But if someone is looking for: it is also possible to disassemble the C/C++ while running in IDE. With Atmel studio with its integrated disassembling tool it can be done following way:
Run project (it can be run in simulator without debugger hardware);
Pause or stop at breakpoint;
Press CTRL + ALT + D
This can be useful in order to verify that particular code fragments are compiled as needed because the optimization sometimes skips/mangles the sequence and leads to some unexpected behavior.