I'm trying to parse some .a2l and .hex files to extract variables and their values. So far l don't know how to find the values of the variables in the .hex file. Here is a link to download an example of these files.
To be more specific : How can I read the value at the address 0x810600 in the .hex file ?
/begin CHARACTERISTIC ASAM.C.DEPENDENT.REF_1.SWORD
"Dependent SWORD"
VALUE
0x810600
RL.FNC.SWORD.ROW_DIR
0
CM.IDENTICAL
-32268 32267
/begin DEPENDENT_CHARACTERISTIC
"X1 + 5"
ASAM.C.SCALAR.SBYTE.IDENTICAL
/end DEPENDENT_CHARACTERISTIC
DISPLAY_IDENTIFIER DI.ASAM.C.DEPENDENT.REF_1.SWORD
/end CHARACTERISTIC
In the same A2L, please find RL.FNC.SWORD.ROW_DIR item, I guess it might be kind of signed word (2 bytes) type.
I'm not sure if this is kind of array or some special type... I assume this is just single variable (scalar).
Again, find CM.IDENTICAL item, as it's name maybe it's identical compu_method. This means HEX value 0 -> displayed screen as 0, HEX value 100 -> displayed screen as 100, ... identical between internal value and physical value. No special conversion I guess.
Go to the address 0x810600 in HEX then you can find some values there. As it is identical compu_method type, the value in HEX might be identically displayed in M/C SW (INCA, Vision, CANape, ...) I guess.
HEX is of intel hex format. This format is used to map each part of the file to a part in virtual address space of device. You can also use the following command if you use Linux:
objdump -s file.hex
Related
I read an Executable file (exe) and I saw \x00#, I know that 0x00 is NULL, but what does the # represent in hexdecimal? I couldn't find any information about this.
Example
b'MZ\x90\x00\x03\x00\x00\x00\x04\x00\x00\x00\xff\xff\x00\x00\xb8\x00\x00\x00\x00\x00\x00\x00#\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xc0\x00\x00\x00\x0e\x1f\xba\x0e\x00\xb4\t\xcd!\xb8\x01L\xcd!This program cannot be run in DOS mode.\r\r\n'
It means nothing special, you are simply viewing raw binary in some manner of bad editor and # simply means value 0x40, or perhaps 0x0040. Perhaps the editor is using a symbol format (some UTF?) where most of these raw hex values don't make sense, but it was able to represent 0x40 or 0x0040 as #.
I'm guessing this binary goo is from the PE Format for Windows executables.
The source code of UNIX V6 is available and there is a book on it by J.Lions. From the book I know that " . " symbol means current location. I do not understand the next:
"*An assignment statement of the form
identifier = expression
associates a value and type with the identifier. In the example
. = 60^.
the operator ’^’ delivers the value of the first
operand and the type of the second operand
(in this case, “location”);*"
The statement can be found in file low.s (0526). What does it mean? Does it actually change PC register value and behaves as a jump instruction? I know it is old code, but I want to understand it. Thank you.
In the 6th edition assembler, . is the location counter, an offset from the beginning of a segment (text, data, or bss). When the assembler starts processing a file, . in each segment is 0, and is incremented either by assignment to . or by the presence of data or instruction statements.
The statement . = 60^. means to take the value 60 (in octal), cast it to the type of the location counter (in this case, type data), and assign it to the location counter. You'll see several statements like this in low.s in the area where interrupt vectors are set up.
When the link editor combines multiple object files together, their text, data, and bss sections are concatenated (except for COMMON data, which gets allocated just once) and any references (such as labels) to instructions or data will be relocated appropriately.
Building the Unix kernel requires an extra step to make sure data meant to be in low memory get loaded at the proper address. After low.s and the rest of the Unix kernel object files have been linked together, sysfix is run to make the data section have a load address of 0, and to relocate all data references appropriately. So that . = 60^. statement has effectively set the location counter to physical address 60.
Before watch the mail list, I'm confused with the lack of "size" of symbol table in the Mach-o file. And I found the solution in source file posted in that E-Mail, which note that:
//Mach-O symbol table does have size in it
//so need to scan ahead to find symbol with next highest address.
But when I parse out the symbol table in a Mach-O file (I got the symbol table from the symtab_command and the following nlists) and trying to calculate the size of one global symbol as the same way, I was confused again when I compared the symbol table from the output of dwarfdump (dwarfdump -ae). The end address of the symbol in the symbol table from the dwarfdump is different from the result my program's output. Is there some problem with the symbol table I parsed out? Or is there some other way to work out it?
Some of the output from my program:
<start address> <section index> <method>
0x0006d030 1 ___arclite_objc_autoreleasePoolPop
0x0006d048 1 _patch_lazy_pointers
0x0006d1f0 1 ___arclite_objc_autoreleasePoolPush
The corresponding part of the output from dwarfdump:
0x0014a37b: [0x0006d030 - 0x0006d046) __arclite_objc_autoreleasePoolPop
0x0014a122: [0x0006d048 - 0x0006d1ee) patch_lazy_pointers
0x0014a3a0: [0x0006d1f0 - 0x0006d212) __arclite_objc_autoreleasePoolPush
So if I use the way in the "MachONormalizedFileToAtoms.cpp" to calculate the end address of the symbol (look ahead to find symbol with next highest address), the result must be different from the output of dwarfdump. And does anyone know how dwarfdump calculate it?
Thank you!
From the answer by Nick Kledzik:
The compiler often aligns functions to start at aligned address (e.g. 8 or 16 bytes). So, there is padding bytes (usually NOPs) after the end of a function and before the start of the next function.
dwarfdump has access to the debug info which does have size info for functions. So dwarfdump can show the size of a function without the alignment padding at the end. Whereas the linker just looks at the next symbol address. There is not much point in the linker digging through the debug info to get a function’s true size, because when writing the output, the linker has to align the next function which would just add back the pad bytes.
I hope that can help others who has the same confusion.
When ever we program a micro controller we convert the C file into a hex file and then we burn that into controller.
My question is that why a hex file only, is that hex file a hexadecimal version of binary executable?
If yes then why do not we use a binary file instead?
if you are talking about an "intel hex" file the reason being is that it is ascii which makes it easy to examine and parse. true, it is innefficient in one way but compared to a raw binary it might be smaller. With a raw binary you only have one if any address associated, the starting address (not embedded in the file) in a hex file or motorola srecord which is a similar and often used format as well. both the ihex and srec formats are basically lines of ascii/hex numbers that represent a type a starting address, length data, and a checksum. there are non data lines in there but much of it will be data. so if your program has a few bytes at address 0x1000 and a few bytes at 0x80000000 then a .bin file would be at its smallest 0x8000000-0x1000 plus a few bytes but would typically be 0x80000000+ a few bytes (right, 2 gigabytes). Where an ihex or srec would be in the dozens of bytes total. the ihex and srec have built in checksums to help protect against corrupt files, not perfect of course but better than nothing at all...
Since then elf and coff and other formats have become popular. these are also based on blocks of data and not a complete memory image. these are binary, not ascii formats, but they are not just a memory image. chunks of data with address, type, etc are provided.
Because the ihex and srec are so simple to create and parse they will continue to be used for a long time, it does not take a lot of resources in a bootloader for example to handle receiving an ihex or srec file. (same with a binary of course, but the binary has a lot of fill data in it costing a lot of unnecessary transmission time).
Every executable must have an ELF header?
Also i would like to know why libraries and header's properties are often associated with HEX values; what is this HEX related to? Why HEX and not just binary code or something else.
I'm referring to the HEX values that comes up with the use of ldd and readelf for example, the 2 utilities often used under linux.
This question is for a generic OS and is not targeting a specific one, the architecture is supposed to be X86 or ARM.
Every executable must have an ELF header
Yes, every ELF file begins with an ELF file header. If it doesn't, it's not a valid ELF file by definition.
Why HEX and not just binary code or something else
You appear to be very confused about what HEX means. Any integer number can be written in many different representations. Decimal (base-10), octal (base-8), hex (base-16) are the most common ones, but base-20 is not unheard of. It's just a number, regardless of how you choose to represent it.