How to get the size of symbols in the symbol table of Mach-O file? - mach-o

Before watch the mail list, I'm confused with the lack of "size" of symbol table in the Mach-o file. And I found the solution in source file posted in that E-Mail, which note that:
//Mach-O symbol table does have size in it
//so need to scan ahead to find symbol with next highest address.
But when I parse out the symbol table in a Mach-O file (I got the symbol table from the symtab_command and the following nlists) and trying to calculate the size of one global symbol as the same way, I was confused again when I compared the symbol table from the output of dwarfdump (dwarfdump -ae). The end address of the symbol in the symbol table from the dwarfdump is different from the result my program's output. Is there some problem with the symbol table I parsed out? Or is there some other way to work out it?
Some of the output from my program:
<start address> <section index> <method>
0x0006d030 1 ___arclite_objc_autoreleasePoolPop
0x0006d048 1 _patch_lazy_pointers
0x0006d1f0 1 ___arclite_objc_autoreleasePoolPush
The corresponding part of the output from dwarfdump:
0x0014a37b: [0x0006d030 - 0x0006d046) __arclite_objc_autoreleasePoolPop
0x0014a122: [0x0006d048 - 0x0006d1ee) patch_lazy_pointers
0x0014a3a0: [0x0006d1f0 - 0x0006d212) __arclite_objc_autoreleasePoolPush
So if I use the way in the "MachONormalizedFileToAtoms.cpp" to calculate the end address of the symbol (look ahead to find symbol with next highest address), the result must be different from the output of dwarfdump. And does anyone know how dwarfdump calculate it?
Thank you!

From the answer by Nick Kledzik:
The compiler often aligns functions to start at aligned address (e.g. 8 or 16 bytes). So, there is padding bytes (usually NOPs) after the end of a function and before the start of the next function.
dwarfdump has access to the debug info which does have size info for functions. So dwarfdump can show the size of a function without the alignment padding at the end. Whereas the linker just looks at the next symbol address. There is not much point in the linker digging through the debug info to get a function’s true size, because when writing the output, the linker has to align the next function which would just add back the pad bytes.
I hope that can help others who has the same confusion.

Related

Assembly code . = 60^. in low.s file of UNIX V6 source code

The source code of UNIX V6 is available and there is a book on it by J.Lions. From the book I know that " . " symbol means current location. I do not understand the next:
"*An assignment statement of the form
identifier = expression
associates a value and type with the identifier. In the example
. = 60^.
the operator ’^’ delivers the value of the first
operand and the type of the second operand
(in this case, “location”);*"
The statement can be found in file low.s (0526). What does it mean? Does it actually change PC register value and behaves as a jump instruction? I know it is old code, but I want to understand it. Thank you.
In the 6th edition assembler, . is the location counter, an offset from the beginning of a segment (text, data, or bss). When the assembler starts processing a file, . in each segment is 0, and is incremented either by assignment to . or by the presence of data or instruction statements.
The statement . = 60^. means to take the value 60 (in octal), cast it to the type of the location counter (in this case, type data), and assign it to the location counter. You'll see several statements like this in low.s in the area where interrupt vectors are set up.
When the link editor combines multiple object files together, their text, data, and bss sections are concatenated (except for COMMON data, which gets allocated just once) and any references (such as labels) to instructions or data will be relocated appropriately.
Building the Unix kernel requires an extra step to make sure data meant to be in low memory get loaded at the proper address. After low.s and the rest of the Unix kernel object files have been linked together, sysfix is run to make the data section have a load address of 0, and to relocate all data references appropriately. So that . = 60^. statement has effectively set the location counter to physical address 60.

Is there a way to check where R is 'stuck' within a for loop? (R)

I am using system() to run several files iteratively through a program via CMD. It deposits each outputs into a sub-directory designated for specifically and only that input file. So # of inputs is exactly equal to the number of output directories/outputs.
My code works for the first iteration, but I can see in the console that it won't move on to the second file after completing the first. The stop sign remains active so I know R is still 'running', but since the for loop environment is unique I can't really tell what it's stuck on. It just stays like this for hours. Therefore I'm not sure how to begin to diagnose the issue I'm having. Is there a way of tracing what happened after cancelling the code, for example?
If your curious, the code looks like this btw. I don't know how to make it reproducible, so I just commented each line:
for (i in 1:length(flist)) {
##flist is a vector of character strings. Each
row of characters is both the name of the input file and the name of the
output directory
setwd(paste0(solutions_dir, "\\", flist[i]))
#sets the appropriate dir
system(paste0(program_dir,"\\program.exe I=",
file_dir, "\\", flist[i], " O=",solutions_dir, "\\", flist[i],
"\\solv"))
##line that inputs program's exe file and the appropriate input/output
locations
}

Parse .a2l and .hex files

I'm trying to parse some .a2l and .hex files to extract variables and their values. So far l don't know how to find the values of the variables in the .hex file. Here is a link to download an example of these files.
To be more specific : How can I read the value at the address 0x810600 in the .hex file ?
/begin CHARACTERISTIC ASAM.C.DEPENDENT.REF_1.SWORD
"Dependent SWORD"
VALUE
0x810600
RL.FNC.SWORD.ROW_DIR
0
CM.IDENTICAL
-32268 32267
/begin DEPENDENT_CHARACTERISTIC
"X1 + 5"
ASAM.C.SCALAR.SBYTE.IDENTICAL
/end DEPENDENT_CHARACTERISTIC
DISPLAY_IDENTIFIER DI.ASAM.C.DEPENDENT.REF_1.SWORD
/end CHARACTERISTIC
In the same A2L, please find RL.FNC.SWORD.ROW_DIR item, I guess it might be kind of signed word (2 bytes) type.
I'm not sure if this is kind of array or some special type... I assume this is just single variable (scalar).
Again, find CM.IDENTICAL item, as it's name maybe it's identical compu_method. This means HEX value 0 -> displayed screen as 0, HEX value 100 -> displayed screen as 100, ... identical between internal value and physical value. No special conversion I guess.
Go to the address 0x810600 in HEX then you can find some values there. As it is identical compu_method type, the value in HEX might be identically displayed in M/C SW (INCA, Vision, CANape, ...) I guess.
HEX is of intel hex format. This format is used to map each part of the file to a part in virtual address space of device. You can also use the following command if you use Linux:
objdump -s file.hex

Are there any equivalent of C/C++ __FILE__ and __LINE__ macros in R?

I'm trying to get the equivalent of FILE or LINE macros in C or C++ in R (or S+). Any ideas?
FILE The presumed name of the current source file (a character string literal).
LINE The presumed line number (within the current source file) of the current source line (an integer constant).
As for context - I have log messages being flushed to console from different sections of the code, and given that the messages themselves are built at run-time, it is often very difficult to find out where this log message is coming from (with the size of the R code growing to many thousand lines and running on a distributed grid). However if I could dump the FILE and LINE number along with the log messages, it would be much easier to trace the logs...
Use the #line directive. The structure is #line nn "filename". See Duncan's Murdoch's article on source references for more.

Set a breakpoint at a given line number in Adobe's FDB?

I'm learning the Flex command-line debugger, and I haven't been able to find information on this particular use case.
I'd like to add a breakpoint to a specific line in one of my class files. I can add breakpoints at the start of a function in a class, but I can't figure out how to set it at a specific line (e.g. line 117 in Foo.as)?
When I try to set one for a file on a given line, I get one at a different location:
(fdb) break Foo 111
Breakpoint 1 at 0x######: file Foo.as, line 115
I've verified the line # I'm specifying is valid, so I don't think the FDB is trying to compensate.
Am I doing something wrong? Is this possible in FDB?
Abso-lutely,
check out the help in fdb, it's fairly helpful :). Just type help or type help then a command. help break gives the output below, lots of nice ways to hook in there, the syntax your using is just missing a colon in between the class and the line number specified, just tried with an MXML file and it worked fine.
Set breakpoint at specified line or function.
Examples:
break 87
Sets a breakpoint at line 87 of the current file.
break myapp.mxml:56
Sets a breakpoint at line 56 of myapp.mxml.
break #3:29
Sets a breakpoint at line 29 of file #3.
break doThis
Sets a breakpoint at function doThis() in the current file.
break myapp.mxml:doThat
Sets a breakpoint at function doThat() in file myapp.mxml.
break #3:doOther
Sets a breakpoint at function doOther() in file #3.
break
Sets a breakpoint at the current execution address in the
current stack frame. This is useful for breaking on return
to a stack frame.
To see file names and numbers, do 'info sources' or 'info files'.
To see function names, do 'info functions'.
Abbreviated file names and function names are accepted if unambiguous.
If line number is specified, break at start of code for that line.
If function is specified, break at start of code for that function.
See 'commands' and 'condition' for further breakpoint control.

Resources