What is the rule behind instruction count in Intel PIN?

What is the rule behind instruction count in Intel PIN? - intel-pin

I wanted to count instructions in simple recursive fibo function O(2^n). I succeded to do so with bubble sort and matrix multiplication, but in this case it seemed like instruction count ignored my fibo function. Here is the code used for instrumentation:
// Insert a call at the entry point of a routine to increment the call count
RTN_InsertCall(rtn, IPOINT_BEFORE, (AFUNPTR)docount, IARG_PTR, &(rc->_rtnCount), IARG_END);
// For each instruction of the routine
for (INS ins = RTN_InsHead(rtn); INS_Valid(ins); ins = INS_Next(ins))
{
// Insert a call to docount to increment the instruction counter for this rtn
INS_InsertCall(ins, IPOINT_BEFORE, (AFUNPTR)docount, IARG_PTR, &(rc->_icount), IARG_END);
}
I started to wonder what's the difference between this program and the previous ones and my first thought was: here I'm not using an array.
This is what I realised after some manual tests:
a = 5; // instruction ignored by PIN and
// pretty much everything not using array
fibo[1] = 1 // instruction counted properly
a = fibo[1] // instruction ignored by PIN
So it seems like only instructions counted are writes to the memory (that's what I assume). After I changed my fibo function to this it works:
long fibonacciNumber(int n, long *fiboNumbers)
{
if (n < 2) {
fiboNumbers[n] = n;
return n;
}
fiboNumbers[n] = fiboNumbers[n-1] + fiboNumbers[n-2];
return fibonacciNumber(n - 1, fiboNumbers) + fibonacciNumber(n - 2, fiboNumbers);
}
But I would like to count instructions also for programs that aren't written by me. Is there a way to count all type of instrunctions? Is there any particular reason why only this instructions are counted? Any help appreciated.
//Edit
I used disassembly option in Visual Studio to check how it looks and it still makes no sense for me. I can't find the reason why only assingment to array is interpreted by PIN as instruction.
instruction_comparison
This exceeded all my expectations, counted as 2 instructions:
even 2 instructions, not one

PIN, like other low-level profiling and analysis tools, measures individual instructions, low-level orders like "add these two registers" or "load a value from that memory address". The sequence of instructions which a program comprises are generally produced from a high-level language like C++ through a compiler. An individual line of C++ code might be transformed into exactly one instruction, but it's also common for a line to translate to several instructions or even to zero instructions; and the instructions for a line of code may be interleaved with those of other instructions.
Your compiler can output an assembly-language file for your source code, showing what instructions were produced for which lines of code. (For GCC and Clang, this is done with the -S flag.) Note that reading the assembly code output from a compiler is not the best way to learn assembly. Also, I would point you to godbolt.org, a very convenient tool for analyzing assembly output.

Related

Performing dead code elimination / slicing from original source code in Frama-C

EDIT: The original question had unnecessary details
I have a source file which I do value analysis in Frama-C, some of the code is highlighted as dead code in the normalized window, no the original source code.
Can I obtain a slice of the original code that removes the dead code?

Short answer: there's nothing in the current Frama-C version that will let you do that directly. Moreover, if your original code contains macros, Frama-C will not even see the real original code, as it relies on an external preprocessor (e.g. cpp) to do macro expansion.
Longer answer: Each statement in the normalized (aka CIL) Abstract Syntax Tree (AST, the internal representation of C code within Frama-C) contains information about the location (start point and end point) of the original statement where it stems from, and this information is also available in the original AST (aka Cabs). It might thus be possible for someone with a good knowledge of Frama-C's inner workings (e.g. a reader of the developer's manual), to build a correspondance between both, and to use that to detect dead statement in Cabs. Going even further, one could bypass Cabs, and identify zones in the original text of the program which are dead code. Note however that it would be a tedious and quite error prone (notably because a single original statement can be expanded in several normalized ones) task.

Given your clarifications, I stand by #Virgile's answer; but for people interested in performing some simplistic dead code elimination within Frama-C, the script below, gifted by a colleague who has no SO account, could be helpful.
(* remove_dead_code.ml *)
let main () =
!Db.Value.compute ();
Slicing.Api.Project.reset_slicing ();
let selection = ref Slicing.Api.Select.empty_selects in
let o = object (self)
inherit Visitor.frama_c_inplace
method !vstmt_aux stmt =
if Db.Value.is_reachable_stmt stmt then
selection :=
Slicing.Api.Select.select_stmt ~spare:true
!selection
stmt
(Extlib.the self#current_kf);
Cil.DoChildren
end in
Visitor.visitFramacFileSameGlobals o (Ast.get ());
Slicing.Api.Request.add_persistent_selection !selection;
Slicing.Api.Request.apply_all_internal ();
Slicing.Api.Slice.remove_uncalled ();
ignore (Slicing.Api.Project.extract "no-dead")
let () = Db.Main.extend main
Usage:
frama-c -load-script remove_dead_code.ml file.c -then-last -print -ocode output.c
Note that this script does not work in all cases and could have further improvements (e.g. to handle initializers), but for some quick-and-dirty hacking, it can still be helpful.

In GDB Python script, array indexing fails if frame’s language is Ada

I have a script to work out how much free stack space there is in each FreeRTOS task. GDB’s language is set to auto. The script works fine when the current language is c, but fails when the current language is ada.
I have, in the class Stacks,
tcb_t = gdb.lookup_type("TCB_t")
int_t = gdb.lookup_type("int")
used to:
find {Ada task control block}.Common.Thread,
thread = atcb["common"]["thread"]
convert to a pointer to the FreeRTOS task control block,
tcb = thread.cast(Stacks.tcb_t.pointer()).dereference()
find the logical top of the stack
stk = tcb["pxStack"].cast(Stacks.int_t.pointer())
Now I need to loop logically down the stack until I find an entry not equal to the initialised value,
free = 0
while stk[free] == 0xa5a5a5a5:
free = free + 1
which works fine if the current frame’s language is c, but if it’s ada I get
Python Exception <class 'gdb.error'> not an array or string:
Error occurred in Python command: not an array or string
I’ve traced this to the expression stk[free], which is being interpreted using the rules of the current language (in Ada, array indexing uses parentheses, so it would be stk(free), which is of course illegal since Python treats it as a function call).
I’ve worked round this by
def invoke(self, arg, from_tty):
gdb.execute("set language c")
...
gdb.execute("set language auto")
but it seems wrong not to set the language back to what it was originally.
So,
is there a way of detecting the current GDB language setting from Python?
is there an alternate way of indexing that doesn’t depend on the current GDB language setting?

What are kernel blocks in OpenCL?

In the article "How to set up Xcode to run OpenCL code, and how to verify the kernels before building" NeXTCoder referred to some code as the "Short Answer", i.e. https://developer.apple.com/library/mac/#documentation/Performance/Conceptual/OpenCL_MacProgGuide/XCodeHelloWorld/XCodeHelloWorld.html.
In that code the author says "Wrap your kernel code into a kernel block:" without explaining what is a "kernel block". (The OpenCL Programmer Guide for Mac OS X by Apple makes no mention of kernel block.)
The host program calls "square_kernel" but the sample kernel is called "square" and the sample kernel block is labelled "kernelName" (in italics). Can you please tell me how to put the 3 pieces together:kernel, kernel block & host program to run in Xcode 5.1? I only have one kernel. Thanks.

It's not really jargon. It's closure-like entity.
OpenCL C 2.0 adds support for the clang block syntax. You use the ^ operator to declare a Block variable and to indicate the beginning of a Block literal. The body of the Block itself is contained within {}, as shown in the example (as usual with C, ; indicates the end of the statement).The Block is able to make use of variables from the same scope in which it was defined.
Example:
int multiplier = 7;
int (^myBlock)(int) = ^(int num) {
return num * multiplier;
};
printf(“%d\n”, myBlock(3));
// prints 21
Source:
https://www.khronos.org/registry/cl/sdk/2.1/docs/man/xhtml/blocks.html

The term "kernel block" only seems to be a jargon to refer to the "part of the code that is the kernel". Particularly, the kernel block in this case is simply the function that is declared to be a kernel, by adding kernel before its declaration. Or, even simpler, and from the way how the term is used on this website, I would say that "kernel block" is the same as "kernel".
The kernelName (in italics) is a placeholder. The code there shows the general pattern of how to define any kernel:
It is prefixed with kernel
It returns void
It has a name ... the kernelName, which may for example be square
It has several input- and output parameters
The reason why the kernel is called square, but invoked with square_kernel seems to be some magic that is done by XCode: It seems to read the .cl file, and creates a .h file that contains additional declarations that are derived from the .cl file (as can be seen in this question, where a kernel called rebound is defined, and GCL generated a rebound_kernel declaration).

Matlab: Attempt to reference field of non-structure array

I am using the Kernel Density Estimator toolbox form http://www.ics.uci.edu/~ihler/code/kde.html . But I am getting the following error when I try to execute the demo files -
>> demo_kde_3
KDE Example #3 : Product sampling methods (single, anecdotal run)
Attempt to reference field of non-structure array.
Error in double (line 10)
if (npd.N > 0) d = 1; % return 1 if the density exists
Error in repmat (line 49)
nelems = prod(double(siz));
Error in kde (line 39)
if (size(ks,1) == 1) ks = repmat(ks,[size(points,1),1]); end;
Error in demo_kde_3 (line 8)
p = kde([.1,.45,.55,.8],.05); % create a mixture of 4 gaussians for
testing
Can anyone suggest what might be wrong? I am new to Matlab and having a hard time to figure out the problem.
Thank You,

Try changing your current directory away from the #kde folder; you may have to add the #kde folder to your path when you do this. For example run:
cd('c:\');
addpath('full\path\to\the\folder\#kde');
You may also need to add
addpath('full\path\to\the\folder\#kde\examples');
Then see if it works.
It looks like function repmat (a mathworks function) is picking up the #kde class's version of the double function, causing an error. Usually, only objects of the class #kde can invoke that functions which are in the #kde folder.
I rarely use the #folder form of class definitions, so I'm not completely sure of the semantics; I'm curious if this has any effect on the error.
In general, I would not recommend using the #folder class format for any development that you do. The mathworks overhauled their OO paradigm a few versions ago to a much more familiar (and useful) format. Use help classdef to see more. This #kde code seems to predate this upgrade.

MATLAB gives you the code line where the error occurs. As double and repmat belong to MATLAB, the bug probably is in kde.m line 39. Open that file in MATLAB debugger, set a breakpoint on that line (so the execution stops immediately before the execution of that specific line), and then when the code is stopped there, check the situation. Try the entire code line in console (copy-paste or type it, do not single-step, as causing an uncatched error while single-stepping ends the execution of code in debugger), it should give you an error (but doesn't stop execution). Then try pieces of the code of that code line, what works as it should and what not, eg. does the result of size(points, 1) make any sense.
However, debugging unfamiliar code is not an easy task, especially if you're a beginner in MATLAB. But if you learn and understand the essential datatypes of MATLAB (arrays, cell arrays and structs) and the different ways they can be addressed, and apply that knowledge to the situation on the line 39 of kde.m, hopefully you can fix the bug.

Repmat calls double and expects the built-in double to be called.
However I would guess that this is not part of that code:
if (npd.N > 0) d = 1; % return 1 if the density exists
So if all is correct this means that the buil-tin function double has been overloaded, and that this is the reason why the code crashes.
EDIT:
I see that #Pursuit has already addressed the issue but I will leave my answer in place as it describes the method of detection a bit more.

Global variable touched by a passed-in parameter becomes unusable

folks!
I pass a struct full of data to my kernel, and I run into the following difficulty using it (very stripped down):
[edit: mac osx / xcode 3.2 on mac book pro; this compile is obviously for cpu]
typedef struct
{
float xoom;
int sizex;
} varholder;
float zX, xd;
__kernel void Harlan( __global varholder * vh )
{
int X = get_global_id(0), Y = get_global_id(1);
zX = ( ( X - vh->sizex/2 ) / vh->xoom + vh->sizex/2 ); // (a)
xd = zX; // (b) BOOM!!
}
after executing line (a), the line marked (b), a simple assignment, gives "LLVM compiler failed to compile a function".
if, however, we do not execute line (a), then line (b) is fine.
So, through my fiddling around a LOT with this, it seems as if it is the assignment statement (a), which uses a passed-in parameter, that messes up the future access of the variable zX. However, of course I need to be able to use the results of calculations further down the line.
I have zX and xd declared at the file level because my helper functions need them.
Any thoughts?
Thanks!
David
p.s. I'm now registered so will be able to upvote and accept answers, which I am sadly unable to do for the last person who helped me (used same username to register, but can't seem to vote on the old post; sorry!).

No, say it ain't so!
I am sincerely hoping that this is not a "correct" answer to my own question. I found on another forum (though not the same question asked!) the following, and I am afraid that it refers to what I'm trying to do:
(quote)
You're doing something the standard prohibits. Section 6.5 says:
'All program scope variables must be declared in the __constant address space.'
In other words, program scope variables cannot be mutable.
(end quote)
... well, tcha!!!! What an astoundingly inconvenient restriction! I'm sure there's reasoning behind it.
[edit: Not At All inconvenient! it was in fact astonishingly easy to work around, given a fresh start the next morning. (And no alcohol.)]
You guys & dolls all knew this, right, and didn't have the heart to tell me?...

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

What is the rule behind instruction count in Intel PIN? - intel-pin

Related

Performing dead code elimination / slicing from original source code in Frama-C

In GDB Python script, array indexing fails if frame’s language is Ada

What are kernel blocks in OpenCL?

Matlab: Attempt to reference field of non-structure array

Global variable touched by a passed-in parameter becomes unusable

Categories

Resources