Ive been going over the book over and over again and cannot understand why this is giving me "improper operand type". It should work!
This is inline assembly in Visual Studio.
function(unsigned int* a){
unsigned int num;
_asm {
mov eax, a //This stores address (start of the array) in eax
mov num, dword ptr [eax*4] //This is the line I am having issues with.
That last line, I am trying to store the 4 byte value that is in the array. But I get error C2415: improper operand type
What am I doing wrong? How do I copy 4 byte value from an array into a 32 bit register?
In Visual C++'s inline assembly, all variables are accessed as memory operands1; in other words, wherever you write num you can think that the compiler will replace dword ptr[ebp - something].
Now, this means that in the last mov you are effectively trying to perform a memory-memory mov, which isn't provided on x86. Use a temporary register instead:
mov eax, dword ptr [a] ; load value of 'a' (which is an address) in eax
mov eax, dword ptr [eax] ; dereference address, and load contents in eax
mov dword ptr [num], eax ; store value in 'num'
Notice that I removed the *4, as it doesn't really make sense to multiply a pointer by four - maybe you meant to use a as base plus some other index?
1 Other compilers, such as gcc, provide means to control way more finely the interaction between inline assembly and compiler generated code, which provides great flexibility and power but has quite a steep learning curve and requires great care to get everything right.
For example for:
type PERSONCV is
record
name: String ( 1..4 );
age: Integer;
cvtext: String ( 1..2000 );
end record;
N: constant := 40000;
persons : array ( 1..N ) of PERSONCV;
function jobcandidate return Boolean is
iscandidate: Boolean := False;
begin
for p of persons loop -- what code is generated for this?
if p.age >= 18 then
iscandidate := true;
exit;
end if;
end loop;
return iscandidate;
end;
In C/C++ the loop part would typically be:
PERSONCV * p; // address pointer
int k = 0;
while ( k < N )
{
p = &persons [ k ]; // pointer to k'th record
if ( p->age >= 18 )...
...
k++ ;
}
I have read that Ada uses Value semantics for records.
Does the Ada loop above copy the k'th record to loop variable p?
e.g. like this is in C/C++ :
PERSONCV p; // object/variable
int k = 0;
while ( k < N )
{
memcpy ( &p, &persons [ k ], sizeof ( PERSONCV ) ); // copies k'th elem
if ( p.age >= 18 )...
...
k++ ;
}
Assuming you are using GNAT, there are two avenues of investigation.
The switch -gnatG will regenerate an Ada-like representation of what the front end of the compiler is going to pass to the back end (before any optimisations). In this case, I see
function resander__jobcandidate return boolean is
iscandidate : boolean := false;
L_1 : label
begin
L_1 : for C8b in 1 .. 40000 loop
p : resander__personcv renames resander__persons (C8b);
if p.age >= 18 then
iscandidate := true;
exit;
end if;
end loop L_1;
return iscandidate;
end resander__jobcandidate;
so the question is, how does renames get translated? Given that the record size is 2008 bytes, the chances of the compiler generating a copy is pretty much zero.
The second investigatory approach is to keep the assembly code that the compiler normally emits to the assembler and then deletes, using the switch -S. This confirms that the code generated is like your first C++ version (for macOS).
As an interesting sidelight, Ada 2012 allows an alternate implementation of jobcandidate:
function jobcandidate2 return Boolean is
(for some p of persons => p.age >= 18);
which generates identical code.
I suspect that what you have read about Ada is wrong, and probably worse, is encouraging you to think about Ada in the wrong way.
Ada's intent is to encourage thinking in the problem domain, i.e. to specify what should happen, rather than thinking in the solution domain, i.e. to implement the fine details of exactly how.
So here the intent is to loop over all Persons, exit returning True on meeting the first over 18, otherwise return False.
And that's it.
By and large Ada mandates nothing about the details of how it's done, provided those semantics are satisfied.
Then, the intent is, you just expect the compiler to do the right thing.
Now an individual compiler may choose one implementation over another - or may switch between implementations according to optimisation heuristics, taking into account which CPU it's compiling for, as well as the size of the objects (will they fit into a register?) etc.
You could imagine a CPU with many registers, where a single cache line read makes the copy implementation faster than operating in place (especially if there are no modifications to write back to P's contents), or other target CPUs where the reverse was true. Why would you want to stop the compiler picking the better implementation?
A good example of this is Ada's approach to parameter passing to subprograms - name, value or reference semantics really don't apply - instead, you specify the parameter passing mode - in, out, or in out describing the information flow to (or from) the subprogram. Intuitive, provides semantics that can be more rigorously checked, and leaves the compiler free to pick the best (fastest, smallest, depending on your goal) implementation that correctly obeys those semantics.
Now it would be possible for a specific Ada compiler to make poor choices, and 30 years ago when computers were barely big enough to run an Ada compiler at all, you might well have found performance compromised for simplicity in early releases of a compiler.
But we have thirty more years of compiler development now, running on more powerful computers. So, today, I'd expect the compiler to normally make the best choice. And if you find a specific compiler missing out on performance optimisations, file an enhancement request. Ada compilers aren't perfect, just like any other compiler.
In this specific example, I'd normally expect P to be a cursor into the array, and operations to happen in-place, i.e. reference semantics. Or possibly a hybrid between forms, where one memory fetch into a register serves several operations, like a partial form of value semantics.
If your interest is academic, you can easily look at the assembly output from whatever compiler you're using and find out. Or write all three versions above and benchmark them.
Using a current compiler (GCC 7.0.0), I have copied your source to both an Ada program and a C++ program, using std:array<char, 4> etc. corresponding to String( 1..4 ) etc. Switches were simply -O2 for C++, and -O2 -gnatp for Ada, so as to use comparable settings regarding checked access to array elements, etc.
These are the results for jobcandidate:
C++:
movl $_ZN15Loop_Over_Array7personsE+4, %eax
movl $_ZN15Loop_Over_Array7personsE+80320004, %edx
jmp .L3
.L8:
addq $2008, %rax
cmpq %rdx, %rax
je .L7
.L3:
cmpl $17, (%rax)
jle .L8
movl $1, %eax
ret
.L7:
xorl %eax, %eax
ret
Ada:
movl $1, %eax
jmp .L5
.L10:
addq $1, %rax
cmpq $40001, %rax
je .L9
.L5:
imulq $2008, %rax, %rdx
cmpl $17, loop_over_array__persons-2004(%rdx)
jle .L10
movl $1, %eax
ret
.L9:
xorl %eax, %eax
ret
One difference I see is in how either implementation uses %edx and %eax; for going form one element of the array to the next, and testing whether the end has been reached. Ada seems to imulq the element size to set the cursor, C++ seems to addq it to the pointer.
I haven't measured performance.
Ex : Function Implementation:
facto(x){
if(x==1){
return 1;
}
else{
return x*facto(x-1);
}
in more simple way lets take a stack -->
returns
|2(1)|----> 2(1) evaluates to 2
|3(2)|----> 3(2)<______________| evaluates to 6
|4(3)|----> 4(6)<______________| evaluates to 24
|5(4)|----> 5*(24)<____________| evaluates to 120
------ finally back to main...
when a function returns in reverse manner it never knows what exactly is behind it? The stack have activation records stored inside it but how they know about each other who is popped and who is on top?
How the stack keeps track of all variables within the function being
executed? Besides this, how it keeps track of what code is executed
(stackpointer)? When returning from a function call the result of that
function will be filled in a placeholder. By using the stackpointer
but how it knows where to continue executing code? These are the
basics of how the stack works I know but for recursion I don't
understand how it exactly works??
When a function returns its stack frame is discarded (i.e the complete local state is pop-ed out of the stack).
The details depend on the processor architecture and language.
Check the C calling conventions for x86 processors: http://en.wikipedia.org/wiki/X86_calling_conventions, http://en.wikibooks.org/wiki/X86_Disassembly/Functions_and_Stack_Frames and search for "PC Assembly Language" by Paul A. Carter (its a bit outdated but it has a good explanation of C and Pascal calling conventions for the ia32 architecture).
In C in x86 processors:
a. The calling function pushes the parameters of the called function to the stack in reverse order and then it pushes the pointer to the return address.
push -6
push 2
call add # pushes `end:` address an then jumps to `add:` (add(2, -6))
end:
# ...
b. Then the called function pushes the base of the stack (the ebp register in ia32) (it is used to reference local variables in the caller function).
add:
push ebp
c. The called function sets ebp to the current stack pointer (this ebp will be the reference to access the local variables and parameters of the current function instance).
add:
# ...
mov ebp, esp
d. The called function reserves space in the stack for the local (automatic) variables subtracting the size of the variables to the stack pointer.
add:
# ...
sub esp, 4 # Reserves 4 bytes for a variable
e. At the end of the called function it sets the stack pointer to be ebp (i.e frees its local variables), restores the ebp register of the caller function and returns to the return address (previously pushed by the caller).
add:
# ...
mov esp, ebp # frees local variables
pop ebp # restores old ebp
ret # pops `end:` and jumps there
f. Finally the caller adds to the stack pointer the space used by the parameters of the called function (i.e frees the space used by the arguments).
# ...
end:
add esp, 8
Return values (unless they are bigger than the register) are returned in the eax register.
What's the name of the function that tells you how many bits are set in some variable? This surely already exists in Base or maybe some standard library.
To quote Keno Fischer...
Try count_ones. As you can see it uses the popcnt instruction:
julia> code_native(count_ones,(Int64,))
.section __TEXT,__text,regular,pure_instructions
Filename: int.jl
Source line: 192
push RBP
mov RBP, RSP
Source line: 192
popcnt RAX, RDI
pop RBP
ret
Is your question in any way related to the Hacker News buzz about Replacing a 32-bit loop count variable with 64-bit introduces crazy performance deviations?
This is probally a really stupid question but how do you call a memory address in ASM? I am using the code call dword 557054 (557054 is were code is located...) but I figure that it is calling 557054 + were ever the program got loaded into into memory. I need this for my executable loader...
There are two ways to do this, you can use CALL or you can use JMP, the second is more flexible but requires you to do a little more work if you want some compatibility with C-style code
Simple c-function call using CALL
push eax ; push args to stack
push ebx
call my_func ; my_func can be a c exported function or defined as a macro or asm function