Analyzing LLVM IR instruction - llvm-ir

I'm trying to analyze the following llvm-ir instruction:
%0 = call nonnull {}* inttoptr (i64 140019712366044 to {}* ({}*, i64)*)({}* inttoptr (i64 140019153768096 to {}*), i64 0)
I couldn't find any information on what is {}* in the docs. Is it a pointer to an empty struct ?
What does this instruction do ?

Is it a pointer to an empty struct ?
Yup. Somewhat like a void pointer. void* is not a valid type in LLVM IR, most frontends use i8* for this purpose.
What does this instruction do ?
It's an indirect call. That first {}* is the return type of the callee (and guaranteed to be a nonnull pointer at that!), the callee is the value i64 140019712366044 casted to pointer to function that returns {}* and takes two parameters, {}* and i64. Then for the first parameter, it passes this function i64 140019153768096 casted to {}* and 0 as the second parameter.
Those constant numbers suggest that this is a JIT and those are pointers to things outside this LLVM IR module.

Related

Deallocating arrays defined from c_f_pointer

The following code compiles in both GNU gfortran and Intel ifort. But only the gfortran compiled version will run successfully.
program fort_tst
use iso_c_binding
INTEGER, POINTER :: a(:)
TYPE(C_PTR) :: ptr
INTEGER, POINTER :: b(:)
ALLOCATE(a(5))
ptr = c_loc(a)
CALL c_f_pointer(ptr,b,[5])
DEALLOCATE(b)
end program fort_tst
The error in the Intel compiled code is :
forrtl: severe (173): A pointer passed to DEALLOCATE points to an object that cannot be deallocated
Image PC Routine Line Source
fort_tst 000000000040C5A1 Unknown Unknown Unknown
fort_tst 0000000000403A17 Unknown Unknown Unknown
fort_tst 0000000000403812 Unknown Unknown Unknown
libc-2.17.so 00002AAAAB20F555 __libc_start_main Unknown Unknown
fort_tst 0000000000403729 Unknown Unknown Unknown
The gfortran code runs to completion. A quick valgrind check does not find any leaks.
Can someone confirm whether the code above is valid/legal code?
I am running
ifort (IFORT) 2021.2.0 20210228
and
GNU Fortran (GCC) 9.2.0
Copyright (C) 2019 Free Software Foundation, Inc.
UPDATE :
What is interesting is that gfortran does the right thing, (i.e. deallocates only allocated memory), even when the user tries to confound it with improper index remapping, or a bogus shape argument. So the internal array descriptor is being properly copied over with gfortran's c_f_pointer.
The error is issued, because the compiler claims that the pointer that is being allocated was not allocated by an allocate statement.
The rules are (F2018):
9.7.3.3 Deallocation of pointer targets
1 If a pointer appears in a DEALLOCATE statement, its association status shall be defined.
Deallocating a pointer that is disassociated or whose target was not
created by an ALLOCATE statement causes an error condition in the
DEALLOCATE statement. If a pointer is associated with an allocatable
entity, the pointer shall not be deallocated. A pointer shall not be
deallocated if its target or any subobject thereof is argument
associated with a dummy argument or construct associated with an
associate name.
Your pointer b was associated using the c_f_pointer subroutine. The error condition mentioned is the
forrtl: severe (173): A pointer passed to DEALLOCATE points to an object that cannot be deallocated
Now we have to be careful, the exact wording is
or whose target was not created by an ALLOCATE statement
The target arguably was created by an allocatable statement. And then went through this indirect chain of association. I am not such an expert language lawyer to be sure whether this makes the target to be applicable or not, when it passed through c_loc() and c_f_pointer().
Gfortran does not issue this error condition and then it works fine because at the end of the day, under the hood, what matters is that the address passed to the system free() function was allocated by the matching system malloc() function.
I think we can conclude that one of the compilers is wrong here, because the mention of the error condition is clear in the standard and either it should be issued or it should not. A third option, that gfortran just leaves it too work, should not happen. Either it is allowed, or an error condition shall be issued.
Re UPDATE: What gfortran does is really sending the address to free(). As long as the pointer is contiguous and starts at the first element, it will work in practice. The size is not necessary and is not passed to free(). The system allocator malloc()/free() stores the size of each allocated system in its own database.
There are even worse abuse cases that can happen and will work just by chance due to this, even if completely illegal in Fortran.
See this:
use iso_c_binding
character, allocatable, target :: a
type(c_ptr) :: p
real, pointer :: b(:)
allocate(a)
p = c_loc(a)
call c_f_pointer(p, b, [1000])
deallocate(b)
end
gfortran is arguably missing a diagnostics opportunity when it comes to the DEALLOCATE statement. ifort is arguably too conservative when it comes to the DEALLOCATE statement.
The error message from ifort is an explicit design choice prohibiting the pointer from C_F_POINTER appearing in a DEALLOCATE statement:
Since the resulting data pointer fptr could point to a target that was not allocated with an ALLOCATE statement, fptr cannot be freed with a DEALLOCATE statement.
There seems little in Fortran 2018 explicitly to support that restriction (even in the case where the target was created by an ALLOCATE statement), and ifort itself isn't consistent in applying it:
use iso_c_binding
integer, pointer :: a, b
type(c_ptr) :: ptr
allocate(a)
ptr = c_loc(a)
call c_f_pointer(ptr,b)
deallocate(b)
end program
However, consider the case
use iso_c_binding
integer, pointer, dimension(:) :: a, b
type(c_ptr) :: ptr
allocate(a(5))
ptr = c_loc(a)
call c_f_pointer(ptr,b,[4])
deallocate(b)
end program
One would surely expect deallocation here to be problematic but this doesn't cause an error condition with gfortran: gfortran isn't carefully checking whether the target is deallocatable (note that it doesn't have to).
There is some subtlety in Fortran 2018's wording of C_F_POINTER (F2018 18.2.3.3)
If both X and FPTR are arrays, SHAPE shall specify a size that is less than or equal to the size of X, and FPTR becomes associated with the first PRODUCT (SHAPE) elements of X (this could be the entirety of X).
and whether "the entirety" of a forms a valid thing to deallocate but ifort's documentation is seemingly too strict and gfortran's checking is not going to catch all invalid cases. There is a case for talking to the vendor of each compiler.
That said, the use of a C_F_POINTER's pointer in a DEALLOCATE statement clearly is more prone to error than "simpler" pointers, and these errors are not ones where we can rely on a compiler to point them out. Even with a conclusion of "clearly this is allowed" I personally would recommend that one avoids this approach where possible without other bad things.
Usage of c_f_pointer is pretty standard behavior in case a Fortran derived type is to be passed to a C++ class as an opaque pointer type, see e.g. the following interoperable class:
module mytype_m
use iso_c_binding
implicit none
private
type, public :: mytype
real, allocatable :: data(:)
contains
procedure :: destroy
procedure :: init
procedure :: printout
end type mytype
public :: mytype_print_c
public :: mytype_init_c
public :: mytype_destroy_c
contains
subroutine init(this,data)
class(mytype), intent(inout), target :: this
real, intent(in) :: data(:)
call destroy(this)
this%data = data
end subroutine init
elemental subroutine destroy(this)
class(mytype), intent(inout), target :: this
integer :: ierr
deallocate(this%data,stat=ierr)
end subroutine destroy
subroutine printout(this)
class(mytype), intent(inout), target :: this
integer :: ndata,i
ndata = merge(size(this%data),0,allocated(this%data))
write(*,1) ndata,(this%data(i),i=1,ndata)
1 format('mytype object has data(',i0,')',:,' = ',*(f3.1,:,', '))
end subroutine printout
subroutine mytype_print_c(this) bind(C,name='mytype_print_c')
type(c_ptr), intent(inout) :: this
type(mytype), pointer :: fortranclass
call c_f_pointer(this, fortranclass)
call fortranclass%printout()
end subroutine mytype_print_c
subroutine mytype_destroy_c(this) bind(C,name='mytype_destroy_c')
type(c_ptr), intent(inout) :: this
type(mytype), pointer :: fortranclass
call c_f_pointer(this, fortranclass)
if (associated(fortranclass)) then
call fortranclass%destroy()
deallocate(fortranclass)
end if
! Nullify C pointer
this = c_null_ptr
end subroutine mytype_destroy_c
subroutine mytype_init_c(this,ndata,data) bind(C,name='mytype_init_c')
type(c_ptr), intent(inout) :: this
integer(c_int), intent(in), value :: ndata
real(c_float), intent(in) :: data(ndata)
type(mytype), pointer :: fortranclass
integer :: ierr
! In case it was previously allocated
call c_f_pointer(this, fortranclass)
allocate(fortranclass,stat=ierr)
call fortranclass%init(data)
this = c_loc(fortranclass)
end subroutine mytype_init_c
end module mytype_m
that would be bound to an opaque pointer in c++:
#include <iostream>
#include <vector>
using namespace std;
// Fortran interoperability
typedef void* mytype;
extern "C" { void mytype_print_c(mytype self);
void mytype_destroy_c(mytype self);
void mytype_init_c(mytype self, const int ndata, float *data); }
// Class definition
class mytype_cpp
{
public:
mytype_cpp(std::vector<float> data) { mytype_init_c(this,data.size(),data.data()); };
~mytype_cpp() { mytype_destroy_c(this); };
void printout() { mytype_print_c(this); };
};
int main()
{
// Print 8--size
std::vector<float> data {1.,2.,3.,4.,5.,6.,7.,8.};
mytype_cpp obj(data); obj.printout();
return 0;
}
which, with gfortran-10, returns
mytype object has data(8) = 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0
I don't have a chance to test with ifort, but it works seamlessly with gcc, how can this approach not be Fortran standard-compliant?
Posts above inspired the following solution. The idea is to create a type that wraps the actual data array. Then, c_loc/c_f_pointer sequence works fine with a pointer to a scalar object. The data array stored in the type can be safely allocated, along with the array type itself.
MODULE arraytype_m
TYPE, PUBLIC :: arraytype
INTEGER, ALLOCATABLE :: data(:)
END TYPE arraytype
END MODULE arraytype_m
PROGRAM fort_tst
USE iso_c_binding
USE arraytype_m
TYPE(arraytype), POINTER :: a, b
TYPE(C_PTR) :: ptr
ALLOCATE(a)
ALLOCATE(a%data(5))
!! Set to C-style pointer, and then copy back to Fortran pointer.
ptr = c_loc(a)
CALL c_f_pointer(ptr,b)
DEALLOCATE(b%data)
DEALLOCATE(b)
END PROGRAM fort_tst
This works with both Intel and gfortan, and is really a better solution than what I was trying to do.
Special thanks for #Federico for posting the C++/Fortran code that made this solution obvious.
Update : A complete code, which shows how the ptr above can be stored in C.
// C code
typedef void* arraytype;
void allocate_array(arraytype *ptr);
void deallocate_array(arraytype *ptr);
void do_something(arraytype *ptr);
int main()
{
arraytype ptr;
allocate_array(&ptr);
do_something(&ptr);
deallocate_array(&ptr);
return 0;
}
and the corresponding Fortran :
!! Fortran code
MODULE arraytype_mod
TYPE, PUBLIC :: arraytype
DOUBLE PRECISION, POINTER :: data(:)
END TYPE arraytype
END MODULE arraytype_mod
SUBROUTINE allocate_array(ptr) BIND(C,name='allocate_array')
USE iso_c_binding
USE arraytype_mod
TYPE(c_ptr) :: ptr
TYPE(arraytype), POINTER :: a
ALLOCATE(a)
ALLOCATE(a%data(5))
ptr = c_loc(a)
END
SUBROUTINE deallocate_array(ptr) BIND(C,name='deallocate_array')
USE iso_c_binding
USE arraytype_mod
TYPE(C_PTR) :: ptr
TYPE(arraytype), pointer :: a
CALL c_f_pointer(ptr,a)
DEALLOCATE(a%data)
DEALLOCATE(a)
END
SUBROUTINE do_something(ptr) BIND(C,name='do_something')
USE iso_c_binding
USE arraytype_mod
TYPE(c_ptr) :: ptr
TYPE(arraytype), POINTER :: a
CALL c_f_pointer(ptr,a)
a%data = 2.5
WRITE(6,*) a%data
END

Pointer methods on non pointer types

According to this response to this question
The rule about pointers vs. values for receivers is that value methods can be invoked on pointers and values, but pointer methods can only be invoked on pointers
But in fact I can execute pointer method on non pointer values:
package main
import "fmt"
type car struct {
wheels int
}
func (c *car) fourWheels() {
c.wheels = 4
}
func main() {
var c = car{}
fmt.Println("Wheels:", c.wheels)
c.fourWheels()
// Here i can execute pointer method on non pointer value
fmt.Println("Wheels:", c.wheels)
}
So, what is wrong here? Is this a new feature ? or the response to the question is wrong ?
You are calling a "pointer method" on a pointer value. In the expression:
c.fourWheels()
c is of type car (non-pointer); since the car.fourWheels() method has a pointer receiver and because the receiver value is a non-pointer and is addressable, it is a shorthand for:
(&c).fourWheels()
This is in Spec: Calls:
If x is addressable and &x's method set contains m, x.m() is shorthand for (&x).m().
The statement:
The rule about pointers vs. values for receivers is that value methods can be invoked on pointers and values, but pointer methods can only be invoked on pointers
Interpret it like this:
If you have a value method, you can always call it: if you have a value, it's ready to be the receiver; and if you have a pointer, you can always dereference it to obtain a value ready to be the receiver.
If you have a pointer method, you may not always be able to call it if you only have a value, as there are several expressions (whose result) are not addressable, and therefore you would not be able to obtain a pointer to it that would be used as the receiver; such examples are function return values and map indexing expressions. For details and examples, see How can I store reference to the result of an operation in Go?; and How to get the pointer of return value from function call? (Sure, you could always assign it to a local variable and take its address, but that's a copy and the pointer method could only modify this copy and not the original.)

Pointer to a struct (or lack thereof)

Let's say I have defined this struct:
type Vertex struct {
X, Y float64
}
now it's perfectly legal Go to use it like this:
func (v *Vertex) Abs() float64 {
return math.Sqrt(v.X*v.X + v.Y*v.Y)
}
func main() {
v := &Vertex{3, 4}
fmt.Println(v.Abs())
}
but it's also ok not to use a pointer:
func main() {
v := Vertex{3, 4}
fmt.Println(v.Abs())
}
The results in both cases is the same, but how are they different, internally? Does the use of pointer makes the program run faster?
PS. I get it that the Abs() function needs a pointer as a receiver. That explains the reason why a pointer has been used later in the main function. But why doesn't the program spit out an error when I don't use a pointer and directly call Abs() on a struct instance?
why doesn't the program spit out an error when I don't use a pointer and directly call Abs() on a struct instance?
Because you can get the pointer to (address of) a struct instance.
As mentioned in "What do the terms pointer receiver and value receiver mean in Golang?"
Go will auto address and auto-dereference pointers (in most cases) so m := MyStruct{}; m.DoOtherStuff() still works since Go automatically does (&m).DoOtherStuff() for you.
As illustrated by "Don't Get Bitten by Pointer vs Non-Pointer Method Receivers in Golang" or "Go 101: Methods on Pointers vs. Values", using a pointer receiver (v *Vertex) is great to avoid copy, since Go passes everything by value.
The spec mentions (Method values):
As with method calls, a reference to a non-interface method with a pointer receiver using an addressable value will automatically take the address of that value: t.Mp is equivalent to (&t).Mp.

uintptr_t not converting the value back to pointer

I'm using Cython to wrap a C++ library, where I use (uintptr_t)(void *) cast to pass pointers to python callers and getback as a handle. In one such scenario - I pass a casted pointer as a Python Integer to another Cython function. In the original function where the pointer is declared the reverse cast to (Class *)(void *) succcessfully generates the original pointer value [Verified within C++]. Whereas in the other Cython function which uses the handle, the reverse cast gives some other pointer value leading to a crash [Verified within C++]. Does the change in size of the object affect reverse cast from uintptr_t to (Class *)(void *)? Or is there any other requirement on such casts and reverse casts.
Class A:
#property
def cppobj(self):
"""
- return pointer to C++ Object
"""
cdef uintptr_t ptr = <uintptr_t><void *> self._obj
# call to printptr C++ method
# argument - <cpp.A *><void *> ptr
# prints: 0x8805508
return <uintptr_t><void *> self._obj
class B:
def useA(self):
# call to printptr C++ method
# argument - <cpp.A *><void *> A.cppobj
# prints: 0x880b718

C++11- Use nullptr all the time?

I'm just a little bit confused.
When should I use nullptr?
I've read on some sites that it should always be used, but I can't set nullptr for a non-pointer for example:
int myVar = nullptr; // Not a pointer ofcourse
Should I always use NULL non-pointers and nullptr for pointers?
Thanks to any help! I'm very new to c++ 11 (and c++ overall).
Always use nullptr when initializing pointers to a null pointer value, that's what it is meant for according to draft n3485.
[lex.nullptr] paragraph 1
The pointer literal is the keyword nullptr. It is a prvalue of type
std::nullptr_t. [ Note: std::nullptr_t is a distinct type that is
neither a pointer type nor a pointer to member type; rather, a prvalue
of this type is a null pointer constant and can be converted to a
null pointer value or null member pointer value. [...] — end note ]
Now onto the use of NULL.
According to the same draft it shall be defined as follows.
[diff.null] paragraph 1
The macro NULL, [...] is an
implementation-defined C ++ null pointer constant in this
International Standard.
and null pointer constant as follows.
[conv.ptr] paragraph 1
A null pointer constant is an integral constant expression [...]
prvalue of integer type that evaluates to zero or a prvalue of type
std::nullptr_t.
That is, it is implementation-defined behavior whether NULL is defined as an integer prvalue evaluating to zero, or a prvalue of type std::nullptr_t. If the given implementation of the standard library chooses the former, then NULL can be assigned to integer types and it's guaranteed it will be set to zero, but if the later is chosen, then the compiler is allowed to issue an error and declare the program ill-formed.
In other words, although conditionally valid [read IB], initializing an integer using NULL is most probably a bad idea, just use 0 instead.
On the other hand, according to the above NULL is guaranteed to initialize pointers to a null pointer value, much like nullptr does, but while NULL is a macro, what accompanies several caveats, nullptr is prvalue of a specific type, for which type checking and conversion rules apply. That's mostly why nullptr should be prefered.
consider the two function overloads:
void foo(int)
void foo(int*)
in C++, people tend to use 0 as a null value. Other people use NULL. (NULL is really just a fancy macro for 0)
If you call foo(0) or foo(NULL), it becomes ambiguous which one should be called. foo(nullptr) clears up this ambiguity, and will always call foo(int*).

Resources