Intentional type mismatch in Fortran - pointers

I'd like to turn a legacy Fortran code into modern Fortran compliant code, so I can turn on compiler warnings, interface checking, etc. At this stage I don't want to change the functionality, just make it work as close as possible to what it was, and still keep compilers happy.
My current problem is that the code at many places passes arrays of the wrong types, e.g. a real array to a subroutine that has an integer dummy argument. This is not a bug per se in the code, since it is intentional and it works as intended (at least in common configurations). Now, how could I do the same and while keeping the code compliant? Consider the following example:
program cast
implicit none
double precision :: a(10)
call fill_dble(a,10)
call print_dble(a,10)
call fill_int(a,10)
!call fill_int(cast_to_int(a),10)
call print_dble(a,10)
call print_int(a(1),10)
!call print_int(cast_to_int(a),10)
call print_dble(a(6),5)
contains
function cast_to_int(a) result(b)
use iso_c_binding
implicit none
double precision, target :: a(*)
integer, pointer :: b(:)
call c_f_pointer(c_loc(a(1)), b, [1])
end function
end program
subroutine fill_dble(b,n)
implicit none
integer :: n, i
double precision :: b(n)
do i = 1, n
b(i) = i
end do
end subroutine
subroutine print_dble(b,n)
implicit none
integer :: n
double precision :: b(n)
write(6,'(10es12.4)') b
end subroutine
subroutine fill_int(b,n)
implicit none
integer :: n, b(n), i
do i = 1, n
b(i) = i
end do
end subroutine
subroutine print_int(b,n)
implicit none
integer :: n, b(n)
write(6,'(10i4)') b
end subroutine
When I compile it and run it (gfortran 4.8 or ifort 18), I get, as expected:
1.0000E+00 2.0000E+00 3.0000E+00 4.0000E+00 5.0000E+00 6.0000E+00 7.0000E+00 8.0000E+00 9.0000E+00 1.0000E+01
4.2440-314 8.4880-314 1.2732-313 1.6976-313 2.1220-313 6.0000E+00 7.0000E+00 8.0000E+00 9.0000E+00 1.0000E+01
1 2 3 4 5 6 7 8 9 10
6.0000E+00 7.0000E+00 8.0000E+00 9.0000E+00 1.0000E+01
The first half of the real array is corrupted with integers (because integers are half the size), but when printed as integers the "right" values are there. But this is non-compliant code. When I try to fix it by activating the cast_to_int function (and disabling the calls without it) I get indeed something that compiles without warning, and with gfortran I get the same result. With ifort, however, I get:
1.0000E+00 2.0000E+00 3.0000E+00 4.0000E+00 5.0000E+00 6.0000E+00 7.0000E+00 8.0000E+00 9.0000E+00 1.0000E+01
1.0000E+00 2.0000E+00 3.0000E+00 4.0000E+00 5.0000E+00 6.0000E+00 7.0000E+00 8.0000E+00 9.0000E+00 1.0000E+01
0******** 0 5 6 7 8 9 10
6.0000E+00 7.0000E+00 8.0000E+00 9.0000E+00 1.0000E+01
which I can't understand. Moreover, ifort with -O0 crashes (and it doesn't with the other version).
I know the code is still not quite correct, because the pointer returned by cast_to_int is still of size 1, but I believe that should be a different problem.
What am I doing wrong, or how can I get ifort do what I want?
EDIT: Following #VladimirF's reply, I add, after implicit none:
subroutine fill_int(b,n)
!dec$ attributes no_arg_check :: b
integer :: n, b(n)
end subroutine
subroutine print_int(b,n)
!dec$ attributes no_arg_check :: b
integer :: n, b(n)
end subroutine
end interface
but compiling with warnings on still gives me an error:
$ ifort cast2.f90 -warn all
cast2.f90(17): error #6633: The type of the actual argument differs from the type of the dummy argument. [A]
call fill_int(a,10)
--------------^
cast2.f90(20): error #6633: The type of the actual argument differs from the type of the dummy argument. [A]
call print_int(a(1),10)
---------------^
compilation aborted for cast2.f90 (code 1)

Intel Fortran supports the !dec$ attributes no_arg_check directive. It instructs the compiler "that type and shape matching rules related to explicit interfaces are to be ignored".
"It can be applied to an individual dummy argument name or to the routine name, in which case the option is applied to all dummy arguments in that interface."
It should be applied to a module procedure (or an interface block), so you should move your functions and subroutines into a module.
Many other compilers have similar directives.
What is wrong about your code? As a rule of thumb, do not ever use any Fortran functions that return pointers. They are pure evil. Fortran pointers are completely different from C pointers.
When you do call fill_int(cast_to_int(a),10) what happens is that the expression cast_to_int(a) is evaluated and the result is an array. Now depending on the optimizations the compiler may choose to pass the address of the original pointer, but it may also create a copy of the result integer array and pass a copy to the subroutine.
Also, your array a does not have the target attribute, so the address used inside cast_to_int(a) is only valid inside the function and is not valid after it returns.
You should make the b inside the main program and just pass b instead of a. It will work similar to equivalence. Looking at the values stored as a different type will be not standard-conforming anyway. This form of type punning is not allowed.

I found a possible general solution that seems to work. The code I have to deal with looks something like this:
subroutine some_subroutine(a,b,c,d,...)
real a(*),b(*),c(*),d(*)
! many more declarations, including common blocks
!...
call other_subroutine(a,b(idx),c,...)
!...
end subroutine some_subroutine
! this typically in another file:
subroutine other_subroutine(x,y,z,...)
real x(*)
integer y(*)
logical z(*)
! other declarations and common blocks
! unreadable code with calls to other procedures
! not clear which which arguments are input and output
end subroutine other_subroutine
I now modify it to be:
subroutine some_subroutine(a,b,c,d,...)
real a(*),b(*),c(*),d(*)
! many more declarations, including common blocks
call inner_sub(b,c)
contains
subroutine inner_sub(b,c)
use iso_c_binding
real, target :: b(*),c(*)
integer, pointer :: ib(:)
logical, pointer :: lc(:)
!...
call c_f_pointer(c_loc(b(idx)),ib,[1]) ! or use the actual length if I can figure it out
call c_f_pointer(c_loc(c(1)),lc,[1])
call other_subroutine(a,ib,lc,...)
nullify(ib,lc)
!...
end subroutine inner_sub
end subroutine some_subroutine
leaving other_subroutine untouched. If I use directly the target attribute on the outer routine, I have to add an explicit interface to anything calling it, so instead I wrap the inner code. By using contains I don't need to pass all variables, just those that will be "punned". The c_f_pointer call should be done right before the problematic call, since index variables (idx in the example) could be in common blocks and changed in other calls, for example.
Any pitfalls, apart from those already present in the original code?

Related

Is using a default method pointer valid Fortran? (IFort compiler bug)

I just want to know for sure if this is valid Fortran or if I've misunderstood some usage. Is the following code valid?
Module MathFxns
implicit none
Type A_T
procedure(DoStuff_F), nopass, pointer :: method => add
contains
End Type A_T
Abstract Interface
Function DoStuff_F(a, b) result(c)
integer, intent(in) :: a, b
integer :: c
End Function DoStuff_F
End Interface
contains
function add(a, b) result(c)
integer, intent(in) :: a, b
integer :: c
c = a + b
end function add
End Module MathFxns
program Main
use MathFxns
implicit none
type(A_T) :: math
print *, math%method(2, 5)
end program Main
I just had to track down a compiler bug, that was being caused by something I think is valid Fortran. I'd submit to the compiler team, but I don't have a way to replicate as it's buried pretty far down in the stack and down multiple libraries before it caused a compiler bug and it doesn't happen in every program that uses it.
Edit: I didn't mention it before because it is complicated to explain, but since there was some curiosity, I'll try.
The production code does work in some executables, but recently I implemented it in another project which caused a compiler bug. I'll try to make a pseudo code example to illustrate, but first is a description. In the production code I have a type that has a default procedure pointer to a function (just like above). An instance of that type is a field of an abstract type. That abstract type is extended by another abstract type, then in a subsequent library that type is extended by another abstract type, which is then extended by a concrete type in another library. Finally an executable makes use of that concrete type. The module that has an instance of the concrete type throws a compiler error.
In the production code, it is an ODE Solver, with functionality wrapped into an entity type that gets extended a few times before being implemented.
It took me 6 hours, but after commenting and uncommenting line after line, the cause of the error was shown to be the default procedure pointer in the type. Whether that is the actual error or not, I can't know, but removing the default pointer (and pointing in the construction subroutine) made the project work again.
!this is in the first static library project
Module Library1
implicit none
Type A_T
!more stuff
procedure(DoStuff_F), nopass, pointer :: method => add
contains
!more procedures
End Type A_T
Type, abstract :: B1_A
type(A_T) :: a
!more stuff and procedures
End Type B1_A
Type, extends(B1_A), abstract :: B2_A
!more stuff and procedures
End Type B2_A
Abstract Interface
Function DoStuff_F(a, b) result(c)
integer, intent(in) :: a, b
integer :: c
End Function DoStuff_F
End Interface
contains
function add(a, b) result(c)
integer, intent(in) :: a, b
integer :: c
c = a + b
end function add
End Module Library1
! this is in the second static library project
Module Library2
use Library1
implicit none
Type, extends(B2_A), abstract :: B3_A
!more stuff and procedures
End Type B3_A
End Module Library2
! this is in the third static library project
Module Library3
use Library2
implicit none
Type, extends(B3_A) :: C_T
!more stuff and procedures
End Type C_T
End Module Library3
!this is in a fourth executable project
program Main
use Library3
implicit none
type(C_T) :: math
print *, math%a%method(2, 5)
end program Main

Strange fortran pointers associations

It seems to me that, no matters how the following subroutine is call, it should never print true:
type :: node
class(node), pointer :: next => null()
end type node
...
subroutine add(nod)
class(node) :: nod
type(node), pointer :: new
allocate(new)
new%next=> nod%next
nod%next=> null()
new%next=> null()
nod%next=> new
print*, associated(new%next,target=new)
print*, associated(nod%next,target=new%next)
print*, associated(new%next)
end subroutine
However, using this subroutine in a larger code, I get true (in all the prints) at least one time.
This is not a minimal example, because when I construct one, I get false, of course. Actually in the full code I get false many times until I see the true. So I wondering, what happens in the full code to achieve this? In other words, how can I build a program that can call the add subroutine in such a way to make the output to be true. There could be another unrelated bug, another memory issue in other parts of the code, that can explain this behavior?
EDIT
I am not sure how, but this is related to an allocate(a,source=b) syntax where a and b are object like:
type :: atype
type(node),pointer :: head
end type
After I changed allocate(a,source=b) to
allocate(a)
allocate(a%head)
The problem disappears.

Pointer to derived type that contains allocatable array

Generally speaking I want to rename allocatable variables in a derived type that are passed through subroutine arguments. Writing everything with 'derived%type_xx' is not so pleasant. Besides, I don't want to spend extra memory on copying the values of the derived type to a new variable which costs new allocated memory. Furthermore, I know allocatable arrays are preferred than pointers for many reasons. I try to define pointers to the allocatable variable, but failed. I tried this because I want to simplify my code, both to be readable and not to be too long. I wonder if there's a way of achieving the goal? Thanks.
Here's the demonstration code:
Module module_type
IMPLICIT NONE
TYPE type_1
REAL,ALLOCATABLE :: longname_1(:), longname_2(:)
END TYPE
END MODULE
!------------------------------------------------------------------------------------------
SUBROUTINE TEST(input)
USE MODULE module_type
IMPLICIT NONE
TYPE(type_1) :: input
input%longname_1 = input%longname_1 + input%longname_2 ! Use one line to show what I mean
END SUBROUTINE
And here's what failed:
Module module_type
IMPLICIT NONE
TYPE type_1
REAL,ALLOCATABLE :: longname_1(:), longname_2(:)
END TYPE
END MODULE
!------------------------------------------------------------------------------------------
SUBROUTINE TEST(input)
USE MODULE module_type
IMPLICIT NONE
TYPE(type_1),TARGET :: input
REAL,POINTER :: a => input%longname_1 &
& b => input%longname_2
a = a + b ! much better for reading
END SUBROUTINE
It seems like a small issue, but I'd like to read my code without too much pain in the future. So what's the best option? Thanks a lot.
You can use the ASSOCIATE construct to associate a simple name with a more complex designator or expression.
You could also use the subobjects of the derived type as actual arguments to a procedure that carried out the operation.
You pointer approach failed because you had a rank mismatch - you were attempting to associate scalar pointers with array targets. You may also have had problems if an explicit interface to your procedure was not available in the calling scope. An explicit interface is required for procedures with dummy arguments with the TARGET attribute.
Use of pointers for this sort of simple name aliasing may reduce the ability of the compiler to optimize the code. Something like ASSOCIATE should be preferred.
Update: After #IanH made his comment, I have gone back to check: I was completely and utterly wrong on why your code failed. As he pointed out in his answer, the main issue is that pointer and target have to have the same rank, so you'd have to declare a and b as:
real, pointer :: a(:), b(:)
Secondly, before you can actually point these pointers to the targets, the targets have to be allocated. Here's an example that works:
program allocatable_target
implicit none
type :: my_type
integer, allocatable :: my_array(:)
end type my_type
type(my_type), target :: dummy
integer, pointer :: a(:)
allocate(dummy%my_array(10))
a => dummy%my_array
a = 10
print *, dummy%my_array
end program allocatable_target
If you have a Fortran 2003 compatible compiler, you can use associate -- which is specifically meant for this kind of issue. Here's an example:
program associate_example
implicit none
type :: my_type
integer, allocatable :: long_name_1(:), long_name_2(:)
end type my_type
type(my_type) :: input
integer :: i
allocate(input%long_name_1(100), input%long_name_2(100))
associate (a=>input%long_name_1, b=>input%long_name_2)
a = (/ (i, i = 1, 100, 1) /)
b = (/ (2*i+4, i = 1, 100, 1) /)
a = a + b
end associate
print *, input%long_name_1
end program associate_example
Inside the associate block, you can use a and b as a shortform for the declared longer named variables.
But other than that, I suggest you get an editor with proper code completion, then long variable names are not that much of an issue any more. At the moment I'm trying out Atom and am quite happy with it. But I have used vim with the proper expansions for a long time.

Segmentation faults in Fortran recursive tree implementation

I need to implement a tree structure in Fortran for a project, so I've read various guides online explaining how to do it. However, I keep getting errors or weird results.
Let's say I want to build a binary tree where each node stores an integer value. I also want to be able to insert new values into a tree and to print the nodes of the tree. So I wrote a type "tree" that contains an integer, two pointers towards the children sub-trees and a boolean which I set to .true. if there are no children sub-trees:
module class_tree
implicit none
type tree
logical :: isleaf
integer :: value
type (tree), pointer :: left,right
end type tree
interface new
module procedure newleaf
end interface
interface insert
module procedure inserttree
end interface
interface print
module procedure printtree
end interface
contains
subroutine newleaf(t,n)
implicit none
type (tree), intent (OUT) :: t
integer, intent (IN) :: n
t % isleaf = .true.
t % value = n
nullify (t % left)
nullify (t % right)
end subroutine newleaf
recursive subroutine inserttree(t,n)
implicit none
type (tree), intent (INOUT) :: t
integer, intent (IN) :: n
type (tree), target :: tleft,tright
if (t % isleaf) then
call newleaf(tleft,n)
call newleaf(tright,n)
t % isleaf = .false.
t % left => tleft
t % right => tright
else
call inserttree(t % left,n)
endif
end subroutine inserttree
recursive subroutine printtree(t)
implicit none
type (tree), intent (IN) :: t
if (t % isleaf) then
write(*,*) t % value
else
write(*,*) t % value
call printtree(t % left)
call printtree(t % right)
endif
end subroutine printtree
end module class_tree
The insertion is always done into the left sub-tree unless trying to insert into a leaf. In that case, the insertion is done into both sub-trees to make sure a node has always 0 or 2 children. The printing is done in prefix traversal.
Now if I try to run the following program:
program main
use class_tree
implicit none
type (tree) :: t
call new(t,0)
call insert(t,1)
call insert(t,2)
call print(t)
end program main
I get the desired output 0 1 2 2 1. But if I add "call insert(t,3)" after "call insert(t,2)" and run again, the output is 0 1 2 0 and then I get a segfault.
I tried to see whether the fault happened during insertion or printing so I tried to run:
program main
use class_tree
implicit none
type (tree) :: t
call new(t,0)
call insert(t,1)
call insert(t,2)
write(*,*) 'A'
call insert(t,3)
write(*,*) 'B'
call print(t)
end program main
It makes the segfault go away but I get a very weird output A B 0 1 2673568 6 1566250180.
When searching online for similar errors, I got results like here where it says it might be due to too many recursive calls. However, the call to insert(t,3) should only contain 3 recursive calls... I've also tried to compile using gfortran with -g -Wall -pedantic -fbounds-check and run with a debugger. It seems the fault happens at the "if (t % isleaf)" line in the printing subroutine, but I have no idea how to make sense of that.
Edit:
Following the comments, I have compiled with -g -fbacktrace -fcheck=all -Wall in gfortran and tried to check the state of the memory. I'm quite new to this so I'm not sure I'm using my debugger (gdb) correctly.
After the three insertions and before the call to print, it seems that everything went well: for example when I type p t % left % left % right % value in gdb I get the expected output (that is 3). If I just type p t, the output is (.FALSE.,0,x,y), where x and y are hexadecimal numbers (memory addresses, I guess). However, if I try p t % left, I get something like a "description" of the pointer:
PTR TO -> (Type tree
logical(kind=4) :: isleaf
integer(kind=4) :: value
which repeats itself a lot since each pointer points to a tree that contains two pointers. I would have expected an output similar to that of p t, but I have no idea whether that's normal.
I also tried to examine the memory: for example if I type x/4uw t % left, I get 4 words, the first 2 words seem to correspond to isleaf and value, the last 2 to memory addresses. By following the memory addresses like that, I managed to visit all the nodes and I didn't find anything wrong.
The segfault happens within the printing routine. If I type p t after the fault, it says I cannot access the 0x0 address. Does that mean my tree is somehow modified when I try to print it?
The reason for your problems is the fact, that variables, which get out of scope, are not valid anymore. This is in contrast to languages like Python, where the number of existing pointers is relevant (refcount).
In your particular case, this means, that the calls to newleaf(left, n) and newleaf(right, n) set the values of left and right, resp., but these variables get ouf of scope and, thus, invalid.
A better approach is to allocate each leaf as it is needed (except the first one, since this is already allocated and will not get out of scope till the end of the program).
recursive subroutine inserttree(t,n)
implicit none
type (tree), intent (INOUT) :: t
integer, intent (IN) :: n
if (t % isleaf) then
allocate(t%left)
allocate(t%right)
call newleaf(t%left,n)
call newleaf(t%right,n)
t % isleaf = .false.
else
call inserttree(t % left,n)
endif
end subroutine inserttree

MPI subroutines in Fortran

I have looked through all the posts on this topic I could find but they do not seem to solve my problem. I am thankful for any input/help/idea. So here it is:
I have my main program (main.f90):
program inv_main
use mod_communication
implicit none
include 'mpif.h'
...
call MPI_INIT(ierr)
call MPI_COMM_RANK(MPI_COMM_WORLD,id,ierr)
call MPI_COMM_SIZE(MPI_COMM_WORLD,nproc,ierr)
...
call SENDRECEIVE(id, nproc, ierr, VVNP, VVN)
...
call MPI_FINALIZE(ierr)
end program inv_main
And here is the module that includes the subroutine (I am aware that allgather might be a better way to do the same but I could not figure it out yet for my 4D array):
Module mod_communication
implicit none
include 'mpif.h'
integer, dimension(MPI_STATUS_SIZE) :: STATUS ! MPI
CONTAINS
Subroutine SENDRECEIVE(id, nproc, ierr, INPUT, OUTPUT )
integer, intent (in) :: nproc, id, ierr
real (dp), intent(in) :: INPUT(n,m)
real (dp), intent(out) :: OUTPUT(n,m,nty,nty)
integer :: sndr
IF (id .eq. 0) THEN
OUTPUT(1:n,1:m,1,1)=INPUT
call MPI_RECV(INPUT,n*m,MPI_DOUBLE_PRECISION,MPI_ANY_SOURCE,MPI_ANY_TAG,MPI_COMM_WORLD,STATUS,ierr)
sndr=STATUS(MPI_SOURCE)
OUTPUT(1:n,1:m,int(sndr/nty)+1,sndr+1-nty*(int(sndr/nty))) = INPUT
END IF
IF (id .ne. 0) THEN
call MPI_SEND(INPUT,n*m,MPI_DOUBLE_PRECISION,0,id,MPI_COMM_WORLD,ierr)
ENDIF
call MPI_BARRIER(MPI_COMM_WORLD,ierr)
call MPI_BCAST(OUTPUT,n*m*nty*nty,MPI_DOUBLE_PRECISION,0,MPI_COMM_WORLD,ierr)
end Subroutine
end Module mod_communication
This is the error message I got when compiling:
use mod_communication
2
Error: Symbol 'mpi_displacement_current' at (1) conflicts with symbol from module 'mod_communication', use-associated at (2)
mpif-mpi-io.h:71.36:
Included at mpif-config.h:65:
Included at mpif-common.h:70:
Included at mpif.h:59:
Included at main.f90:27:
integer MPI_MAX_DATAREP_STRING
1
main.f90:21.6:
use mod_communication
2
Error: Symbol 'mpi_max_datarep_string' at (1) conflicts with symbol from module 'mod_communication', use-associated at (2)
mpif-mpi-io.h:73.32:
Included at mpif-config.h:65:
Included at mpif-common.h:70:
Included at mpif.h:59:
Included at main.f90:27:
parameter (MPI_FILE_NULL=0)
These are just the first two errors, it keeps going like that... And I cannot find my mistake. Also, I have to use "include 'mpif.h'" and not "use mpi" because of the machine I am ultimately going to run it on. If I compile it with use mpi however on my own computer it gives me a different error, which is the following:
mod_MPI.f90:93.41:
call MPI_BARRIER(MPI_COMM_WORLD,ierr)
1
Error: There is no specific subroutine for the generic 'mpi_barrier' at (1)
mod_MPI.f90:52.41:
Your main program probably gets (or rather tries to get) two copies of all the stuff in mpif.h. By include-ing it in the module you effectively make all its contents module things (variables, routines, parameters, what-nots). Then, in main you both use the module and, thereby, use-associate the module things, and try to include mpif.h and redeclare all those things again.
Do what #Jonathan Dursi suggests too.

Resources