MPI subroutines in Fortran - mpi

I have looked through all the posts on this topic I could find but they do not seem to solve my problem. I am thankful for any input/help/idea. So here it is:
I have my main program (main.f90):
program inv_main
use mod_communication
implicit none
include 'mpif.h'
...
call MPI_INIT(ierr)
call MPI_COMM_RANK(MPI_COMM_WORLD,id,ierr)
call MPI_COMM_SIZE(MPI_COMM_WORLD,nproc,ierr)
...
call SENDRECEIVE(id, nproc, ierr, VVNP, VVN)
...
call MPI_FINALIZE(ierr)
end program inv_main
And here is the module that includes the subroutine (I am aware that allgather might be a better way to do the same but I could not figure it out yet for my 4D array):
Module mod_communication
implicit none
include 'mpif.h'
integer, dimension(MPI_STATUS_SIZE) :: STATUS ! MPI
CONTAINS
Subroutine SENDRECEIVE(id, nproc, ierr, INPUT, OUTPUT )
integer, intent (in) :: nproc, id, ierr
real (dp), intent(in) :: INPUT(n,m)
real (dp), intent(out) :: OUTPUT(n,m,nty,nty)
integer :: sndr
IF (id .eq. 0) THEN
OUTPUT(1:n,1:m,1,1)=INPUT
call MPI_RECV(INPUT,n*m,MPI_DOUBLE_PRECISION,MPI_ANY_SOURCE,MPI_ANY_TAG,MPI_COMM_WORLD,STATUS,ierr)
sndr=STATUS(MPI_SOURCE)
OUTPUT(1:n,1:m,int(sndr/nty)+1,sndr+1-nty*(int(sndr/nty))) = INPUT
END IF
IF (id .ne. 0) THEN
call MPI_SEND(INPUT,n*m,MPI_DOUBLE_PRECISION,0,id,MPI_COMM_WORLD,ierr)
ENDIF
call MPI_BARRIER(MPI_COMM_WORLD,ierr)
call MPI_BCAST(OUTPUT,n*m*nty*nty,MPI_DOUBLE_PRECISION,0,MPI_COMM_WORLD,ierr)
end Subroutine
end Module mod_communication
This is the error message I got when compiling:
use mod_communication
2
Error: Symbol 'mpi_displacement_current' at (1) conflicts with symbol from module 'mod_communication', use-associated at (2)
mpif-mpi-io.h:71.36:
Included at mpif-config.h:65:
Included at mpif-common.h:70:
Included at mpif.h:59:
Included at main.f90:27:
integer MPI_MAX_DATAREP_STRING
1
main.f90:21.6:
use mod_communication
2
Error: Symbol 'mpi_max_datarep_string' at (1) conflicts with symbol from module 'mod_communication', use-associated at (2)
mpif-mpi-io.h:73.32:
Included at mpif-config.h:65:
Included at mpif-common.h:70:
Included at mpif.h:59:
Included at main.f90:27:
parameter (MPI_FILE_NULL=0)
These are just the first two errors, it keeps going like that... And I cannot find my mistake. Also, I have to use "include 'mpif.h'" and not "use mpi" because of the machine I am ultimately going to run it on. If I compile it with use mpi however on my own computer it gives me a different error, which is the following:
mod_MPI.f90:93.41:
call MPI_BARRIER(MPI_COMM_WORLD,ierr)
1
Error: There is no specific subroutine for the generic 'mpi_barrier' at (1)
mod_MPI.f90:52.41:

Your main program probably gets (or rather tries to get) two copies of all the stuff in mpif.h. By include-ing it in the module you effectively make all its contents module things (variables, routines, parameters, what-nots). Then, in main you both use the module and, thereby, use-associate the module things, and try to include mpif.h and redeclare all those things again.
Do what #Jonathan Dursi suggests too.

Related

Intentional type mismatch in Fortran

I'd like to turn a legacy Fortran code into modern Fortran compliant code, so I can turn on compiler warnings, interface checking, etc. At this stage I don't want to change the functionality, just make it work as close as possible to what it was, and still keep compilers happy.
My current problem is that the code at many places passes arrays of the wrong types, e.g. a real array to a subroutine that has an integer dummy argument. This is not a bug per se in the code, since it is intentional and it works as intended (at least in common configurations). Now, how could I do the same and while keeping the code compliant? Consider the following example:
program cast
implicit none
double precision :: a(10)
call fill_dble(a,10)
call print_dble(a,10)
call fill_int(a,10)
!call fill_int(cast_to_int(a),10)
call print_dble(a,10)
call print_int(a(1),10)
!call print_int(cast_to_int(a),10)
call print_dble(a(6),5)
contains
function cast_to_int(a) result(b)
use iso_c_binding
implicit none
double precision, target :: a(*)
integer, pointer :: b(:)
call c_f_pointer(c_loc(a(1)), b, [1])
end function
end program
subroutine fill_dble(b,n)
implicit none
integer :: n, i
double precision :: b(n)
do i = 1, n
b(i) = i
end do
end subroutine
subroutine print_dble(b,n)
implicit none
integer :: n
double precision :: b(n)
write(6,'(10es12.4)') b
end subroutine
subroutine fill_int(b,n)
implicit none
integer :: n, b(n), i
do i = 1, n
b(i) = i
end do
end subroutine
subroutine print_int(b,n)
implicit none
integer :: n, b(n)
write(6,'(10i4)') b
end subroutine
When I compile it and run it (gfortran 4.8 or ifort 18), I get, as expected:
1.0000E+00 2.0000E+00 3.0000E+00 4.0000E+00 5.0000E+00 6.0000E+00 7.0000E+00 8.0000E+00 9.0000E+00 1.0000E+01
4.2440-314 8.4880-314 1.2732-313 1.6976-313 2.1220-313 6.0000E+00 7.0000E+00 8.0000E+00 9.0000E+00 1.0000E+01
1 2 3 4 5 6 7 8 9 10
6.0000E+00 7.0000E+00 8.0000E+00 9.0000E+00 1.0000E+01
The first half of the real array is corrupted with integers (because integers are half the size), but when printed as integers the "right" values are there. But this is non-compliant code. When I try to fix it by activating the cast_to_int function (and disabling the calls without it) I get indeed something that compiles without warning, and with gfortran I get the same result. With ifort, however, I get:
1.0000E+00 2.0000E+00 3.0000E+00 4.0000E+00 5.0000E+00 6.0000E+00 7.0000E+00 8.0000E+00 9.0000E+00 1.0000E+01
1.0000E+00 2.0000E+00 3.0000E+00 4.0000E+00 5.0000E+00 6.0000E+00 7.0000E+00 8.0000E+00 9.0000E+00 1.0000E+01
0******** 0 5 6 7 8 9 10
6.0000E+00 7.0000E+00 8.0000E+00 9.0000E+00 1.0000E+01
which I can't understand. Moreover, ifort with -O0 crashes (and it doesn't with the other version).
I know the code is still not quite correct, because the pointer returned by cast_to_int is still of size 1, but I believe that should be a different problem.
What am I doing wrong, or how can I get ifort do what I want?
EDIT: Following #VladimirF's reply, I add, after implicit none:
subroutine fill_int(b,n)
!dec$ attributes no_arg_check :: b
integer :: n, b(n)
end subroutine
subroutine print_int(b,n)
!dec$ attributes no_arg_check :: b
integer :: n, b(n)
end subroutine
end interface
but compiling with warnings on still gives me an error:
$ ifort cast2.f90 -warn all
cast2.f90(17): error #6633: The type of the actual argument differs from the type of the dummy argument. [A]
call fill_int(a,10)
--------------^
cast2.f90(20): error #6633: The type of the actual argument differs from the type of the dummy argument. [A]
call print_int(a(1),10)
---------------^
compilation aborted for cast2.f90 (code 1)
Intel Fortran supports the !dec$ attributes no_arg_check directive. It instructs the compiler "that type and shape matching rules related to explicit interfaces are to be ignored".
"It can be applied to an individual dummy argument name or to the routine name, in which case the option is applied to all dummy arguments in that interface."
It should be applied to a module procedure (or an interface block), so you should move your functions and subroutines into a module.
Many other compilers have similar directives.
What is wrong about your code? As a rule of thumb, do not ever use any Fortran functions that return pointers. They are pure evil. Fortran pointers are completely different from C pointers.
When you do call fill_int(cast_to_int(a),10) what happens is that the expression cast_to_int(a) is evaluated and the result is an array. Now depending on the optimizations the compiler may choose to pass the address of the original pointer, but it may also create a copy of the result integer array and pass a copy to the subroutine.
Also, your array a does not have the target attribute, so the address used inside cast_to_int(a) is only valid inside the function and is not valid after it returns.
You should make the b inside the main program and just pass b instead of a. It will work similar to equivalence. Looking at the values stored as a different type will be not standard-conforming anyway. This form of type punning is not allowed.
I found a possible general solution that seems to work. The code I have to deal with looks something like this:
subroutine some_subroutine(a,b,c,d,...)
real a(*),b(*),c(*),d(*)
! many more declarations, including common blocks
!...
call other_subroutine(a,b(idx),c,...)
!...
end subroutine some_subroutine
! this typically in another file:
subroutine other_subroutine(x,y,z,...)
real x(*)
integer y(*)
logical z(*)
! other declarations and common blocks
! unreadable code with calls to other procedures
! not clear which which arguments are input and output
end subroutine other_subroutine
I now modify it to be:
subroutine some_subroutine(a,b,c,d,...)
real a(*),b(*),c(*),d(*)
! many more declarations, including common blocks
call inner_sub(b,c)
contains
subroutine inner_sub(b,c)
use iso_c_binding
real, target :: b(*),c(*)
integer, pointer :: ib(:)
logical, pointer :: lc(:)
!...
call c_f_pointer(c_loc(b(idx)),ib,[1]) ! or use the actual length if I can figure it out
call c_f_pointer(c_loc(c(1)),lc,[1])
call other_subroutine(a,ib,lc,...)
nullify(ib,lc)
!...
end subroutine inner_sub
end subroutine some_subroutine
leaving other_subroutine untouched. If I use directly the target attribute on the outer routine, I have to add an explicit interface to anything calling it, so instead I wrap the inner code. By using contains I don't need to pass all variables, just those that will be "punned". The c_f_pointer call should be done right before the problematic call, since index variables (idx in the example) could be in common blocks and changed in other calls, for example.
Any pitfalls, apart from those already present in the original code?

Execute fortran code on startup?

Is it possible to have fortran execute code upon startup, without explicitly putting it into the main program?
Usecase
Consider e.g. a routine that reads data from a configuration file with keyword-value pairs, which is used in different modules.
For the sake of code locality it would be favorable for valid keywords, as well as error handling for invalid values, to be defined in the module that needs the data.
Right now the only pattern to implement such behaviour, that I can think of, would be writing a setup subroutine in said module, which is called by the main program.
This means that changing the logic of a module may require a change of the main program. This seems harder to maintain than e.g. in Python doing something like
# ------ ./project/module.py ------
from project.config import register_keyword
register_keyword("some_setting")
One way to do this would be to use derived type constructors:
module foo
implicit none
! only export the derived type, and not any of the
! helper procedures
private
public :: mytype
type :: mytype
! internals of type
end type
! Write an interface overloading 'mytype' allows us to
! overload the type constructor
interface mytype
procedure :: new_mytype
end interface mytype
contains
type(mytype) function new_mytype(setting)
! Some generic setting type
type(setting_type), intent(in) :: setting
! do something with setting
...
end function new_mytype
end module foo
program bar
use foo
implicit none
type(mytype) :: thing
type(setting_type) :: setting
! calls 'foo::new_mytype'
! all implementation details hidden away
thing = mytype(setting)
end program bar
As far as I've seen, this is not possible according to the Fortran (2003) standard.
A "static" function call like your register_keyword would have to be done in the specification-part of a module definition. By general definition this specification-part could contain a stmt-function-stmt, but this is then explicitly forbidden in C1105: "... shall not contain stmt-function-stmt, [...]". So you are basically left with calling intrinsic functions only.
If you really do not want to edit your main program, but you are fine with an additional C++ intermediate file, then the following PoC works:
prog.f03:
program main
end program
my_module.f03:
module my_mod
use, intrinsic :: iso_c_binding
contains
function foo() bind(C, name="foo")
integer(c_int) :: foo
write (*,*) "hello, world!"
foo = 10
end function
end module
my_module.cc:
extern "C" int foo();
int a = foo();
Compile and link as follows:
g++ -Wall -c my_module.cc -o my_module.o
gfortran -Wall -o prog prog.f03 my_module.f03 my_module.o
There will be an output, despite the Fortran program being empty:
hello, world!
I still would not recommend doing this, since it is likely not guaranteed that the Fortran RT environment is ready when the C/C++ global/static function call happens.

I am writing to the file with MPI on Fortran and while checking what was written I don't get the expected results

Each process is building some array and is writing this array in the "correct" place, using mpi_file_write_at().
After writing to the file I read from the same place and it is not what I wrote. The code is attached. I am just a beginner in MPI, so sorry if the question is not clever.
program output
use mpi
implicit none
integer :: ierr,i,proc_num,file,intsize
integer :: status(mpi_status_size)
integer,parameter :: count=10
integer,dimension(count) :: buf
integer,dimension(3*count) :: arr
integer(kind=mpi_offset_kind) :: disp
call mpi_init(ierr)
call mpi_comm_rank(MPI_COMM_WORLD,proc_num,ierr)
do i=1,count
buf(i) = proc_num*count+i
enddo
call mpi_file_open(MPI_COMM_WORLD,'out.txt',mpi_mode_wronly+mpi_mode_create,mpi_info_null,file,ierr)
call mpi_type_size(mpi_integer,intsize,ierr)
disp = proc_num*count*intsize
call mpi_file_write_at(file,disp,buf,count,mpi_integer,status,ierr)
if (proc_num==0) then
call mpi_file_read_at(file,0,arr,3*count,mpi_integer,status,ierr)
write(*,*),arr
endif
call mpi_file_close(file,ierr)
call mpi_finalize(ierr)
end program output
Thank you!
You are using mpi_mode_wronly to open the file. As stated here, it corresponds to "write only". Consequently, mpi_file_read_at() is likely to fail. It can be checked by looking at the output parameter ierr.
Could you try the mpi_mode_rdwr flag ? This should enable both read and write operations.
Moreover, MPI_File_write_at() is a noncollective operation. So process 0 can call mpi_file_read_at() before process 1 exited MPI_File_write_at(). A mpi_barrier() can be added to prevent that. Take a look at http://www.mpi-forum.org/docs/mpi-2.2/mpi22-report/node305.htm . It features various examples using MPI_File_write_at(). It is likely that additional calls to MPI_File_sync() and MPI_File_set_view() are required as well.
Notice that the code you provided is equivalent to a call to the function MPI_Gather().

LOC() for user defined types gives different results depending on the context

I'm trying to debug some code in which members of a user defined object mysteriously change addresses, and while doing that I realized user defined objects do that as well. Here's a small example of querying object address from function that created it and then from its member function:
module foo_module
type foo_type
contains
procedure :: foo
end type foo_type
contains
subroutine foo(this)
class(foo_type) :: this
print *, 'Inside foo this is', loc(this)
end subroutine foo
end module foo_module
program trial
use foo_module
type(foo_type) :: object
print *, 'Object address', loc(object)
call object%foo()
end program trial
A sample output I get is:
Object address 4452052800
Inside foo this is 140734643354880
Why am I getting two different addresses for the same object? Am I doing something wrong? Or is there something with LOC that comes into play I don't understand?
I'm using ifort under osx.
LOC is an extension. Its behaviour is as specified by the compiler vendor.
What the behaviour intended by the vendor here isn't clear, but the difference that you are seeing is that in the main program you get the integer equivalent of the memory address of the non-polymorphic object (what you probably expect), while in the subroutine you get the integer equivalent of the memory address of the polymorphic descriptor for the object (maybe what you want, maybe not).
Using TRANSFER(C_LOC(xxx), 0_C_INTPTR_T) is a more portable way of getting the integer representation of the address of an object (where the C_* things are from the ISO_C_BINDING intrinsic module). C_LOC requires that its argument have the TARGET attribute and be non-polymorphic (use SELECT TYPE).
I'd recommend asking on the relevant Intel Forum if you want further clarification on the intended behaviour of the LOC extension.
I reported the bug to the developers our internal issue ID is DPD200253159. I found that the C_LOC function from ISO_C_BINDING works. For example:
subroutine foo(this)
use, intrinsic :: iso_c_binding
class(foo_type) :: this
print *, 'Inside foo this is', transfer(c_loc(this),0_C_INTPTR_T)
end subroutine foo

MPI_COMM_WORLD handle loses value in a subroutine

my program is as follows:
module x
use mpi !x includes mpi module
implicit none
...
contains
subroutine do_something_with_mpicommworld
!use mpi !uncommenting this makes a difference (****)
call MPI_...(MPI_COMM_WORLD,...,ierr)
end subroutine
...
end module x
program main
use mpi
use x
MPI_INIT(...)
call do_something_with_mpicommworld
end program main
This program fails with the following error: MPI_Cart_create(199): Invalid communicator, unless
the line marked with (**) is uncommented.
Now, maybe my knowledge of Fortran 90 is incomplete, but i thought if you have a use clause in the module definition (see my module x), whichever global variable exists in the included module (in case of x : MPI_COMM_WORLD from include module mpi) will have the same value in any of the contained subroutines ( do_something_with_mpicommworld ) even when those subroutines do not explicitly include the module (e.g. when (**) is commented out). Or, to put it simply, if you include a module within another module, the subroutines contained in the second module will have access to the globals in the included module without a special use statement.
When I ran my programme, I saw a different behaviour. The sub contained in x was creating errors unless it had the 'use mpi' statement.
So what is the problem, do I have a wrong idea about Fortran 90, or is there something special about MPI module which induces such behaviour?
Its annoyingly hard to find exact details about what should and shouldn't happen in these cases, and my expectation was the same as yours -- the `use mpi' should work as above. So I tried the following:
module hellompi
use mpi
implicit none
contains
subroutine hello
integer :: ierr, nprocs, rank
call MPI_INIT(ierr)
call MPI_COMM_SIZE(MPI_COMM_WORLD, nprocs, ierr)
call MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierr)
print *, 'Hello world, from ', rank, ' of ', nprocs
print *, MPI_COMM_WORLD
call MPI_FINALIZE(ierr)
return
end subroutine hello
end module hellompi
and it works fine under both gfortran and ifort with OpenMPI. Adding a cart_create doesn't change anything.
What strikes me as weird with your case is that it isn't complaining that MPI_COMM_WORLD isn't defined -- so obviously some of the relevant information is being propagated to the subroutine. Can you post a simpler full example which still fails to work?
Thank you Johnatan for your answer. The problem was really, really simple. I added the subroutine in question after the "end module"
:-D, 'implicit none' did not apply to now external sub and compiler happily initialised a brand new variable MPI_COMM_WORLD to whatever it thought suitable following the standard implicit rules.
This is just a lesson to me to enforce 'implicit none' not only by keywords, but also via the compiler flag. Evil lurks after every end statement.
I'm sorry you went trough the trouble of making the test example, I'd buy you a beer if I could :-)

Resources