I wrote a recursive program on Fortran to calculate the combinations of npoints of ndim dimensions as follows. I first wrote this program on MATLAB and it was perfectly running. But in Fortran, my problem is that after the first iteration it is assigning absurd values for the list of points, with no explanation. Could somebody give me a hand?
PROGRAM MAIN
IMPLICIT NONE
INTEGER :: ndim, k, npontos, contador,i,iterate, TEST
integer, dimension(:), allocatable :: pontos
print*, ' '
print*, 'npoints?'
read *, npontos
print*, 'ndim?'
read *, ndim
k=1
contador = 1
open(450,file= 'combination.out',form='formatted',status='unknown')
write(450,100) 'Comb ','stat ',(' pt ',i,' ',i=1,ndim)
write(450,120) ('XXXXXXXXXX ',i=1,ndim+1)
allocate(pontos(ndim))
do i=1,4
pontos(i)=i
end do
TEST = iterate(pontos, ndim, npontos,k,contador)
end program MAIN
recursive integer function iterate(pontos, ndim, npontos, k,contador)
implicit NONE
integer, intent(in) :: ndim, k, npontos
integer,dimension(:) :: pontos
integer contador,inic,i,j,m
if (k.eq.ndim) then
inic=pontos(ndim)
do i = pontos(ndim),npontos
pontos(k)= i
write(*,*) pontos(:)
contador=contador+1
end do
pontos(ndim)= inic + 1
else
inic = pontos (k)
do j = pontos(k),(npontos-ndim+k)
pontos(k)=j
pontos= iterate(pontos, ndim, npontos, k+1,contador)
end do
end if
pontos(k)=inic+1;
if (pontos(k).gt.(npontos-ndim+k+1)) then
do m =k+1,ndim
pontos(m)=pontos(m-1)+1
end do
end if
end function iterate
There are too many issues in that code... I stopped debugging it. This is what I got so far, it's too much for a comment.
This doesn't make sense:
pontos= iterate(pontos, ndim, npontos, k+1,contador)
You are changing pontos inside iterate, and never set a return value within the function.
You need to a) provide a result statement for recursive functions (and actually set it) or b) convert it to a subroutine. Since you are modifying at least one dummy argument, you should go with b).
Since you are using assumed-shape dummy arguments, you need to specify an interface to the function/subroutine, either explicitly or with a module.
Neither format 100 nor format 120 are specified in your code.
Related
I've written a rudimentary algorithm in Fortran 95 to calculate the gradient of a function (an example of which is prescribed in the code) using central differences augmented with a procedure known as Richardson extrapolation.
function f(n,x)
! The scalar multivariable function to be differentiated
integer :: n
real(kind = kind(1d0)) :: x(n), f
f = x(1)**5.d0 + cos(x(2)) + log(x(3)) - sqrt(x(4))
end function f
!=====!
!=====!
!=====!
program gradient
!==============================================================================!
! Calculates the gradient of the scalar function f at x=0using a finite !
! difference approximation, with a low order Richardson extrapolation. !
!==============================================================================!
parameter (n = 4, M = 25)
real(kind = kind(1d0)) :: x(n), xhup(n), xhdown(n), d(M), r(M), dfdxi, h0, h, gradf(n)
h0 = 1.d0
x = 3.d0
! Loop through each component of the vector x and calculate the appropriate
! derivative
do i = 1,n
! Reset step size
h = h0
! Carry out M successive central difference approximations of the derivative
do j = 1,M
xhup = x
xhdown = x
xhup(i) = xhup(i) + h
xhdown(i) = xhdown(i) - h
d(j) = ( f(n,xhup) - f(n,xhdown) ) / (2.d0*h)
h = h / 2.d0
end do
r = 0.d0
do k = 3,M r(k) = ( 64.d0*d(k) - 20.d0*d(k-1) + d(k-2) ) / 45.d0
if ( abs(r(k) - r(k-1)) < 0.0001d0 ) then
dfdxi = r(k)
exit
end if
end do
gradf(i) = dfdxi
end do
! Print out the gradient
write(*,*) " "
write(*,*) " Grad(f(x)) = "
write(*,*) " "
do i = 1,n
write(*,*) gradf(i)
end do
end program gradient
In single precision it runs fine and gives me decent results. But when I try to change to double precision as shown in the code, I get an error when trying to compile claiming that the assignment statement
d(j) = ( f(n,xhup) - f(n,xhdown) ) / (2.d0*h)
is producing a type mismatch real(4)/real(8). I have tried several different declarations of double precision, appended every appropriate double precision constant in the code with d0, and I get the same error every time. I'm a little stumped as to how the function f is possibly producing a single precision number.
The problem is that f is not explicitely defined in your main program, therefore it is implicitly assumed to be of single precision, which is the type real(4) for gfortran.
I completely agree to the comment of High Performance Mark, that you really should use implicit none in all your fortran code, to make sure all object are explicitely declared. This way, you would have obtained a more appropriate error message about f not being explicitely defined.
Also, you could consider two more things:
Define your function within a module and import that module in the main program. It is a good practice to define all subroutines/functions within modules only, so that the compiler can make extra checks on number and type of the arguments, when you invoke the function.
You could (again in module) introduce a constant for the precicision and use it everywhere, where the kind of a real must be specified. Taking the example below, by changing only the line
integer, parameter :: dp = kind(1.0d0)
into
integer, parameter :: dp = kind(1.0)
you would change all your real variables from double to single precision. Also note the _dp suffix for the literal constants instead of the d0 suffix, which would automatically adjust their precision as well.
module accuracy
implicit none
integer, parameter :: dp = kind(1.0d0)
end module accuracy
module myfunc
use accuracy
implicit none
contains
function f(n,x)
integer :: n
real(dp) :: x(n), f
f = 0.5_dp * x(1)**5 + cos(x(2)) + log(x(3)) - sqrt(x(4))
end function f
end module myfunc
program gradient
use myfunc
implicit none
real(dp) :: x(n), xhup(n), xhdown(n), d(M), r(M), dfdxi, h0, h, gradf(n)
:
end program gradient
I have a Fortran subroutine that selects a function based on the value of a string and then executes that function.
!! file:select.f90
module funcs
contains
subroutine add(x, y, xy)
real :: x, y, xy
xy = x + y
return
end subroutine
subroutine diff(x, y, xy)
real :: x, y, xy
xy = x - y
return
end subroutine
end module
subroutine simple(modname)
use funcs
procedure(), pointer :: model => null()
character(10) :: modname
real :: x, y, xy
print *, "-",modname,"-"
select case (modname)
case("add")
model => add
case("diff")
model => diff
case default
print *, "No model with that name!"
stop
end select
x = 4
y = 3
call model(x, y, xy)
print *, xy
end subroutine
I would like to call this subroutine from an R script.
# file:select.R
dyn.load("select.so")
.Fortran("simple", "add")
.Fortran("simple", "diff")
As a standalone Fortran program that takes a command line argument, this runs perfectly fine. It's even insensitive to spaces before or after modname. However, when I try to pass in a character as an argument from R, it correctly re-prints the character (without any extra spaces), but then doesn't recognize it as a case and skips to the default. What is going on here? Is there some encoding issue with R characters that makes them incompatible with Fortran?
I believe your select case statement is not properly matching because the modname is 10 characters long, and none of your cases cover a string of that length. The best thing to do is to also pass in the length of the string to your Fortran function, then use this to slice your character array.
subroutine simple(modname,length)
then select case (modname(1:length))
Fortran strings are not zero terminated like the C language. It's an array based language.
Also when passing a string from R to .Fortran, it might be better to pass it as raw bytes. Simple example below. First is the Fortran code, then the R wrapper code.
subroutine print_this ( str, length )
integer :: length
character(length) :: str
print *, str(1:length)
end subroutine print_this
test <- function(str) {
l <- nchar(str)
str_raw <- character(l)
str_raw <- charToRaw(str)
.Fortran("print_this",str_raw,l)
l #returns length
}
I understand the interface command can be used to pass a a function into a subroutine. So for example in the main program I'd define some function and then pass it to some subroutine like:
MainProgran
Use ....
Implicit None
Type decorations etc
Interface
Function test(x,y)
REAL, INTENT(IN) :: x, y
REAL :: test
END function
End Interface
Call Subroutine( limit1, limit2, test, Ans)
End MainProgram
Is this the correct way of doing this? I'm quite stuck! Also within the Subroutine is there anything I need to put to let it know that a function is coming in? The Subroutine in this case will be a library so I don't want to have to keep recompiling it to change the function.
Module:
module fmod
interface
function f_interf(x,y)
real, intent(in) :: x, y
real :: f_interf
end function
end interface
contains
function f_sum(x,y)
real, intent(in) :: x, y
real f_sum
f_sum = x + y
end function
function f_subst(x,y)
real, intent(in) :: x, y
real f_subst
f_subst = x - y
end function
subroutine subr(limit1, limit2, func, ans)
real limit1, limit2
procedure(f_interf) func
real ans
ans = func(limit1, limit2)
end subroutine
end module
main program:
program pass_func
use fmod
Implicit None
real ans, limit1, limit2
limit1 = 1.0
limit2 = 2.0
call subr( limit1, limit2, f_subst, ans)
write(*,*) ans
call subr( limit1, limit2, f_sum, ans)
write(*,*) ans
end program pass_func
and output:
-1.000000
3.000000
A simple way to do this is to go old school and just leave the function external:
program main
real f,z
external f
call subr(f,z)
write(*,*)z
end
real function f(x)
real x
f=x**2
end
! below possibly in a precompiled library:
subroutine subr(f,y)
real f,y
y=f(2.)
end
out: 4
Of course with this approach you can not use advanced language features that require an explicit interface. **
On the other hand if you are interfacing with standard libraries that need function arguments this is I think the only way.
** per MSB's comment you can handle that issue with an interface block in the subroutine,
for example if we want to pass a function that returns an array:
function f(x)
real x,f(2)
f(1)=x
f(2)=x**2
end
as in the first example f is an external function, and the sub can be in
a precompiled library:
subroutine subr(g,y)
interface
function g(x)
real x,g(2)
end function
end interface
real y,z(2)
z=g(2.)
y=z(1)+z(2)
end
out: 6
As noted, this is only strictly necessary if relying on language features that need the interface.
The most elegant way I know of right now is to put your functions into a module so that you don't have to do construct interface but simply use 'external'. Here is a example to do that.
It covers different situations using subroutine or function as arguments for subroutine or function.
Notice if you want to pass array as argument without receiving null arraies, here is a tip to do that.
Module part:
module func_arg_test
!I used ifort to compile but other compilers should also be fine.
!Written by Kee
!Feb 20, 2017
contains
!-------------------------
real function func_func(f, arg)
!========================================
!This shows how to pass number as argument
!========================================
implicit none
real, external::f !Use external to indicate the f is a name of a function
real::arg
func_func=f(arg)
end function func_func
real function func_sub(subr, arg)
!========================================
!This shows how to pass subroutine as arg to function
!========================================
implicit none
external::subr !Use external to indicate subr is a subroutine
real::arg
call sub(arg)
func_sub = arg
end function func_sub
subroutine sub_func(f,arg)
!========================================
!This shows how to pass function as argument
!in subroutine
!========================================
real::arg
real,external::f
arg = f(arg)
end subroutine sub_func
subroutine sub_sub(subr,arg)
!========================================
!This shows how to pass subroutine as argument
!in subroutine
!========================================
real::arg
external::subr
call subr(arg)
end subroutine sub_sub
real function funcmat(f, mat)
!========================================
!This shows how to pass matrix as argument
!========================================
implicit none
real, external::f
real,dimension(:)::mat!Here memory for mat is already allocated when mat is
!passed in, so don't need specific size
integer::sizeinfo
sizeinfo = size(mat)
funcmat = f(mat,sizeinfo)
end function funcmat
!--------------------------
real function f1(arg)
!This test function double the number arg
implicit none
real::arg
f1 = arg*2
return
end function f1
real function f2(arg)
!This test function square the number arg
implicit none
real::arg
f2 = arg*arg
return
end function f2
real function fmat(mat,sizeinfo)
!This test function sum up all elements in the mat
implicit none
integer::sizeinfo!This is the method I come up with to get around the
!restriction.
real,dimension(sizeinfo)::mat!This mat cannot be undetermined, otherwise it
!won't recevie mat correctly. I don't know why yet.
fmat = sum(mat)
end function fmat
subroutine sub(arg)
real::arg
arg = arg*3
end subroutine sub
end module
Main program:
program main
use func_arg_test
implicit none
real::a = 5d0
real::output
real, dimension(:),allocatable::mat
write(*,*) 'value of a=',a
output = func_func(f1,a)
write(*,*) 'a is doubled'
write(*,*) output
output = func_func(f2,a)
write(*,*) 'a is squared'
write(*,*) output
output = func_sub(sub,a)
write(*,*) 'a is tripled and overwritten'
write(*,*) output
call sub_func(f2,a)
write(*,*) 'a is squared and overwritten'
write(*,*) a
call sub_sub(sub,a)
write(*,*) 'a is tripled and overwritten'
write(*,*) a
allocate(mat(3))
mat = (/1d0,10d0,1d0/)!The allocatable arrray has to have a determined shape before
!pass as arguemnt
write(*,*) '1D matrix:',mat
write(*,*) 'Summation of the matrix:'
output = funcmat(fmat,mat)!elements of mat are summed
write(*,*) output
end program
And the result is:
value of a= 5.000000
a is doubled
10.00000
a is squared
25.00000
a is tripled and overwritten
15.00000
a is squared and overwritten
225.0000
a is tripled and overwritten
675.0000
1D matrix: 1.000000 10.00000 1.000000
Summation of the matrix:
12.00000
I have a program I want to parallelize using MPI. I have not worked with MPI before.
The program calculates the behavior for a large numer of objects over time. The data of
these objects is stored in arrays, e.g. double precision :: body_x(10000) for the x coordinate.
To calculate the behavior of an object the information about all other objects is needed,
so every thread needs to hold all data but will only update a portion of it. But before the
new timestep every thread needs to get the information from all other threads.
As I understand MPI_Allgather could be used for this, but it needs a send buffer and a
recive buffer. How can I synchronize an array over different threads if each thread updated
a different part of the array? Do I have to send the whole array from each thread to the
master in a recive buffer, update the specific part of the masters array and after all
threads have sent their data re-broadcast from the master?
This is a pretty basic question, but I'm very new to MPI and all examples I found are
pretty simple and do not cover this. Thanks for any help.
Pseudo-Example (assuming Fortran-Style vectors with first index 1):
(Yes the send/recive would better be done non-blocking, this is for the sake of simplicity)
if (master) then
readInputFile
end if
MPI_Bcast(numberOfObject)
allocate body_arrays(numberOfObjects)
if (master) then
fill body_arrays ! with the data from the input file
end if
MPI_Bcast(body_arrays)
objectsPerThread = numberOfObjects / threadCount
myStart = threadID * objectsPerThread + 1
myEnd = (threadID + 1) * objectsPerThread
do while (t < t_end)
do i = myStart, myEnd
do stuff for body_arrays(i)
end do
! here is the question
if (.not. master)
MPI_Send(body_arrays, toMaster)
else
do i = 1, threadCount - 1
MPI_Recive(body_arrays_recive, senderID)
body_arrays(senderID*objectsPerThread+1, (senderId+1)*objectsPerThread) = body_arrays_recive(senderID*objectsPerThread+1, (senderId+1)*objectsPerThread)
end if
MPI_Bcast(body_arrays)
! ----
t = t + dt
end do
It sounds like you want MPI_Allgather. To avoid needing a separate send buffer, you may be able to use the MPI_IN_PLACE value. That tells MPI to use the same buffer for both send and receive.
See http://mpi-forum.org/docs/mpi-2.2/mpi22-report/node99.htm#Node99
The array chunks from all processes can be combined using a call to MPI_Allgatherv. The following is a complete example in Fortran. It defines an array of size 50. Then each process sets a chunk of that array to some complex number. Finally, the call to MPI_allgatherv pulls all the chunks together. The calculations of the chunk sizes, and some of the parameters that need to be passed to MPI_allgatherv are encapsulated in the mpi_split routine.
program test
use mpi
implicit none
integer, parameter :: idp = 8
integer, parameter :: n_tasks = 11
real(idp), parameter :: zero = 0.0d0
complex(idp), parameter :: czero = cmplx(zero, zero, kind=idp)
integer :: mpi_n_procs, mpi_proc_id, error
integer :: i, i_from, i_to
complex(idp) :: c(-5:5)
real(idp) :: split_size
integer, allocatable :: recvcount(:), displs(:)
call MPI_Init(error)
call MPI_Comm_size(MPI_COMM_WORLD, mpi_n_procs, error)
call MPI_Comm_rank(MPI_COMM_WORLD, mpi_proc_id, error)
allocate(recvcount(mpi_n_procs))
allocate(displs(mpi_n_procs))
i_from = -5
i_to = 5
! each process covers only part of the array
call mpi_split(i_from, i_to, counts=recvcount, displs=displs)
write(*,*) "ID", mpi_proc_id,":", i_from, "..", i_to
if (mpi_proc_id == 0) then
write(*,*) "Counts: ", recvcount
write(*,*) "Displs: ", displs
end if
c(:) = czero
do i = i_from, i_to
c(i) = cmplx(real(i, idp), real(i+1, idp), kind=idp)
end do
call MPI_Allgatherv(c(i_from), i_to-i_from+1, MPI_DOUBLE_COMPLEX, c, &
& recvcount, displs, MPI_DOUBLE_COMPLEX, MPI_COMM_WORLD, &
& error)
if (mpi_proc_id == 0) then
do i = -5, 5
write(*,*) i, ":", c(i)
end do
end if
deallocate(recvcount, displs)
call MPI_Finalize(error)
contains
!! #description: split the range (a,b) into equal chunks, where each chunk is
!! handled by a different MPI process
!! #param: a On input, the lower bound of an array to be processed. On
!! output, the lower index of the chunk that the MPI process
!! `proc_id` should process
!! #param: b On input, the upper bound of an array. On, output the
!! upper index of the chunk that process `proc_id` should
!! process.
!! #param: n_procs The total number of available processes. If not given,
!! this is determined automatically from the MPI environment.
!! #param: proc_id The (zero-based) process ID (`0 <= proc_id < n_procs`). If
!! not given, the ID of the current MPI process
!! #param: counts If given, must be of size `n_procs`. On output, the chunk
!! size for each MPI process
!! #param: displs If given, must be of size `n_procs`. On output, the offset
!! if the first index processed by each MPI process, relative
!! to the input value of `a`
subroutine mpi_split(a, b, n_procs, proc_id, counts, displs)
integer, intent(inout) :: a
integer, intent(inout) :: b
integer, optional, intent(in) :: n_procs
integer, optional, intent(in) :: proc_id
integer, optional, intent(inout) :: counts(:)
integer, optional, intent(inout) :: displs(:)
integer :: mpi_n_procs, n_tasks, mpi_proc_id, error
integer :: aa, bb
real(idp) :: split_size
logical :: mpi_is_initialized
mpi_n_procs = 1
if (present(n_procs)) mpi_n_procs = n_procs
mpi_proc_id = 0
if (present(proc_id)) mpi_proc_id = proc_id
if (.not. present(n_procs)) then
call MPI_Comm_size(MPI_COMM_WORLD, mpi_n_procs, error)
end if
if (.not. present(proc_id)) then
call MPI_Comm_rank(MPI_COMM_WORLD, mpi_proc_id, error)
end if
aa = a
bb = b
n_tasks = bb - aa + 1
split_size = real(n_tasks, idp) / real(max(mpi_n_procs, 1), idp)
a = nint(mpi_proc_id * split_size) + aa
b = min(aa + nint((mpi_proc_id+1) * split_size) - 1, bb)
if (present(counts)) then
do mpi_proc_id = 0, mpi_n_procs-1
counts(mpi_proc_id+1) = max(nint((mpi_proc_id+1) * split_size) &
& - nint((mpi_proc_id) * split_size), 0)
end do
end if
if (present(displs)) then
do mpi_proc_id = 0, mpi_n_procs-1
displs(mpi_proc_id+1) = min(nint(mpi_proc_id * split_size), bb-aa)
end do
end if
end subroutine mpi_split
end program
I have a program in R which calls a couple of Fortran routines, which are openMP-enabled. There are two Fortran routines sub_1 and sub_2. The first one is called twice in an R function, while the second is called once. Both routines are almost identical except for a few minor things. I call the first routine, then the second, then the first again. However, if I have both of them openMP-enabled, the function stops doing anything (doesn't have an error or stop execution, just sits there) when it gets to the second time it uses the first fortran routine.
If I disable the openMP in sub_1 then everything runs fine. If I instead disable the openMP in sub_2, then it again hangs in the same fashion on the second usage of sub_1. This is odd because it obviously gets through the first usage fine.
I thought it may be to do with the threads not closing properly or something (I don't know too much about openMP). However, another oddity is that the R function that calls these three routines is being called four times, and if I only enable openMP in sub_2, then this works fine (ie. the second, third etc. call to sub_2 doesn't hang). I just have no idea why it would do this! For reference, this is the code for sub_1:
subroutine correlation_dd_rad(s_bins,min_s,end_s,n,pos1,dd,r)
!!! INTENT IN !!!!!!!!
integer :: s_bins !Number of separation bins
integer :: N !Number of objects
real(8) :: pos1(3,N) !Cartesian Positions of particles
real(8) :: min_s !The smallest separation calculated.
real(8) :: end_s !The largest separation calculated.
real(8) :: r(N) !The radii of each particle (ascending)
!!! INTENT OUT !!!!!!!
real(8) :: dd(N,s_bins) !The binned data.
!!! LOCAL !!!!!!!!!!!!
integer :: i,j !Iterators
integer :: bin
real(8) :: d !Distance between particles.
real(8) :: dr,mins,ends
real(8),parameter :: pi = 3.14159653589
integer :: counter
dd(:,:) = 0.d0
dr = (end_s-min_s)/s_bins
!Perform the separation binning
mins = min_s**2
ends = end_s**2
counter = 1000
!$OMP parallel do private(d,bin,j)
do i=1,N
!$omp critical (count_it)
counter = counter - 1
!$omp end critical (count_it)
if(counter==0)then
counter = 1000
write(*,*) "Another Thousand"
end if
do j=i+1,N
if(r(j)-r(i) .GT. end_s)then
exit
end if
d=(pos1(1,j)-pos1(1,i))**2+&
&(pos1(2,j)-pos1(2,i))**2+&
&(pos1(3,j)-pos1(3,i))**2
if(d.LT.ends .AND. d.GT.mins)then
d = Sqrt(d)
bin = Floor((d-min_s)/dr)+1
dd(i,bin) = dd(i,bin)+1.d0
dd(j,bin) = dd(j,bin)+1.d0
end if
end do
end do
!$OMP end parallel do
write(*,*) "done"
end subroutine
Does anyone have any clue why this would happen??
Cheers.
I'll add in the smallest example that I can think of that does reproduce the problem (by the way, this must be an R problem - a small example of the type that I present here but written in fortran works fine). So I have the above code and the following code in fortran, compiled to the shared object correlate.so:
subroutine correlation_dr_rad(s_bins,min_s,end_s,n,pos1,n2,pos2,dd,r1,r2)
!!! INTENT IN !!!!!!!!
integer :: s_bins !Number of separation bins
integer :: N !Number of objects
integer :: n2
real(8) :: pos1(3,N) !Cartesian Positions of particles
real(8) :: pos2(3,n2) !random particles
real(8) :: end_s !The largest separation calculated.
real(8) :: min_s !The smallest separation
real(8) :: r1(N),r2(N2) !The radii of particles (ascending)
!!! INTENT OUT !!!!!!!
real(8) :: dd(N,s_bins) !The binned data.
!!! LOCAL !!!!!!!!!!!!
integer :: i,j !Iterators
integer :: bin
real(8) :: d !Distance between particles.
real(8) :: dr,mins,ends
real(8),parameter :: pi = 3.14159653589
integer :: counter
dd(:,:) = 0.d0
dr = (end_s-min_s)/s_bins
!Perform the separation binning
mins = min_s**2
ends = end_s**2
write(*,*) "Got just before parallel dr"
counter = 1000
!$OMP parallel do private(d,bin,j)
do i=1,N
!$OMP critical (count)
counter = counter - 1
!$OMP end critical (count)
if(counter==0)then
write(*,*) "Another thousand"
counter = 1000
end if
do j=1,N2
if(r2(j)-r1(i) .GT. end_s)then
exit
end if
d=(pos1(1,j)-pos2(1,i))**2+&
&(pos1(2,j)-pos2(2,i))**2+&
&(pos1(3,j)-pos2(3,i))**2
if(d.GT.mins .AND. d.LT.ends)then
d = Sqrt(d)
bin = Floor((d-min_s)/dr)+1
dd(i,bin) = dd(i,bin)+1.d0
end if
end do
end do
!$OMP end parallel do
write(*,*) "Done"
end subroutine
Then in R, I have the following functions - the first two just wrap the above fortran code. The third calls it in a similar way to my actual code:
correlate_dd_rad = function(pos,r,min_r,end_r,bins){
#A wrapper for the fortran routine of the same name.
dyn.load('correlate.so')
out = .Fortran('correlation_dd_rad',
s_bins = as.integer(bins),
min_s = as.double(min_r),
end_s = as.double(end_r),
n = as.integer(length(r)),
pos = as.double(t(pos)),
dd = matrix(0,length(r),bins), #The output matrix.
r = as.double(r))
dyn.unload('correlate.so')
return(out$dd)
}
correlate_dr_rad = function(pos1,r1,pos2,r2,min_r,end_r,bins){
#A wrapper for the fortran routine of the same name
N = length(r1)
N2 = length(r2)
dyn.load('correlate.so')
out = .Fortran('correlation_dr_rad',
s_bins = as.integer(bins),
min_s = as.double(min_r),
end_s = as.double(end_r),
n = N,
pos1 = as.double(t(pos1)),
n2 = N2,
pos2 = as.double(t(pos2)),
dr = matrix(0,nrow=N,ncol=bins),
r1 = as.double(r1),
r2 = as.double(r2))
dyn.unload('correlate.so')
return(out$dr)
}
the_calculation = function(){
#Generate some data to use
pos1 = matrix(rnorm(30000),10000,3)
pos2 = matrix(rnorm(30000),10000,3)
#Find the radii
r1 = sqrt(pos1[,1]^2 + pos1[,2]^2+pos1[,3]^2)
r2 = sqrt(pos2[,1]^2 + pos2[,2]^2+pos2[,3]^2)
#usually sort them but it doesn't matter here.
#Now call the functions
print("Calculating the data-data pairs")
dd = correlate_dd_rad(pos=pos1,r=r1,min_r=0.001,end_r=0.8,bins=15)
print("Calculating the data-random pairs")
dr = correlate_dr_rad(pos1,r1,pos2,r2,min_r=0.001,end_r=0.8,bins=15)
print("Calculating the random-random pairs")
rr = correlate_dd_rad(pos=pos2,r=r2,min_r=0.001,end_r=0.8,bins=15)
#Now we would do something with it but I don't care in this example.
print("Done")
}
Running this I get the output:
[1] "Calculating the data-data pairs"
Another Thousand
Another Thousand
Another Thousand
Another Thousand
Another Thousand
Another Thousand
Another Thousand
Another Thousand
Another Thousand
Another Thousand
done
[1] "Calculating the data-random pairs"
Got just before parallel dr
Another thousand
Another thousand
And then it just sits there... Actually, running it a few times has shown that it changes where it hangs each time. Sometimes it gets most of the way through the second call to correlate_dd_rad and others it only gets halfway through the call to correlate_dr_rad.
I am not sure if this will solve your problem, but it is indeed a bug. In subroutine correlation_dd_rad when you intended to close the parallel region, you actually put a comment. To be more clear the line that reads:
!OMP end parallel do
should be converted to:
!$OMP end parallel do
As side notes:
you don't need to use omp_lib if you don't call the library functions
you can use the atomic construct (see section 2.8.5 of the latest OpenMP specifications) to access a specific storage location atomically, instead of a critical construct
always give a name to critical constructs as (section 2.8.2 of the specifications)
All critical constructs without a name are considered to have the same unspecified name.