How to write out complex numbers using Fortran specifier - math

I am new to Fortran and I have the following code, it basically solves a simple quadratic equation and output the solutions
!solving ax^2 + bx + c = 0
program output
implicit none
real :: a,b,c
character :: response
do
print*, 'Enter three coefficients a,b and c'
read *, a,b,c
call quad(a,b,c)
print*, 'Press Y to continue. Anykey for otherwise'
read *, response
if ( response /= 'y' .and. response /= 'Y') stop
end do
end program output
!Function to calculate Xs
subroutine quad(a,b,c)
implicit none
real :: a,b,c,xplus, xminus
xplus = ((-b)+sqrt((b**2)-(4*a*c)))/(2*a)
xminus = ((-b)-sqrt((b**2)-(4*a*c)))/(2*a)
if (xplus == xminus) then
print*, 'There exists 1 root only: '
write(*,12) xplus
12 format(2f10.2)
else
print*, 'Solutions of quadratic equation are'
write(*,10) xplus, xminus
10 format(1f10.5)
end if
end subroutine quad
This works. However, how would I go about with complex solutions. I.e, how would this line change, to make the format for complex numbers.
10 format(1f10.5)
Thank you so much.

Related

How do I finish a recursive binary search in fortran?

I am trying to find the smallest index containing the value i in a sorted array. If this i value is not present I want -1 to be returned. I am using a binary search recursive subroutine. The problem is that I can't really stop this recursion and I get lot of answers(one right and the rest wrong). And sometimes I get an error called "segmentation fault: 11" and I don't really get any results.
I've tried to delete this call random_number since I already have a sorted array in my main program, but it did not work.
program main
implicit none
integer, allocatable :: A(:)
real :: MAX_VALUE
integer :: i,j,n,s, low, high
real :: x
N= 10 !size of table
MAX_VALUE = 10
allocate(A(n))
s = 5 ! searched value
low = 1 ! lower limit
high = n ! highest limit
!generate random table of numbers (from 0 to 1000)
call Random_Seed
do i=1, N
call Random_Number(x) !returns random x >= 0 and <1
A(i)= anint(MAX_VALUE*x)
end do
call bubble(n,a)
print *,' '
write(*,10) (a(i),i=1,N)
10 format(10i6)
call bsearch(A,n,s,low,high)
deallocate(A)
end program main
The sort subroutine:
subroutine sort(p,q)
implicit none
integer(kind=4), intent(inout) :: p, q
integer(kind=4) :: temp
if (p>q) then
temp = p
p = q
q = temp
end if
return
end subroutine sort
The bubble subroutine:
subroutine bubble(n,arr)
implicit none
integer(kind=4), intent(inout) :: n
integer(kind=4), intent(inout) :: arr(n)
integer(kind=4) :: sorted(n)
integer :: i,j
do i=1, n
do j=n, i+1, -1
call sort(arr(j-1), arr(j))
end do
end do
return
end subroutine bubble
recursive subroutine bsearch(b,n,i,low,high)
implicit none
integer(kind=4) :: b(n)
integer(kind=4) :: low, high
integer(kind=4) :: i,j,x,idx,n
real(kind=4) :: r
idx = -1
call random_Number(r)
x = low + anint((high - low)*r)
if (b(x).lt.i) then
low = x + 1
call bsearch(b,n,i,low,high)
else if (b(x).gt.i) then
high = x - 1
call bsearch(b,n,i,low,high)
else
do j = low, high
if (b(j).eq.i) then
idx = j
exit
end if
end do
end if
! Stop if high = low
if (low.eq.high) then
return
end if
print*, i, 'found at index ', idx
return
end subroutine bsearch
The goal is to get the same results as my linear search. But I'am getting either of these answers.
Sorted table:
1 1 2 4 5 5 6 7 8 10
5 found at index 5
5 found at index -1
5 found at index -1
or if the value is not found
2 2 3 4 4 6 6 7 8 8
Segmentation fault: 11
There are a two issues causing your recursive search routine bsearch to either stop with unwanted output, or result in a segmentation fault. Simply following the execution logic of your program at the hand of the examples you provided, elucidate the matter:
1) value present and found, unwanted output
First, consider the first example where array b contains the value i=5 you are searching for (value and index pointed out with || in the first two lines of the code block below). Using the notation Rn to indicate the the n'th level of recursion, L and H for the lower- and upper bounds and x for the current index estimate, a given run of your code could look something like this:
b(x): 1 1 2 4 |5| 5 6 7 8 10
x: 1 2 3 4 |5| 6 7 8 9 10
R0: L x H
R1: Lx H
R2: L x H
5 found at index 5
5 found at index -1
5 found at index -1
In R0 and R1, the tests b(x).lt.i and b(x).gt.i in bsearch work as intended to reduce the search interval. In R2 the do-loop in the else branch is executed, idx is assigned the correct value and this is printed - as intended. However, a return statement is now encountered which will return control to the calling program unit - in this case that is first R1(!) where execution will resume after the if-else if-else block, thus printing a message to screen with the initial value of idx=-1. The same happens upon returning from R0 to the main program. This explains the (unwanted) output you see.
2) value not present, segmentation fault
Secondly, consider the example resulting in a segmentation fault. Using the same notation as before, a possible run could look like this:
b(x): 2 2 3 4 4 6 6 7 8 8
x: 1 2 3 4 5 6 7 8 9 10
R0: L x H
R1: L x H
R2: L x H
R3: LxH
R4: H xL
.
.
.
Segmentation fault: 11
In R0 to R2 the search interval is again reduced as intended. However, in R3 the logic fails. Since the search value i is not present in array b, one of the .lt. or .gt. tests will always evaluate to .true., meaning that the test for low .eq. high to terminate a search is never reached. From this point onwards, the logic is no longer valid (e.g. high can be smaller than low) and the code will continue deepening the level of recursion until the call stack gets too big and a segmentation fault occurs.
These explained the main logical flaws in the code. A possible inefficiency is the use of a do-loop to find the lowest index containing a searched for value. Consider a case where the value you are looking for is e.g. i=8, and that it appears in the last position in your array, as below. Assume further that by chance, the first guess for its position is x = high. This implies that your code will immediately branch to the do-loop, where in effect a linear search is done of very nearly the entire array, to find the final result idx=9. Although correct, the intended binary search rather becomes a linear search, which could result in reduced performance.
b(x): 2 2 3 4 4 6 6 7 |8| 8
x: 1 2 3 4 5 6 7 8 |9| 10
R0: L xH
8 found at index 9
Fixing the problems
At the very least, you should move the low .eq. high test to the start of the bsearch routine, so that recursion stops before invalid bounds can be defined (you then need an additional test to see if the search value was found or not). Also, notify about a successful search right after it occurs, i.e. after the equality test in your do-loop, or the additional test just mentioned. This still does not address the inefficiency of a possible linear search.
All taken into account, you are probably better off reading up on algorithms for finding a "leftmost" index (e.g. on Wikipedia or look at a tried and tested implementation - both examples here use iteration instead of recursion, perhaps another improvement, but the same principles apply) and adapt that to Fortran, which could look something like this (only showing new code, ...refer to existing code in your examples):
module mod_search
implicit none
contains
! Function that uses recursive binary search to look for `key` in an
! ordered `array`. Returns the array index of the leftmost occurrence
! of `key` if present in `array`, and -1 otherwise
function search_ordered (array, key) result (idx)
integer, intent(in) :: array(:)
integer, intent(in) :: key
integer :: idx
! find left most array index that could possibly hold `key`
idx = binary_search_left(1, size(array))
! if `key` is not found, return -1
if (array(idx) /= key) then
idx = -1
end if
contains
! function for recursive reduction of search interval
recursive function binary_search_left(low, high) result(idx)
integer, intent(in) :: low, high
integer :: idx
real :: r
if (high <= low ) then
! found lowest possible index where target could be
idx = low
else
! new guess
call random_number(r)
idx = low + floor((high - low)*r)
! alternative: idx = low + (high - low) / 2
if (array(idx) < key) then
! continue looking to the right of current guess
idx = binary_search_left(idx + 1, high)
else
! continue looking to the left of current guess (inclusive)
idx = binary_search_left(low, idx)
end if
end if
end function binary_search_left
end function search_ordered
! Move your routines into a module
subroutine sort(p,q)
...
end subroutine sort
subroutine bubble(n,arr)
...
end subroutine bubble
end module mod_search
! your main program
program main
use mod_search, only : search_ordered, sort, bubble ! <---- use routines from module like so
implicit none
...
! Replace your call to bsearch() with the following:
! call bsearch(A,n,s,low,high)
i = search_ordered(A, s)
if (i /= -1) then
print *, s, 'found at index ', i
else
print *, s, 'not found!'
end if
...
end program main
Finally, depending on your actual use case, you could also just consider using the Fortran intrinsic procedure minloc saving you the trouble of implementing all this functionality yourself. In this case, it can be done by making the following modification in your main program:
! i = search_ordered(a, s) ! <---- comment out this line
j = minloc(abs(a-s), dim=1) ! <---- replace with these two
i = merge(j, -1, a(j) == s)
where j returned from minloc will be the lowest index in the array a where s may be found, and merge is used to return j when a(j) == s and -1 otherwise.

Wrong Fortran Output [duplicate]

I've written a rudimentary algorithm in Fortran 95 to calculate the gradient of a function (an example of which is prescribed in the code) using central differences augmented with a procedure known as Richardson extrapolation.
function f(n,x)
! The scalar multivariable function to be differentiated
integer :: n
real(kind = kind(1d0)) :: x(n), f
f = x(1)**5.d0 + cos(x(2)) + log(x(3)) - sqrt(x(4))
end function f
!=====!
!=====!
!=====!
program gradient
!==============================================================================!
! Calculates the gradient of the scalar function f at x=0using a finite !
! difference approximation, with a low order Richardson extrapolation. !
!==============================================================================!
parameter (n = 4, M = 25)
real(kind = kind(1d0)) :: x(n), xhup(n), xhdown(n), d(M), r(M), dfdxi, h0, h, gradf(n)
h0 = 1.d0
x = 3.d0
! Loop through each component of the vector x and calculate the appropriate
! derivative
do i = 1,n
! Reset step size
h = h0
! Carry out M successive central difference approximations of the derivative
do j = 1,M
xhup = x
xhdown = x
xhup(i) = xhup(i) + h
xhdown(i) = xhdown(i) - h
d(j) = ( f(n,xhup) - f(n,xhdown) ) / (2.d0*h)
h = h / 2.d0
end do
r = 0.d0
do k = 3,M r(k) = ( 64.d0*d(k) - 20.d0*d(k-1) + d(k-2) ) / 45.d0
if ( abs(r(k) - r(k-1)) < 0.0001d0 ) then
dfdxi = r(k)
exit
end if
end do
gradf(i) = dfdxi
end do
! Print out the gradient
write(*,*) " "
write(*,*) " Grad(f(x)) = "
write(*,*) " "
do i = 1,n
write(*,*) gradf(i)
end do
end program gradient
In single precision it runs fine and gives me decent results. But when I try to change to double precision as shown in the code, I get an error when trying to compile claiming that the assignment statement
d(j) = ( f(n,xhup) - f(n,xhdown) ) / (2.d0*h)
is producing a type mismatch real(4)/real(8). I have tried several different declarations of double precision, appended every appropriate double precision constant in the code with d0, and I get the same error every time. I'm a little stumped as to how the function f is possibly producing a single precision number.
The problem is that f is not explicitely defined in your main program, therefore it is implicitly assumed to be of single precision, which is the type real(4) for gfortran.
I completely agree to the comment of High Performance Mark, that you really should use implicit none in all your fortran code, to make sure all object are explicitely declared. This way, you would have obtained a more appropriate error message about f not being explicitely defined.
Also, you could consider two more things:
Define your function within a module and import that module in the main program. It is a good practice to define all subroutines/functions within modules only, so that the compiler can make extra checks on number and type of the arguments, when you invoke the function.
You could (again in module) introduce a constant for the precicision and use it everywhere, where the kind of a real must be specified. Taking the example below, by changing only the line
integer, parameter :: dp = kind(1.0d0)
into
integer, parameter :: dp = kind(1.0)
you would change all your real variables from double to single precision. Also note the _dp suffix for the literal constants instead of the d0 suffix, which would automatically adjust their precision as well.
module accuracy
implicit none
integer, parameter :: dp = kind(1.0d0)
end module accuracy
module myfunc
use accuracy
implicit none
contains
function f(n,x)
integer :: n
real(dp) :: x(n), f
f = 0.5_dp * x(1)**5 + cos(x(2)) + log(x(3)) - sqrt(x(4))
end function f
end module myfunc
program gradient
use myfunc
implicit none
real(dp) :: x(n), xhup(n), xhdown(n), d(M), r(M), dfdxi, h0, h, gradf(n)
:
end program gradient

Recursive Combination on Fortran

I wrote a recursive program on Fortran to calculate the combinations of npoints of ndim dimensions as follows. I first wrote this program on MATLAB and it was perfectly running. But in Fortran, my problem is that after the first iteration it is assigning absurd values for the list of points, with no explanation. Could somebody give me a hand?
PROGRAM MAIN
IMPLICIT NONE
INTEGER :: ndim, k, npontos, contador,i,iterate, TEST
integer, dimension(:), allocatable :: pontos
print*, ' '
print*, 'npoints?'
read *, npontos
print*, 'ndim?'
read *, ndim
k=1
contador = 1
open(450,file= 'combination.out',form='formatted',status='unknown')
write(450,100) 'Comb ','stat ',(' pt ',i,' ',i=1,ndim)
write(450,120) ('XXXXXXXXXX ',i=1,ndim+1)
allocate(pontos(ndim))
do i=1,4
pontos(i)=i
end do
TEST = iterate(pontos, ndim, npontos,k,contador)
end program MAIN
recursive integer function iterate(pontos, ndim, npontos, k,contador)
implicit NONE
integer, intent(in) :: ndim, k, npontos
integer,dimension(:) :: pontos
integer contador,inic,i,j,m
if (k.eq.ndim) then
inic=pontos(ndim)
do i = pontos(ndim),npontos
pontos(k)= i
write(*,*) pontos(:)
contador=contador+1
end do
pontos(ndim)= inic + 1
else
inic = pontos (k)
do j = pontos(k),(npontos-ndim+k)
pontos(k)=j
pontos= iterate(pontos, ndim, npontos, k+1,contador)
end do
end if
pontos(k)=inic+1;
if (pontos(k).gt.(npontos-ndim+k+1)) then
do m =k+1,ndim
pontos(m)=pontos(m-1)+1
end do
end if
end function iterate
There are too many issues in that code... I stopped debugging it. This is what I got so far, it's too much for a comment.
This doesn't make sense:
pontos= iterate(pontos, ndim, npontos, k+1,contador)
You are changing pontos inside iterate, and never set a return value within the function.
You need to a) provide a result statement for recursive functions (and actually set it) or b) convert it to a subroutine. Since you are modifying at least one dummy argument, you should go with b).
Since you are using assumed-shape dummy arguments, you need to specify an interface to the function/subroutine, either explicitly or with a module.
Neither format 100 nor format 120 are specified in your code.

Synchronize array over MPI processes, if each thread changed a part of it?

I have a program I want to parallelize using MPI. I have not worked with MPI before.
The program calculates the behavior for a large numer of objects over time. The data of
these objects is stored in arrays, e.g. double precision :: body_x(10000) for the x coordinate.
To calculate the behavior of an object the information about all other objects is needed,
so every thread needs to hold all data but will only update a portion of it. But before the
new timestep every thread needs to get the information from all other threads.
As I understand MPI_Allgather could be used for this, but it needs a send buffer and a
recive buffer. How can I synchronize an array over different threads if each thread updated
a different part of the array? Do I have to send the whole array from each thread to the
master in a recive buffer, update the specific part of the masters array and after all
threads have sent their data re-broadcast from the master?
This is a pretty basic question, but I'm very new to MPI and all examples I found are
pretty simple and do not cover this. Thanks for any help.
Pseudo-Example (assuming Fortran-Style vectors with first index 1):
(Yes the send/recive would better be done non-blocking, this is for the sake of simplicity)
if (master) then
readInputFile
end if
MPI_Bcast(numberOfObject)
allocate body_arrays(numberOfObjects)
if (master) then
fill body_arrays ! with the data from the input file
end if
MPI_Bcast(body_arrays)
objectsPerThread = numberOfObjects / threadCount
myStart = threadID * objectsPerThread + 1
myEnd = (threadID + 1) * objectsPerThread
do while (t < t_end)
do i = myStart, myEnd
do stuff for body_arrays(i)
end do
! here is the question
if (.not. master)
MPI_Send(body_arrays, toMaster)
else
do i = 1, threadCount - 1
MPI_Recive(body_arrays_recive, senderID)
body_arrays(senderID*objectsPerThread+1, (senderId+1)*objectsPerThread) = body_arrays_recive(senderID*objectsPerThread+1, (senderId+1)*objectsPerThread)
end if
MPI_Bcast(body_arrays)
! ----
t = t + dt
end do
It sounds like you want MPI_Allgather. To avoid needing a separate send buffer, you may be able to use the MPI_IN_PLACE value. That tells MPI to use the same buffer for both send and receive.
See http://mpi-forum.org/docs/mpi-2.2/mpi22-report/node99.htm#Node99
The array chunks from all processes can be combined using a call to MPI_Allgatherv. The following is a complete example in Fortran. It defines an array of size 50. Then each process sets a chunk of that array to some complex number. Finally, the call to MPI_allgatherv pulls all the chunks together. The calculations of the chunk sizes, and some of the parameters that need to be passed to MPI_allgatherv are encapsulated in the mpi_split routine.
program test
use mpi
implicit none
integer, parameter :: idp = 8
integer, parameter :: n_tasks = 11
real(idp), parameter :: zero = 0.0d0
complex(idp), parameter :: czero = cmplx(zero, zero, kind=idp)
integer :: mpi_n_procs, mpi_proc_id, error
integer :: i, i_from, i_to
complex(idp) :: c(-5:5)
real(idp) :: split_size
integer, allocatable :: recvcount(:), displs(:)
call MPI_Init(error)
call MPI_Comm_size(MPI_COMM_WORLD, mpi_n_procs, error)
call MPI_Comm_rank(MPI_COMM_WORLD, mpi_proc_id, error)
allocate(recvcount(mpi_n_procs))
allocate(displs(mpi_n_procs))
i_from = -5
i_to = 5
! each process covers only part of the array
call mpi_split(i_from, i_to, counts=recvcount, displs=displs)
write(*,*) "ID", mpi_proc_id,":", i_from, "..", i_to
if (mpi_proc_id == 0) then
write(*,*) "Counts: ", recvcount
write(*,*) "Displs: ", displs
end if
c(:) = czero
do i = i_from, i_to
c(i) = cmplx(real(i, idp), real(i+1, idp), kind=idp)
end do
call MPI_Allgatherv(c(i_from), i_to-i_from+1, MPI_DOUBLE_COMPLEX, c, &
& recvcount, displs, MPI_DOUBLE_COMPLEX, MPI_COMM_WORLD, &
& error)
if (mpi_proc_id == 0) then
do i = -5, 5
write(*,*) i, ":", c(i)
end do
end if
deallocate(recvcount, displs)
call MPI_Finalize(error)
contains
!! #description: split the range (a,b) into equal chunks, where each chunk is
!! handled by a different MPI process
!! #param: a On input, the lower bound of an array to be processed. On
!! output, the lower index of the chunk that the MPI process
!! `proc_id` should process
!! #param: b On input, the upper bound of an array. On, output the
!! upper index of the chunk that process `proc_id` should
!! process.
!! #param: n_procs The total number of available processes. If not given,
!! this is determined automatically from the MPI environment.
!! #param: proc_id The (zero-based) process ID (`0 <= proc_id < n_procs`). If
!! not given, the ID of the current MPI process
!! #param: counts If given, must be of size `n_procs`. On output, the chunk
!! size for each MPI process
!! #param: displs If given, must be of size `n_procs`. On output, the offset
!! if the first index processed by each MPI process, relative
!! to the input value of `a`
subroutine mpi_split(a, b, n_procs, proc_id, counts, displs)
integer, intent(inout) :: a
integer, intent(inout) :: b
integer, optional, intent(in) :: n_procs
integer, optional, intent(in) :: proc_id
integer, optional, intent(inout) :: counts(:)
integer, optional, intent(inout) :: displs(:)
integer :: mpi_n_procs, n_tasks, mpi_proc_id, error
integer :: aa, bb
real(idp) :: split_size
logical :: mpi_is_initialized
mpi_n_procs = 1
if (present(n_procs)) mpi_n_procs = n_procs
mpi_proc_id = 0
if (present(proc_id)) mpi_proc_id = proc_id
if (.not. present(n_procs)) then
call MPI_Comm_size(MPI_COMM_WORLD, mpi_n_procs, error)
end if
if (.not. present(proc_id)) then
call MPI_Comm_rank(MPI_COMM_WORLD, mpi_proc_id, error)
end if
aa = a
bb = b
n_tasks = bb - aa + 1
split_size = real(n_tasks, idp) / real(max(mpi_n_procs, 1), idp)
a = nint(mpi_proc_id * split_size) + aa
b = min(aa + nint((mpi_proc_id+1) * split_size) - 1, bb)
if (present(counts)) then
do mpi_proc_id = 0, mpi_n_procs-1
counts(mpi_proc_id+1) = max(nint((mpi_proc_id+1) * split_size) &
& - nint((mpi_proc_id) * split_size), 0)
end do
end if
if (present(displs)) then
do mpi_proc_id = 0, mpi_n_procs-1
displs(mpi_proc_id+1) = min(nint(mpi_proc_id * split_size), bb-aa)
end do
end if
end subroutine mpi_split
end program

OpenMP in Fortran in R - program seems to hang for no reason?

I have a program in R which calls a couple of Fortran routines, which are openMP-enabled. There are two Fortran routines sub_1 and sub_2. The first one is called twice in an R function, while the second is called once. Both routines are almost identical except for a few minor things. I call the first routine, then the second, then the first again. However, if I have both of them openMP-enabled, the function stops doing anything (doesn't have an error or stop execution, just sits there) when it gets to the second time it uses the first fortran routine.
If I disable the openMP in sub_1 then everything runs fine. If I instead disable the openMP in sub_2, then it again hangs in the same fashion on the second usage of sub_1. This is odd because it obviously gets through the first usage fine.
I thought it may be to do with the threads not closing properly or something (I don't know too much about openMP). However, another oddity is that the R function that calls these three routines is being called four times, and if I only enable openMP in sub_2, then this works fine (ie. the second, third etc. call to sub_2 doesn't hang). I just have no idea why it would do this! For reference, this is the code for sub_1:
subroutine correlation_dd_rad(s_bins,min_s,end_s,n,pos1,dd,r)
!!! INTENT IN !!!!!!!!
integer :: s_bins !Number of separation bins
integer :: N !Number of objects
real(8) :: pos1(3,N) !Cartesian Positions of particles
real(8) :: min_s !The smallest separation calculated.
real(8) :: end_s !The largest separation calculated.
real(8) :: r(N) !The radii of each particle (ascending)
!!! INTENT OUT !!!!!!!
real(8) :: dd(N,s_bins) !The binned data.
!!! LOCAL !!!!!!!!!!!!
integer :: i,j !Iterators
integer :: bin
real(8) :: d !Distance between particles.
real(8) :: dr,mins,ends
real(8),parameter :: pi = 3.14159653589
integer :: counter
dd(:,:) = 0.d0
dr = (end_s-min_s)/s_bins
!Perform the separation binning
mins = min_s**2
ends = end_s**2
counter = 1000
!$OMP parallel do private(d,bin,j)
do i=1,N
!$omp critical (count_it)
counter = counter - 1
!$omp end critical (count_it)
if(counter==0)then
counter = 1000
write(*,*) "Another Thousand"
end if
do j=i+1,N
if(r(j)-r(i) .GT. end_s)then
exit
end if
d=(pos1(1,j)-pos1(1,i))**2+&
&(pos1(2,j)-pos1(2,i))**2+&
&(pos1(3,j)-pos1(3,i))**2
if(d.LT.ends .AND. d.GT.mins)then
d = Sqrt(d)
bin = Floor((d-min_s)/dr)+1
dd(i,bin) = dd(i,bin)+1.d0
dd(j,bin) = dd(j,bin)+1.d0
end if
end do
end do
!$OMP end parallel do
write(*,*) "done"
end subroutine
Does anyone have any clue why this would happen??
Cheers.
I'll add in the smallest example that I can think of that does reproduce the problem (by the way, this must be an R problem - a small example of the type that I present here but written in fortran works fine). So I have the above code and the following code in fortran, compiled to the shared object correlate.so:
subroutine correlation_dr_rad(s_bins,min_s,end_s,n,pos1,n2,pos2,dd,r1,r2)
!!! INTENT IN !!!!!!!!
integer :: s_bins !Number of separation bins
integer :: N !Number of objects
integer :: n2
real(8) :: pos1(3,N) !Cartesian Positions of particles
real(8) :: pos2(3,n2) !random particles
real(8) :: end_s !The largest separation calculated.
real(8) :: min_s !The smallest separation
real(8) :: r1(N),r2(N2) !The radii of particles (ascending)
!!! INTENT OUT !!!!!!!
real(8) :: dd(N,s_bins) !The binned data.
!!! LOCAL !!!!!!!!!!!!
integer :: i,j !Iterators
integer :: bin
real(8) :: d !Distance between particles.
real(8) :: dr,mins,ends
real(8),parameter :: pi = 3.14159653589
integer :: counter
dd(:,:) = 0.d0
dr = (end_s-min_s)/s_bins
!Perform the separation binning
mins = min_s**2
ends = end_s**2
write(*,*) "Got just before parallel dr"
counter = 1000
!$OMP parallel do private(d,bin,j)
do i=1,N
!$OMP critical (count)
counter = counter - 1
!$OMP end critical (count)
if(counter==0)then
write(*,*) "Another thousand"
counter = 1000
end if
do j=1,N2
if(r2(j)-r1(i) .GT. end_s)then
exit
end if
d=(pos1(1,j)-pos2(1,i))**2+&
&(pos1(2,j)-pos2(2,i))**2+&
&(pos1(3,j)-pos2(3,i))**2
if(d.GT.mins .AND. d.LT.ends)then
d = Sqrt(d)
bin = Floor((d-min_s)/dr)+1
dd(i,bin) = dd(i,bin)+1.d0
end if
end do
end do
!$OMP end parallel do
write(*,*) "Done"
end subroutine
Then in R, I have the following functions - the first two just wrap the above fortran code. The third calls it in a similar way to my actual code:
correlate_dd_rad = function(pos,r,min_r,end_r,bins){
#A wrapper for the fortran routine of the same name.
dyn.load('correlate.so')
out = .Fortran('correlation_dd_rad',
s_bins = as.integer(bins),
min_s = as.double(min_r),
end_s = as.double(end_r),
n = as.integer(length(r)),
pos = as.double(t(pos)),
dd = matrix(0,length(r),bins), #The output matrix.
r = as.double(r))
dyn.unload('correlate.so')
return(out$dd)
}
correlate_dr_rad = function(pos1,r1,pos2,r2,min_r,end_r,bins){
#A wrapper for the fortran routine of the same name
N = length(r1)
N2 = length(r2)
dyn.load('correlate.so')
out = .Fortran('correlation_dr_rad',
s_bins = as.integer(bins),
min_s = as.double(min_r),
end_s = as.double(end_r),
n = N,
pos1 = as.double(t(pos1)),
n2 = N2,
pos2 = as.double(t(pos2)),
dr = matrix(0,nrow=N,ncol=bins),
r1 = as.double(r1),
r2 = as.double(r2))
dyn.unload('correlate.so')
return(out$dr)
}
the_calculation = function(){
#Generate some data to use
pos1 = matrix(rnorm(30000),10000,3)
pos2 = matrix(rnorm(30000),10000,3)
#Find the radii
r1 = sqrt(pos1[,1]^2 + pos1[,2]^2+pos1[,3]^2)
r2 = sqrt(pos2[,1]^2 + pos2[,2]^2+pos2[,3]^2)
#usually sort them but it doesn't matter here.
#Now call the functions
print("Calculating the data-data pairs")
dd = correlate_dd_rad(pos=pos1,r=r1,min_r=0.001,end_r=0.8,bins=15)
print("Calculating the data-random pairs")
dr = correlate_dr_rad(pos1,r1,pos2,r2,min_r=0.001,end_r=0.8,bins=15)
print("Calculating the random-random pairs")
rr = correlate_dd_rad(pos=pos2,r=r2,min_r=0.001,end_r=0.8,bins=15)
#Now we would do something with it but I don't care in this example.
print("Done")
}
Running this I get the output:
[1] "Calculating the data-data pairs"
Another Thousand
Another Thousand
Another Thousand
Another Thousand
Another Thousand
Another Thousand
Another Thousand
Another Thousand
Another Thousand
Another Thousand
done
[1] "Calculating the data-random pairs"
Got just before parallel dr
Another thousand
Another thousand
And then it just sits there... Actually, running it a few times has shown that it changes where it hangs each time. Sometimes it gets most of the way through the second call to correlate_dd_rad and others it only gets halfway through the call to correlate_dr_rad.
I am not sure if this will solve your problem, but it is indeed a bug. In subroutine correlation_dd_rad when you intended to close the parallel region, you actually put a comment. To be more clear the line that reads:
!OMP end parallel do
should be converted to:
!$OMP end parallel do
As side notes:
you don't need to use omp_lib if you don't call the library functions
you can use the atomic construct (see section 2.8.5 of the latest OpenMP specifications) to access a specific storage location atomically, instead of a critical construct
always give a name to critical constructs as (section 2.8.2 of the specifications)
All critical constructs without a name are considered to have the same unspecified name.

Resources