Can I point to variable ranked arrays in Fortran? [duplicate] - pointers

There are basically two ways to pass arrays to a subroutine in Fortran 90/95:
PROGRAM ARRAY
INTEGER, ALLOCATABLE :: A(:,:)
INTEGER :: N
ALLOCATE(A(N,N))
CALL ARRAY_EXPLICIT(A,N)
! or
CALL ARRAY_ASSUMED(A)
END PROGRAM ARRAY
SUBROUTINE ARRAY_EXPLICIT(A,N)
INTEGER :: N
INTEGER :: A(N,N)
! bla bla
END SUBROUTINE ARRAY_EXPLICIT
SUBROUTINE ARRAY_ASSUMED(A)
INTEGER, ALLOCATABLE :: A(:,:)
N=SIZE(A,1)
! bla bla
END SUBROUTINE ARRAY_ASSUMED
where you need an explicit interface for the second, usually through the use of a module.
From FORTRAN77, I'm used to the first alternative, and I read this is also the most efficient if you pass the whole array.
The nice thing with the explicit shape is that I can also call a subroutine and treat the array as a vector instead of a matrix:
SUBROUTINE ARRAY_EXPLICIT(A,N)
INTEGER :: N
INTEGER :: A(N**2)
! bla bla
END SUBROUTINE ARRAY_EXPLICIT
I wondered if there is a nice way to do that kind of thing using the second, assumed shape interface, without copying it.

See the RESHAPE intrinsic, e.g.
http://gcc.gnu.org/onlinedocs/gfortran/RESHAPE.html
Alternatively, if you want to avoid the copy (in some cases an optimizing compiler might be able to do a reshape without copying, e.g. if the RHS array is not used afterwards, but I wouldn't count on it), as of Fortran 2003 you can assign pointers to targets of different rank, using bounds remapping. E.g. something like
program ptrtest
real, pointer :: a(:)
real, pointer :: b(:,:)
integer :: n = 10
allocate(a(n**2))
a = 42
b (1:n, 1:n) => a
end program ptrtest

I was looking to do the same thing and came across this discussion. None of the solutions suited my purposes, but I found that there is a way to reshape an array without copying the data using iso_c_binding if you are using the fortran 2003 standard which current fortran 90/95 compilers tend to support. I know the discussion is old, but I figured I would add what I came up with for the benefit of others with this question.
The key is to use the function C_LOC to convert an array to an array pointer, and then use C_F_POINTER to convert this back into a fortran array pointer with the desired shape. One challenge with using C_LOC is that C_LOC only works for array that have a directly specified shape. This is because arrays in fortran with an incomplete size specification (i.e., that use a : for some dimension) include an array descriptor along with the array data. C_LOC does not give you the memory location of the array data, but the location of the descriptor. So an allocatable array or a pointer array don't work with C_LOC (unless you want the location of the compiler specific array descriptor data structure). The solution is to create a subroutine or function that receives the array as an array of fixed size (the size really doesn't matter). This causes the array variable in the function (or subroutine) to point to the location of the array data rather than the location of the array descriptor. You then use C_LOC to get a pointer to the array data location and C_F_POINTER to convert this pointer back into an array with the desired shape. The desired shape must be passed into this function to be used with C_F_POINTER. Below is an example:
program arrayresize
implicit none
integer, allocatable :: array1(:)
integer, pointer :: array2(:,:)
! allocate and initialize array1
allocate(array1(6))
array1 = (/1,2,3,4,5,6/)
! This starts out initialized to 2
print *, 'array1(2) = ', array1(2)
! Point array2 to same data as array1. The shape of array2
! is passed in as an array of intergers because C_F_POINTER
! uses and array of intergers as a SIZE parameter.
array2 => getArray(array1, (/2,3/))
! Change the value at array2(2,1) (same as array1(2))
array2(2,1) = 5
! Show that data in array1(2) was modified by changing
! array2(2,1)
print *, 'array(2,1) = array1(2) = ', array1(2)
contains
function getArray(array, shape_) result(aptr)
use iso_c_binding, only: C_LOC, C_F_POINTER
! Pass in the array as an array of fixed size so that there
! is no array descriptor associated with it. This means we
! can get a pointer to the location of the data using C_LOC
integer, target :: array(1)
integer :: shape_(:)
integer, pointer :: aptr(:,:)
! Use C_LOC to get the start location of the array data, and
! use C_F_POINTER to turn this into a fortran pointer (aptr).
! Note that we need to specify the shape of the pointer using an
! integer array.
call C_F_POINTER(C_LOC(array), aptr, shape_)
end function
end program

#janneb has already answered re RESHAPE. RESHAPE is a function -- usually used in an assignment statement so there will be a copy operation. Perhaps it can be done without copying using pointers. Unless the array is huge, it is probably better to use RESHAPE.
I'm skeptical that the explicit shape array is more efficient than the assumed shape, in terms of runtime. My inclination is to use the features of the Fortran >=90 language and use assumed shape declarations ... that way you don't have to bother passing the dimensions.
EDIT:
I tested the sample program of #janneb with ifort 11, gfortran 4.5 and gfortran 4.6. Of these three, it only works in gfortran 4.6. Interestingly, to go the other direction and connect a 1-D array to an existing 2-D array requires another new feature of Fortran 2008, the "contiguous" attribute -- at least according to gfortran 4.6.0 20110318. Without this attribute in the declaration, there is a compile time error.
program test_ptrs
implicit none
integer :: i, j
real, dimension (:,:), pointer, contiguous :: array_twod
real, dimension (:), pointer :: array_oned
allocate ( array_twod (2,2) )
do i=1,2
do j=1,2
array_twod (i,j) = i*j
end do
end do
array_oned (1:4) => array_twod
write (*, *) array_oned
stop
end program test_ptrs

You can use assumed-size arrays, but it can mean multiple layers of wrapper
routines:
program test
implicit none
integer :: test_array(10,2)
test_array(:,1) = (/1, 2, 3, 4, 5, 6, 7, 8, 9, 10/)
test_array(:,2) = (/11, 12, 13, 14, 15, 16, 17, 18, 19, 20/)
write(*,*) "Original array:"
call print_a(test_array)
write(*,*) "Reshaped array:"
call print_reshaped(test_array, size(test_array))
contains
subroutine print_reshaped(a, n)
integer, intent(in) :: a(*)
integer, intent(in) :: n
call print_two_dim(a, 2, n/2)
end subroutine
subroutine print_two_dim(a, n1, n2)
integer, intent(in) :: a(1:n1,1:*)
integer, intent(in) :: n1, n2
call print_a(a(1:n1,1:n2))
end subroutine
subroutine print_a(a)
integer, intent(in) :: a(:,:)
integer :: i
write(*,*) "shape:", shape(a)
do i = 1, size(a(1,:))
write(*,*) a(:,i)
end do
end subroutine
end program test

I am using ifort 14.0.3 and 2D to 1D conversion, I could use an allocatable array for 2D array and a pointer array for 1D:
integer,allocatable,target :: A(:,:)
integer,pointer :: AP(:)
allocate(A(3,N))
AP(1:3*N) => A
As #M.S.B mentioned, in case both A and AP have the pointer attribute, I had to use contiguous attribute for A to guarantee the consistency of the conversion.

Gfortran is a bit paranoid with interfaces. It not only wants to know the type, kind, rank and number of arguments, but also the shape, the target attribute and the intent (although I agree with the intent part). I encountered a similar problem.
With gfortran, there are three different dimension definition:
1. Fixed
2. Variable
3. Assumed-size
With ifort, categories 1 and 2 are considered the same, so you can do just define any dimension size as 0 in the interface and it works.
program test
implicit none
integer, dimension(:), allocatable :: ownlist
interface
subroutine blueprint(sz,arr)
integer, intent(in) :: sz
integer, dimension(0), intent(in) :: arr
! This zero means that the size does not matter,
! as long as it is a one-dimensional integer array.
end subroutine blueprint
end interface
procedure(blueprint), pointer :: ptr
allocate(ownlist(3))
ownlist = (/3,4,5/)
ptr => rout1
call ptr(3,ownlist)
deallocate(ownlist)
allocate(ownlist(0:10))
ownlist = (/3,4,5,6,7,8,9,0,1,2,3/)
ptr => rout2
call ptr(3,ownlist)
deallocate(ownlist)
contains
! This one has a dimension size as input.
subroutine rout1(sz,arr)
implicit none
integer, intent(in) :: sz
integer, dimension(sz), intent(in) :: arr
write(*,*) arr
write(*,*) arr(1)
end subroutine rout1
! This one has a fixed dimension size.
subroutine rout2(sz,arr)
implicit none
integer, intent(in) :: sz
integer, dimension(0:10), intent(in) :: arr
write(*,*) "Ignored integer: ",sz
write(*,*) arr
write(*,*) arr(1)
end subroutine rout2
end program test
Gfortran complains about the interface. Changing the 0 into 'sz' solves the problem four 'rout1', but not for 'rout2'.
However, you can fool gfortran around and say dimension(0:10+0*sz) instead of dimension(0:10) and gfortran compiles and gives the same
result as ifort.
This is a stupid trick and it relies on the existence of the integer 'sz' that may not be there. Another program:
program difficult_test
implicit none
integer, dimension(:), allocatable :: ownlist
interface
subroutine blueprint(arr)
integer, dimension(0), intent(in) :: arr
end subroutine blueprint
end interface
procedure(blueprint), pointer :: ptr
allocate(ownlist(3))
ownlist = (/3,4,5/)
ptr => rout1
call ptr(ownlist)
deallocate(ownlist)
allocate(ownlist(0:10))
ownlist = (/3,4,5,6,7,8,9,0,1,2,3/)
ptr => rout2
call ptr(ownlist)
deallocate(ownlist)
contains
subroutine rout1(arr)
implicit none
integer, dimension(3), intent(in) :: arr
write(*,*) arr
write(*,*) arr(1)
end subroutine rout1
subroutine rout2(arr)
implicit none
integer, dimension(0:10), intent(in) :: arr
write(*,*) arr
write(*,*) arr(1)
end subroutine rout2
end program difficult_test
This works under ifort for the same reasons as the previous example, but gfortran complains about the interface. I do not know how I can fix it.
The only thing I want to tell gfortran is 'I do not know the dimension size yet, but we will fix it.'. But this needs a spare integer arguemnt (or something else that we can turn into an integer) to fool gfortran around.

Related

"C pointer trickery" to allow mismatched Fortran array ranks

I'm writing a HDF5 wrapper subroutine that will read/write a double precision array of any shape from/to a dataset inside a HDF5 file. To achieve this, I use some C pointer trickery such that the subroutine takes in only the first element of the array as val, but it actually reads/writes the whole array using the temporary buffer buf(1:sz_buf).
So far I have the following for the read subroutine (after removing error checks to keep it concise):
SUBROUTINE hdf5_read_array_d( fname, path, name, val, dims )
USE ISO_C_BINDING, ONLY: C_SIZE_T, C_LOC, C_F_POINTER
! Input arguments
CHARACTER(LEN=*), INTENT(IN) :: fname, path, name
REAL(KIND(1.D0)), TARGET, INTENT(OUT) :: val
INTEGER, DIMENSION(:), INTENT(IN) :: dims
! Internal variables
INTEGER(KIND=HID_T) :: h5root, h5path, h5dset
INTEGER(KIND=HSIZE_T), DIMENSION(SIZE(dims)) :: h5dims
REAL(KIND(1.D0)), DIMENSION(:), POINTER :: buf
INTEGER(KIND=C_SIZE_T) :: sz_buf
INTEGER :: dim
! Open the file in read-only mode
CALL h5fopen_f( TRIM(fname), H5F_ACC_RDONLY_F, h5root, ierr )
! Open the pre-existing path in the file as a group
CALL h5gopen_f( h5root, TRIM(path), h5path, ierr )
! Open the dataset
CALL h5dopen_f( h5path, TRIM(name), h5dset, ierr )
! Convert dims to HSIZE_T
h5dims(:) = dims(:)
! C pointer trickery: cast double -> void* -> double*
sz_buf = PRODUCT(dims)
ALLOCATE( buf( sz_buf ) )
CALL C_F_POINTER( C_LOC(val), buf, (/sz_buf/) )
! Read data from dataset through buffer
CALL h5dread_f( h5dset, H5T_NATIVE_DOUBLE, buf, h5dims, ierr )
! Clean up and close HDF5 file
NULLIFY(buf)
CALL h5dclose_f( h5dset, ierr )
CALL h5gclose_f( h5path, ierr )
CALL h5fclose_f( h5root, ierr )
RETURN
END SUBROUTINE hdf5_read_array_d
Now, the question is, do I need to also put in DEALLOCATE(buf) in addition to / in place of the NULLIFY(buf)?
Any help would be appreciated.
Note: I am aware that Fortran 2018 includes assumed-rank arrays val(..) that will elegantly solve this problem. But again, it's a newer feature that might not be implemented by all compilers yet.
Edit: On C_F_POINTER(), here's a screenshot of Metcalf, Reid, and Cohen (4th Edition, not the newest one that has Fortran 2018 stuff):
You can use C-style pointer trickery to do what you want, but you have some things to address in your approach:
you have a memory leak with allocate(buf)
you are (subtly) lying about the scalar nature of val
you'll horribly confuse anyone reading your code
The reason why this is horribly confusing, is because you don't need to do this trickery. That's also why I won't show you how to do it, or to address the question "do I need to deallocate as well as nullify?".
You know that you have an array val to stuff n values in, in a contiguous lump. You worry that that can't do that because you (without using an assumed-rank dummy) have to match array shape. Worry not.
integer :: a(2,2,2,2), b(4,2,2), c(4,4)
are all arrays with 16 elements. So is
integer :: d(16)
You can associate actual arguments a, b and c with dummy argument d. Let's see that in action:
implicit none
integer :: a(2,2,2,2), b(4,2,2), c(4,4)
call set_them(a, SHAPE(a))
call set_them(b, SHAPE(b))
call set_them(c, SHAPE(c))
print '(16I3)', a, b, c
contains
subroutine set_them(d, dims)
integer, intent(in) :: dims(:)
integer, intent(out) :: d(PRODUCT(dims))
integer i
d=[(i,i=1,SIZE(d))]
end subroutine
end program
You can even associate array sections in this way to define portions.
You can see several other questions around here about this sequence association, in particular looking at changing shapes of arrays. This answer is more of a motivation of what to look for when tempted to do something complicated instead.

Fortran pointer to arbitrary elements of an array [duplicate]

I want to define a pointer to a subarray. For a simple range this is easily done by pointer => array(i:j), but I can't figure out how to do this for a map like k=[k1,k2,k3]. If I would define another array I could use a loop like array2=[(array1(k(j)),j=1,size(k,1))]. But it isn't possible to assign a pointer in a similar way (pointer => [(array1(k(j)),j=1,size(k,1))]) since the r.h.s. of the expression seems to define another variabel which then not even has the target attribute. For simple tasks, a trick around this, is to first assign a pointer to the total array an to use the map on the readout. But in my case this doesn't seem to be possible.
I will attach to examples: The first one shows what I described above. The second one is a more complicated example, where the trick doesn't work anymore. And in addition a two dimensional map is required.
Minimal example:
program test
integer, parameter :: n=10,n_k=3
real,target :: a(1:n)
real :: b(1:n_k)
integer :: k(1:n_k)
integer :: j
real,pointer :: p(:)
! fill array a and define map k:
a=[(real(j),j=1,n)]
k=[((j+1)*2,j=1,n_k)]
! can be used to print the arrays:
!write(*,*) a
!write(*,*) k
! can be used to write only the part of a defined by k:
!write(*,*) (a(k(j)),j=1,n_k)
! this an similar things didn't work:
!p(1:n_k) => [(a(k(j)),j=1,n_k)]
! works, but not generally:
p => a
write(*,*) (p(k(j)),j=1,n_k)
! works, only for arrays:
b=(/(a(k(j)),j=1,n_k)/)
write(*,*) b
end program
More complicated (but also kind of minimal) example which shows (hopefully) the problem I really have. For an easy understanding some explanation leads through it. There are plenty of write commands to print the arrays. I appreciate for the amount of code, but I really don't see how to make a shorter and understandable working example:
module mod1
type base
real :: a
end type
type,extends(base) :: type1
end type
type,extends(base) :: type2
type(type1),allocatable :: b(:)
end type
type(type2),allocatable,target :: c(:)
contains
subroutine printer(z)
class(*),pointer,dimension(:) :: z
integer :: j,a_z,n_z
character(len=40) :: f,ff='(F10.2,1x))',form_z
! define format for printing:
a_z=lbound(z,1)
n_z=ubound(z,1)
write(f,'(I0)') (n_z-a_z+1)
form_z="("//trim(adjustl(f))//ff
! writing:
select type(z)
class is (base)
write(*,form_z) (z(j)%a,j=a_z,n_z)
end select
end subroutine
end module
program test
use mod1
integer,parameter :: n_b=8,n_c=6,n_js=3,n_ls=2
integer :: js(1:n_js),ls(1:n_ls)
integer :: j,l
class(*),pointer :: p(:)
character(len=40) :: f,ff='(F10.2,1x))',form_c,form_b
! define format for printing:
write(f,'(I0)') n_b
form_b="("//trim(adjustl(f))//ff
write(f,'(I0)') n_c
form_c="("//trim(adjustl(f))//ff
! creating and filling the arrays:
allocate(c(n_c))
c%a=[(2d0*real(j),j=1,n_c)]
do j=1,n_c
allocate(c(j)%b(n_b))
c(j)%b%a=[(real(l)*1d1**(j-1),l=1,n_b)]
end do
! write arrays to compare later:
write(*,form_c) c%a
write(*,*)
write(*,form_b) (c(j)%b%a,j=1,n_c)
write(*,*)
! denfining two maps (size and entries will be input in the final program):
js=[1,4,6]
ls=[2,7]
! using the maps to print only the desired entries:
write(*,*) (c(js(j))%a,j=1,n_js)
write(*,*)
write(*,*) ((c(js(j))%b(ls(l))%a,j=1,n_js),l=1,n_ls)
write(*,*)
! !!! here I want to use the maps as well, but so far I only know how to use ranges:
p => c(1:4)
call printer(p)
write(*,*)
p => c(2)%b(3:6)
call printer(p)
write(*,*)
end program
Edit:
Just for the record, I solved the problem now by using arrays of derived types including pointers and slightly changing the calling subroutines.
You cannot do this with pointer association (e.g. pointer1 => array1(vector_subscript). Section 7.2.2.2 of the Fortran 2008 standard that disallows this is:
R733 pointer-assignment-stmt is data-pointer-object [ (bounds-spec-list) ] => data-target
There are two other forms, but they do not match your use, nor would they change the outcome. Reading further:
R737 data-target is variable
C724 (R737) A variable shall have either the TARGET or POINTER attribute, and shall not be an array section with a vector subscript.
This is why you cannot perform the pointer association your are attempting. You can however work around this and with pointer allocation. See this code:
n_k = 3
k = [((j+1)*2,j=1,n_k)] ! a vector subscript
p => a(k) ! NOT OK. Violates C724
allocate(p(n_k)) ! Associate your pointer this way
p = a(k) ! This is OK.
write(*,*) p
Which yields (wrapped in your example program):
% ./ptrtest
4.00000000 6.00000000 8.00000000
This allocates p to be the proper size and then assigns from a with a vector subscript. This gets around the issue of directly associating p with a map of a. This snippet assumes the variables are declared and initialized per your example code. This shows that you can assign a vector subscript of an array to a pointer, but only one that is already associated, not during the association.
As noted in a comment to your Q, if you have a regular stride, you can make the pointer association directly. For your first test case, this would be equivalent and work:
p => a(4:2:8) ! Allocation to a strided array is allowed
If however, you have an irregular vector subscript then the method in this answer will be what you need to use to accomplish the pointer association.
Another workaround you can use is passing a pointer and the map to a procedure. Consider the following code:
program test
implicit none
integer, parameter :: nx = 10, nx_m = 3
integer,dimension(nx_m) :: x_map
integer :: i
real, dimension(nx),target :: a
real, dimension(:), pointer :: p
! initialize array
a = [(real(i*2),i=1,10)]
write (*,'(10(f5.1 x))') a
!define a map
x_map = [1, 9, 4]
! associate pointer
p => a
call print_map(p, x_map)
contains
subroutine print_map(apointer, map)
implicit none
real, dimension(:), pointer :: apointer
integer, dimension(:) :: map
write (*,*) apointer(map)
end subroutine print_map
end program test
In this case, p "knows" about a and the map of elements in a can be calculated in the caller. Rather than associating (=>) p as a map of a (which cannot be done), p is associated to a and the map passed along with it.
This code produces the output:
% ./ptrtest3
2.0 4.0 6.0 8.0 10.0 12.0 14.0 16.0 18.0 20.0
2.00000000 18.0000000 8.00000000

Pointers to arrays as member variables in Fortran derived type

In Fortran, it is not possible to make a member variable of a derived type a target. (I guess this has to do with the standard not specifying how a derived type is stored in memory?) However, I can have a pointer as a member variable and associate pointers with pointers. Like I do in the example below.
module DataMod
type DataType
real(8), private, dimension(:,:), pointer, contiguous :: A
real(8), private, dimension(:,:), pointer, contiguous :: B
integer :: n
contains
procedure :: alloc
procedure :: set
procedure :: print_
final :: dealloc
end type DataType
interface DataType
procedure :: NewDataType
end interface DataType
contains
function NewDataType(dimension_) result(new)
integer, intent(in) :: dimension_
type(DataType) :: new
new%n = dimension_
end function NewDataType
subroutine alloc(dataObject)
class(DataType) :: dataObject
allocate(dataObject%A(dataObject%n,dataObject%n))
allocate(dataObject%B(dataObject%n,dataObject%n))
end subroutine alloc
subroutine set(dataObject, datas, choice)
class(DataType) :: dataObject
real(8), dimension(dataObject%n,dataObject%n), intent(in) :: datas
character(len=1), intent(in) :: choice
real(8), dimension(:,:), pointer :: dataPointer
integer :: i,j
if(choice .eq. 'A') then
datapointer => dataObject%A
elseif(choice .eq. 'B') then
datapointer => dataObject%B
else
stop
endif
do j = 1,dataObject%n
do i = 1,dataObject%n
datapointer(i,j) = datas(i,j)
enddo
enddo
end subroutine set
subroutine print_(dataObject)
class(DataType), intent(in) :: dataObject
print *, 'A'
print *, dataObject%A(1:dataObject%n,1:dataObject%n)
print *
print *, 'B'
print *, dataObject%B(1:dataObject%n,1:dataObject%n)
end subroutine print_
subroutine dealloc(dataObject)
type(DataType) :: dataObject
deallocate(dataObject%A)
deallocate(dataObject%B)
end subroutine dealloc
end module DataMod
program DataTest
use DataMod, only: DataType
implicit none
real(8), dimension(2,2) :: testArray
type(DataType) :: testType
testType = DataType(2)
call testType%alloc()
testArray(1,1) = 1
testArray(2,1) = 2
testArray(1,2) = 3
testArray(2,2) = 4
call testType%set(testArray, 'A')
testArray(1,1) = 5
testArray(2,1) = 6
testArray(1,2) = 7
testArray(2,2) = 8
call testType%set(testArray, 'B')
call testType%print_()
end program DataTest
In the set routine, I use an if statement to set a pointer to decide if it should dump the incoming matrix in A or B. In the program I'm currently working on, I must decide which combination of four different matrices to multiply together and setting a pair of pointers is much nicer than writing 16 almost identical calls to dgemm.
My question is if there are any problems with this approach besides the normal dangers of dangling pointers etc. or a way to do this without pointers? The arrays should not be accessed from outside the object. Are there any performance issues?
Components in a type definition may not be declared with the TARGET attribute (beyond the missing syntax, that would be inconsistent with other concepts and rules in the current language), but if a variable of derived type has the TARGET attribute, then all of its subobjects have the TARGET attribute too. For a type definition:
type DataType
real(8), private, dimension(:,:), allocatable :: A
real(8), private, dimension(:,:), allocatable :: B
...
The procedure set could be written...
subroutine set(dataObject, datas, choice)
class(DataType), TARGET :: dataObject
real(8), dimension(dataObject%n,dataObject%n), intent(in) :: datas
character(len=1), intent(in) :: choice
real(8), dimension(:,:), pointer :: dataPointer
! require dataobject%A and ..%B to already be allocated.
if(choice .eq. 'A') then
datapointer => dataObject%A
elseif(choice .eq. 'B') then
datapointer => dataObject%B
else
stop
endif
datapointer = datas ! or some other operation.
...
(dataPointer could be declared contiguous, the allocatable arrays that it gets pointed at are always contiguous.)
An actual argument without the TARGET attribute may be associated with a dummy argument with the TARGET attribute. If this is the case, then pointers associated with the dummy argument become undefined when execution of the procedure completes. (Such pointers may also become undefined in some cases, even when the actual argument has the TARGET attribute - see F2018 15.5.2.4p8 on for details - but these cases don't apply to scalars.)
Consequently, in the general case, if a pointer to one of the components of an object of derived type needs to outlive a procedure like set above (e.g. if dataPointer was not local to set) and you can't ensure that the actual argument will always have the TARGET attribute, then the original method using pointer components may be more appropriate. The implementation in the question appears to be ok - though I would suggest making the finalizer IMPURE ELEMENTAL to make things more robust to future changes.

Odd behavior of fortran strings in a derived type after accessing the type by a pointer

[Working example at the end of this post!]
I'm trying to write a simple module to handle physical units in arithmetic operations. My aim is to create derived units out of primary ones.
As you can see in the following code, I have a derived type, namely unit_t, which stores a string, representing the unit itself, power of the unit, conversion factor (to convert it to SI), a logical variable to show if the unit is cloned and next and prev pointers which point to the next or previous unit (in case we have a combination of units, for example kg * m / s**2, so basically it's a linked list connecting different units to each other).
I have a function named unit_clone to clone a primary unit. The unit_int_pow function overloads the exponentiation operator (**) and it simply clones a given primary unit and updates its exponent. The units_mul function overloads the multiplication operator (*). This function first check if the two given units are cloned (if not, it clones them) and then just connect them using next and prev pointers.
Here is my code (you should be able to compile it with gfortran)
module units
implicit none
type unit_t
character(len=16) :: symb
integer :: pow
real :: conv
logical :: cloned
type(unit_t), pointer :: next => null(), prev => null()
end type unit_t
! definitions
type(unit_t), target :: m = unit_t("m", 1, 1.d0, .false.)
type(unit_t), target :: km = unit_t("km", 1, 1.d3, .false.)
type(unit_t), target :: kg = unit_t("kg", 1, 1.d0, .false.)
type(unit_t), target :: s = unit_t("s", 1, 1.d0, .false.)
interface operator (**)
procedure unit_int_pow
end interface operator (**)
interface operator (*)
procedure units_mul
end interface operator (*)
contains
!> Cloning a given node (unit)
function unit_clone(u) result (clone)
implicit none
type(unit_t), intent(in) :: u
type(unit_t), allocatable, target :: clone
allocate(clone)
clone%symb = u%symb
clone%conv = u%conv
clone%pow = u%pow
clone%cloned = .true.
clone%next => u%next
clone%prev => u%prev
end function unit_clone
!> integer powers
function unit_int_pow(u1, p) result(u)
implicit none
type(unit_t), intent(in) :: u1
integer, intent(in) :: p
type(unit_t), allocatable, target :: u
u = unit_clone(u1)
u%pow = u%pow * p
end function unit_int_pow
!> multiplication
function units_mul (u1, u2) result (u1c)
implicit none
type(unit_t), intent(in) :: u1, u2
type(unit_t), allocatable, target :: u1c, u2c
if ( u1%cloned ) then
u1c = u1
else
u1c = unit_clone(u1)
end if
if ( u2%cloned ) then
u2c = u2
else
u2c = unit_clone(u2)
end if
u2c%prev => u1c
u1c%next => u2c
end function units_mul
end module units
program test
use units
implicit none
type(unit_t) :: u
u = kg**2 * m
print *, u%symb, "^", u%pow, " [expected: kg^2]"
print *, u%next%symb, "^", u%next%pow, " [expected: m^1]"
print *, u%next%prev%symb, "^", u%next%prev%pow, " [expected: kg^2]"
end program test
The problem is, I'm getting the following output:
kg ^ 2 [expected: kg^2]
�ȷ2�U ^ 1 [expected: m^1]
�ȷ2�U ^ 2 [expected: kg^2]
Apparently, after accessing the next or next%prev unit (which is basically the head of this short linked list), the code outputs random character instead of the symbs. If I change the order of the variables in the derived type, unit_t, for example if I put symb at the end of the derived type, I will get right symbs, but this time wrong pows.
Any idea what is the culprit of this rather odd behavior?
Using Rudrigo's comment below, I rewrote the code, and it works fine now. Just for the reference, the working code is as follows (if you have further suggestion or modification, please let me know, Nombre respository)
module units
implicit none
type unit_t
character(len=16) :: symb
real :: conv
real :: pow = 1.d0
logical :: cloned = .false.
type(unit_t), pointer :: next => null(), prev => null()
end type unit_t
! units definitions
type(unit_t), target :: m = unit_t("m", 1.d0)
type(unit_t), target :: km = unit_t("km", 1.d3)
type(unit_t), target :: kg = unit_t("kg", 1.d0)
type(unit_t), target :: s = unit_t("s", 1.d0)
interface operator (**)
procedure unit_int_pow
end interface operator (**)
interface operator (*)
procedure units_mul
end interface operator (*)
contains
!> Cloning a given node (unit)
function unit_clone(u) result (clone)
implicit none
type(unit_t), intent(in) :: u
type(unit_t), pointer :: clone
allocate(clone)
clone%symb = trim(u%symb)
clone%conv = u%conv
clone%pow = u%pow
clone%cloned = .true.
clone%next => u%next
clone%prev => u%prev
end function unit_clone
!> integer powers
function unit_int_pow(u1, p) result(u)
implicit none
type(unit_t), intent(in) :: u1
integer, intent(in) :: p
type(unit_t), pointer :: u
if ( u1%cloned ) then
! TODO: should be able to handle complex cases like: a * (b * c)**3
! most likly, only updating the power of the linked list chain
! would do the job
else
u => unit_clone(u1)
end if
u%pow = u%pow * p
end function unit_int_pow
!> multiplication
function units_mul (u1, u2) result (u2c)
implicit none
type(unit_t), intent(in), target :: u1, u2
type(unit_t), pointer :: u2c
if ( u2%cloned ) then
if ( associated(u2%prev) ) then
u2c => u2%prev%next
else
u2c => u2
end if
else
u2c => unit_clone(u2)
end if
if ( u1%cloned ) then
if ( associated(u2%prev) ) then
u2c%prev => u1%prev%next
else
u2c%prev => u1
end if
else
u2c%prev => unit_clone(u1)
end if
u2c%prev%next => u2c
end function units_mul
end module units
A pointer in Fortran has three posible association status:
associated: the pointer is actually pointing to a defined and allocated variable / matching data storage (its target);
disassociated: it was (or is part of an objects that was) explicitly nullified or deallocated, or its target was properly disassociated.
undefined: anything different than the former, e.g. its target is (or became) undefined, or was deallocated by other means than by calling deallocate directly in the pointer itself, among other causes.
When execution of an instance of a subprogram completes (e.g. when function units_mul reaches end function), any unsaved local variable becomes undefined. Also, any allocatable local variable that is not saved or is a function result gets deallocated, and when an allocatable entity is deallocated, it also becomes undefined.
Back to your problem, u2c is an allocatable unsaved local variable inside units_mul function, where you associate u1c%next to it. When this function reaches the end, u2c ends its lifecycle and becomes undefined, bringing u1c%next to become also undefined, in a state referred in the Fortran lingo as dangling pointer.
This a text from the Fortran Standard describing this phenomena (even though it is referring to the case of modules host association, it's the same logic):
Note 19.10
A pointer from a module program unit might be accessible in a
subprogram via use association. Such pointers have a lifetime that is
greater than targets that are declared in the subprogram, unless such
targets are saved. Therefore, if such a pointer is associated with a
local target, there is the possibility that when a procedure defined
by the subprogram completes execution, the target will cease to exist,
leaving the pointer “dangling”. This document considers such pointers
to have an undefined association status. They are neither associated
nor disassociated. They cannot be used again in the program until
their status has been reestablished. A processor is not required to
detect when a pointer target ceases to exist.
A dangling pointer is not a reliable pointer, and the compiler has no control over it. They may, by any reason, keep pointing to their last memory address (and accidentally give the expected result in some cases, or the values would be gibberish from random memory address), but it will most certainly break, and the fail can be anything, from just wrong results to a SIGSEG fault or a memory address violation.
See this example code:
program dangling_pointer
implicit none
integer, pointer :: p(:)
integer, allocatable :: a(:)
call sub1(p)
print *, 'sub1: ', p
call sub2(p)
print *, 'sub2: ', p
call sub3(p, a)
print *, 'sub3: ', p
p => fun4()
print *, 'fun4: ', p
contains
subroutine sub1(dummy_p)
! the pointer passed as argument outlives the local target
! when the procedure ends, it becomes a "dangling pointer"
integer, pointer :: dummy_p(:)
integer, allocatable, target :: local_a(:)
allocate(local_a(5))
local_a = 100
dummy_p => local_a
end
subroutine sub2(dummy_p)
! here the local variable is saved, so it persists. No problem here.
integer, pointer :: dummy_p(:)
integer, allocatable, target, save :: saved_a(:)
allocate(saved_a(5))
saved_a = 100
dummy_p => saved_a
end
subroutine sub3(dummy_p, out_a)
! here the target is a passed argument, so it persists. No problem here.
integer, pointer :: dummy_p(:)
integer, allocatable, target :: out_a(:)
allocate(out_a(5))
out_a = 100
dummy_p => out_a
end
function fun4() result(result_p)
! here the function result will be returned as a pointer. No problem here.
integer, pointer :: result_p(:)
allocate(result_p(5))
result_p = 100
end
end
With gfortran 9.0.0 I get:
sub1: 14316208 0 14287184 0 100
sub2: 100 100 100 100 100
sub3: 100 100 100 100 100
fun4: 100 100 100 100 100
Edit
I think this snippet would work for your problem:
allocate(u1c%next)
if (u2%cloned) then
u1c%next = u2
else
u1c%next = unit_clone(u2)
end if
u1c%next%prev => u1c

Are anonymous arrays possible in fortran?

I am trying to create an array of arrays in Fortran.
Something like the following
TYPE :: array_of_arrays
REAL, DIMENSION(:), POINTER :: p => NULL()
END TYPE
TYPE(array_of_arrays), DIMENSION(2) :: some_array
So that I can do:
REAL, DIMENSION(3), TARGET :: some_vector1 = (/1.0, 2.1, 4.3/)
REAL, DIMENSION(3), TARGET :: some_vector2 = (/3.0, 1.2, 9.6/)
some_array(1)%p => some_vector1
some_array(2)%p => some_vector2
WRITE(*,*) some_array(1)%p ! I see some_vector1
WRITE(*,*) some_array(2)%p ! I see some_vector2
Now it's cumbersome for me to actually declare each of these some_vector arrays to correspond to each element in my array of arrays.
What I'd like to do is have in a separate subroutine where a temporary vector is set as a target, and that subroutine sets up my array of arrays to point to that temporary vector.
This way I can have anonymous arrays.
However, this doesn't seem to be working and I wonder if first if I am doing something that Fortran doesn't support.
So does Fortran support anonymous arrays, that is (in case I have the terms wrong), an array who can only be accessed through a reference?
Sure; as IanH suggests, you can just have the pointer refer to allocated memory directly, rather than refer to a variable; this is one of the few cases where the allocated memory doesn't automatically get deallocated once it goes out of scope.
eg,
module arrays
TYPE :: array_of_arrays
REAL, DIMENSION(:), POINTER :: p => NULL()
END TYPE
contains
subroutine alloc(aa)
type(array_of_arrays), intent(inout) :: aa(:)
integer :: i
allocate( aa(1)%p(1) )
aa(1) % p = [1.]
allocate( aa(2)%p(5) )
aa(2) % p = [ (i, i=1,5) ]
end subroutine alloc
end module arrays
program usearrays
use arrays
TYPE(array_of_arrays), DIMENSION(2) :: some_array
call alloc(some_array)
WRITE(*,*) some_array(1)%p ! I see some_vector1
WRITE(*,*) some_array(2)%p ! I see some_vector2
deallocate( some_array(1) )
deallocate( some_array(2) )
end program usearrays
and running it gives
$ gfortran -o arrays arrays.f90
$ ./arrays
1.0000000
1.0000000 2.0000000 3.0000000 4.0000000 5.0000000

Resources