Fortran array automatically growing when adding a value - vector

Is there any existing way to emulate growing array in Fortran? Like vector in C++. I was very surprised when I haven't found anything on this subject on the Internet.
As a motivation example, suppose I compute some recurrence relation and I want to store all the intermediate numbers I get. My stopping criterion is the difference between adjacent results so I cannot know beforehand how much memory I should allocate for this.

I am sure it has been shown somewhere on this site before, but I cannot find it.
First, in Fortran 2003, you can add one element by simple
a = [a, item]
as commented by francescalus. This is likely to reallocate the array very often and will be slow.
You can keep your array to be allocated to somewhat larger size then your number of elements n. When your number of elements n grows above the size of the array size(a) you can allocate a new array larger by some factor (here 2x) and copy the old elements there. There is no realloc() in Fortran, unfortunately.
module growing_array
implicit none
real, allocatable :: a(:)
integer :: n
contains
subroutine add_item(item)
real, allocatable :: tmp(:)
real, intent(in) :: item
if (n == size(a)) then
!this statement is F2003, it can be avoided, but I don't see why in 2016
call move_alloc(a, tmp)
allocate(a(n*2))
a(1:n) = tmp
end if
n = n + 1
a(n) = item
end subroutine
end module
I left out the initial allocation, it is simple enough.
It all can be put into a derived type with type-bound procedures, and use it as a data structure, but that is pure Fortran 2003 and you wanted 90. So I show Fortran 95, because Fortran 90 is flawed in many ways for allocatable arrays and is desperately obsolete and essentially dead.

Related

Fortran calls: advantage of passing by pointer vs passing by reference

It is my understanding that in Fortran arrays are passed by reference. So is there an advantage to passing a pointer to a large array (into a subroutine) as opposed to passing the array itself.
Could you also clarify this in the context of recursive functions. I have seen implementations where pointers are used "for efficiency", but if everything is passed by reference, then what's the benefit of pointers.
Here's an example. I have an array X (in reality lets say it's a very large array).
INTEGER :: X(:)
I can define a subrouitne that takes this array as follows:
SUBROUTINE FOO(X)
INTEGER, INTENT(IN) :: X(:)
INTEGER :: I
DO I = 1, 4
WRITE(*,*) X(I)
ENDDO
END SUBROUTINE FOO
When I call the subroutine above then the array X is not copied as fortran passes a reference to it. Now lets say I have a modified version of the subroutine:
SUBROUTINE FOO2(X)
INTEGER, POINTER, INTENT(IN) :: X(:)
INTEGER :: I
DO I = 1, 4
WRITE(*,*) X(I)
ENDDO
END SUBROUTINE FOO2
I can call FOO2 from a program as follows:
PROGRAM TEST
IMPLICIT NONE
INTEGER, TARGET :: X(5)
INTEGER, POINTER :: Y(:)
X = (/1,2,3,4,5/)
Y => X
CALL FOO2(Y)
END PROGRAM TEST
Then here's my question: is there a performance difference between the two versions of foo? Is there any useful scenario where the declaration of FOO2 might be preferable to FOO?
In this simple case there shouldn't be any real difference. Note the program is illegal, you don't have the explicit interface to FOO or FOO2, but I will assume you just ommited it for simplicity and they are in a module or internal.
Both arrays can be non-contiguous in principle, so no difference here. If that slows down the code, the contiguous attribute might help. Or assumed size or explicite size arrays too.
Your subroutine is too simple, so there is no danger of aliasing too. This is the common source of decreasing performance with pointers. There could be potential aliasing with some other argument or another variable you access by host or use association provided it has the target attribute.
The purpose of pointer arguments is actually to either allow disassociated (null()) arguments, or to allow changing of the association status in the subroutines. Your example doesn't use neither and therefore the pointer attribute is superfluous.
There is on last small difference. It is not specified in the standard what is actually passed to the subroutine at the machine code level for the pointer variables. If it is just an address (likely for scalars) it is the same as non-pointer, just the aliasing rules and the allowed usage are different. Otherwise some descriptor is passed, but any overhead should be negligible, the assumed shape arrays use a descriptor too.

Fortran Pointer arithmetic

That's my first question post ever ... don't be cruel, please.
My problem is the following. I'd like to assign a fortran pointer as an expression. I think that's not possible by simple fortran techniques. But since new fortran versions seem to provide ways to handle things used in C and C++ (like c_ptr and c_f_pointer ... ), maybe someone knows a way to solve my problem. (I have not really in idea about C, but I read that pointer arithmetic is possible in C)
To make things more clear, here is the code which came to my mind immediately but isn't working:
program pointer
real(8),target :: a
real(8),pointer :: b
b=>a*2.0d0 ! b=>a is of course working
do i=1,10
a=dble(i)*2.0d0
write(*,*)b
end do
end program
I know that there are ways around this issue, but in the actual program, all of which came to my mind, would lead to much longer computation time and/or quite wiered code.
Thanks, a lot, in advance!
Best, Peter
From Michael Metcalf,
Pointers are variables with the POINTER attribute; they are not a distinct data type (and so no 'pointer arithmetic' is possible).
They are conceptually a descriptor listing the attributes of the objects (targets) that the pointer may point to, and the address, if any, of a target. They have no associated storage until it is allocated or otherwise associated (by pointer assignment, see below):
So your idea of b=>a*2 doesn't work because b is being assigned to a and not given the value of a.
Expression, in general (there two and a half very significant exceptions), are not valid pointer targets. Evaluation of an expression (in general) yields a value, not an object.
(The exceptions relate to the case where the overall expression results in a reference to a function with a data pointer result - in that case the expression can be used on the right hand side of a pointer assignment statement, or as the actual argument in a procedure reference that correspond to a pointer dummy argument or [perhaps - and F2008 only] in any context where a variable might be required, such as the left hand side of an ordinary assignment statement. But your expressions do not result in such a function reference and I don't think the use cases are relevant to what you wnt to do. )
I think you want the value of b to change as the "underlying" value of a changes, as per the form of the initial expression. Beyond the valid pointer target issue, this requires behaviour contrary to one of the basic principles of the language (most languages really) - evaluation of an expression uses the value of its primaries at the time the expression is evaluation - subsequent changes in those primaries do not result in a change in the historically evaluated value.
Instead, consider writing a function that calculates b based on a.
program pointer
IMPLICIT NONE
real(8) :: a
do i=1,10
a=dble(i)*2.0d0
write(*,*) b(a)
end do
contains
function b(x)
real(kind(a)), intent(in) :: x
real(kind(a)) :: b
b = 2.0d0 * x
end function b
end program
Update: I'm getting closer to what I wanted to have (for those who are interested):
module test
real,target :: a
real, pointer :: c
abstract interface
function func()
real :: func
end function func
end interface
procedure (func), pointer :: f => null ()
contains
function f1()
real,target :: f1
c=>a
f1 = 2.0*c
return
end function f1
end module
program test_func_ptrs
use test
implicit none
integer::i
f=>f1
do i=1,10
a=real(i)*2.0
write(*,*)f()
end do
end program test_func_ptrs
I would be completely satisfied if I could find a way to avoid the dummy arguments (at least in when I'm calling f).
Additional information: The point is that I want to define different functions f1 and deside before starting the loop, what f is going to be inside of the loop (depending on whatever input).
Pointer arithmetic, in the sense of calculating address offsets from a pointer, is not allowed in Fortran. Pointer arithmetic can easily cause memory errors and the authors of Fortran considered it unnecessary. (One could do it via the back door of interoperability with C.)
Pointers in Fortran are useful for passing procedures as arguments, setting up data structures such as linked lists (e.g., How can I implement a linked list in fortran 2003-2008), etc.

How many bytes does a derived type (in Fortran) occupy? Are the locations contiguous? And a pointer to a derived type?

I could not find this anywhere, and even if it could be trivial I wanna be sure I have well understood. I have 4 questions (strictly related):
1)If I define a derived type in fortran like this
TYPE :: node
INTEGER :: int
REAL :: REALfirst
REAL :: REALsecond
END TYPE
TYPE(node) :: var
allocate(var)
After the above allocation it occupies 4 byte for the integer and other 8 for the 2 single precision reals, for a total of 12 bytes. Are they located continuously in memory? And how does the computer store the information about the type of variables?I guess it needs some extra memory for saving that.
2)if in the example above instead of
TYPE(node) :: var
i would have written:
TYPE(node),POINTER :: var
I guess that if I compiled a 32 bit executable the ALLOCATE statement would allocate the same amount of memory as in the example above. Is it correct?
3)Now lets suppose i declare the type
TYPE :: node
INTEGER :: int
TYPE(node), POINTER :: BEFORE
TYPE(node), POINTER :: AFTER
END TYPE
TYPE(node) :: var
allocate(var)
here (if 32-bit compiled) it would allocate 4 byte for the integer and other 8 for the 2 pointers, for a total of 12 bytes. is that correct?Again how does the computer store the information about the type of variables?
4)In the example (3) if I now write ALLOCATE(var%BEFORE), other 12 bytes are allocated for a variable with derived type node, and the 4 byte of integer type that were allocated for the pointer var%BEFORE (see example 3) are now freed, correct?
THANKS
A.
1) This is not covered by the fortran standard. real and integer do not have to be 4 bytes wide. To ensure that by specifying their precision. If you do not care about the numeric precision, but about the number of bytes, do it like this
!In Fortran 2008
use iso_fortran_env
or
!In Fortran 95
integer,parameter :: int32 = selected_int_kind(9)
integer,parameter :: real32 = selected_real_kind(p=6,r=37)
and
TYPE :: node
INTEGER(int32) :: int
REAL(real32) :: REALfirst
REAL(real32) :: REALsecond
END TYPE
The compiler is allowed the insert any padding it wants. This is likely ta happen if you mix variables with 4, 8 or even more bytes. To suppress any padding use SEQUENCE.
2) The allocated memory would be the same. The compiler also uses some datastructure (it may be just an address, but doesn't have to) for bookkeeping.
3) The bookkeeping datastructure I reffered to is stored in the datatype. It may be just an address.
4) The pointer data structuture can be 4 bytes, but also can be more. The more important point is however, that they are not freed. You must know where to find the allocated space on the heap and you use the pointer for that. It does not matter, whether you use this pointer to allocate new data, or if you point to some existing one.
Note, that bit size of the derived type cannot change at runtime, it is fixed. Another issue are polymorphic variables, but they must be allocated dynamically for this reason.

Set array's rank at runtime

I have written a program which reads a file containing multidimensional data (most commonly 3D, but 2D could occur as well). To heighten simplicity I would like to store the data in an array of the same rank (or something pretending to be one), i.e. using a three-dimensional array for 3D data, etc.; the trouble is that the program only learns about the dimensionality on reading the data file.
Currently I store all data in an array of rank one and calculate each element's index in that array from the element's coordinates (this was also suggested here). However, I have also read about pointer rank remapping, which seems very elegant and just what I have been looking for, as it would allow me to scrap my procedures for array index determination (which are probably far less efficient than what goes on behind the scenes). Now, however, it looks like I'm facing the same problem as with directly declaring a multidimensional array - how to do the declaration? Again, it requires information about the rank.
How could I use pointer rank remapping or some other, more suitable technique for setting an array's rank at runtime - in case this can be done at all. Or am I best off sticking to the rank one-array that I am currently using?
I once asked something similar, i.e. how to treat a two-dimensional array as one dimension, see here: changing array dimensions in fortran.
The answers were about the RESHAPE instrinsic of pointers, however there seems to be no way to use the same array name unless you use subroutine wrappers, but then you need callbacks to have the eventual subroutine with only one name, so the problems get larger.
program test
real, allocatable :: data(:)
allocate(data(n_data))
! read stuff, set is_2d and sizes
if (is_2d) then
call my_sub2(data, nX, nY)
else
call my_sub3(data, nX, nY, nZ)
end if
end program test
subroutine my_sub2(data, nX, nY)
real :: data(nx,nY)
! ...
end subroutine my_sub2
subroutine my_sub3(data, nX, nY, nZ)
real :: data(nx,nY,nZ)
! ...
end subroutine my_sub3
EDIT: as an alternative, set the third rank to 1:
program test
real, allocatable, target:: data(:)
real, pointer:: my_array(:,:,:)
logical is_2d
n_data = 100
allocate(data(n_data))
! read stuff, determine is_2d and n
if (is_2d) then
i=n
j=n
k=1
else
i=n
j=n
k=n
end if
my_array(1:i,1:j,1:k) => data
write(*,*) my_array
end program test
Then you handle the 2D case as a special 3D case with third dimension 1.
EDIT2: also, beware when passing non-contiguous arrays to subroutines with explicit-shape arrays: http://software.intel.com/sites/products/documentation/hpc/compilerpro/en-us/fortran/lin/compiler_f/optaps/fortran/optaps_prg_arrs_f.htm
If I understand correctly, you read in data in and 1-D array and want to assign it to 2D or 3D arrays, which you know only after reading the file. Why not declare both 2D and 3D arrays as allocatable arrays, and allocate only one of them base on your data shape? You could use the intrinsic function RESHAPE to do this conveniently.
REAL,DIMENSION(:,:), ALLOCATABLE :: arr2d
REAL,DIMENSION(:,:,:),ALLOCATABLE :: arr3d
...
! Read data into 1-D array, arr1d;
...
IF(L2d)THEN
ALLOCATE(arr2d(im,jm))
arr2d=RESHAPE(arr1d,(/im,jm/))
ELSEIF(L3d)THEN
ALLOCATE(arr3d(im,jm,km))
arr3d=RESHAPE(arr1d,(/im,jm,km/))
ENDIF
You could use the EQUIVALENCE statement like this:
Program ranks
integer a_1d(12)
integer a_2d(2, 6)
integer a_3d(2, 2, 3)
equivalence (a_1d, a_2d, a_3d)
! fill array 1d
a_1d = (/1,2,3,4,5,6,7,8,9,10,11,12/)
print *, a_1d
print *, a_2d(1,1:6)
print *, a_2d(2,1:6)
print *, a_3d(1,1,1:3)
print *, a_3d(2,1,1:3)
print *, a_3d(1,2,1:3)
print *, a_3d(2,2,1:3)
end program ranks
You can write a subroutine for different ranks of array and create an interface
Here in example I have shown that how to populate an array of different array using interface statement `
program main
use data
implicit none
real,dimension(:,:,:),allocatable::data
integer::nx,ny,nz
nx = 5
ny = 10
nz = 7
call populate(nx,ny,nz,data)
print *,data
end program main `
data module is here
module data
private
public::populate
interface populate
module procedure populate_1d
module procedure populate_2d
module procedure populate_3d
end interface
contains
subroutine populate_1d(x,data)
implicit none
integer,intent(in)::x
real,dimension(:),allocatable,intent(out):: data
allocate(data(x))
data=rand()
end subroutine populate_1d
subroutine populate_2d(x,y,data)
implicit none
integer,intent(in)::x,y
real,dimension(:,:),allocatable,intent(out):: data
allocate(data(x,y))
data=rand()
end subroutine populate_2d
subroutine populate_3d(x,y,z,data)
implicit none
integer,intent(in)::x,y,z
real,dimension(:,:,:),allocatable,intent(out):: data
allocate(data(x,y,z))
data=rand()
end subroutine populate_3d
end module data
There is an interface to populate 1d, 2d and 3d arrays. you can call populate interface instead of calling individual subroutines. It will automatically pick the relevant one.

Why does a Fortran POINTER require a TARGET?

Why does the Fortran 90 Specification specify (5.2.8) that the TARGET keyword must be used to associate a POINTER to it? Why isn't every type a valid TARGET?
For example,
INTEGER, POINTER :: px
INTEGER, TARGET :: x
x = 5
px => x
is valid syntax but
INTEGER, POINTER :: px
INTEGER :: x
x = 5
px => x
is not valid.
Why must this be?
An item that might be pointed to could be aliased to another item, and the compiler must allow for this. Items without the target attribute should not be aliased and the compiler can make assumptions based on this and therefore produce more efficient code.
Pointers in fortran are different than pointers in c. In fortran 90 pointers were provided with few restriction like having a target. This was done to address speed issue and to keep pointer usage safe. Although one call make allocatable pointers which do not need to specify a target. Dig deeper and you will find them!!
For better compiler optimization. When your code runs on 1K-100K cores speed does matter.
Btw TARGET is not always used. For example in situations when the pointer is being used for allocating memory.
...
real, pointer :: p(:), x
...
allocate(p(15))
...
x => p(1:5)
...
nullify(x)
deallocate(p)
...

Resources