Fortran array of variable size arrays [closed] - multidimensional-array

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
A simplified description of the problem:
There are exactly maxSize people shopping in a store. Each of them has a shopping list, containing the price of items (as integers). Using Fortran arrays, how can I represent all the shopping lists. The shopping lists may contain any number of items (1, 10, 1000000000).
(NOTE: The actual problem is far more complicated. It is not even about shopping.)
The lazy approach would be:
integer :: array(maxSize, A_REALLY_BIG_NUMBER)
However, this is very wasteful, I basically want the second dimension to be variable, and then allocate it for each person seperately.
The obvious attempt, doomed to failure:
integer, allocatable :: array(:,:)
allocate(array(maxSize, :)) ! Compiler error
Fortran seems to require that arrays have a fixed size in each dimension.
This is wierd, since most languages treat a multidimensional array as an "array of arrays", so you can set the size of each array in the "array of arrays" seperately.
Here is something that does work:
type array1D
integer, allocatable :: elements(:) ! The compiler is fine with this!
endtype array1D
type(array1D) :: array2D(10)
integer :: i
do i=1, size(array2D)
allocate(array2D(i)%elements(sizeAt(i))
enddo
If this is the only solution, I guess I will use it. But I was kind of hoping there would be a way to do this using intrinsic functions. Having to define a custom type for such a simple thing is a bit annoying.
In C, since an array is basically a pointer with fancy syntax, you can do this with an array of pointers:
int sizeAt(int x); //Function that gets the size in the 2nd dimension
int * array[maxSize];
for (int x = 0; x < maxSize; ++x)
array[x] = (int*)(calloc(sizeAt(x) , sizeof(int)));
Fortran seems to have pointers too. But the only tutorials I have found all say "NEVER USE THESE EVER" or something similar.

You seem to be complaining that Fortran isn't C. That's true. There are probably a near infinite number of reasons why the standards committees chose to do things differently, but here are some thoughts:
One of the powerful things about fortran arrays is that they can be sliced.
a(:,:,3) = b(:,:,3)
is a perfectly valid statement. This could not be achieved if arrays were "arrays of pointers to arrays" since the dimensions along each axis would not necessarily be consistent (the very case you're trying to implement).
In C, there really is no such thing as a multidimensional array. You can implement something that looks similar using arrays of pointers to arrays, but that isn't really a multidimensional array since it doesn't share a common block of memory. This can have performance implications. In fact, in HPC (where many Fortran users spend their time), a multi-dimensional C array is often a 1D array wrapped in a macro to calculate the stride based on the size of the dimensions. Also, dereferencing a 7D array like this:
a[i][j][k][l][m][n][o]
is a good bit more difficult to type than:
a(i,j,k,l,m,n,o)
Finally, the solution that you've posted is closest to the C code that you're trying to emulate -- what's wrong with it? Note that for your problem statement, a more complex data-structure (like a linked-list) might be in order (which can be implemented in C or Fortran). Of course, linked-lists are the worst as far as performance goes, but if that's not a concern, it's probably the correct data structure to use as a "shopper" can decide to add more things into their "cart", even if it wasn't on the shopping list they took to the store.

Related

Guidelines for choosing Array of Struct or Multiple Arrays

This question has been asked for other languages. I would like to ask this in relation to Julia.
What are the general guidelines for choosing between an array of struct e.g.
struct vertex
x::Real
y::Real
gradient_x::Real
gradient_y::Real
end
myarray::Array{Vertex}
and multiple arrays.
xpositions::Array{<:Real}
ypositions::Array{<:Real}
gradient_x::Array{<:Real}
gradient_y::Array{<:Real}
Are there any performance considerations? Or is it just a style/readability issue.
Your struct as it currently stands will perform poorly. From the Performance Tips you should always:
Avoid fields with abstract type
Similarly, you should always prefer Vector{<:Real} to Vector{Real}.
The Julian way to approach this is to parameterize your struct as follows:
struct Vertex{T<:Real}
x::T
y::T
gradient_x::T
gradient_y::T
end
Given the above, the two approaches discussed in the question will now have roughly similar performance. In practice, it really depends on what kind of operations you want to perform. For example, if you frequently need a vector of just the x fields, then having multiple arrays will probably be a better approach, since any time you need a vector of x fields from a Vector{Vertex} you will need to loop over the structs to allocate it:
xvec = [ v.x for v in vertexvec ]
On the other hand, if your application lends itself to functions called over all four fields of the struct, then your code will be significantly cleaner if you use a Vector{Vertex} and will be just as performant as looping over 4 arrays. Broadcasting in particular will make for nice clean code here. For example, if you have some function:
f(x, y, gradient_x, gradient_y)
then just add the method:
f(v::Vertex) = f(v.x, v.y, v.gradient_x, v.gradient_y)
Now if you want to apply it to vv::Vector{Vertex}, you can just use:
f.(vv)
Remember, user-defined types in Julia are just as performant as "in-built" types. In fact, many types that you might think of as in-built are just defined in Julia itself, much as you are doing here.
So the short summary is: both approaches are performant, so use whichever makes more sense in the context of your application.

Strange pair of type declarations

Okay, so it may not be that strange, but I'm really new to Ada. In my job, I am translating legacy Ada to C, and have come across something that I haven't seen yet. I searched around, but couldn't really find it; here it is.
type Discrete_Names is ( ENUM_POS_4, --label names in an enum
ENUM_POS_5, --that evaluate to 4, 5, and 6
ENUM_POS_6); --respectively
type Discrete_Array_Type is Array (Discrete_Names) of Discrete.Does_Not_Matter
Side noteā€”the Discrete.Does_Not_Matter just references another type in a different library.
It would be great if someone could just help me get my bearings and just figure out what is going on here.
Well, it is quite simple. In Ada arrays can be indexed by any discrete type, that is, integers, characters or enumeration types (your case). The line
type Discrete_Array_Type is Array (Discrete_Names) of Does_Not_Matter
declares Discrete_Array_Type as the type of an array that contains values of type Does_Not_Matter and it is indexed by values of type Discrete_Names.
If your doubt stems from the fact that ENUM_POS_4 has Pos equal to 4 -- so that it seems that the first index of the array is 4 and not 0 -- my suggestion is... forget about it. The compiler will take care of that. In Ada arrays can start from any index. For example, if you say
type Array_Foo is array(Positive range <>) of Characters;
Bar : Array_Foo(10..15);
Bar will be just 6 entries long (not 16) and when you access Bar(12) the compiler -- behind the scenes -- will remove the initial offset "10" to "12" so that you will access the third memory location reserved to Bar. (Actually, I think that for the sake of efficiency it will add 12 to the address of Bar diminished by 10 times the integer sizes, but this is a detail...)
My personal experience is that in cases like this you should not consider the enumerative type like a "integer in disguise" (although it will be internally represented by an integer), but like a type of its own that can be used to index an array. Let the compiler worry about the internal low-level details.

Bit Array with Find Max

So bit arrays and hash tables don't seem to inherently allow for a find-max type operation, but there are ways around it. I'm wondering if there's a way using the bit array alone without extra variables, pointers, or manipulating the start/end of the array, in some scenarios. For example...
I have integers {1,...,n} and a n-bit bit array. To keep a subset of the integers, I use the integer itself as the key in the bit array and set the bit to 1 if it is in the subset, or 0 if it is not.
For example for integers {1,2,3,4} and subset {1,3), the bit array would look like {1,0,1,0}.
It seems like there's no way to do this without somehow moving the bits around which leads me to believe the O(1) dream is dead and perhaps the bit array won't work. Is something like this possible in O(log n)?
Thanks
Finding the highest set bit on a bit array of length n is O(n). If you need better, then you'll need to choose another data structure, or keep a high-water mark along with your bitmap.

Using 2d array vs array of derived type in Fortran 90

Assuming you want a list of arrays, each having the same size. Is it better performance-wise to use a 2D array :
integer, allocatable :: data(:,:)
or an array of derived types :
type test
integer, allocatable :: content(:)
end type
type(test), allocatable :: data(:)
Of course, for arrays of different sizes, we don't have a choice. But how is the memory managed between the 2 cases ? Also, is one of them good code practice ?
Choose the implementation which minimises the conceptual distance that your mind has to leap between the problem in your head and the solution in your code. The force of this approach increases with age, both the age of your code (good conceptual design is a solid foundation for future development) and your own age (the less effort understanding your code demands the longer you'll remain mentally competent enough to understand it).
As to the non-opinion-determined part of your question concerning the way that the memory is managed ... My naive expectation is that most compilers will, under most circumstances, allocate contiguous memory for the first of your outlines, and may not for the second. But I don't care enough about this to check, and I do not think that you should either. I don't, by this, suggest that you should not be interested in what is going on under the hood, but rather that you should be more concerned with the matters referred to in the first paragraph.
In general, you want to use the simplest data structure that suits your problem. If a 2d rectangular array meets your needs - and for a huge number of scientific computing problems, problems for which Fortran is a good choice, it does - then that's the choice you want.
The 2d array will be contiguous in memory, which will normally make accessing it faster both due to caching and one fewer level of indirection; the 2d array will also allow you to do things like data = data * 2 or data = 0. which the array-of-array approach doesn't [Edited to add: though as IanH points out in comments you can create a defined type and defined operations on those types to allow this]. Those advantages are great enough that even when you have "ragged arrays", if the range of expected row lengths isn't that large, implementing it as a rectangular 2d array is sometimes a choice worth considering.

Ocaml operations are performed out of order [closed]

This question is unlikely to help any future visitors; it is only relevant to a small geographic area, a specific moment in time, or an extraordinarily narrow situation that is not generally applicable to the worldwide audience of the internet. For help making this question more broadly applicable, visit the help center.
Closed 10 years ago.
In Ocaml I have a "global" (ie. has file scope) array initialized with some numbers, then I do some operations on those numbers and then I call a function to sum those numbers together. Now because this array is "global" I didn't bother to pass the array as an argument and what ended up happening is that Ocaml calculated the sum of the initialized numbers (in compile time I guess) instead of after my operations on the array had happened. My question is, why does this happen? I spent about 3hrs trying to track down the bug! Does this have something to do with the no-side-effects part of Ocaml? And if so what are the rules for never having something like this happen?
Thanks
EDIT: You guys are very right, I had screwed up fundamentally. This was essentially my code
let my_array = Array.make 10 0;;
let sum_array = ...;;
let my_fun =
do_stuff_with_array args;
sum_array;;
So of course sum_array was being calculated beforehand. Changed it to this and it worked, is this the best solution?
let my_array = Array.make 10 0;;
let sum_array _ = ...;;
let my_fun =
do_stuff_with_array args;
sum_array ();;
OCaml did certainly not compute the sum of the elements of your array "at compile time". There is something you haven't understood about OCaml evaluation order. It's hard to answer your question because there is no question really, it just tells us that you're a bit lost on this topic.
This is fine if we can help you by explaining things to you. It would help, however, if you could help us in spotting where your incomprehension lies, by :
giving a small source code example that does not behave as you expect
and explaining which behavior you would expect and why
The general thing to know about OCaml evaluation order is that, in a module or file, sentences are evaluated from top to bottom, that when you write let x = a in b the expression a is always evaluated before b, and that a function fun x -> a (or equivalent form such as let f x = a) evaluates to itself, without evaluating a at all -- this happens at application time.
Some people like to have a "main" sentence that contains all the side-effects of your code. It is often written like that:
let () =
(* some code that does side-effect *)
If you write code that evaluates and produce side-effects in other part of your file, well, they will be evaluated before or after this sentence depending on whether they are before or after it.

Resources