Container Semantics and Functional Style in D - functional-programming

Does containers in D have value or reference semantics by default? If they have reference semantics doesn't that fundamentally hinder the use of functional programming style in D (compared to C++11's Move Semantics) such as in the following (academic) example:
auto Y = reverse(sort(X));
where X is a container.

Whether containers have value semantics or reference semantics depends entirely on the container. The only built-in containers are dynamic arrays, static arrays, and associative arrays. Static arrays have strict value semantics, because they sit on the stack. Associative arrays have strict reference semantics. And dynamic arrays mostly have reference semantics. They're elements don't get copied, but they do, so they end up with semantics which are a bit particular. I'd advise reading this article on D arrays for more details.
As for containers which are official but not built-in, the containers in std.container all have reference semantics, and in general, that's how containers should be, because it's highly inefficient to do otherwise. But since anyone can implement their own containers, anyone can create containers which are value types if they want to.
However, like C++, D does not take the route of having algorithms operate on containers, so as far as algorithms go, whether containers have reference or value semantics is pretty much irrelevant. In C++, algorithms operate on iterators, so if you wanted to sort a container, you'd do something like sort(container.begin(), container.end()). In D, they operate on ranges, so you'd do sort(container[]). In neither language would you actually sort a container directly. Whether containers themselves have value or references semantics is therefore irrelevant to your typical algorithm.
However, D does better at functional programming with algorithms than C++ does, because ranges are better suited for it. Iterators have to be passed around in pairs, which doesn't work very well for chaining functions. Ranges, on the other hand, chain quite well, and Phobos takes advantage of this. It's one of its primary design principles that most of its functions operate on ranges to allow you to do in code what you typically end up doing on the unix command line with pipes, where you have a lot of generic tools/functions which generate output which you can pipe/pass to other tools/functions to operate on, allowing you to chain independent operations to do something specific to your needs rather than relying on someone to have written a program/function which did exactly what you want directly. Walter Bright discussed it recently in this article.
So, in D, it's easy to do something like:
auto stuff = sort(array(take(map!"a % 1000"(rndGen()), 100)));
or if you prefer UFCS (Universal Function Call Syntax):
auto stuff = rndGen().map!"a % 1000"().take(100).array().sort();
In either case, it generates a sorted list of 100 random numbers between 0 and 1000, and the code is in a functional style, which C++ would have a much harder time doing, and libraries which operate on containers rather than iterators or ranges would have an even harder time doing.

In-Built Containers
The only in-built containers in D are slices (also called arrays/dynamic arrays) and static arrays. The latter have value semantics (unlike in C and C++) - the entire array is (shallow) copied when passed around.
As for slices, they are value types with indirection, so you could say they have both value and reference semantics.
Imagine T[] as a struct like this:
struct Slice(T)
{
size_t length;
T* ptr;
}
Where ptr is a pointer to the first element of the slice, and length is the number of elements within the bounds of the slice. You can access the .ptr and .length fields of a slice, but while the data structure is identical to the above, it's actually a compiler built-in and thus not defined anywhere (the name Slice is just for demonstrative purposes).
Knowing this, you can see that copying a slice (assign to another variable, pass to a function etc.) just copies a length (no indrection - value semantics) and a pointer (has indirection - reference semantics).
In other words, a slice is a view into an array (located anywhere in memory), and there can be multiple views into the same array.
Algorithms
sort and reverse from std.algorithm work in-place to cater to as many users as possible. If the user wanted to put the result in a GC-allocated copy of the slice and leave the original unchanged, that can easily be done (X.dup). If the user wanted to put the result in a custom-allocated buffer, that can be done too. Finally, if the user wanted to sort in-place, this is an option. At any rate, any extra overhead is made explicit.
However, it's important to note that most algorithms in the standard library don't require mutation, instead returning lazily-evaluated range results, which is characteristic of functional programming.
User-Defined Containers
When it comes to user-defined containers, they can have whatever semantics they want - any configuration is possible in D.
The containers in std.container are reference types with .dup methods for making copies, thus slightly emulating slices.

Related

how do functional languages handle manipulating array arguments if arguments are non mutable?

I am reading about functional languages and I can't understand this particular thing. Suppose a function takes an array of numbers and has to square each number. What if we need to remove or insert some elements? Do we have to return a copy of the mutated array for every operation? If so how are arrays of hundreds of millions of objects manipulated reasonably?
There are several ways that functional languages handle array arguments.
Don't actually use arrays.
Instead of using arrays, one should almost always use some other data structure. Lists, binary search trees, finger trees, functional queues, and other data structures are commonly employed in functional code instead of arrays. It often takes some thought to pick the best data structure.
Have a "special escape hatch" for using mutation.
In Haskell, there is a magical thing known as the ST monad. This allows you to write code in Haskell which manipulates mutable arrays in an imperative style while still guaranteeing that the mutation can't "leak out" the escape hatch. For example, if I have a function f :: Int -> Int and I call f 3 twice, I am guaranteed to get the same results each time even if the function internally uses a mutable array. This is not the case in a language like Java, since calling f(3) might read from and write to mutable state, but in Haskell, you can use mutation fairly freely without compromising purity using ST.
Use linear types.
This is a relatively recent addition to Haskell. Consider a function modify :: Int -> a -> Array a -> Array a, where modify idx new_val original_array should return a new array which is a copy of original_array, except that position idx has been overwritten with value new_val. If we never read from the array original_array after we call the modify function on it, then it's ok for the compiler to secretly modify original_array rather than creating a new array without breaking the abstraction of the code. Linear types basically enforce this restriction within the language. It's rather sophisticated and takes some getting used to, but it allows you to use an underlying mutable data structure safely with functional abstractions. It's more limited than ST but doesn't involve any "imperative thinking".
Use immutable arrays.
You might just bite the bullet and use arrays that must be copied on modification. This is very rarely optimal, but the language may offer some abstractions that make this less bearable and more asymptotically efficient in certain circumstances.

Julia functions: making mutable types immutable

Coming from Wolfram Mathematica, I like the idea that whenever I pass a variable to a function I am effectively creating a copy of that variable. On the other hand, I am learning that in Julia there are the notions of mutable and immutable types, with the former passed by reference and the latter passed by value. Can somebody explain me the advantage of such a distinction? why arrays are passed by reference? Naively I see this as a bad aspect, since it creates side effects and ruins the possibility to write purely functional code. Where I am wrong in my reasoning? is there a way to make immutable an array, such that when it is passed to a function it is effectively passed by value?
here an example of code
#x is an in INT and so is immutable: it is passed by value
x = 10
function change_value(x)
x = 17
end
change_value(x)
println(x)
#arrays are mutable: they are passed by reference
arr = [1, 2, 3]
function change_array!(A)
A[1] = 20
end
change_array!(arr)
println(arr)
which indeed modifies the array arr
There is a fair bit to respond to here.
First, Julia does not pass-by-reference or pass-by-value. Rather it employs a paradigm known as pass-by-sharing. Quoting the docs:
Function arguments themselves act as new variable bindings (new
locations that can refer to values), but the values they refer to are
identical to the passed values.
Second, you appear to be asking why Julia does not copy arrays when passing them into functions. This is a simple one to answer: Performance. Julia is a performance oriented language. Making a copy every time you pass an array into a function is bad for performance. Every copy operation takes time.
This has some interesting side-effects. For example, you'll notice that a lot of the mature Julia packages (as well as the Base code) consists of many short functions. This code structure is a direct consequence of near-zero overhead to function calls. Languages like Mathematica and MatLab on the other hand tend towards long functions. I have no desire to start a flame war here, so I'll merely state that personally I prefer the Julia style of many short functions.
Third, you are wondering about the potential negative implications of pass-by-sharing. In theory you are correct that this can result in problems when users are unsure whether a function will modify its inputs. There were long discussions about this in the early days of the language, and based on your question, you appear to have worked out that the convention is that functions that modify their arguments have a trailing ! in the function name. Interestingly, this standard is not compulsory so yes, it is in theory possible to end up with a wild-west type scenario where users live in a constant state of uncertainty. In practice this has never been a problem (to my knowledge). The convention of using ! is enforced in Base Julia, and in fact I have never encountered a package that does not adhere to this convention. In summary, yes, it is possible to run into issues when pass-by-sharing, but in practice it has never been a problem, and the performance benefits far outweigh the cost.
Fourth (and finally), you ask whether there is a way to make an array immutable. First things first, I would strongly recommend against hacks to attempt to make native arrays immutable. For example, you could attempt to disable the setindex! function for arrays... but please don't do this. It will break so many things.
As was mentioned in the comments on the question, you could use StaticArrays. However, as Simeon notes in the comments on this answer, there are performance penalties for using static arrays for really big datasets. More than 100 elements and you can run into compilation issues. The main benefit of static arrays really is the optimizations that can be implemented for smaller static arrays.
Another package-based options suggested by phipsgabler in the comments below is FunctionalCollections. This appears to do what you want, although it looks to be only sporadically maintained. Of course, that isn't always a bad thing.
A simpler approach is just to copy arrays in your own code whenever you want to implement pass-by-value. For example:
f!(copy(x))
Just be sure you understand the difference between copy and deepcopy, and when you may need to use the latter. If you're only working with arrays of numbers, you'll never need the latter, and in fact using it will probably drastically slow down your code.
If you wanted to do a bit of work then you could also build your own array type in the spirit of static arrays, but without all the bells and whistles that static arrays entails. For example:
struct MyImmutableArray{T,N}
x::Array{T,N}
end
Base.getindex(y::MyImmutableArray, inds...) = getindex(y.x, inds...)
and similarly you could add any other functions you wanted to this type, while excluding functions like setindex!.

Explicitly stack-allocated data

I think in most implementations of Common Lisp cons cells are generally/always heap allocated (see Why is consing in Lisp slow?)
Common Lisp does provide a facility for returning multiple values from a function (using values when returning and multiple-value-bind at the call-site). I'm speculating a bit here, but I think the motivation for this construction is two-fold: 1) make functions like truncate easier to use in the typical case where you don't care about the discarded value and 2) make it possible to return multiple values without using a heap-allocated data structure at all and (depending on the implementation (?)) avoiding the heap entirely (and GC overhead later down the road).
Does Common Lisp (or a specific implementation like SBCL maybe) give you the ability to use stack-allocated data (maybe in conjunction with something like weak references) or create composite/large-ish value types (something like structs in C)?
Common Lisp has a DYNAMIC-EXTENT declaration. Implementations can use this information to stack allocate some data structures - they can also ignore this declaration.
See the respective documentation how some implementations support it:
Allegro CL: Stack consing
LispWorks: Stack allocation of objects with dynamic extent
SBCL: Dynamic-extent allocation
Other implementations support it also, but they may lack explicit documentation about it.
The main motivation for explicit support of returning multiple values was to get rid of consing / destructuring lists of return values or even putting some results in global variables. Thus one may now be able to return multiple values in registers or via a stack.

Why should I use a pointer ( performance)?

I'm wondering if there is any perf benchmark on raw objects vs pointers to objects.
I'm aware that it doesn't make sense to use pointers on reference types (e.g. maps) so please don't mention it.
I'm aware that you "must" use pointers if the data needs to be updated so please don't mention it.
Most of the answers/ docs that I've found basically rephrase the guidelines from the official documentation:
... If the receiver is large, a big struct for instance, it will be much cheaper to use a pointer receiver.
My question is simply what means "large" / "big"? Is a pointer on a string overkill ? what about a struct with two strings, what about a struct 3 string fields??
I think we deal with this use case quite often so it's a fair question to ask. Some advise to don't mind the performance issue but maybe some people want to use the right notation whenever they have to chance even if the performance gain is not signifiant. After all a pointer is not that expensive (i.e. one additional keystroke).
An example where it doesn't make sense to use a pointer is for reference types (slices, maps, and channels)
As mentioned in this thread:
The concept of a reference just means something that serves the purpose of referring you to something. It's not magical.
A pointer is a simple reference that tells you where to look.
A slice tells you where to start looking and how far.
Maps and channels also just tell you where to look, but the data they reference and the operations they support on it are more complex.
The point is that all the actually data is stored indirectly and all you're holding is information on how to access it.
As a result, in many cases you don't need to add another layer of indirection, unless you want a double indirection for some reason.
As twotwotwo details in "Pointers vs. values in parameters and return values", strings, interface values, and function values are also implemented with pointers.
As a consequence, you would rarely need a to use a pointer on those objects.
To quote the official golang documentation
...the consideration of efficiency. If the receiver is large, a big struct for instance, it will be much cheaper to use a pointer receiver.
It's very hard to give you exact conditions since there can be different performance goals. As a rule of thumb, by default, all objects larger than 128 bits should be passed by pointer. Possible exceptions of the rule:
you are writing latency sensitive server, so you want to minimise garbage collection pressure. In order to achieve that your Request struct has byte[8] field instead of pointer to Data struct which holds byte[8]. One allocation instead of two.
algorithm you are writing is more readable when you pass the struct and make a copy
etc.

How do you work around the inabilty to pass a list of cl_mem into a kernel invocation?

There are lots of real-world reasons you'd want to do this. Ours is because we have a list of variable length data structures, and we want to be able to change the size of one of the elements without recopying them all.
Here's a few things I've tried:
Just have a lot of kernel arguments. Sure, sounds hacky, but works for small N. This is actually what we've been doing.
Do 1) with some sort of macro loop which extends the kernel args to the max size (which I think is device dependent). I don't really want to do this... it sounds bad.
Create some sort of list of structs which contain pointers, and fill it before your kernel invocation. I tried this, and I think it violates the spec. According to what I've seen on the nVidia forums, preserving the address of a device pointer beyond one kernel invocation is illegal. If anyone can point to where in the spec it says this, I'd love to know, because I can't find it. However, this definitely breaks on ATI hardware, as it moves the objects around.
Give up, store the variable sized objects in a big array, and write a clever algorithm to use empty space so the whole array must be reflowed less often. This will work, but is an inelegant, complicated design. Also, it requires lots of scary pointer arithmetic...
Does anyone else have other ideas? What about experiences trying to do this; is there a least hacky way? Why?
To 3:
OpenCL 1.1 spec page 193 says "Arguments to kernel functions in a program cannot be declared as a pointer to a pointer(s)."
Struct containing a pointer to pointer (pointer to a buffer object) might not be against strict reading of this sentence but it's within the spirit: No pointers to buffer objects may be passed as arguments from host code to kernel even if they're hidden inside a user defined struct.
I'd opt for option 5: Do not use variable size data structures. If you have any way of making them constant size by all means do it. It will make your life a whole lot easier. To be precise there is no 'variable size structure'. Every struct definition produces constant sized structs, so if the size has changed then the struct itself has changed and therefore requires another mem object. Every pointer passed to kernel function must have a single type.
In addition to sharpnelis answer option 5:
If the objects have similar size you could use unions on the biggest possible object size. But make sure you use explicit alignment. Pass a second buffer identifying the union used in each object in your variable-sized-objects-in-static-size-union buffer.
I reverted to this when using opencl lib code that only allowed one variable array of arbitrary type. I simply used cl_float2 to pass two floats. Since the cl_floatN types are implemented as unions - what works for the build in types will work for you as well.

Resources