Can you cheat contracts / asserts in SPARK? - ada

Suppose I have a subprogram written using the SPARK Ada subset with postconditions verifying some property - for example, that the returned array is sorted, whose body just calls out to a function external to SPARK - for example, a C/C++ function that sorts arrays. Is there any way to force SPARK to assume, after this call, that the array will be sorted?

In short, GNATprove takes a divide-and-conquer approach when analyzing code. The following explanation is incomplete and in practice things are slightly more complicated, but for the sake of understanding, it gives a useful perspective on how things work.
For each assertion, loop invariant and pre-/post-condition GNATprove creates verification conditions (VCs) that must be proven. Verification conditions are to be proven by assumptions and the semantics of the code.
When a code section is being analyzed, and this code section starts just after a call to a subprogram, then any post-condition of that subprogram is assumed to hold.
If that particular subprogram is implemented in SPARK, then GNATprove will try to proof that the post-conditions indeed hold by analyzing the subprogram. However, if the particular subprogram is not in SPARK (e.g., the subprogram is imported), then the post-conditions will remain assumptions and it is left to the developer to prove them by other means.
A nice example that illustrates the first point can be found in sections 1 and 2 of the recently published article The Work of Proof in SPARK (available here). Note in particular how a repeated call to the Increment function is being analyzed by GNATprove.
So, if you want SPARK to assume particular post-conditions to hold for a subprogram that is not in SPARK (an imported function, for example), just provide the post-conditions.

Related

Is Ada.Containers.Functional_Maps usable in Ada2012?

The information about Ada.Containers.Functional_Maps in the GNAT documentation is quite—let's say—abstruse.
First, it says this:
…these containers can still be used safely.
In the second paragraph, it seems to me that you cannot free the memory allocated for those objects once the program exits the context where they are created. I am understanding that you could run into a memory leak. Am I right?
They are also memory consuming, as the allocated memory is not reclaimed when the container is no longer referenced.
Read the next two sentences in the doc:
Thus, they should in general be used in ghost code and annotations, so that they can be removed from the final executable. The specification of this unit is compatible with SPARK 2014.
Because the specification of Ada.Containers.Functional_Maps is compatible with SPARK, it may help to examine it in the context of related SPARK Libraries with regard to proof, testing and annotation. In particular,
The functional maps, sets and vectors are unbounded collections of indefinite elements that are neither controlled nor limited. While they are inefficient with regard to memory, they are simple, immutable and useful "to model user defined data structures."
The functional containers can be used in Ghost Code, "parts of the code that are only meant for specification and verification", as suggested here. This related example illustrates a ghost function.
it seems to me that you cannot free the memory allocated for those
objects once the program exits the context where they are created. I
am understanding that you could run into a memory leak. Am I right?
There are some things that you can do in Ada to manage memory, I would be surprised if (for example) the usage of an instance inside a declare-block were not cleaned-up on the block's exit. — This is, in fact, how some surprisingly robust applications can get away without "dynamically-allocated" memory/values (it's actually heap-allocated, but that's pedantic).
This sort of granular control is really nice, as you can constrain things/usages to specific points. Combined with Ada's good facilities for presenting interfaces, this means that changing some structure to another can be less-painful than it otherwise might be.
As an example of the above, I had a nested key-value map (a JSON object) that was being used to pass parameters around; the method for doing this changed and so I had a string of values (with common-rooted keys) coming in and a procedure that took JSON as input. Obviously what was needed was a "keys&values-to-JSON function, so inside the function I used the multiway-tree container where the leafs represented values and the internal-nodes the keys, the second step was to traverse the tree and create the JSON-object as needed - simple recursion and data-structure selection used to address the problem of adapting the textual key-value pairs of these nested parameters to JSON. — And because the usage of multi-way trees was exclusive to this function, I can be confident that the memory used by the intermediate tree-object I used is released on the function's exit.

Julia functions: making mutable types immutable

Coming from Wolfram Mathematica, I like the idea that whenever I pass a variable to a function I am effectively creating a copy of that variable. On the other hand, I am learning that in Julia there are the notions of mutable and immutable types, with the former passed by reference and the latter passed by value. Can somebody explain me the advantage of such a distinction? why arrays are passed by reference? Naively I see this as a bad aspect, since it creates side effects and ruins the possibility to write purely functional code. Where I am wrong in my reasoning? is there a way to make immutable an array, such that when it is passed to a function it is effectively passed by value?
here an example of code
#x is an in INT and so is immutable: it is passed by value
x = 10
function change_value(x)
x = 17
end
change_value(x)
println(x)
#arrays are mutable: they are passed by reference
arr = [1, 2, 3]
function change_array!(A)
A[1] = 20
end
change_array!(arr)
println(arr)
which indeed modifies the array arr
There is a fair bit to respond to here.
First, Julia does not pass-by-reference or pass-by-value. Rather it employs a paradigm known as pass-by-sharing. Quoting the docs:
Function arguments themselves act as new variable bindings (new
locations that can refer to values), but the values they refer to are
identical to the passed values.
Second, you appear to be asking why Julia does not copy arrays when passing them into functions. This is a simple one to answer: Performance. Julia is a performance oriented language. Making a copy every time you pass an array into a function is bad for performance. Every copy operation takes time.
This has some interesting side-effects. For example, you'll notice that a lot of the mature Julia packages (as well as the Base code) consists of many short functions. This code structure is a direct consequence of near-zero overhead to function calls. Languages like Mathematica and MatLab on the other hand tend towards long functions. I have no desire to start a flame war here, so I'll merely state that personally I prefer the Julia style of many short functions.
Third, you are wondering about the potential negative implications of pass-by-sharing. In theory you are correct that this can result in problems when users are unsure whether a function will modify its inputs. There were long discussions about this in the early days of the language, and based on your question, you appear to have worked out that the convention is that functions that modify their arguments have a trailing ! in the function name. Interestingly, this standard is not compulsory so yes, it is in theory possible to end up with a wild-west type scenario where users live in a constant state of uncertainty. In practice this has never been a problem (to my knowledge). The convention of using ! is enforced in Base Julia, and in fact I have never encountered a package that does not adhere to this convention. In summary, yes, it is possible to run into issues when pass-by-sharing, but in practice it has never been a problem, and the performance benefits far outweigh the cost.
Fourth (and finally), you ask whether there is a way to make an array immutable. First things first, I would strongly recommend against hacks to attempt to make native arrays immutable. For example, you could attempt to disable the setindex! function for arrays... but please don't do this. It will break so many things.
As was mentioned in the comments on the question, you could use StaticArrays. However, as Simeon notes in the comments on this answer, there are performance penalties for using static arrays for really big datasets. More than 100 elements and you can run into compilation issues. The main benefit of static arrays really is the optimizations that can be implemented for smaller static arrays.
Another package-based options suggested by phipsgabler in the comments below is FunctionalCollections. This appears to do what you want, although it looks to be only sporadically maintained. Of course, that isn't always a bad thing.
A simpler approach is just to copy arrays in your own code whenever you want to implement pass-by-value. For example:
f!(copy(x))
Just be sure you understand the difference between copy and deepcopy, and when you may need to use the latter. If you're only working with arrays of numbers, you'll never need the latter, and in fact using it will probably drastically slow down your code.
If you wanted to do a bit of work then you could also build your own array type in the spirit of static arrays, but without all the bells and whistles that static arrays entails. For example:
struct MyImmutableArray{T,N}
x::Array{T,N}
end
Base.getindex(y::MyImmutableArray, inds...) = getindex(y.x, inds...)
and similarly you could add any other functions you wanted to this type, while excluding functions like setindex!.

Can't understand why is prolog looping infinitly

From Bratko's book, Prolog Programming for Artificial Intelligence (4th Edition)
We have the following code which doesn't work -
anc4(X,Z):-
anc4(X,Y),
parent(Y,Z).
anc4(X,Z):-
parent(X,Z).
In the book, on page 55, figure 2.15, is shown that parent(Y,Z) is kept calling until stack is out of memory.
What I don't understand is that prolog does a recursiv call to anc4(X,Y), and not to parent (Y,Z) first. Why doesn't prolog goes over and over to the first line, anc4(X,Y), and rather goes to the second line?
Can you please elaborate why is the line parent(Y,Z) is kept being called?
Thank you.
The origin of your 'problem' (i.e. goals order) is deeply rooted in the basic of the language.
Prolog is based on the simple and efficient strategy of chronological backtracking, to implement SLD resolution, and left recursive clauses, like anc4/2, cause an infinite recursion.
Indeed, the comma operator (,)/2 stands for
evaluate the right expression only if the left expression holds
So, order of goals in a clause is actually an essential part of the program.
For your concrete case,
... , parent(Y,Z).
cannot be called if
anc4(X,Y),
doesn't hold.
The counterpart to goals order is clauses order.
That is, the whole program has a different semantics after the clauses are exchanged:
anc4(X,Z):-
parent(X,Z).
anc4(X,Z):-
anc4(X,Y),
parent(Y,Z).
To better understand the problem, I think it's worth to try this definition as well.
Prolog cannot handle left recursion by default without a tabling mechanism. Only some Prolog systems support tabling and usually you need to explicitly declare which predicates are tabled.
If you're using e.g. XSB, YAP, or SWI-Prolog, try adding the following directive on top of your source file containing the definition for the anc4/2 predicate:
:- table(anc4/2).
and retry your queries. The tabling mechanism detects when a query is calling a variant (*) of itself and suspends execution of that branch until a query answer is found and tries an alternative branch (provided in your case by the second clause). If that happens, execution is resumed with that answer.
(*) Variant here means that two terms are equal upon variable renaming.

How does Julia implement multimethods?

I've been reading a bit from http://c2.com/cgi/wiki?ImplementingMultipleDispatch
I've been having some trouble finding reference on how Julia implements multimethods. What's the runtime complexity of dispatch, and how does it achieve it?
Dr. Bezanson's thesis is certainly the best source for descriptions of Julia's internals right now:
4.3 Dispatch system
Julia’s dispatch system strongly resembles the multimethod systems found in some object-oriented languages [17, 40, 110, 31, 32, 33]. However we prefer the term type-based dispatch, since our system actually works by dispatching a single tuple type of arguments. The difference is subtle and in many cases not noticeable, but has important conceptual implications. It means that methods are not necessarily restricted to specifying a type for each argument “slot”. For example a method signature could be Union{Tuple{Any,Int}, Tuple{Int,Any}}, which matches calls where either, but not necessarily both, of two arguments is an Int.2
The section continues to describe the type and method caches, sorting by specificity, parametric dispatch, and ambiguities. Note that tuple types are covariant (unlike all other Julian types) to match the covariant behavior of method dispatch.
The biggest key here is that the method definitions are sorted by specificity, so it's just a linear search to check if the type of the argument tuple is a subtype of the signature. So that's just O(n), right? The trouble is that checking subtypes with full generality (including Unions and TypeVars, etc) is hard. Very hard, in fact. Worse than NP-complete, it's estimated to be ΠP2 (see the polynomial hierarchy) — that is, even if P=NP, this problem would still take non-polynomial time! It might even be PSPACE or worse.
Of course, the best source for how it actually works is the implementation in JuliaLang/julia/src/gf.c (gf = generic function). There's a rather useful comment there:
Method caches are divided into three parts: one for signatures where
the first argument is a singleton kind (Type{Foo}), one indexed by the
UID of the first argument's type in normal cases, and a fallback
table of everything else.
So the answer to your question about the complexity of method lookup is: "it depends." The first time a method is called with a new set of argument types, it must go through that linear search, looking for a subtype match. If it finds one, it specializes that method for the particular arguments and places it into one of those caches. This means that before embarking upon the hard subtype search, Julia can perform a quick equality check against the already-seen methods… and the number of methods it needs to check are further reduced since the caches are stored as hashtables based upon the first argument.
But, really, your question was about the runtime complexity of dispatch. In that case, the answer is quite often "what dispatch?" — because it has been entirely eliminated! Julia uses LLVM as a just-barely-ahead-of-time compiler, wherein methods are compiled on-demand as they're needed. In performant Julia code, the types should be concretely inferred at compile time, and so dispatch can also be performed at compile time. This completely eliminates the runtime dispatch overhead, and potentially even inlines the found method directly into the caller's body (if it's small) to remove all function call overhead and allow for further optimizations downstream. If the types aren't concretely inferred, there are other performance pitfalls, and I've not profiled to see how much time is typically spent in dispatch. There are ways to optimize this case further, but there's likely bigger fish to fry first… and for now it's typically easiest to just make the hot loops type-stable in the first place.

When to use Pragma Pure/Preelaborate

Is there a set of general rules/guidelines that can help to understand when to prefer pragma Pure, pragma Preelaborate, or something else entirely? The rules and definitions presented in the standard (Ada 2012), are a little heavy-going and I'd be grateful to read something that's a little more clear and geared towards the average case.
If I wanted to be thorough without fully understanding the "why" of it, can I simply try:
Mark the package spec with pragma Pure;
If it doesn't compile, try pragma Preelaborate;
If that fails, then I've done something tricky and either need to pragma Elaborate units on a with-by-with basis, or rethink the package layout.
While this might work (does it?), because it's recommended to mark a package as Pure whenever possible (likewise with Preelaborate), however it seems a bit brain damaged and I'd prefer to understand the process a bit better.
pragma Pure
You should use this on any package which does not have an internal state. It tells the user of the package that calls to any subprograms cannot have side effects, because there is no internal state they could change. So a function declared at library level inside a pure package will always return the same result when called with the same parameters.
The Ada implementation is allowed to cache return values of functions of a pure package, and to omit calls to subroutines if their return values won't be used because of these requirements. However, you can violate the constraints by calling imported subroutines (e.g. from a C library) inside your pure package (these may change some internal state which the Ada compiler doesn't know of). If you're evil, you can even import Ada subroutines from other parts of the software with pragma Import to bypass the requirements of pragma Pure. Needless to say: If you're doing anything like this, don't use pragma Pure.
Edit: To clarify the circumstances when calls may be omitted, let me quote the ARM:
If a library unit is declared pure, then the implementation is permitted to omit a call on a library-level subprogram of the library unit if the results are not needed after the call. Similarly, it may omit such a call and simply reuse the results produced by an earlier call on the same subprogram, provided that none of the parameters are of a limited type, and the addresses and values of all by-reference actual parameters, and the values of all by-copy-in actual parameters, are the same as they were at the earlier call. This permission applies even if the subprogram produces other side effects when called.
GNAT, for example, additionally defines that any subroutines that take a parameter of type System.Address or a type derived from it are not considered pure even if they are defined in a pure package, because the location the address points to may be altered, but GNAT does not know what kind of structure the address points to and therefore cannot run any checks about whether the referenced value of the parameter has been changed.
pragma Preelaborate
This tells the compiler that the package won't execute any code at elaboration time (i.e. before the main procedure starts executing). At elaboration time, the following constructs will execute:
Initialization of library-level variables (this can be a function call)
Initialization of tasks declared at library level (they may start executing before the main procedure does)
Statements in a begin ... end block at library level
You generally should avoid these things if you don't need them. Use pragma Preelaborate wherever possible, it tells the caller that he can safely use the package without executing anything at elaboration time.
If something doesn't compile with one of these pragmas when you think it should, look into why it doesn't compile. It may help you discover problems with your package implementation or structure. Don't just drop the pragma when it doesn't compile. As the constraint affects possible constraints on any packages that depend on yours, you should always choose the strictest applicable pragma.
Elaboration Order Handling in GNAT is a helpful guide. Ideally, the standard rules will suffice for most programs. The pragmas tell the compiler to substitute your elaboration order. They should be applied to solve specific problems, rather than used empirically.
Addendum: #ajb underscores an important distinction among the pragmas. The article cited agrees with the approach outlined in the question (bullets one and two): "Consequently a good rule is to mark units as Pure or Preelaborate if possible, and if this is not possible, mark them as Elaborate_Body if possible." It goes on to discuss situations (bullet three) "where neither of these three pragmas can be used."

Resources