Is there a set of general rules/guidelines that can help to understand when to prefer pragma Pure, pragma Preelaborate, or something else entirely? The rules and definitions presented in the standard (Ada 2012), are a little heavy-going and I'd be grateful to read something that's a little more clear and geared towards the average case.
If I wanted to be thorough without fully understanding the "why" of it, can I simply try:
Mark the package spec with pragma Pure;
If it doesn't compile, try pragma Preelaborate;
If that fails, then I've done something tricky and either need to pragma Elaborate units on a with-by-with basis, or rethink the package layout.
While this might work (does it?), because it's recommended to mark a package as Pure whenever possible (likewise with Preelaborate), however it seems a bit brain damaged and I'd prefer to understand the process a bit better.
pragma Pure
You should use this on any package which does not have an internal state. It tells the user of the package that calls to any subprograms cannot have side effects, because there is no internal state they could change. So a function declared at library level inside a pure package will always return the same result when called with the same parameters.
The Ada implementation is allowed to cache return values of functions of a pure package, and to omit calls to subroutines if their return values won't be used because of these requirements. However, you can violate the constraints by calling imported subroutines (e.g. from a C library) inside your pure package (these may change some internal state which the Ada compiler doesn't know of). If you're evil, you can even import Ada subroutines from other parts of the software with pragma Import to bypass the requirements of pragma Pure. Needless to say: If you're doing anything like this, don't use pragma Pure.
Edit: To clarify the circumstances when calls may be omitted, let me quote the ARM:
If a library unit is declared pure, then the implementation is permitted to omit a call on a library-level subprogram of the library unit if the results are not needed after the call. Similarly, it may omit such a call and simply reuse the results produced by an earlier call on the same subprogram, provided that none of the parameters are of a limited type, and the addresses and values of all by-reference actual parameters, and the values of all by-copy-in actual parameters, are the same as they were at the earlier call. This permission applies even if the subprogram produces other side effects when called.
GNAT, for example, additionally defines that any subroutines that take a parameter of type System.Address or a type derived from it are not considered pure even if they are defined in a pure package, because the location the address points to may be altered, but GNAT does not know what kind of structure the address points to and therefore cannot run any checks about whether the referenced value of the parameter has been changed.
pragma Preelaborate
This tells the compiler that the package won't execute any code at elaboration time (i.e. before the main procedure starts executing). At elaboration time, the following constructs will execute:
Initialization of library-level variables (this can be a function call)
Initialization of tasks declared at library level (they may start executing before the main procedure does)
Statements in a begin ... end block at library level
You generally should avoid these things if you don't need them. Use pragma Preelaborate wherever possible, it tells the caller that he can safely use the package without executing anything at elaboration time.
If something doesn't compile with one of these pragmas when you think it should, look into why it doesn't compile. It may help you discover problems with your package implementation or structure. Don't just drop the pragma when it doesn't compile. As the constraint affects possible constraints on any packages that depend on yours, you should always choose the strictest applicable pragma.
Elaboration Order Handling in GNAT is a helpful guide. Ideally, the standard rules will suffice for most programs. The pragmas tell the compiler to substitute your elaboration order. They should be applied to solve specific problems, rather than used empirically.
Addendum: #ajb underscores an important distinction among the pragmas. The article cited agrees with the approach outlined in the question (bullets one and two): "Consequently a good rule is to mark units as Pure or Preelaborate if possible, and if this is not possible, mark them as Elaborate_Body if possible." It goes on to discuss situations (bullet three) "where neither of these three pragmas can be used."
Related
I wonder what is the fundamental difference between binding and linking when working with Ada code? I couldn't find a good explanation on google and this is why I ask the question.
For the binding process what is the input and what is the output?
What is the relation between binding and linking? I assume binding needs to be done first.
Thanks,
Bogdan.
With GNAT, there are two jobs which the binder performs: first, checking that all the necessary compilations have been done, so that the program’s closure is consistent, and secondly arranging for elaboration to happen (these jobs are needed for any Ada build system, but they may be implemented differently).
When using gnatmake, the first of these jobs is usually superfluous, because gnatmake has already organised all the necessary compilations. It is possible to get this wrong (by, for example, moving a unit to a different library and not deleting its compilation products from the original place) but quite hard!
Elaboration is a feature of Ada that isn’t present in many other languages. There’s explanation at gcc.gnu.org and other places, but for a simple example,
with Foo;
package Bar is
Int : Integer := Foo.Value;
[...]
end Bar;
package Foo is
function Value return Integer;
[...]
end Foo;
we don’t know what Foo.Value is going to return at compile time, and we may not know until run time (what if it reads a value from the command line?), so Foo.Value must be in a fit state to be called before Bar’s initialisation happens.
Bar’s initialisation happens when Bar is elaborated, and likewise for Foo, so it’s gnatbind’s job to recognise this and arrange that Foo is elaborated before Bar.
It does this by emitting calls to packages’ elaboration code in a function (usually called adanit), and a main(), which is to be called by the operating system and calls adainit and then the Ada main program, say program.adb.
gnatmake then calls gnatlink, which takes the gnatbind-generated code, in Ada in files called b-program.ad[sb] or b__program.ad[sb] or b~program.ad[sb] depending on the vintage of the compiler, compiles it, and links it with the program’s closure to produce the final executable.
See the four points listed here: https://docs.adacore.com/gnat_ugn-docs/html/gnat_ugn/gnat_ugn/building_executable_programs_with_gnat.html#binding-with-gnatbind
You could think of it as a built-in make but without the recompilation: it ensures objects are consistent, generates a correct initialization order, compiles it, and passes everything to the linker.
As pointed out, in Ada the program entry point is not your main procedure, but one that performs a safe initialization and then calls your main procedure.
I've got a procedure within a SPARK module that calls the standard Ada-Text_IO.Put_Line.
During proving I get the following warning warning: no Global contract available for "Put_Line".
I do already know how to add the respective data dependency contract to procedures and functions written by myself but how do I add them to a procedures / functions written by others where I can't edit the source files?
I looked through sections 5.2 and 7.4 of the Adacore SPARK 2014 user's guide but didn't found an example with a solution to my problem.
This means that the analyzer cannot "see" whether global variables might be affected when this function is called. It therefore assumes this call is not modifying anything (otherwise all other proofs could be refuted immediately). This is likely a valid assumption for your specific example, but it might not be valid on an embedded system, where a custom implementation of Put_Line might do anything.
There are two ways to convey the missing information:
verifier can examine the source code of the function. Then it can try to generate global contracts itself.
global contracts are specified explicitly, see RM 6.1.4 (http://docs.adacore.com/spark2014-docs/html/lrm/subprograms.html#global-aspects)
In this case, the procedure you are calling is part of the run-time system (RTS), and therefore the source is not visible, and you probably cannot/should not change it.
What to do in practice?
Suppressing warnings is almost never a good idea, especially not when you are working on something safety-critical. Usually the code has to be changed until the warning goes away, or some justification process has to start.
If you are serious about the analysis results, I recommend to not use such subprograms. If you really need output there, either write your own procedure that replaces the RTS subprogram, or ensure that the subprogram really has no side effects. This is further backed up by what Frédéric has linked: Even if the callee has no side effects, you don't know whether it raises an exception for specific inputs (e.g., very long strings).
If you are not so serious about the results, then you can consider this specific one as a warning that you could live with.
Wrapper packages for use in development of SPARK applications may be found here:
https://github.com/joakim-strandberg/aida_2012
I think you just can't add Spark contracts on code you don't own, especially code from the Ada standard.
About Text_Io, I found something that may be valuable to you in the reference manual.
EDIT
Another solution compared to what Martin said, according to "Building high integrity applications with Spark" book, is to create a wrapper package.
As Spark requires you to deal with Spark packages but allows you to depend on a Spark spec with an Ada body, the solution is to build a Spark package wrapping your Ada.Text_io calls.
It might be tedious as you will have to wrap possible exceptions, possibly define specific types and so on but this way, you'll be able to discharge VCs on your full Spark package.
When using the Extended Program Check, I get the following warning:
Do not declare fields and field symbols (variable name) globally.
This is from declaring global data before the selection screen. The obvious solution is that they should be declared locally in a subroutine.
If I decide to do this, the data will now be out of scope for the other subroutines, so I would end up creating something to the effect of a main() function from C or Java. This sounds like a good idea - however, events such as INITIALIZATION are not allowed to be inside of subroutines, meaning that it forces a break in scope.
Observe the sample program below:
REPORT Z_EXAMPLE.
SELECTION-SCREEN BEGIN OF BLOCK upload WITH FRAME TITLE text-H01.
PARAMETERS: p_infile TYPE rlgrap-filename LOWER CASE OBLIGATORY.
SELECTION-SCREEN END OF BLOCK upload.
AT SELECTION-SCREEN ON VALUE-REQUEST FOR p_infile.
PERFORM main1 CHANGING p_infile.
INITIALIZATION.
PERFORM main2.
TOP-OF-PAGE.
PERFORM main3.
...
main1, main2, and main3 cannot to my knowledge pass any data to one another without global declaration. If the data is parsed from the uploaded file p_infile in main1, it cannot be accessed in main2 or main3. Aside from omitting events all together, is there any way to abide by the warning but let data be passed over events?
There are a variety of techniques - I prefer to code almost everything except for the basic selection screen handling in a separate controller class. The report simply defers to that class and calls its methods. Other than that - it's just a warning that you can ignore if you know what you're doing. Writing a program without any global variable at all will certainly not be practical - however, you should think at least twice before using global variables or attributes in a place where a method parameter would be more appropriate.
As #vwegert so rightly said, it's almost impossible to write an ABAP program that doesn't have at least a few global variables (the selection screen and events enforce that, unfortunately).
One approach is to use a controller class, another is to have a main subroutine and have it call other subroutines as required, passing values as required. I tend to favour the latter approach in a lot of cases, if only because it's easier to split the subroutines into logical groupings in separate includes (doing so with classes can sometimes be a little ugly). It really is a matter of approach though, but the key thing is reducing global variables to a minimum - unfortunately too few ABAP developers that I've encountered care about such issues.
Update
#Christian has reminded me that as of ABAP AS 7.02, subroutines are considered obsolete:
Subroutines should no longer be created in new programs for the following reasons:
The parameter interface has clear weaknesses when compared with the parameter interface of methods, such as:
positional parameters instead of keyword parameters
no genuine input parameters in pass by reference
typing is optional
no optional parameters
Every subroutine implicitly belongs to the public interface of its program. Generally this is not desirable.
Calling subroutines externally is critical with regard to the assignment of the container program to a program group in the internal session. This assignment cannot generally be defined as static.
Those are all valid points and I think in light of that, using classes for modularisation is definitely the preferred approach (and from a purely aesthetic point of view, they also "fit" better with the syntax enhancements in 7.02 and later).
There are three 'normal' modes of passing parameters in Ada: in, out, and in out.
But then there's a fourth mode, access… is there anything wherein they're required?
(i.e. something that would otherwise be impossible.)
Now, I do know that the GNAT JVM Ada-compiler makes pretty heavy use of them in the imported [library] specifications. (Also, they could arguably be seen as essential for C/C++ translations.)
One of the primary drivers of the access mode was to work-around the restriction that, prior to Ada 2012, function parameters could only be of mode 'in'.
So while there may still be areas where they're an appropriate solution, perhaps in bindings, Ada 2012's relaxation of the allowed function parameters modes to now include 'in out' will probably significantly reduce the need for access mode.
Regardless of what other uses there are for them, I rather like using them when coding bindings to C API's that take in pointers (if and only if 0 is not a valid value for that parameter on the C side).
This way on the Ada side I can deal with a nice object rather than a messy error-prone pointer.
Of course you can just specify in the bindings that the parameter is passed by reference, which gets you the same thing.
In my latest project, the only time I've needed to use access so far is when defining my own stream subprograms (Read, Write, X'Class'Output etc. etc.). These functions require not null access Ada.Streams.Root_Stream_Type'Class as a parameter.
For example:
package Example is
type Printable_Type is private;
procedure Print_Printable(
Stream : not null access Ada.Streams.Root_Stream_Type'Class;
Print : in Printable_Type);
for Printable_Type'Write use Print_Printable;
end Example
What are some of the optimization steps that this command does
`(optimize speed (safety 0))`
Can I handcode some of these techniques in my Lisp/Scheme program?
What these things do in CL depends on the implementation. What usually happens is: there's a bunch of optimizations and other code transformations that can be applied on your code with various tradeoffs, and these declarations are used as a higher level specification that is translated to these individual transformations. Most implementations will also let you control the individual settings, but that won't be portable.
In Scheme there is no such standard facility, even though some Schemes use a similar approach. The thing is that Scheme in general (ie, in the standard) avoids such "real world" issues. It is possible to use some optimization techniques here and there, but that depends on the implementation. For example, in PLT the first thing you should do is make sure your code is defined in a module -- this ensures that the compiler gets to do a bunch of optimizations like inlining and unfolding loops.
I don't know, but I think the SBCL internals wiki might have some starting points if you want to explore.
Higher speed settings will cause the compiler to work harder on constant folding, compile-time type-inference (hence eliminating runtime dynamic dispatches for generic operations) and other code analyses/transformations; lower safety will skip runtime type checks, array bound checks, etc. For more details, see
Advanced Compiler Use and Efficiency Hints chapter of the CMUCL User's Manual, which applies to both CMUCL and SBCL(more or less).