Localizing global variables - global-variables

When using the Extended Program Check, I get the following warning:
Do not declare fields and field symbols (variable name) globally.
This is from declaring global data before the selection screen. The obvious solution is that they should be declared locally in a subroutine.
If I decide to do this, the data will now be out of scope for the other subroutines, so I would end up creating something to the effect of a main() function from C or Java. This sounds like a good idea - however, events such as INITIALIZATION are not allowed to be inside of subroutines, meaning that it forces a break in scope.
Observe the sample program below:
REPORT Z_EXAMPLE.
SELECTION-SCREEN BEGIN OF BLOCK upload WITH FRAME TITLE text-H01.
PARAMETERS: p_infile TYPE rlgrap-filename LOWER CASE OBLIGATORY.
SELECTION-SCREEN END OF BLOCK upload.
AT SELECTION-SCREEN ON VALUE-REQUEST FOR p_infile.
PERFORM main1 CHANGING p_infile.
INITIALIZATION.
PERFORM main2.
TOP-OF-PAGE.
PERFORM main3.
...
main1, main2, and main3 cannot to my knowledge pass any data to one another without global declaration. If the data is parsed from the uploaded file p_infile in main1, it cannot be accessed in main2 or main3. Aside from omitting events all together, is there any way to abide by the warning but let data be passed over events?

There are a variety of techniques - I prefer to code almost everything except for the basic selection screen handling in a separate controller class. The report simply defers to that class and calls its methods. Other than that - it's just a warning that you can ignore if you know what you're doing. Writing a program without any global variable at all will certainly not be practical - however, you should think at least twice before using global variables or attributes in a place where a method parameter would be more appropriate.

As #vwegert so rightly said, it's almost impossible to write an ABAP program that doesn't have at least a few global variables (the selection screen and events enforce that, unfortunately).
One approach is to use a controller class, another is to have a main subroutine and have it call other subroutines as required, passing values as required. I tend to favour the latter approach in a lot of cases, if only because it's easier to split the subroutines into logical groupings in separate includes (doing so with classes can sometimes be a little ugly). It really is a matter of approach though, but the key thing is reducing global variables to a minimum - unfortunately too few ABAP developers that I've encountered care about such issues.
Update
#Christian has reminded me that as of ABAP AS 7.02, subroutines are considered obsolete:
Subroutines should no longer be created in new programs for the following reasons:
The parameter interface has clear weaknesses when compared with the parameter interface of methods, such as:
positional parameters instead of keyword parameters
no genuine input parameters in pass by reference
typing is optional
no optional parameters
Every subroutine implicitly belongs to the public interface of its program. Generally this is not desirable.
Calling subroutines externally is critical with regard to the assignment of the container program to a program group in the internal session. This assignment cannot generally be defined as static.
Those are all valid points and I think in light of that, using classes for modularisation is definitely the preferred approach (and from a purely aesthetic point of view, they also "fit" better with the syntax enhancements in 7.02 and later).

Related

Functions with no side effect but with readonly access to state

Considering the common definition of pure functions:
the function return values are identical for identical arguments (no variation with local static variables, non-local variables, mutable reference arguments or input streams)
the function application has no side effects (no mutation of local static variables, non-local variables, mutable reference arguments or input/output streams).
What are the implications of using (only) impure functions that still have 2 but do not have 1, in the sense that they can read the current value of (some) immutable state, but can not modify (any) state ? Is such a pattern have a name and is useful or is it an anti pattern ?
An example can be using a global way to get a read only immutable version of a state, or passing as an argument a function that return the current immutable value of a state.
(Rationale - I have been trying to structure my (C#) code in a more functional way, using pure functions where possible (as static members of static classes).
It quickly became obvious how complex and tedious it is to pass state values to these pure functions even when they only need to read the value. I need to know the relevant state value at the point of calling and that often means passing it around through parts of the code which have no need to know it otherwise.
However if for example I initialize such static classes with an internal member function that can return the current immutable value of the state, other members can use it instead of having that value passed to them. And This pattern has greatly simplified my code where I used it. And it feels like I still get most of the benefits of isolating state changes etc)
An potentially impure operation without side effects fits the description of a Query in Command-Query Separation (CQS) - a decade-old object-oriented design principle.
CQS explicitly distinguishes between operations with side effects (Commands) and operations that return data (Queries). According to that principle, a Query must not have a side effect.
On the other hand, CQS says nothing about determinism, so a Query is allowed to be non-deterministic.
In C#, a fundamental example of a Query is DateTime.Now. This is essentially a method that takes no arguments - which is equivalent with unit as an input argument. Thus, we can think of DateTime.Now as a Query from unit to DateTime: () -> DateTime.
DateTime.Now is (in)famously non-deterministic, so it's clearly not a pure function. It is, however, a Query.
All pure functions are Queries, but not all Queries are pure functions.
CQS is a nice design principle, but it's not Functional Programming (FP). It's a move in the right direction, but you should attempt to have as few non-deterministic Queries as possible.
People often tend to focus on avoiding side-effects when learning FP, but it's just as important to avoid non-determinism.

Difference between the internal procedures and functions in Progress4gl?

Both internal procedures and functions are accepting the parameters to give the output. So what is the use of using Internal procedures instead of functions.
A user-defined function is used when you want to perform some calculation and return a single value. In this respect it is the same as a built-in ABL function, like the SUBSTRING or EXP functions. Putting this calculation code in a FUNCTION block instead of inline in your code allows you to put it in one place and reference it multiple times without code duplication.
An internal procedure is also an encapsulated piece of code that does some work, but it is more general-purpose. While a function must return a single value, an internal procedure may or may not have input parameters or output parameters.
https://docs.progress.com/category/openedge-archives
Also functions (like methods) parameters and return value type are checked at compile time, which removes some potential problems at run time later.
The question acknowledges that both functions and internal procedures allow OUTPUT parameters and asks "what is the use" of internal procedures instead of functions.
To me, this implies that the poster is contemplating always using functions and deprecating internal procedures and is asking: "what would I lose if I do that?"
Two things spring to mind:
Sort of the opposite of Jean-Christophe Cardot's point: you would lose some automatic type conversions and syntactic flexibility about the parameter lists. Some people see that flexibility in a negative light. Others see it as a positive.
You need to "forward declare" your functions or use dynamic invocations. With an internal procedure you can RUN it without providing a declaration earlier in the code.
If you tend to think that strict type checking is useful then these are probably not benefits that you think of as being lost. If you prefer more flexible behaviors, then you may regret choosing functions rather than internal procedures.

What is the difference between self-modifying code and reflection?

Self-modifying code is code that "alters its own instructions while it is executing". This is not typically done outside of assembly language or viruses.
Reflection is just the ability of a program to access its own namespace dynamically, in order to reference functions and classes and variables dynamically. According to this article, reflection is not just introspection (a program's ability to examine itself), but also intercession (a program's ability to modify itself).
So, is the difference that reflection refers to a mild form of self-modifying code where only the variable/class/function name gets "modified" within the instructions? That is, reflection is a milder, less "dramatic" form of modification compared with the ability to modify the nature of the entire instruction itself as in self-modifying code.
Do I have this distinction right?
No, one is about changing the code during execution. The other about reading the structure and metadata (introspection) of the code during execution.
They can be mutually exclusive. You don't need to know how the code is to modify it (if the OS allows you that is).
Typically you can use reflection to execute code in a non 'normal use case' manner, but it's still the same code. contrast this to modifying the code.
The goals are completely non aligned.
However I suppose one example where they intersect in a small way is to consider a function (F) that calls two other functions - A then B. You could reflect that knowledge and then call B then A (thus modifying the use case of (F)). as you can see it's not not modifying code, rather just the intended use case.

Limit functions to be called by specific other functions in Ada

Suppose I have a procedure that I want to only have called by another specific procedure. Is it possible to force restrictions on that procedure so that it can only be referenced by the specified calling procedure? Really what I'm wanting to know, is whether there is another way to write the code so you don't have to nest/embed procedures within procedures, to force a limited scope.
procedure one
procedure two
begin
blah
end two;
begin
end one;
EDIT: This is for Ada Code btw.
No (generally speaking).
A public procedure is a public procedure, so it can be invoked by anything that "with's" it (if it's a standalone procedure) or the package in which it is declared.
There are a few ways to constrain its visibility if any of these might fit with your implementation approach:
Declare the procedure in the private part of a package, or within the package body. Then only subprograms within that package would have access to it.
Declare the supplying package or subprogram as private, then those packages that 'with' it can only reference the supplying unit's contents (including invoking its subprograms) within their private part or package body.
"Private with" the supplying package, so that it can only reference the package within its private part/package body.
But like T.E.D. says, work with the language and exploit its capabilities rather than trying to recreate some other language's constructs.
Well, if you were to put procedure one in a package by itself and put procedure two in its private section, then no other routine would be able to call it (unless written into the package or a child package).
You could also make a tagged type with any data specific to procedure one in it, and put procedure two in its package with an object of that type as a parameter. Then others might call procedure two, but not with procedure one's object.
I'm a little confused as to why you'd want to recreate Ada's scoping behavior without using scoping though. Embrace the language.
I have two possible suggestions. The first one is slightly odd and off topic a bit but I wanted to bring it up in case you didn't know since most of the answers have to do with hiding visibility of the code or changing relationships
You could consider using the Ada Tasking features and use the 'Caller Attribute. Normally this is only for tasking and then the "caller" name only denotes the calling task to the receiving task. But once inside the receiving task's entry you can then use the Caller name to quickly end or otherwise flag the caller as wrong or not the caller you expected. This basically puts a "doorman" inside the task entry which could then decide to let them proceed, requeue the caller to a different entry, or do something else. But again this would only really work if you have a task consuming published calls from another task. This is the only thing that I'm aware of in Ada where you can detect who called you and do something about it at runtime.
However your question seemed to want to use scope and so I would agree with what's been said here and only add that in Ada it is a normal have have nested procedures (for readability) but in addition to that you could consider creating child packages and using the hierarchy in reverse. That is expose the chidren to the programmer and make the parent only accessible from the children. Design the parent to be very limited in scope such that the public spec of the parent is totally worthless to any caller that doesn't have the private view of the parent's spec. That way you have your separation and only the chidren can access the functions in the parent and can actually call them because they have a complete view of the parent's types and function definitions.
Good luck with your issue.

How to hand over variables to a function? With an array or variables?

When I try to refactor my functions, for new needs, I stumble from time to time about the crucial question:
Shall I add another variable with a default value? Or shall I use only one array, where I´m able to add an additional variable without breaking the API?
Unless you need to support a flexible number of variables, I think it's best to explicitly identify each parameter. In most cases you can add an overloaded method that has a different signature to support the extra parameter while still supporting the original method signature. If you use an array for passing variables it just makes it too confusing for users of your API. Obviously there are some inputs that lend themselves to an array (a list of points in a polygon, a list of account IDs you wish to perform an action on, etc.) but if it's not a variable that you would reasonably expect to be an array or list, you should pass it into the method as a separate parameter.
Just like many questions in programming, the right answer is "it depends".
To take Javascript/jQuery as an example, one good rule of thumb is whether the parameter will be required each time the function is called or whether it is optional. For example, the main jQuery function itself requires an expression to determine what element(s) the operation will affect:
jQuery(expresssion)
It makes no sense to try to pass this parameter as part of an array as it will be required every time this function is called.
On the other hand, many jQuery plugins require several miscellaneous parameters that may be optional. By convention, these are passed as parameters via an 'options' array. As you said, this provides a nice interface as new parameters can be added without affecting the existing API. This makes the API clean as well since the user can ignore those options that are not applicable.
In general, when several parameters are involved, passing them as an array is a nice convention as many of them are certainly going to be optional. This would have helped clean up many WIN32 API's, although it is more difficult to deal with arrays in C/C++ than in Javascript.
It depends on the programming language used.
If you have a run-of-the-mill OO language, you should use an object that you can easily extend, if you are really concerned about API consistency.
If that doesn't matter that much, there is the option of changing the method signature and overloading the method with more / different parameters.
If your language doesn't support either and you want the API to be binary stable, use an array.
There are several considerations that must be made.
Where is the function used? - Only in code you created? One place or hundreds of places? The amount of work that will need to be done to maintain existing code is important. Remember to include the amount of time it will take to communicate to other programmers that may currently be using your function.
How critical is the new parameter? - Do you want to require it to be used? If it has a default value, will that default value break existing use of the function in any subtle ways?
Ease of comprehension - How many parameters are already passed into the function? The larger the number, the more confusing and error prone it will be. Code Complete recommends that you restrict the number of parameters to 7 or less. If you need more than that, you should try to abstract some or all of the related parameters into one object.
Other special considerations - Do you want to optimize your efforts for any special conditions such as code speed or size? Are there any special considerations that must be taken into account for your execution environment? Keep in mind your goals for the project and make sure you aren't working against them with whatever design choice you make.
In his book Code Complete, Steve McConnell decrees that a function should never have more than 7 arguments, and rarely even that many. He presents compelling arguments - that I can't cite from memory, alas.
Clean Code, more recently, advocates even fewer arguments.
So unless the number of things to pass is really small, they should be passed in an enveloping structure. If they're homogenous, an array. If not, then a reasonably lightweight object should be built for the purpose.
You should do neither. Just add the parameter and change all callers to supply the proper default value. The reason is that parameters with default values can only be at the end, and will not be able to add any more required parameters anywhere in the parameters list, without having a risk of misinterpretation.
These are the critical steps to disaster:
1. add one or two parameters with defaults
2. some callers will supply it, and some will rely on defaults.
[half a year passed]
3. add a required parameter (before them)
4. change all callers to accept the required parameter
5. get a phone call, or other event which will make you forget to change one of the instances in part#2
6. now your program compiles perfectly, but is invalid.
Unfortunately, in function call semantics we usually don't have a chance to say, by name, which value goes where.
Array is also not a proper solution. Array should be used as a connection of similar objects, upon which there's a uniform activity performed. As they say here, if it's worth refactoring, it's worth refactoring now.

Resources