How to hand over variables to a function? With an array or variables? - conventions

When I try to refactor my functions, for new needs, I stumble from time to time about the crucial question:
Shall I add another variable with a default value? Or shall I use only one array, where I´m able to add an additional variable without breaking the API?

Unless you need to support a flexible number of variables, I think it's best to explicitly identify each parameter. In most cases you can add an overloaded method that has a different signature to support the extra parameter while still supporting the original method signature. If you use an array for passing variables it just makes it too confusing for users of your API. Obviously there are some inputs that lend themselves to an array (a list of points in a polygon, a list of account IDs you wish to perform an action on, etc.) but if it's not a variable that you would reasonably expect to be an array or list, you should pass it into the method as a separate parameter.

Just like many questions in programming, the right answer is "it depends".
To take Javascript/jQuery as an example, one good rule of thumb is whether the parameter will be required each time the function is called or whether it is optional. For example, the main jQuery function itself requires an expression to determine what element(s) the operation will affect:
jQuery(expresssion)
It makes no sense to try to pass this parameter as part of an array as it will be required every time this function is called.
On the other hand, many jQuery plugins require several miscellaneous parameters that may be optional. By convention, these are passed as parameters via an 'options' array. As you said, this provides a nice interface as new parameters can be added without affecting the existing API. This makes the API clean as well since the user can ignore those options that are not applicable.
In general, when several parameters are involved, passing them as an array is a nice convention as many of them are certainly going to be optional. This would have helped clean up many WIN32 API's, although it is more difficult to deal with arrays in C/C++ than in Javascript.

It depends on the programming language used.
If you have a run-of-the-mill OO language, you should use an object that you can easily extend, if you are really concerned about API consistency.
If that doesn't matter that much, there is the option of changing the method signature and overloading the method with more / different parameters.
If your language doesn't support either and you want the API to be binary stable, use an array.

There are several considerations that must be made.
Where is the function used? - Only in code you created? One place or hundreds of places? The amount of work that will need to be done to maintain existing code is important. Remember to include the amount of time it will take to communicate to other programmers that may currently be using your function.
How critical is the new parameter? - Do you want to require it to be used? If it has a default value, will that default value break existing use of the function in any subtle ways?
Ease of comprehension - How many parameters are already passed into the function? The larger the number, the more confusing and error prone it will be. Code Complete recommends that you restrict the number of parameters to 7 or less. If you need more than that, you should try to abstract some or all of the related parameters into one object.
Other special considerations - Do you want to optimize your efforts for any special conditions such as code speed or size? Are there any special considerations that must be taken into account for your execution environment? Keep in mind your goals for the project and make sure you aren't working against them with whatever design choice you make.

In his book Code Complete, Steve McConnell decrees that a function should never have more than 7 arguments, and rarely even that many. He presents compelling arguments - that I can't cite from memory, alas.
Clean Code, more recently, advocates even fewer arguments.
So unless the number of things to pass is really small, they should be passed in an enveloping structure. If they're homogenous, an array. If not, then a reasonably lightweight object should be built for the purpose.

You should do neither. Just add the parameter and change all callers to supply the proper default value. The reason is that parameters with default values can only be at the end, and will not be able to add any more required parameters anywhere in the parameters list, without having a risk of misinterpretation.
These are the critical steps to disaster:
1. add one or two parameters with defaults
2. some callers will supply it, and some will rely on defaults.
[half a year passed]
3. add a required parameter (before them)
4. change all callers to accept the required parameter
5. get a phone call, or other event which will make you forget to change one of the instances in part#2
6. now your program compiles perfectly, but is invalid.
Unfortunately, in function call semantics we usually don't have a chance to say, by name, which value goes where.
Array is also not a proper solution. Array should be used as a connection of similar objects, upon which there's a uniform activity performed. As they say here, if it's worth refactoring, it's worth refactoring now.

Related

Difference between the internal procedures and functions in Progress4gl?

Both internal procedures and functions are accepting the parameters to give the output. So what is the use of using Internal procedures instead of functions.
A user-defined function is used when you want to perform some calculation and return a single value. In this respect it is the same as a built-in ABL function, like the SUBSTRING or EXP functions. Putting this calculation code in a FUNCTION block instead of inline in your code allows you to put it in one place and reference it multiple times without code duplication.
An internal procedure is also an encapsulated piece of code that does some work, but it is more general-purpose. While a function must return a single value, an internal procedure may or may not have input parameters or output parameters.
https://docs.progress.com/category/openedge-archives
Also functions (like methods) parameters and return value type are checked at compile time, which removes some potential problems at run time later.
The question acknowledges that both functions and internal procedures allow OUTPUT parameters and asks "what is the use" of internal procedures instead of functions.
To me, this implies that the poster is contemplating always using functions and deprecating internal procedures and is asking: "what would I lose if I do that?"
Two things spring to mind:
Sort of the opposite of Jean-Christophe Cardot's point: you would lose some automatic type conversions and syntactic flexibility about the parameter lists. Some people see that flexibility in a negative light. Others see it as a positive.
You need to "forward declare" your functions or use dynamic invocations. With an internal procedure you can RUN it without providing a declaration earlier in the code.
If you tend to think that strict type checking is useful then these are probably not benefits that you think of as being lost. If you prefer more flexible behaviors, then you may regret choosing functions rather than internal procedures.

Localizing global variables

When using the Extended Program Check, I get the following warning:
Do not declare fields and field symbols (variable name) globally.
This is from declaring global data before the selection screen. The obvious solution is that they should be declared locally in a subroutine.
If I decide to do this, the data will now be out of scope for the other subroutines, so I would end up creating something to the effect of a main() function from C or Java. This sounds like a good idea - however, events such as INITIALIZATION are not allowed to be inside of subroutines, meaning that it forces a break in scope.
Observe the sample program below:
REPORT Z_EXAMPLE.
SELECTION-SCREEN BEGIN OF BLOCK upload WITH FRAME TITLE text-H01.
PARAMETERS: p_infile TYPE rlgrap-filename LOWER CASE OBLIGATORY.
SELECTION-SCREEN END OF BLOCK upload.
AT SELECTION-SCREEN ON VALUE-REQUEST FOR p_infile.
PERFORM main1 CHANGING p_infile.
INITIALIZATION.
PERFORM main2.
TOP-OF-PAGE.
PERFORM main3.
...
main1, main2, and main3 cannot to my knowledge pass any data to one another without global declaration. If the data is parsed from the uploaded file p_infile in main1, it cannot be accessed in main2 or main3. Aside from omitting events all together, is there any way to abide by the warning but let data be passed over events?
There are a variety of techniques - I prefer to code almost everything except for the basic selection screen handling in a separate controller class. The report simply defers to that class and calls its methods. Other than that - it's just a warning that you can ignore if you know what you're doing. Writing a program without any global variable at all will certainly not be practical - however, you should think at least twice before using global variables or attributes in a place where a method parameter would be more appropriate.
As #vwegert so rightly said, it's almost impossible to write an ABAP program that doesn't have at least a few global variables (the selection screen and events enforce that, unfortunately).
One approach is to use a controller class, another is to have a main subroutine and have it call other subroutines as required, passing values as required. I tend to favour the latter approach in a lot of cases, if only because it's easier to split the subroutines into logical groupings in separate includes (doing so with classes can sometimes be a little ugly). It really is a matter of approach though, but the key thing is reducing global variables to a minimum - unfortunately too few ABAP developers that I've encountered care about such issues.
Update
#Christian has reminded me that as of ABAP AS 7.02, subroutines are considered obsolete:
Subroutines should no longer be created in new programs for the following reasons:
The parameter interface has clear weaknesses when compared with the parameter interface of methods, such as:
positional parameters instead of keyword parameters
no genuine input parameters in pass by reference
typing is optional
no optional parameters
Every subroutine implicitly belongs to the public interface of its program. Generally this is not desirable.
Calling subroutines externally is critical with regard to the assignment of the container program to a program group in the internal session. This assignment cannot generally be defined as static.
Those are all valid points and I think in light of that, using classes for modularisation is definitely the preferred approach (and from a purely aesthetic point of view, they also "fit" better with the syntax enhancements in 7.02 and later).

What is the difference between self-modifying code and reflection?

Self-modifying code is code that "alters its own instructions while it is executing". This is not typically done outside of assembly language or viruses.
Reflection is just the ability of a program to access its own namespace dynamically, in order to reference functions and classes and variables dynamically. According to this article, reflection is not just introspection (a program's ability to examine itself), but also intercession (a program's ability to modify itself).
So, is the difference that reflection refers to a mild form of self-modifying code where only the variable/class/function name gets "modified" within the instructions? That is, reflection is a milder, less "dramatic" form of modification compared with the ability to modify the nature of the entire instruction itself as in self-modifying code.
Do I have this distinction right?
No, one is about changing the code during execution. The other about reading the structure and metadata (introspection) of the code during execution.
They can be mutually exclusive. You don't need to know how the code is to modify it (if the OS allows you that is).
Typically you can use reflection to execute code in a non 'normal use case' manner, but it's still the same code. contrast this to modifying the code.
The goals are completely non aligned.
However I suppose one example where they intersect in a small way is to consider a function (F) that calls two other functions - A then B. You could reflect that knowledge and then call B then A (thus modifying the use case of (F)). as you can see it's not not modifying code, rather just the intended use case.

Using generic functions of R, when and why?

I'm developing an major upgrade to the R package, and as part of the changes I want to start using the S3 methods so I can use the generic plot, summary and print functions. But I think I'm not totally sure I understand why and when to use generic functions in general.
For example, I currently have a function called logLikSSM, which computes the log-likelihood of a state space model. Instead of using this functions, I could make function logLik.SSM or something like that, as there is generic function logLik in R. The benefit of this would be that logLik is shorter to write than logLikSSM, but is there really any other point in this?
Similar case, there is a generic function called simulate in stats package, so in theory I could use that instead of simulateSSM. But now the description of the simulate function tells that function is used to "Simulate Responses", but my function actually simulates the hidden states, so it really doesn't fit into the description of the simulate function. So probably in this case I shouldn't use the generic function right?
I apologize if this question is too vague for here.
The advantages of creating methods for generics from the core of R include:
Ease of Use. Users of your package already familiar with those generics will have less to remember making it easier to use your package. They might even be able to do a certain amount without reading the documentation. If you come up with your own names then they must discover and remember new names which is an added cognitive burden.
Leverage Existing Functionality. Also any other functions that make use of generics you create methods for can then automatically use yours as well; otherwise, they would have to be changed. For example, AIC uses logLik.
A disadvantage is that the generic involves the extra level of dispatch and if logLik is in the inner loop of an optimization there could be an impact (although possibly not material). In that case you could check the performance of calling the generic vs. calling the method directly and use the latter if it makes a significant difference.
Regarding the case that your function has a completely different purpose than the generic in the core of R, then it might be more confusing than helpful so you might, in that case, not create a method but have your own function name.
You might want to read the zoo Design manual (see link to zoo Design under Vignettes near the bottom of that page) which discusses the design ideas that went into the zoo package. These include the idea being discussed here.
EDIT: Added disadvantates.
good question.
I'll split your Question into two parts; here's the first one:
i]s there really any other point in [making functions generic]?
Well, this pattern is usually invoked when the develper doesn't know the object class for every object he/she expects a user to pass in to the method under consideration.
And because of this uncertainty, this design pattern (which is called overloading in many other languages) is invokved, and which requires R to evaluate the object class, then dispatch that object to the appropriate method given the object type.
The second part of your Question: [i]n this case I shouldn't use [the generic function] right?
To try to give you an answer useful beyond the detail of your Question, consider what happens to the original method when you call setGeneric, passing that method in.
the original function body is replaced with code for performing a top-level dispatch based on type of object passed in. This replaces the original function body, which just slides down one level so that it becomes the default method that the top level (generic) function dispatches to.
showMethods() will let you see all of those methods which are called by the newly created dispatch function (generic function).
And now for one huge disadvantage:
Ease of MISUse:
Users of your package already familiar with those generics might do a certain amount without reading the documentation.
And therein lies the fallacy that components, reusable objects, services, etc are an easy panacea for all software challenges.
And why the overwhelming majority of software is buggy, bloated, and operates inconsistently with little hope of tech support being able to diagnose your problem.
There WAS a reason for static linking and small executables back in the day. But this generation of code now, get paid now, debug later if ever, before the layoffs/IPO come, has no memory of the days when code actually worked very reliably and installation/integration didn't require 200$/hr Big 4 consultants or hackers who spend a week trying to get some "simple" open source product installed and productively running.
But if you want to continue the tradition of writing ever shorter function/method names, be my guest.

Why are getters prefixed with the word "get"?

Generally speaking, creating a fluid API is something that makes all programmers happy; Both for the creators who write the interface, and the consumers who program against it. Looking beyond conventions, why is it that we prefix all our getters with the word "get". Omitting it usually results in a more fluid, easy to read set of instructions, which ultimately leads to happiness (however small or passive). Consider this very simple example. (pseudo code)
Conventional:
person = new Person("Joey")
person.getName().toLower().print()
Alternative:
person = new Person("Joey")
person.name().toLower().print()
Of course this only applies to languages where getters/setters are the norm, but is not directed at any specific language. Were these conventions developed around technical limitations (disambiguation), or simply through the pursuit of a more explicit, intentional feeling type of interface, or perhaps this is just a case of trickle a down norm. What are your thoughts? And how would simple changes to these conventions impact your happiness / daily attitudes towards your craft (however minimal).
Thanks.
Because, in languages without Properties, name() is a function. Without some more information though, it's not necessarily specific about what it's doing (or what it's going to return).
Functions/Methods are also supposed to be Verbs because they are performing some action. name() obviously doesn't fit the bill because it tells you nothing about what action it is performing.
getName() lets you know without a doubt that the method is going to return a name.
In languages with Properties, the fact that something is a Property expresses the same meaning as having get or set attached to it. It merely makes things look a little neater.
The best answer I have ever heard for using the get/set prefixes is as such:
If you didn't use them, both the accessor and mutator (getter and setter) would have the same name; thus, they would be overloaded. Generally, you should only overload a method when each implementation of the method performs a similar function (but with different inputs).
In this case, you would have two methods with the same name that peformed very different functions, and that could be confusing to users of the API.
I always appreciate consistent get/set prefixing when working with a new API and its documentation. The automatic grouping of getters and setters when all functions are listed in their alphabetical order greatly helps to distinguish between simple data access and advanced functinality.
The same is true when using intellisense/auto completion within the IDE.
What about the case where a property is named after an verb?
object.action()
Does this get the type of action to be performed, or execute the action... Adding get/set/do removes the ambiguity which is always a good thing...
object.getAction()
object.setAction(action)
object.doAction()
In school we were taught to use get to distinguish methods from data structures. I never understood why the parens wouldn't be a tipoff. I'm of the personal opinion that overuse of get/set methods can be a horrendous time waster, and it's a phase I see a lot of object oriented programmers go through soon after they start.
I may not write much Objective-C, but since I learned it I've really come to love it's conventions. The very thing you are asking about is addressed by the language.
Here's a Smalltalk answer which I like most. One has to know a few rules about Smalltalk BTW.
fields are only accessible in the they are defined.If you dont write "accessors" you won't be able to do anything with them.
The convention there is having a Variable (let's anme it instVar1.
then you write a function instVar1 which just returns instVar1 and instVar: which sets
the value.
I like this convention much more than anything else. If you see a : somewhere you can bet it's some "setter" in one or the other way.
Custom.
Plus, in C++, if you return a reference, that provides potential information leakage into the class itself.

Resources