Is there some rule when to use two functions or when to pass boolean parameter.
Thanks
It has been a while since I last re-read Code Complete, but I vaguely recall McConnell addressing this, and the words "disjunctive conherence" pop into my head. Briefly,
void f(int x, int y, bool b)
versus
void f1(int x, int y)
void f2(int x, int y)
is often a choice, and depending on how similar or different f would behave under true versus false, it may make sense to break it into two functions and give them distinct names. Often a third choice is better, which is to change the bool to a two-value enum, where the enum name makes the distinction clear.
The key is to look at the call-sites, and see if the meaning is clear just from reading the code. If you are tempted to put a comment on every boolean call-site:
f(3, 4, true /* absoluteWidgetMode */ )
and the call-sites usually call with boolean constants, that's a strong smell that you should break it up into multiple functions.
Boolean parameters are meaningless most of the times, basically deserving the same criticism magic numbers do. You have no chance of unterstanding what is done by just looking at the function call.
So even if it's convenient to have a boolean parameter for very similar codes (appending/overwriting a file), keep it internal, private and don't let this be visible in the interface.
Instead, always force the programmer to be explicit:
Use enumerations to give meaningful descriptions for the distinction or just use separate functions.
Compare:
WriteFile(path, "Hello, World", true)
with
WriteFile(path, "Hello, World", FileMode.Append)
or simply
AppendFile(path, "Hello, World")
Related
The following &greet function is pure, and can appropriately be marked with the is pure trait.
sub greet(Str:D $name) { say "Hello, $name" }
my $user = get-from-db('USER');
greet($user);
This one, however, is not:
sub greet {
my $name = get-from-db('USER');
say "Hello, $name"
}
greet($user);
What about this one, though?
sub greet(Str:D $name = get-from-db('USER')) { say "Hello, $name" }
greet();
From "inside" the function, it seems pure – when is parameters are bound to the same values, it always produces the same output, without side effects. But from outside the function, it seems impure – when called twice with the same argument, it can produce different return values. Which prospective does Raku/Rakudo take?
There are at least two strategies a language might take when implementing default values for parameters:
Treat the parameter default value as something that the compiler, upon encountering a call without enough arguments, should emit at the callsite in order to produce the extra argument to pass to the callee. This means that it's possible to support default values for parameters without any explicit support for it in the calling conventions. This also, however, requires that you always know where the call is going at compile time (or at least know it accurately enough to insert the default value, and one can't expect to use different default values in method overrides in subclasses and have it work out).
Have a calling convention powerful enough that the callee can discover that a value was not passed for the parameter, and then compute the default value.
With its dynamic nature, only the second of these really makes sense for Raku, and so that is what it does.
In a language doing strategy 1 it could arguably make sense to mark such a function as pure, insofar as the code that calculates the default lives at each callsite, and so anything doing an analysis and perhaps transformation based upon the purity will already be having to deal with the code that evaluates the default value, and can see that it is not a source of a pure value.
Under strategy 2, and thus Raku, we should understand default values as an implementation detail of the block or routine that has the default in its signature. Thus if the code calculating the default value is impure, then the routine as a whole is impure, and so the is pure trait is not suitable.
More generally, the is pure trait is applicable if for a given argument capture we can always expect the same return value. In the example given, the argument capture \() contradicts this.
An alternative factoring here would be to use multi subs instead of parameter defaults, and to mark only one candidate with is pure.
When you say that a sub is pure, then the you are guaranteeing that any given input will always produce the same output. In your last example of sub greet it looks to me that you cannot guarantee that for the default value case, as the content of the database may change, or the get-from-db may have side-effects.
Of course, if you are sure that the database doesn't change, and there aren't any side-effects, you could still apply is pure to the sub, but why would you be using a database then?
Why would you mark a sub as is pure anyway? Well, it allows the compiler to constant-fold a call to a subroutine at compile time. Take e.g.:
sub foo($a) is pure {
2 * $a
}
say foo(21); # 42
If you look at the code that is generated for this:
$ raku --target=optimize -e 'sub foo($a) is pure { 2 * $a }; say foo(21)'
then you will see this near the end:
│ │ - QAST::IVal(42)
The 42 is the constant folded call for foo(21). So this way the entire call is optimized away, because the sub was marked is pure and the parameter you provided was a constant.
If I want to specify that my function returns a Bool I do:
function myfunc(a,b)::Bool
What if I want to specify that I will return a vector of 4 Int32 elements?
a = Vector{Int32}(undef, 4)
You can't, and you don't have to.
The return type annotation is to declare the return type.
The length of a Vector is not part of its type.
it is part of its value, and it can change. (e.g. push! can be called on it).
Notice:
julia> typeof([1,2,3,4])
Array{Int64,1}
(Vector{T} is just a constant for Array{T,1})
So all you would do is delcare the type:
function myfunc(a,b)::Vector{Int}
Alternatively, you might want a NTuple{Int,4} i.e. a Tuple{Int, Int, Int, Int},
or a SVector{Int,4} from StaticArrays.jl
In general return type annotation is not super useful.
It basically boils down to the code automatically calling convert{RETURNTYPE, raw_return_value), which may error.
This can be helpful on occation for making your code type-stable, if you lose track of what types different are being returned from different return points (if you have multiple).
Rarely it might help the compiler type-infer. (Since convert always returns the indictated target type).
Some argue this serves a documentation purpose also.
Let's say I have a simple function like this:
int all_true(int* bools, int len) {
if (len < 1) return TRUE;
return *bools && all_true(bools+1, len-1);
}
This function can be rewritten in a more obviously tail-recursive style as follows:
int all_true(int* bools, int len) {
if (len < 1) return TRUE;
if (!*bools) return FALSE;
return all_true(bools+1, len-1);
}
Logically, there is zero difference between the two; assuming bools contains only TRUE or FALSE (sensibly defined), they do exactly the same thing.
My question is: if a compiler is smart enough to optimize the second as a tail-recursive call, is it reasonable to expect it to optimize the first in the same way, given that "&&" short-circuits? Obviously, if a non-short-circuiting operator were used, this would not be tail-recursive because both expressions would be evaluated before the operator is even applied, but I'm curious about the short-circuited case.
(Before I get a flood of comments telling me that C compilers don't usually optimize tail-recursive calls: consider this to be a general question about optimizing tail-recursive calls with short-circuit operators, independent of language. I'll be happy to rewrite this in Scheme, Haskell, OCaml, F#, Python, or what the heck ever else for you if you don't understand C.)
Your question is really "how smart is the compiler?" but you don't state which compiler you are using.
Given a hypothetical reasonable compiler which converts source code to an intermediary flow graph before optimizations, both fragments of code that you have written could be represented in the same way (the && operator, while convenient to type, is not nearly as trivially compiled as the & operator; so I wouldn't be surprised if it gets expanded out in one phase on a hypothetical compiler). On that assumption, it is reasonable to assert that the answer to your question is "yes".
However, if you're actually going to rely on this, you should just test it with whatever compiler you happen to be using.
Generally, I have a headache because something is wrong with my reasoning:
For 1 set of arguments, referential transparent function will always return 1 set of output values.
that means that such function could be represented as a truth table (a table where 1 set of output parameters is specified for 1 set of arguments).
that makes the logic behind such functions is combinational (as opposed to sequential)
that means that with pure functional language (that has only rt functions) it is possible to describe only combinational logic.
The last statement is derived from this reasoning, but it's obviously false; that means there is an error in reasoning. [question: where is error in this reasoning?]
UPD2. You, guys, are saying lots of interesting stuff, but not answering my question. I defined it more explicitly now. Sorry for messing up with question definition!
Question: where is error in this reasoning?
A referentially transparent function might require an infinite truth table to represent its behavior. You will be hard pressed to design an infinite circuit in combinatory logic.
Another error: the behavior of sequential logic can be represented purely functionally as a function from states to states. The fact that in the implementation these states occur sequentially in time does not prevent one from defining a purely referentially transparent function which describes how state evolves over time.
Edit: Although I apparently missed the bullseye on the actual question, I think my answer is pretty good, so I'm keeping it :-) (see below).
I guess a more concise way to phrase the question might be: can a purely functional language compute anything an imperative one can?
First of all, suppose you took an imperative language like C and made it so you can't alter variables after defining them. E.g.:
int i;
for (i = 0; // okay, that's one assignment
i < 10; // just looking, that's all
i++) // BUZZZ! Sorry, can't do that!
Well, there goes your for loop. Do we get to keep our while loop?
while (i < 10)
Sure, but it's not very useful. i can't change, so it's either going to run forever or not run at all.
How about recursion? Yes, you get to keep recursion, and it's still plenty useful:
int sum(int *items, unsigned int count)
{
if (count) {
// count the first item and sum the rest
return *items + sum(items + 1, count - 1);
} else {
// no items
return 0;
}
}
Now, with functions, we don't alter state, but variables can, well, vary. Once a variable passes into our function, it's locked in. However, we can call the function again (recursion), and it's like getting a brand new set of variables (the old ones stay the same). Although there are multiple instances of items and count, sum((int[]){1,2,3}, 3) will always evaluate to 6, so you can replace that expression with 6 if you like.
Can we still do anything we want? I'm not 100% sure, but I think the answer is "yes". You certainly can if you have closures, though.
You have it right. The idea is, once a variable is defined, it can't be redefined. A referentially transparent expression, given the same variables, always yields the same result value.
I recommend looking into Haskell, a purely functional language. Haskell doesn't have an "assignment" operator, strictly speaking. For instance:
my_sum numbers = ??? where
i = 0
total = 0
Here, you can't write a "for loop" that increments i and total as it goes along. All is not lost, though. Just use recursion to keep getting new is and totals:
my_sum numbers = f 0 0 where
f i total =
if i < length numbers
then f i' total'
else total
where
i' = i+1
total' = total + (numbers !! i)
(Note that this is a stupid way to sum a list in Haskell, but it demonstrates a method of coping with single assignment.)
Now, consider this highly imperative-looking code:
main = do
a <- readLn
b <- readLn
print (a + b)
It's actually syntactic sugar for:
main =
readLn >>= (\a ->
readLn >>= (\b ->
print (a + b)))
The idea is, instead of main being a function consisting of a list of statements, main is an IO action that Haskell executes, and actions are defined and chained together with bind operations. Also, an action that does nothing, yielding an arbitrary value, can be defined with the return function.
Note that bind and return aren't specific to actions. They can be used with any type that calls itself a Monad to do all sorts of funky things.
To clarify, consider readLn. readLn is an action that, if executed, would read a line from standard input and yield its parsed value. To do something with that value, we can't store it in a variable because that would violate referential transparency:
a = readLn
If this were allowed, a's value would depend on the world and would be different every time we called readLn, meaning readLn wouldn't be referentially transparent.
Instead, we bind the readLn action to a function that deals with the action, yielding a new action, like so:
readLn >>= (\x -> print (x + 1))
The result of this expression is an action value. If Haskell got off the couch and performed this action, it would read an integer, increment it, and print it. By binding the result of an action to a function that does something with the result, we get to keep referential transparency while playing around in the world of state.
As far as I understand it, referential transparency just means: A given function will always yield the same result when invoked with the same arguments. So, the mathematical functions you learned about in school are referentially transparent.
A language you could check out in order to learn how things are done in a purely functional language would be Haskell. There are ways to use "updateable storage possibilities" like the Reader Monad, and the State Monad for example. If you're interested in purely functional data structures, Okasaki might be a good read.
And yes, you're right: Order of evaluation in a purely functional language like haskell does not matter as in non-functional languages, because if there are no side effects, there is no reason to do someting before/after something else -- unless the input of one depends on the output of the other, or means like monads come into play.
I don't really know about the truth-table question.
Here's my stab at answering the question:
Any system can be described as a combinatorial function, large or small.
There's nothing wrong with the reasoning that pure functions can only deal with combinatorial logic -- it's true, just that functional languages hide that from you to some extent or another.
You could even describe, say, the workings of a game engine as a truth table or a combinatorial function.
You might have a deterministic function that takes in "the current state of the entire game" as the RAM occupied by the game engine and the keyboard input, and returns "the state of the game one frame later". The return value would be determined by the combinations of the bits in the input.
Of course, in any meaningful and sane function, the input is parsed down to blocks of integers, decimals and booleans, but the combinations of the bits in those values is still determining the output of your function.
Keep in mind also that basic digital logic can be described in truth tables. The only reason that that's not done for anything more than, say, arithmetic on 4-bit integers, is because the size of the truth table grows exponentially.
The error in Your reasoning is the following:
"that means that such function could be represented as a truth table".
You conclude that from a functional language's property of referential transparency. So far the conclusion would sound plausible, but You oversee that a function is able to accept collections as input and process them in contrast to the fixed inputs of a logic gate.
Therefore a function does not equal a logic gate but rather a construction plan of such a logic gate depending on the actual (at runtime determined) input!
To comment on Your comment: Functional languages can - although stateless - implement a state machine by constructing the states from scratch each time they are being accessed.
Closures are poor man's objects and vice versa.
I have seen this statement at many places on the web (including SO) but I don't quite understand what it means. Could someone please explain what it exactly means?
If possible, please include examples in your answer.
Objects are poor man's closures.
Consider Java. Java is an object-oriented programming language with no language level support for real lexical closures. As a work-around Java programmers use anonymous inner classes that can close over the variables available in lexical scope (provided they're final). In this sense, objects are poor man's closures.
Closures are poor man's objects.
Consider Haskell. Haskell is a functional language with no language level support for real objects. However they can be modeled using closures, as described in this excellent paper by Oleg Kiselyov and Ralf Lammel. In this sense, closures are poor man's objects.
If you come from an OO background, you'll probably find thinking in terms of objects more natural, and may therefore think of them as a more fundamental concept than closures. If you come from a FP background, you might find thinking in terms of closures more natural, and may therefore think of them as a more fundamental concept than objects.
Moral of the story is that closures and objects are ideas that are expressible in terms of each other, and none is more fundamental than the other. That's all there is to the statement under consideration.
In philosophy, this is referred to as model dependent realism.
The point is that closures and objects accomplish the same goal: encapsulation of data and/or functionality in a single, logical unit.
For example, you might make a Python class that represents a dog like this:
class Dog(object):
def __init__(self):
self.breed = "Beagle"
self.height = 12
self.weight = 15
self.age = 1
def feed(self, amount):
self.weight += amount / 5.0
def grow(self):
self.weight += 2
self.height += .25
def bark(self):
print "Bark!"
And then I instantiate the class as an object
>>> Shaggy = Dog()
The Shaggy object has data and functionality built in. When I call Shaggy.feed(5), he gains a pound. That pound is stored in variable that's stored as an attribute of the object, which more or less means that it's in the objects internal scope.
If I was coding some Javascript, I'd do something similar:
var Shaggy = function() {
var breed = "Beagle";
var height = 12;
var weight = 15;
var age = 1;
return {
feed : function(){
weight += amount / 5.0;
},
grow : function(){
weight += 2;
height += .25;
},
bark : function(){
window.alert("Bark!");
},
stats : function(){
window.alert(breed "," height "," weight "," age);
}
}
}();
Here, instead of creating a scope within an object, I've created a scope within a function and then called that function. The function returns a JavaScript object composed of some functions. Because those functions access data that was allocated in the local scope, the memory isn't reclaimed, allowing you to continue to use them through the interface provided by the closure.
An object, at its simplest, is just a collection of state and functions that operate on that state. A closure is also a collection of state and a function that operates on that state.
Let's say I call a function that takes a callback. In this callback, I need to operate on some state known before the function call. I can create an object that embodies this state ("fields") and contains a member function ("method") that performs as the callback. Or, I could take the quick and easy ("poor man's") route and create a closure.
As an object:
class CallbackState{
object state;
public CallbackState(object state){this.state = state;}
public void Callback(){
// do something with state
}
}
void Foo(){
object state = GenerateState();
CallbackState callback = new CallbackState(state);
PerformOperation(callback.Callback);
}
This is pseudo-C#, but is similar in concept to other OO languages. As you can see, there's a fair amount of boilerplate involved with the callback class to manage the state. This would be much simpler using a closure:
void Foo(){
object state = GenerateState();
PerformOperation(()=>{/*do something with state*/});
}
This is a lambda (again, in C# syntax, but the concept is similar in other languages that support closures) that gives us all the capabilities of the class, without having to write, use, and maintain a separate class.
You'll also hear the corollary: "objects are a poor man's closure". If I can't or won't take advantage of closures, then I am forced to do their work using objects, as in my first example. Although objects provide more functionality, closures are often a better choice where a closure will work, for the reasons already stated.
Hence, a poor man without objects can often get the job done with closures, and a poor man without closures can get the job done using objects. A rich man has both and uses the right one for each job.
EDITED: The title of the question does not include "vice versa" so I'll try not to assume the asker's intent.
The two common camps are functional vs imperative languages. Both are tools that can accomplish similar tasks in different ways with different sets of concerns.
Closures are poor man's objects.
Objects are poor man's closures.
Individually, each statement usually means the author has a some bias, one way or another, usually rooted in their comfort with one language or class of language vs discomfort with another. If not bias, they may be constrained with one environment or the other. The authors I read that say this sort of thing are usually the zealot, purist or language religious types. I avoid the language religious types if possible.
Closures are poor man's objects. Objects are poor man's closures.
The author of that is a "pragmatist" and also pretty clever. It means the author appreciates both points of view and appreciates they are conceptually one and the same. This is my sort of fellow.
Just so much sugar, as closures hide anonymous objects under their skirts.
"Objects are a poor man's closures" isn't just a statement of some theoretical equivalence — it's a common Java idiom. It's very common to use anonymous classes to wrap up a function that captures the current state. Here's how it's used:
public void foo() {
final String message = "Hey ma, I'm closed over!";
SwingUtilities.invokeLater(new Runnable() {
public void run() {
System.out.println(message);
}
});
}
This even looks a lot like the equivalent code using a closure in another language. For example, using Objective-C blocks (since Objective-C is reasonably similar to Java):
void foo() {
NSString *message = #"Hey ma, I'm closed over!";
[[NSOperationQueue currentQueue] addOperationWithBlock:^{
printf("%s\n", [message UTF8String]);
}];
}
The only real difference is that the functionality is wrapped in the new Runnable() anonymous class instance in the Java version.
That objects can be used as a replacement for closures is quite easy to understand, you just place the captured state in the object and the calling as a method. Indeed for example C++ lambda closures are implemented as objects (things are sort of tricky for C++ because the language doesn't provide garbage collection and true closures with mutable shared state are therefore hard to implement correctly because of the lifetime of captured context).
The opposite (closures can be used as objects) is less observed but it's IMO a very powerful technique... consider for example (Python)
def P2d(x, y):
def f(cmd, *args):
nonlocal x, y
if cmd == "x": return x
if cmd == "y": return y
if cmd == "set-x": x = args[0]
if cmd == "set-y": y = args[0]
return f
The function P2d returns a closure that captured the two values of x and y. The closure then provide access for reading and writing to them using a command. For example
p = P2d(10, 20)
p("x") # --> 10
p("set-x", 99)
p("x") # --> 99
so the closure is behaving like an object; moreover as any access is going through the command interface it's very easy to implement delegation, inheritance, computed attributes etc.
The nice book "Let Over Lambda" builds over this idea using Lisp as a language, but any language that supports closures can use this technique (in Lisp the advantage is that you can also bend the syntax using macros and read macros to improve usability and automatically generate all boilerplate code). The title of the book is exactly about this... a let wrapping a lambda:
(defun p2d (x y)
(let ((x x) (y y))
(lambda (cmd &rest args)
(cond
((eq cmd 'x) x)
((eq cmd 'y) y)
((eq cmd 'set-x) (setq x (first args)))
((eq cmd 'set-y) (setq y (first args)))))))
Actually I'm not sure I agree with the "poor" adjective in this approach.