Perl 6 calculate the average of an int array at once using reduce - functional-programming

I'm trying to calculate the average of an integer array using the reduce function in one step. I can't do this:
say (reduce {($^a + $^b)}, <1 2 3>) / <1 2 3>.elems;
because it calculates the average in 2 separate pieces.
I need to do it like:
say reduce {($^a + $^b) / .elems}, <1 2 3>;
but it doesn't work of course.
How to do it in one step? (Using map or some other function is welcomed.)

TL;DR This answer starts with an idiomatic way to write equivalent code before discussing P6 flavored "tacit" programming and increasing brevity. I've also added "bonus" footnotes about the hyperoperation Håkon++ used in their first comment on your question.5
Perhaps not what you want, but an initial idiomatic solution
We'll start with a simple solution.1
P6 has built in routines2 that do what you're asking. Here's a way to do it using built in subs:
say { sum($_) / elems($_) }(<1 2 3>); # 2
And here it is using corresponding3 methods:
say { .sum / .elems }(<1 2 3>); # 2
What about "functional programming"?
First, let's replace .sum with an explicit reduction:
.reduce(&[+]) / .elems
When & is used at the start of an expression in P6 you know the expression refers to a Callable as a first class citizen.
A longhand way to refer to the infix + operator as a function value is &infix:<+>. The shorthand way is &[+].
As you clearly know, the reduce routine takes a binary operation as an argument and applies it to a list of values. In method form (invocant.reduce) the "invocant" is the list.
The above code calls two methods -- .reduce and .elems -- that have no explicit invocant. This is a form of "tacit" programming; methods written this way implicitly (or "tacitly") use $_ (aka "the topic" or simply "it") as their invocant.
Topicalizing (explicitly establishing what "it" is)
given binds a single value to $_ (aka "it") for a single statement or block.
(That's all given does. Many other keywords also topicalize but do something else too. For example, for binds a series of values to $_, not just one.)
Thus you could write:
say .reduce(&[+]) / .elems given <1 2 3>; # 2
Or:
$_ = <1 2 3>;
say .reduce(&[+]) / .elems; # 2
But given that your focus is FP, there's another way that you should know.
Blocks of code and "it"
First, wrap the code in a block:
{ .reduce(&[+]) / .elems }
The above is a Block, and thus a lambda. Lambdas without a signature get a default signature that accepts one optional argument.
Now we could again use given, for example:
say do { .reduce(&[+]) / .elems } given <1 2 3>; # 2
But we can also just use ordinary function call syntax:
say { .reduce(&[+]) / .elems }(<1 2 3>)
Because a postfix (...) calls the Callable on its left, and because in the above case one argument is passed in the parens to a block that expects one argument, the net result is the same as the do4 and the given in the prior line of code.
Brevity with built ins
Here's another way to write it:
<1 2 3>.&{.sum/.elems}.say; #2
This calls a block as if it were a method. Imo that's still eminently readable, especially if you know P6 basics.
Or you can start to get silly:
<1 2 3>.&{.sum/$_}.say; #2
This is still readable if you know P6. The / is a numeric (division) operator. Numeric operators coerce their operands to be numeric. In the above $_ is bound to <1 2 3> which is a list. And in Perls, a collection in numeric context is its number of elements.
Changing P6 to suit you
So far I've stuck with standard P6.
You can of course write subs or methods and name them using any Unicode letters. If you want single letter aliases for sum and elems then go ahead:
my (&s, &e) = &sum, &elems;
But you can also extend or change the language as you see fit. For example, you can create user defined operators:
#| LHS ⊛ RHS.
#| LHS is an arbitrary list of input values.
#| RHS is a list of reducer function, then functions to be reduced.
sub infix:<⊛> (#lhs, *#rhs (&reducer, *#fns where *.all ~~ Callable)) {
reduce &reducer, #fns».(#lhs)
}
say <1 2 3> ⊛ (&[/], &sum, &elems); # 2
I won't bother to explain this for now. (Feel free to ask questions in the comments.) My point is simply to highlight that you can introduce arbitrary (prefix, infix, circumfix, etc.) operators.
And if custom operators aren't enough you can change any of the rest of the syntax. cf "braid".
Footnotes
1 This is how I would normally write code to do the computation asked for in the question. #timotimo++'s comment nudged me to alter my presentation to start with that, and only then shift gears to focus on a more FPish solution.
2 In P6 all built in functions are referred to by the generic term "routine" and are instances of a sub-class of Routine -- typically a Sub or Method.
3 Not all built in sub routines have correspondingly named method routines. And vice-versa. Conversely, sometimes there are correspondingly named routines but they don't work exactly the same way (with the most common difference being whether or not the first argument to the sub is the same as the "invocant" in the method form.) In addition, you can call a subroutine as if it were a method using the syntax .&foo for a named Sub or .&{ ... } for an anonymous Block, or call a method foo in a way that looks rather like a subroutine call using the syntax foo invocant: or foo invocant: arg2, arg3 if it has arguments beyond the invocant.
4 If a block is used where it should obviously be invoked then it is. If it's not invoked then you can use an explicit do statement prefix to invoke it.
5 Håkon's first comment on your question used "hyperoperation". With just one easy to recognize and remember "metaop" (for unary operations) or a pair of them (for binary operations), hyperoperations distribute an operation to all the "leaves"6 of a data structure (for an unary) or create a new one based on pairing up the "leaves" of a pair of data structures (for binary operations). NB. Hyperoperations are done in parallel7.
6 What is a "leaf" for a hyperoperation is determined by a combination of the operation being applied (see the is nodal trait) and whether a particular element is Iterable.
7 Hyperoperation is applied in parallel, at least semantically. Hyperoperation assumes8 that the operations on the "leaves" have no mutually interfering side-effects -- that is to say, that any side effect when applying the operation to one "leaf" can safely be ignored in respect to applying the operation to any another "leaf".
8 By using a hyperoperation the developer is declaring that the assumption of no meaningful side-effects is correct. The compiler will act on the basis it is, but will not check that it is true. In the safety sense it's like a loop with a condition. The compiler will follow the dev's instructions, even if the result is an infinite loop.

Here is an example using given and the reduction meta operator:
given <1 2 3> { say ([+] $_)/$_.elems } ;

Related

Use a recursive dop in APL

In the tryapl.org Primer when you select the Del glyph (∇) it shows an example of a Double-Del (∇∇) where a dop (as I understand it, a user defined operator) can reference itself in a recursive fashion:
Double-Del Syntax: dop self-reference
pow←{ ⍝ power operator: apply ⍵⍵ times
⍵⍵=0:⍵ ⍝ ⍵⍵ is 0: finished
⍺⍺ ∇∇(⍵⍵-1)⍺⍺ ⍵ ⍝ otherwise: recurse
}
Can someone provide me with some examples of this particular dop in use so I can see how to utilize it? I see that a single ⍺ is not used in the body of this dop, only ⍺⍺; does that make this a monadic dop?
I have tried a number of different ways to use this operator in an expression after it's defined and can't seem to get anything but errors or instances where it appears the body/text of the dop is in an array as text, alongside what I was trying to pass to it.
Example usage
incr←{1+⍵} ⍝ define an increment function
(incr pow 3) 5 ⍝ apply it 3 times to 5
8
Lack of single ⍺
The operator has ⍵⍵ which alone makes it a dyadic operator. The lack of a single ⍺ means that, when given the two required operands, it derives a monadic function.
Troubleshooting
When an operator takes an array (as opposed to a function) operand adjacent to an argument, it is crucial to separate the operand from the argument. This can be done in any of three ways:
Name the derived function and then apply it:add3←incr pow 3
add3 5
Parenthesise the derived function:(incr pow 3) 5
Separate the operand from the argument using a function (often, the identity function ⊢ is used): incr pow 3 ⊢ 5
Failing this, the intended operand and argument will strand to a single array, which then becomes the operand, leaving the derived function with no argument that it can apply to:
incr pow (3 5)
The result is therefore the derived function; the text source you see reported.

What are some different ways to do a for loop in Julia 1.0+?

I am looking for different ways of writing for loops in Julia! I know this is a basic question but I'm wondering what some of the different options are and if there are advantages/disadvantages with respect to performance.
For loop
Pro: fully flexible has break and continue
Con: no return, must specify iterator at start
While loop
Pro: fully flexible has break and continue
Con: no return, if iterator must be handled manually
Label+goto
Please don't use this for loops
Generator comprehension/Vector comprehension
Pro: Has return value, continue is expressed with filter clause, comes in lazy (generator) and eager forms (vector), can create multidimensional return vale
Con: really ugly for anything long, no break
Broadcast
Pro: express transform of multiple input susictly, has return value with output structure matching what it should be. Can be expressed with just a dot and supports loop fusion.
Con: no break no contine. Writing body means writing a function. Wrapping things you want to broadcast as scalar in Ref is a bit ugly
Map/pmap/asyncmap
Written in do-block form
Pro: can easily change to run distributed or asynchronously, had a return value
Con: no break, no continue
foreach function
It is a lot like map but no return value. So save on allocating that.
Other than that same pros and cons
This is straight from the Julia docs:
The for loop makes common repeated evaluation idioms easier to write. Since counting up and down like the above while loop does is so common, it can be expressed more concisely with a for loop:
julia> for i = 1:5
println(i)
end
1
2
3
4
5
Here the 1:5 is a range object, representing the sequence of numbers 1, 2, 3, 4, 5. The for loop iterates through these values, assigning each one in turn to the variable i. One rather important distinction between the previous while loop form and the for loop form is the scope during which the variable is visible. If the variable i has not been introduced in another scope, in the for loop form, it is visible only inside of the for loop, and not outside/afterward. You'll either need a new interactive session instance or a different variable name to test this:
julia> for j = 1:5
println(j)
end
1
2
3
4
5
julia> j
ERROR: UndefVarError: j not defined
See Scope of Variables for a detailed explanation of the variable scope and how it works in Julia.
In general, the for loop construct can iterate over any container. In these cases, the alternative (but fully equivalent) keyword in or ∈ is typically used instead of =, since it makes the code read more clearly:
julia> for i in [1,4,0]
println(i)
end
1
4
0
julia> for s ∈ ["foo","bar","baz"]
println(s)
end
foo
bar
baz
Various types of iterable containers will be introduced and discussed in later sections of the manual (see, e.g., Multi-dimensional Arrays).

Simple example of call-by-need

I'm trying to understand the theorem behind "call-by-need." I do understand the definition, but I'm a bit confused. I would like to see a simple example which shows how call-by-need works.
After reading some previous threads, I found out that Haskell uses this kind of evaluation. Are there any other programming languages which support this feature?
I read about the call-by-name of Scala, and I do understand that call-by-name and call-by-need are similar but different by the fact that call-by-need will keep the evaluated value. But I really would love to see a real-life example (it does not have to be in Haskell), which shows call-by-need.
The function
say_hello numbers = putStrLn "Hello!"
ignores its numbers argument. Under call-by-value semantics, even though an argument is ignored, the parameter at the function call site may need to be evaluated, perhaps because of side effects that the rest of the program depends on.
In Haskell, we might call say_hello as
say_hello [1..]
where [1..] is the infinite list of naturals. Under call-by-value semantics, the CPU would run off trying to build an infinite list and never get to the say_hello at all!
Haskell merely outputs
$ runghc cbn.hs
Hello!
For less dramatic examples, the first ten natural numbers are
ghci> take 10 [1..]
[1,2,3,4,5,6,7,8,9,10]
The first ten odds are
ghci> take 10 $ filter odd [1..]
[1,3,5,7,9,11,13,15,17,19]
Under call-by-need semantics, each value — even a conceptually infinite one as in the examples above — is evaluated only to the extent required and no more.
update: A simple example, as asked for:
ff 0 = 1
ff 1 = 1
ff n = go (ff (n-1))
where
go x = x + x
Under call-by-name, each invocation of go evaluates ff (n-1) twice, each for each appearance of x in its definition (because + is strict in both arguments, i.e. demands the values of the both of them).
Under call-by-need, go's argument is evaluated at most once. Specifically, here, x's value is found out only once, and reused for the second appearance of x in the expression x + x. If it weren't needed, x wouldn't be evaluated at all, just as with call-by-name.
Under call-by-value, go's argument is always evaluated exactly once, prior to entering the function's body, even if it isn't used anywhere in the function's body.
Here's my understanding of it, in the context of Haskell.
According to Wikipedia, "call by need is a memoized variant of call by name where, if the function argument is evaluated, that value is stored for subsequent uses."
Call by name:
take 10 . filter even $ [1..]
With one consumer the produced value disappears after being produced so it might as well be call-by-name.
Call by need:
import qualified Data.List.Ordered as O
h = 1 : map (2*) h <> map (3*) h <> map (5*) h
where
(<>) = O.union
The difference is, here the h list is reused by several consumers, at different tempos, so it is essential that the produced values are remembered. In a call-by-name language there'd be much replication of computational effort here because the computational expression for h would be substituted at each of its occurrences, causing separate calculation for each. In a call-by-need--capable language like Haskell the results of computing the elements of h are shared between each reference to h.
Another example is, most any data defined by fix is only possible under call-by-need. With call-by-value the most we can have is the Y combinator.
See: Sharing vs. non-sharing fixed-point combinator and its linked entries and comments (among them, this, and its links, like Can fold be used to create infinite lists?).

Replacing functions with Table Lookups

I've been watching this MSDN video with Brian Beckman and I'd like to better understand something he says:
Every imperitive programmer goes through this phase of learning that
functions can be replaced with table lookups
Now, I'm a C# programmer who never went to university, so perhaps somewhere along the line I missed out on something everyone else learned to understand.
What does Brian mean by:
functions can be replaced with table lookups
Are there practical examples of this being done and does it apply to all functions? He gives the example of the sin function, which I can make sense of, but how do I make sense of this in more general terms?
Brian just showed that the functions are data too. Functions in general are just a mapping of one set to another: y = f(x) is mapping of set {x} to set {y}: f:X->Y. The tables are mappings as well: [x1, x2, ..., xn] -> [y1, y2, ..., yn].
If function operates on finite set (this is the case in programming) then it's can be replaced with a table which represents that mapping. As Brian mentioned, every imperative programmer goes through this phase of understanding that the functions can be replaced with the table lookups just for performance reason.
But it doesn't mean that all functions easily can or should be replaced with the tables. It only means that you theoretically can do that for every function. So the conclusion would be that the functions are data because tables are (in the context of programming of course).
There is a lovely trick in Mathematica that creates a table as a side-effect of evaluating function-calls-as-rewrite-rules. Consider the classic slow-fibonacci
fib[1] = 1
fib[2] = 1
fib[n_] := fib[n-1] + fib[n-2]
The first two lines create table entries for the inputs 1 and 2. This is exactly the same as saying
fibTable = {};
fibTable[1] = 1;
fibTable[2] = 1;
in JavaScript. The third line of Mathematica says "please install a rewrite rule that will replace any occurrence of fib[n_], after substituting the pattern variable n_ with the actual argument of the occurrence, with fib[n-1] + fib[n-2]." The rewriter will iterate this procedure, and eventually produce the value of fib[n] after an exponential number of rewrites. This is just like the recursive function-call form that we get in JavaScript with
function fib(n) {
var result = fibTable[n] || ( fib(n-1) + fib(n-2) );
return result;
}
Notice it checks the table first for the two values we have explicitly stored before making the recursive calls. The Mathematica evaluator does this check automatically, because the order of presentation of the rules is important -- Mathematica checks the more specific rules first and the more general rules later. That's why Mathematica has two assignment forms, = and :=: the former is for specific rules whose right-hand sides can be evaluated at the time the rule is defined; the latter is for general rules whose right-hand sides must be evaluated when the rule is applied.
Now, in Mathematica, if we say
fib[4]
it gets rewritten to
fib[3] + fib[2]
then to
fib[2] + fib[1] + 1
then to
1 + 1 + 1
and finally to 3, which does not change on the next rewrite. You can imagine that if we say fib[35], we will generate enormous expressions, fill up memory, and melt the CPU. But the trick is to replace the final rewrite rule with the following:
fib[n_] := fib[n] = fib[n-1] + fib[n-2]
This says "please replace every occurrence of fib[n_] with an expression that will install a new specific rule for the value of fib[n] and also produce the value." This one runs much faster because it expands the rule-base -- the table of values! -- at run time.
We can do likewise in JavaScript
function fib(n) {
var result = fibTable[n] || ( fib(n-1) + fib(n-2) );
fibTable[n] = result;
return result;
}
This runs MUCH faster than the prior definition of fib.
This is called "automemoization" [sic -- not "memorization" but "memoization" as in creating a memo for yourself].
Of course, in the real world, you must manage the sizes of the tables that get created. To inspect the tables in Mathematica, do
DownValues[fib]
To inspect them in JavaScript, do just
fibTable
in a REPL such as that supported by Node.JS.
In the context of functional programming, there is the concept of referential transparency. A function that is referentially transparent can be replaced with its value for any given argument (or set of arguments), without changing the behaviour of the program.
Referential Transparency
For example, consider a function F that takes 1 argument, n. F is referentially transparent, so F(n) can be replaced with the value of F evaluated at n. It makes no difference to the program.
In C#, this would look like:
public class Square
{
public static int apply(int n)
{
return n * n;
}
public static void Main()
{
//Should print 4
Console.WriteLine(Square.apply(2));
}
}
(I'm not very familiar with C#, coming from a Java background, so you'll have to forgive me if this example isn't quite syntactically correct).
It's obvious here that the function apply cannot have any other value than 4 when called with an argument of 2, since it's just returning the square of its argument. The value of the function only depends on its argument, n; in other words, referential transparency.
I ask you, then, what the difference is between Console.WriteLine(Square.apply(2)) and Console.WriteLine(4). The answer is, there's no difference at all, for all intents are purposes. We could go through the entire program, replacing all instances of Square.apply(n) with the value returned by Square.apply(n), and the results would be the exact same.
So what did Brian Beckman mean with his statement about replacing function calls with a table lookup? He was referring to this property of referentially transparent functions. If Square.apply(2) can be replaced with 4 with no impact on program behaviour, then why not just cache the values when the first call is made, and put it in a table indexed by the arguments to the function. A lookup table for values of Square.apply(n) would look somewhat like this:
n: 0 1 2 3 4 5 ...
Square.apply(n): 0 1 4 9 16 25 ...
And for any call to Square.apply(n), instead of calling the function, we can simply find the cached value for n in the table, and replace the function call with this value. It's fairly obvious that this will most likely bring about a large speed increase in the program.

How are functions curried?

I understand what the concept of currying is, and know how to use it. These are not my questions, rather I am curious as to how this is actually implemented at some lower level than, say, Haskell code.
For example, when (+) 2 4 is curried, is a pointer to the 2 maintained until the 4 is passed in? Does Gandalf bend space-time? What is this magic?
Short answer: yes a pointer is maintained to the 2 until the 4 is passed in.
Longer than necessary answer:
Conceptually, you're supposed to think about Haskell being defined in terms of the lambda calculus and term rewriting. Lets say you have the following definition:
f x y = x + y
This definition for f comes out in lambda calculus as something like the following, where I've explicitly put parentheses around the lambda bodies:
\x -> (\y -> (x + y))
If you're not familiar with the lambda calculus, this basically says "a function of an argument x that returns (a function of an argument y that returns (x + y))". In the lambda calculus, when we apply a function like this to some value, we can replace the application of the function by a copy of the body of the function with the value substituted for the function's parameter.
So then the expression f 1 2 is evaluated by the following sequence of rewrites:
(\x -> (\y -> (x + y))) 1 2
(\y -> (1 + y)) 2 # substituted 1 for x
(1 + 2) # substituted 2 for y
3
So you can see here that if we'd only supplied a single argument to f, we would have stopped at \y -> (1 + y). So we've got a whole term that is just a function for adding 1 to something, entirely separate from our original term, which may still be in use somewhere (for other references to f).
The key point is that if we implement functions like this, every function has only one argument but some return functions (and some return functions which return functions which return ...). Every time we apply a function we create a new term that "hard-codes" the first argument into the body of the function (including the bodies of any functions this one returns). This is how you get currying and closures.
Now, that's not how Haskell is directly implemented, obviously. Once upon a time, Haskell (or possibly one of its predecessors; I'm not exactly sure on the history) was implemented by Graph reduction. This is a technique for doing something equivalent to the term reduction I described above, that automatically brings along lazy evaluation and a fair amount of data sharing.
In graph reduction, everything is references to nodes in a graph. I won't go into too much detail, but when the evaluation engine reduces the application of a function to a value, it copies the sub-graph corresponding to the body of the function, with the necessary substitution of the argument value for the function's parameter (but shares references to graph nodes where they are unaffected by the substitution). So essentially, yes partially applying a function creates a new structure in memory that has a reference to the supplied argument (i.e. "a pointer to the 2), and your program can pass around references to that structure (and even share it and apply it multiple times), until more arguments are supplied and it can actually be reduced. However it's not like it's just remembering the function and accumulating arguments until it gets all of them; the evaluation engine actually does some of the work each time it's applied to a new argument. In fact the graph reduction engine can't even tell the difference between an application that returns a function and still needs more arguments, and one that has just got its last argument.
I can't tell you much more about the current implementation of Haskell. I believe it's a distant mutant descendant of graph reduction, with loads of clever short-cuts and go-faster stripes. But I might be wrong about that; maybe they've found a completely different execution strategy that isn't anything at all like graph reduction anymore. But I'm 90% sure it'll still end up passing around data structures that hold on to references to the partial arguments, and it probably still does something equivalent to factoring in the arguments partially, as it seems pretty essential to how lazy evaluation works. I'm also fairly sure it'll do lots of optimisations and short cuts, so if you straightforwardly call a function of 5 arguments like f 1 2 3 4 5 it won't go through all the hassle of copying the body of f 5 times with successively more "hard-coding".
Try it out with GHC:
ghc -C Test.hs
This will generate C code in Test.hc
I wrote the following function:
f = (+) 16777217
And GHC generated this:
R1.p[1] = (W_)Hp-4;
*R1.p = (W_)&stg_IND_STATIC_info;
Sp[-2] = (W_)&stg_upd_frame_info;
Sp[-1] = (W_)Hp-4;
R1.w = (W_)&integerzmgmp_GHCziInteger_smallInteger_closure;
Sp[-3] = 0x1000001U;
Sp=Sp-3;
JMP_((W_)&stg_ap_n_fast);
The thing to remember is that in Haskell, partially applying is not an unusual case. There's technically no "last argument" to any function. As you can see here, Haskell is jumping to stg_ap_n_fast which will expect an argument to be available in Sp.
The stg here stands for "Spineless Tagless G-Machine". There is a really good paper on it, by Simon Peyton-Jones. If you're curious about how the Haskell runtime is implemented, go read that first.

Resources