Given the following structure:
class G {
Node[] nodes;
}
class Node {
Node neighbour;
}
The deep copy operations can be defined as:
function G copy (G g) {
G r = new G();
Map isom = new Map();
for (Node node in g.nodes) {
Node c = isom.get(node);
if (c == null) {
c = copy(node, isom);
isom.put(node, c);
}
r.nodes.add(c);
}
return r;
}
function Node copy(Node n, Map isom) {
Node r = isom.get(n);
if (r == null) {
r = new Node();
isom.put(n, r);
r.neighbour = copy(n.neighbour);
}
return r;
}
My question is how to design a function copy(Node n, Map isom), such that it does not mutate the argument isom, in a functional programming style.
After having posted this question, I did some investigations seriously. My finding is that functional programming is not good at handling the popular graph algorithms.
People with purely functional favour have to treat graph in a distinct way from the normal literature. That's the motivation of pushing guys to generate the following works:
functional graph algorithms with depth-first search
graph algorithm lazy functional programming language
inductive graphs and functional graph algorithms
purely functional data structures
graph algorithms with a functional flavour
Graph algorithms have long been a challenge to program in a pure
functional language. Previous attempts have either tended to be
unreadable, or have failed to achieve standard asymptotic complexity
measures.
---John Launchbury. 1995. Graph Algorithms with a Functional Flavous. In Advanced Functional Programming, First International Spring School on Advanced Functional Programming Techniques-Tutorial Text, Johan Jeuring and Erik Meijer (Eds.). Springer-Verlag, London, UK, 308-331.
Related
Is there a way to calculate the dominator tree from the Graph type using a more imperative approach? The language has a support to create such data structure directly?
I'm trying to extract a dominator tree from a Graph using the following algorithm (here is the link for the original article):
But I'm having trouble in adapting those for and while statements.
There are some choices to make, like for example how to represent the output dominator tree. One typical way is to choose Graph again. Later you could transform the Graph to a constructor tree if you like by another function.
Given that choice for Graph[&T], the following template could become a rather literal translation of the given algorithm into Rascal:
Graph[&T] dominators(Graph[&T] graph, &T root) {
result = {};
V = carrier(graph);
Pred = graph<to,from>;
solve(result) {
for (v <- V, u <- Pred[v]) {
if (...)
}
}
return result;
}{
However it is unnecessary to go to the "pred" form of a graph by first inverting it and then continously looking up predecessors, we can directly loop over the edges as well, and this is much much faster:
Graph[&T] dominators(Graph[&T] graph, &T root) {
result = {};
solve(result) {
for (<u, v> <- graph) { // u is the predecessor of v
if (...) {
result += { };
}
}
}
return result;
}
A basic fixed point solver directly from the definition in the Dragon book (and also equation 3.2 in the thesis you cited). (Note I just typed this in, haven't tested it so it may be buggy):
rel[&T, set[&T]] dominators(graph[&T] graph) {
nodes = carrier(graph);
result = {};
preds = graph<to,from>;
solve(result) {
for (n <- nodes) {
result[n] = {n} + intersect({result[p] | p <- preds[n]?{}});
}
}
return result;
}
(with intersect a library function from the Set module)
And here is a "relational calculus" solution, which solves the problem using the reachX library function and returns a relation from each node to the set of nodes it dominates (taken from Rascal documentation files):
rel[&T, set[&T]] dominators(rel[&T,&T] PRED, &T ROOT) {
set[&T] VERTICES = carrier(PRED);
return { <V, (VERTICES - {V, ROOT}) - reachX({ROOT}, {V}, PRED)> | &T V : VERTICES};
}
Let's say that I have a few functions that perform business logic on some data:
function addEmployees(data, numberOfNewEmployees){
// Business logic...
data.employeeCount += numberOfNewEmployees;
return data;
}
function withdrawFunds(data, withdrawAmount){
// Business logic...
data.checkingAccount -= withdrawAmount;
return data;
}
function completeAnOrder(data){
// Business logic...
data.pendingOrders -- 1;
return data;
}
Now, to do several operations on some data, I have something like this (Let's assume that data is passed by copy):
const data = {
employeeCount: 5,
checkingAccount: 5000,
pendingOrders: 2
}
let newData = addEmployees(data, 2);
newData = withdrawFunds(newData, 2000);
newData = completeAnOrder(newData);
I was curious if there is an elegant method in the functional programming world to accomplish something closer to this:
const data = {
employeeCount: 5,
checkingAccount: 5000,
pendingOrders: 2
}
let biz = createBiz(data);
const newData = biz.addEmployees(2)
.withdrawFunds(2000)
.completeAnOrder()
.toValue();
In JavaScript I know that an object can return this and that is how JQuery method chaining works.
But is there a elegant method in the functional world to do something similar? I realize I may be trying to force an OOP idea into FP.
Is there a Monad that solves this problem? Does it make sense to create your own custom Monads for specific business logic?
This will heavily depend on the language and the tools the language has available.
In Clojure, which is homoiconic, tasks like this are often solved using macros. In this case, this would be accomplished using a "threading" macro.
Say I have your functions:
; All of these functions return the modified data
(defn add-employees [data number-of-new-employees]
...)
(defn withdraw-funds [data withdraw-amount]
...)
(defn complete-an-order [data]
...)
Since "this" (the data) is the first parameter, I can use -> to automatically "thread" the argument to each call:
(def data {:employee-count 5,
:checking-account 5000,
:pending-orders 2})
(-> data
(add-employees 2) ; The result of this gets passed as the first argument to withdraw-funds
(withdraw-funds 2000) ; Then the result of this gets passed to complete-an-order...
(complete-an-order) ; Same as above
(to-value))
After macro expansion, this basically gets turned into:
(to-value (complete-an-order (withdraw-funds (add-employees data 2) 2000)))
But it's much more readable and easier to change in the future using ->.
You would use composition. In Haskell, if the operations are pure functions that operate on a structure and return a new structure, and not I/O operations, you might write that several different ways, such as: toValue . completeOrder . withdrawFunds 2000 . addEmployees 2 $ data. (You can also write it left-to-right using &.)
You’re more likely to see that example turned into stateful code with side-effects on an external database, though. In Haskell, this would use the abstraction of applicatives or monads, but most other functional languages wouldn’t be such sticklers for mathematical formalism. The applicative version lets you write something like runValue $ completeOrder <$> withdrawFunds 2000 <$> addEmployees 2 <$> data. Or you can write this as do blocks.
Facebook gives some real-world examples of how it does this for some of its database code. The imperative code:
NumCommonFriends(x, y) = Length(Intersect(FriendsOf(x), FriendsOf(y)))
has the applicative version
numCommonFriends x y =
length <$> (intersect <$> friendsOf x <*> friendsOf y)
which can be written with some syntactic sugar as
numCommonFriends x y = do
fx <- friendsOf x
fy <- friendsOf y
return (length (intersect fx fy))
I found an answer on SO that explained how to write a randomly weighted drop system for a game. I would prefer to write this code in a more functional-programming style but I couldn't figure out a way to do that for this code. I'll inline the pseudo code here:
R = (some random int);
T = 0;
for o in os
T = T + o.weight;
if T > R
return o;
How could this be written in a style that's more functional? I am using CoffeeScript and underscore.js, but I'd prefer this answer to be language agnostic because I'm having trouble thinking about this in a functional way.
Here are two more functional versions in Clojure and JavaScript, but the ideas here should work in any language that supports closures. Basically, we use recursion instead of iteration to accomplish the same thing, and instead of breaking in the middle we just return a value and stop recursing.
Original pseudo code:
R = (some random int);
T = 0;
for o in os
T = T + o.weight;
if T > R
return o;
Clojure version (objects are just treated as clojure maps):
(defn recursive-version
[r objects]
(loop [t 0
others objects]
(let [obj (first others)
new_t (+ t (:weight obj))]
(if (> new_t r)
obj
(recur new_t (rest others))))))
JavaScript version (using underscore for convenience).
Be careful, because this could blow out the stack.
This is conceptually the same as the clojure version.
var js_recursive_version = function(objects, r) {
var main_helper = function(t, others) {
var obj = _.first(others);
var new_t = t + obj.weight;
if (new_t > r) {
return obj;
} else {
return main_helper(new_t, _.rest(others));
}
};
return main_helper(0, objects);
};
You can implement this with a fold (aka Array#reduce, or Underscore's _.reduce):
An SSCCE:
items = [
{item: 'foo', weight: 50}
{item: 'bar', weight: 35}
{item: 'baz', weight: 15}
]
r = Math.random() * 100
{item} = items.reduce (memo, {item, weight}) ->
if memo.sum > r
memo
else
{item, sum: memo.sum + weight}
, {sum: 0}
console.log 'r:', r, 'item:', item
You can run it many times at coffeescript.org and see that the results make sense :)
That being said, i find the fold a bit contrived, as you have to remember both the selected item and the accumulated weight between iterations, and it doesn't short-circuit when the item is found.
Maybe a compromise solution between pure FP and the tedium of reimplementing a find algorithm can be considered (using _.find):
total = 0
{item} = _.find items, ({weight}) ->
total += weight
total > r
Runnable example.
I find (no pun intended) this algorithm much more accessible than the first one (and it should perform better, as it doesn't create intermediate objects, and it does short-circuiting).
Update/side-note: the second algorithm is not "pure" because the function passed to _.find is not referentially transparent (it has the side effect of modifying the external total variable), but the whole of the algorithm is referentially transparent. If you were to encapsulate it in a findItem = (items, r) -> function, the function will be pure and will always return the same output for the same input. That's a very important thing, because it means that you can get the benefits of FP while using some non-FP constructs (for performance, readability, or whatever reason) under the hoods :D
I think the underlying task is randomly selecting 'events' (objects) from array os with a frequency defined by their respective weights. The approach is to map (i.e. search) a random number (with uniform distribution) onto the stairstep cumulative probability distribution function.
With positive weights, their cumulative sum is increasing from 0 to 1. The code you gave us simply searches starting at the 0 end. To maximize speed with repeated calls, pre calculate sums, and order the events so the largest weights are first.
It really doesn't matter whether you search with iteration (looping) or recursion. Recursion is nice in a language that tries to be 'purely functional' but doesn't help understanding the underlying mathematical problem. And it doesn't help you package the task into a clean function. The underscore functions are another way of packaging the iterations, but don't change the basic functionality. Only any and all exit early when the target is found.
For small os array this simple search is sufficient. But with a large array, a binary search will be faster. Looking in underscore I find that sortedIndex uses this strategy. From Lo-Dash (an underscore dropin), "Uses a binary search to determine the smallest index at which the value should be inserted into array in order to maintain the sort order of the sorted array"
The basic use of sortedIndex is:
os = [{name:'one',weight:.7},
{name:'two',weight:.25},
{name:'three',weight:.05}]
t=0; cumweights = (t+=o.weight for o in os)
i = _.sortedIndex(cumweights, R)
os[i]
You can hide the cumulative sum calculation with a nested function like:
osEventGen = (os)->
t=0; xw = (t+=y.weight for y in os)
return (R) ->
i = __.sortedIndex(xw, R)
return os[i]
osEvent = osEventGen(os)
osEvent(.3)
# { name: 'one', weight: 0.7 }
osEvent(.8)
# { name: 'two', weight: 0.25 }
osEvent(.99)
# { name: 'three', weight: 0.05 }
In coffeescript, Jed Clinger's recursive search could be written like this:
foo = (x, r, t=0)->
[y, x...] = x
t += y
return [y, t] if x.length==0 or t>r
return foo(x, r, t)
An loop version using the same basic idea is:
foo=(x,r)->
t=0
while x.length and t<=r
[y,x...]=x # the [first, rest] split
t+=y
y
Tests on jsPerf http://jsperf.com/sortedindex
suggest that sortedIndex is faster when os.length is around 1000, but slower than the simple loop when the length is more like 30.
Given a tree, how can I get a depth -> nodes map?
I’ve come this far in JavaScript but I’m not sure how to make the outer loop functional. I could probably do it using recursive inject but I’d rather avoid recursion.
function treeToLayers(root) {
var layers = [[root]];
var nextLayer = root.children;
while (nextLayer.length > 0) {
layers.push(nextLayer);
var lastLayer = nextLayer;
nextLayer = _(lastLayer).chain().
pluck('children').
flatten().
value();
}
return layers;
}
For some reason, I tend to associate closures with functional languages. I believe this is mostly because the discussions I've seen concerning closures is almost always in an environment that is focused around functional programming. That being said, the actual practical uses of closures that I can think are are all non-functional in nature.
Are there practical uses of closures in functional languages, or is the association in my mind mostly because closures are used to program in a style that's also common to functional programming languages (first class functions, currying, etc)?
Edit: I should clarify that I refering to actual functional languages, meaning I was looking for uses that preserve referential transparency (for the same input you get the same output).
Edit: Adding a summary of what's been posted so far:
Closures are used to implement partial evaluation. Specifically, for a function that takes two arguments, it can be called with one argument which results in it returning a function that takes one argument. Generally, the method by which this second function "stores" the first value passed into it is a closure.
Objects can be implemented using closures. A function is returned that has closes around a number of variables, and can then use them like object attributes. The function itself may return more methods, which act as object methods, which also have access to these variables. Assuming the variables aren't modified, referential transparency is maintained.
I use lots of closures in Javascript code (which is a pretty functional language -- I joke that it is Scheme with C clothing). They provide encapsulation of data that is private to a function.
The most ubiquitous example:
var generateId = function() {
var id = 0;
return function() {
return id++;
}
}();
window.alert(generateId());
window.alert(generateId());
But that's the hello, world of Javascript closures. However there are many more practical uses.
Recently, in my job, I needed to code a simple photo gallery with sliders. It does something like:
var slide = function() {
var photoSize = ...
var ... // lots of calculations of sizes, distances to scroll, etc
var scroll = function(direction, amout) {
// here we use some of the variables defined just above
// (it will be returned, therefore it is a closure)
};
return {
up: function() { scroll(1, photoSize); },
down: function() { scroll(-1, photoSize); }
}
}();
slide.up();
// actually the line above would have to be associated to some
// event handler to be useful
In this case I've used closures to hide all the up and down scrolling logic, and have a code which is very semantic: in Javascript, "slide up" you will write slide.up().
One nice use for closures is building things like decision trees. You return a classify() function that tests whether to go down the left or right tree, and then calls either its leftClassify() or rightClassify() function depending on the input data. The leaf functions simply return a class label. I've actually implemented decision trees in Python and D this way before.
They're used for a lot of things. Take, for example, function composition:
let compose f g = fun x -> f (g x)
This returns a closure that uses the arguments from the function environment where it was created. Functional languages like OCaml and Haskell actually use closures implicitly all over the place. For example:
let flip f a b = f b a
Usually, this will be called as something like let minusOne = flip (-) 1 to create a function that will subtract 1 from its argument. This "partially applied" function is effectively the same as doing this:
let flip f a = fun b -> f b a
It returns a closure that remembers the two arguments you passed in and takes another argument of its own.
Closures can be used to simulate objects that can respond to messages and maintain their own local state. Here is a simple counter object in Scheme:
;; counter.ss
;; A simple counter that can respond to the messages
;; 'next and 'reset.
(define (create-counter start-from)
(let ((value start-from))
(lambda (message)
(case message
((next) (set! value (add1 value)) value)
((reset) (set! value start-from))
(else (error "Invalid message!"))))))
Sample usage:
> (load "counter.ss")
> (define count-from-5 (create-counter 5))
> (define count-from-0 (create-counter 0))
> (count-from-5 'next)
6
> (count-from-5 'next)
7
> (count-from-0 'next)
1
> (count-from-0 'next)
2
> (count-from-0 'reset)
> (count-from-0 'next)
1