Doubts with transitive closure in Alloy - recursion

I am doing a model in Alloy to represent a subset of Java language. Below we have some elements of this model:
sig Method {
id : one MethodId,
param: lone Type,
return: one Type,
acc: lone Accessibility,
b: one Block
}
abstract sig Expression {}
abstract sig StatementExpression extends Expression {}
sig MethodInvocation extends StatementExpression{
pExp: lone PrimaryExpression,
id_methodInvoked: one Method,
param: lone Type
}
sig Block {
statements: set StatementExpression
}
pred noRecursiveMethodInvocationCall [] {
all bl:Block | all mi, mi2: MethodInvocation | all m:Method |
bl in m.b && mi in bl.statements
&& mi2 = mi.*(id_methodInvoked.b.statements) =>
m != mi2.id_methodInvoked
}
The problem is that the predicate noRecursiveMethodInvocationCall apparently is not working since the instances generated contains methods being invoked in a recursive way (even indirectly, e.g. m1 invokes m2, that invokes m3 that in turn invokes m1) and i want to avoid recursion.
The instances are generated through another model, see below:
open javametamodel_withfield_final
one sig BRight, CRight, BLeft, CLeft, Test extends Class{
}
one sig F extends Field{}
fact{
BRight in CRight.extend
BLeft in CLeft.extend
F in BRight.fields
F in CLeft.fields
all c:{Class-BRight-CLeft} | F !in c.fields
}
pred law6RightToLeft[]{
proviso[]
}
pred proviso [] {
some BRight.extend
some BLeft.extend
#(extend.BRight) > 2
#(extend.BLeft) > 2
no cfi:FieldAccess | ( cfi.pExp.id_cf in extend.BRight || cfi.pExp.id_cf in BRight || cfi.pExp.id_cf in extend.BLeft || cfi.pExp.id_cf in BLeft) && cfi.id_fieldInvoked=F
some Method
}
run law6RightToLeft for 9 but 15 Id, 15 Type, 15 Class
Please, does anyone have any clue what the problem is?
Thanks in advance for the attention,
Follow-on query
Still regarding this question, the predicate suggested solves the recursion problem:
pred noRecursiveMethodInvocationCall [] {
no m:Method
| m in m.^(b.statements.id_methodInvoked)
}
However, it causes inconsistency with another predicate (see below), and instances are not generated when both predicates exist.
pred atLeastOneMethodInvocNonVoidMethods [] {
all m:Method
| some mi:MethodInvocation
| mi in (m.b).statements
}
Any idea why instances can not be generated with both predicates?

You might look closely at the condition
mi2 = mi.*(id_methodInvoked.b.statements)
which seems to check whether the set of all statements reachable recursively from mi is equal to the single statement mi2. Now, unless I've confused myself about multiplicities again, mi2 is a scalar, so in any case where the method in question has a block with more than one method-invocation statement, this condition won't fire and the predicate will be vacuously true.
Changing = to in may be the simplest fix, but in that case I expect you won't get any non-empty instances, because you're using * and getting reflexive transitive closure, and not ^ (positive transitive closure).
It looks at first glance as if the condition might be simplified to something like
pred noRecursion {
no m : Method
| m in m.^(b.statements.idMethodInvoked)
}
but perhaps I'm missing something.
Postscript: a later addition to the question asks why no instances are generated when the prohibition on recursion is combined with a requirement that every method contain at least one method invocation:
pred atLeastOneMethodInvocNonVoidMethods [] {
all m:Method
| some mi:MethodInvocation
| mi in (m.b).statements
}
Perhaps the simplest way to see what's wrong is to imagine constructing a call graph. The nodes of the graph are methods, and the arcs of the graph are method invocations. There is an arc from node M1 to node M2 if the body of method M1 contains an invocation of method M2.
If we interpret the two predicates in terms of the graph, the predicate noRecursiveMethodInvocationCall means that the graph is acyclic. The predicate atLeastOneMethodInvocNonVoidMethods means that every node in the graph has at least one outgoing arc.
Try it with a single method M. This method must contain a method invocation, and this method invocation must invoke M (since there is no other method in the universe). So we have an arc from M to M, and the graph has a cycle. But the graph is not allowed to have a cycle. So we cannot create a one-method universe that satisfies both predicates.
Try again with two methods, M1 and M2. Let M1 call M2. Now, what does M2 call? It can't call M1 without making a cycle. It can't call M2 without making a cycle. Again we fail.
I don't have the time just now to look it up, but I think you'll find there is a basic theorem of graph theory that if the number of edges equals the number of nodes, the graph must have a cycle.

Related

Initialization of Arcs depening on Sets/Subsets in directed graphs in CPLEX

I am dealing with a directed weighted graph and have a question about how to initialize a set a defined in the following:
Assume that the graph has the following nodes, which are subdivided into three different subsets.
//Subsets of Nodes
{int} Subset1= {44,99};
{int} Subset2={123456,123457,123458};
{int} Subset3={1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
{int} Nodes=Subset1 union Subset2 union Subset3;
Now there is a set of H_j arcs, where j is in Nodes. H_j gives all arcs outgoing from Node j.
The arcs are stored in an excel file with the following structure:enter image description here
For node 44 in Nodes (Subset 1), there are the arcs <44,123456>, <44,123457>, <44,123458>. For 66 in Nodes (Subset 2), there is no arc. Can somebody help me how to implement this?
Important is that the code uses the input from the excel because in my real case there will be too much data to make a manual input... :(
Maybe there is a real easy solution for that. I would be very thankful!
Thank you so much in advance!
enter image description here
This addition refers to the answer from #Alex Fleischer:
Your code seems to work also in the overall context.
I am trying to implement the following constraints within a Maximization optimization ( The formulations (j,99) and (j,i) in the lower sum boundaries represent arcs):
enter image description here
I tried to implement it like this:
{int} TEST= {99};
subject to {
sum(m in M, j in a[m])x[<44,j>]==3;
sum(j in destPerOrig[99], t in TEST)x[<j,t>]==3;
forall(i in Nodes_wo_Subset1)
sum(j in destPerOrig[i],i in destPerOrig[i])x[<j,i>]==1;
}
M is a set of trains and a[M] gives a specific cost value for each indiviudal train. CPLEX shows 33 failure messages.
The most frequent one is that it cannot extract x[<j,i>], sum(in in destPerOrig[i]), sum(j in destPerOrig[i] and that x and destPerOrig are outside of the valid area.
Most probably the problem is that I implement the constraints in the wrong manner. Again, it is a directed graph.
Referring to the mathematical formulation in the screenshot: Could the format of destPerOrig[i] be a problem?
At the moment destPerOrig[44] gives {2 3 4}. But should´t it give:
{<44 2> <44 3> <44 4>} to work within the mathematical formulation?
I hope that this is enoug information for you to help me :(
I would be very thankful!
all arcs outgoing from Node j.
How to do this depends on how you store the adjacencies of the graph.
Perhaps you store a vector of arcs:
LOOP over arcs
IF arc source == node J
ADD to output
.mod
tuple arcE
{
string o;
string d;
}
{arcE} arcsInExcel=...;
{int} orig={ intValue(a.o) | a in arcsInExcel};
{int} destPerOrig[o in orig]={intValue(a.d) | a in arcsInExcel : intValue(a.o)==o && a.d!="" };
execute
{
writeln(orig);
writeln("==>");
writeln(destPerOrig);
}
/*
which gives
{44 66}
==>
[{2 3 4} {}]
*/
https://github.com/AlexFleischerParis/oplexcel/blob/main/readarcs.mod
.dat
SheetConnection s("readarcs.xlsx");
arcsInExcel from SheetRead(s,"A2:B5");
https://github.com/AlexFleischerParis/oplexcel/blob/main/readarcs.dat

F# Recursive Tree Validation

This is a somewhat beginner question. I have been trying to validate the following type of FamilyTree. I can't find a simple way to do this. All help would be appreciated.
type BirthYear = int;;
type Tree = Person of BirthYear * Children
and Children = Tree list;;
I want to validate a given family tree such that every Person is older than their Children and furthermore check if the list of Children is sorted in order of their age (eldest first). Preferably done with a function that return a boolean. Something along the lines of this:
let rec validate (Person(x,child)) =
let vali = child |> List.forall (fun (y,_) -> y < x)
I'd do something like this:
let rec checkAges minBirth = function
| Person(b, _) :: t -> b >= minBirth && checkAges b t
| [] -> true
let rec validate (Person(b, c)) =
List.forall validate c && checkAges (b + minParentAge) c
where minParentAge is set to a reasonable minimum age to have children at.
I'd expect checkAges to be the more difficult part here: the function checks whether the first child it sees is younger than the limit it is given, then recursively checks the next child, with the current child's age as the new limit.
Note some techniques:
The function that checks child ages takes the minimum birthday as input; this is used to validate that the parent is old enough for the first child to be reasonable.
List.forall checks a predicate for all items in a list, and early-outs if a predicate is not fulfilled
function is a shorthand to create a function that does pattern matching on its parameter. Therefore, checkAges actually has two arguments.
Here's a very simple solution using a single recursive function. It's not relying on built-in functions like List.forall but I think it's very declarative and (hopefully) easy to follow.
Rule 1: Every Person is older than their Children
Rule 2: List of Children is sorted in order of their age (eldest first)
Code:
let rec isValid = function
| Person ( _ , []) -> true // Person alone without childs -> always valid
| Person (minYear, Person (year, childs) :: brothers) ->
year > minYear && // Validate Rules (either 1 or 2)
isValid (Person (year, childs)) && // Enforce Rule 1
isValid (Person (year, brothers)) // Enforce Rule 2
I personally don't feel List.forall fits well here, it helps to solve a part of the problem but not the whole, so you need to combine it with more stuff (see the other answers) and in the end you can't avoid a recursive function.
List functions are good for lists but for trees I feel recursion more natural unless your tree provides already a way to traverse it.
Here's a way to do it. Perhaps spending some time analyzing how this works will be helpful to you.
let rec check (Person(age, children)) =
match children with
| [] -> true
| Person(eldest, _)::_ ->
Seq.pairwise children |> Seq.forall ((<||) (>))
&& age > eldest
&& List.forall check children

How should I implement a Cayley Table in Haskell?

I'm interested in generalizing some computational tools to use a Cayley Table, meaning a lookup table based multiplication operation.
I could create a minimal implementation as follows :
date CayleyTable = CayleyTable {
ct_name :: ByteString,
ct_products :: V.Vector (V.Vector Int)
} deriving (Read, Show)
instance Eq (CayleyTable) where
(==) a b = ct_name a == ct_name b
data CTElement = CTElement {
ct_cayleytable :: CayleyTable,
ct_index :: !Int
}
instance Eq (CTElement) where
(==) a b = assert (ct_cayleytable a == ct_cayleytable b) $
ct_index a == ct_index b
instance Show (CTElement) where
show = ("CTElement" ++) . show . ctp_index
a **** b = assert (ct_cayleytable a == ct_cayleytable b) $
((ct_cayleytable a) ! a) ! b
There are however numerous problems with this approach, starting with the run time type checking via ByteString comparisons, but including the fact that read cannot be made to work correctly. Any idea how I should do this correctly?
I could imagine creating a family of newtypes CTElement1, CTElement2, etc. for Int with a CTElement typeclass that provides the multiplication and verifies their type consistency, except when doing IO.
Ideally, there might be some trick for passing around only one copy of this ct_cayleytable pointer too, perhaps using an implicit parameter like ?cayleytable, but this doesn't play nicely with multiple incompatible Cayley tables and gets generally obnoxious.
Also, I've gathered that an index into a vector can be viewed as a comonad. Is there any nice comonad instance for vector or whatever that might help smooth out this sort of type checking, even if ultimately doing it at runtime?
You thing you need to realize is that Haskell's type checker only checks types. So your CaleyTable needs to be a class.
class CaleyGroup g where
caleyTable :: g -> CaleyTable
... -- Any operations you cannot implement soley by knowing the caley table
data CayleyTable = CayleyTable {
...
} deriving (Read, Show)
If the caleyTable isn't known at compile time you have to use rank-2 types. Since the complier needs to enforce the invariant that the CaleyTable exists, when your code uses it.
manipWithCaleyTable :: Integral i => CaleyTable -> i -> (forall g. CaleyGroup g => g -> g) -> a
can be implemented for example. It allows you to perform group operations on the CaleyTable. It works by combining i and CaleyTable to make a new type it passes to its third argument.

How are closures used in functional languages

For some reason, I tend to associate closures with functional languages. I believe this is mostly because the discussions I've seen concerning closures is almost always in an environment that is focused around functional programming. That being said, the actual practical uses of closures that I can think are are all non-functional in nature.
Are there practical uses of closures in functional languages, or is the association in my mind mostly because closures are used to program in a style that's also common to functional programming languages (first class functions, currying, etc)?
Edit: I should clarify that I refering to actual functional languages, meaning I was looking for uses that preserve referential transparency (for the same input you get the same output).
Edit: Adding a summary of what's been posted so far:
Closures are used to implement partial evaluation. Specifically, for a function that takes two arguments, it can be called with one argument which results in it returning a function that takes one argument. Generally, the method by which this second function "stores" the first value passed into it is a closure.
Objects can be implemented using closures. A function is returned that has closes around a number of variables, and can then use them like object attributes. The function itself may return more methods, which act as object methods, which also have access to these variables. Assuming the variables aren't modified, referential transparency is maintained.
I use lots of closures in Javascript code (which is a pretty functional language -- I joke that it is Scheme with C clothing). They provide encapsulation of data that is private to a function.
The most ubiquitous example:
var generateId = function() {
var id = 0;
return function() {
return id++;
}
}();
window.alert(generateId());
window.alert(generateId());
But that's the hello, world of Javascript closures. However there are many more practical uses.
Recently, in my job, I needed to code a simple photo gallery with sliders. It does something like:
var slide = function() {
var photoSize = ...
var ... // lots of calculations of sizes, distances to scroll, etc
var scroll = function(direction, amout) {
// here we use some of the variables defined just above
// (it will be returned, therefore it is a closure)
};
return {
up: function() { scroll(1, photoSize); },
down: function() { scroll(-1, photoSize); }
}
}();
slide.up();
// actually the line above would have to be associated to some
// event handler to be useful
In this case I've used closures to hide all the up and down scrolling logic, and have a code which is very semantic: in Javascript, "slide up" you will write slide.up().
One nice use for closures is building things like decision trees. You return a classify() function that tests whether to go down the left or right tree, and then calls either its leftClassify() or rightClassify() function depending on the input data. The leaf functions simply return a class label. I've actually implemented decision trees in Python and D this way before.
They're used for a lot of things. Take, for example, function composition:
let compose f g = fun x -> f (g x)
This returns a closure that uses the arguments from the function environment where it was created. Functional languages like OCaml and Haskell actually use closures implicitly all over the place. For example:
let flip f a b = f b a
Usually, this will be called as something like let minusOne = flip (-) 1 to create a function that will subtract 1 from its argument. This "partially applied" function is effectively the same as doing this:
let flip f a = fun b -> f b a
It returns a closure that remembers the two arguments you passed in and takes another argument of its own.
Closures can be used to simulate objects that can respond to messages and maintain their own local state. Here is a simple counter object in Scheme:
;; counter.ss
;; A simple counter that can respond to the messages
;; 'next and 'reset.
(define (create-counter start-from)
(let ((value start-from))
(lambda (message)
(case message
((next) (set! value (add1 value)) value)
((reset) (set! value start-from))
(else (error "Invalid message!"))))))
Sample usage:
> (load "counter.ss")
> (define count-from-5 (create-counter 5))
> (define count-from-0 (create-counter 0))
> (count-from-5 'next)
6
> (count-from-5 'next)
7
> (count-from-0 'next)
1
> (count-from-0 'next)
2
> (count-from-0 'reset)
> (count-from-0 'next)
1

What is 'Pattern Matching' in functional languages?

I'm reading about functional programming and I've noticed that Pattern Matching is mentioned in many articles as one of the core features of functional languages.
Can someone explain for a Java/C++/JavaScript developer what does it mean?
Understanding pattern matching requires explaining three parts:
Algebraic data types.
What pattern matching is
Why its awesome.
Algebraic data types in a nutshell
ML-like functional languages allow you define simple data types called "disjoint unions" or "algebraic data types". These data structures are simple containers, and can be recursively defined. For example:
type 'a list =
| Nil
| Cons of 'a * 'a list
defines a stack-like data structure. Think of it as equivalent to this C#:
public abstract class List<T>
{
public class Nil : List<T> { }
public class Cons : List<T>
{
public readonly T Item1;
public readonly List<T> Item2;
public Cons(T item1, List<T> item2)
{
this.Item1 = item1;
this.Item2 = item2;
}
}
}
So, the Cons and Nil identifiers define simple a simple class, where the of x * y * z * ... defines a constructor and some data types. The parameters to the constructor are unnamed, they're identified by position and data type.
You create instances of your a list class as such:
let x = Cons(1, Cons(2, Cons(3, Cons(4, Nil))))
Which is the same as:
Stack<int> x = new Cons(1, new Cons(2, new Cons(3, new Cons(4, new Nil()))));
Pattern matching in a nutshell
Pattern matching is a kind of type-testing. So let's say we created a stack object like the one above, we can implement methods to peek and pop the stack as follows:
let peek s =
match s with
| Cons(hd, tl) -> hd
| Nil -> failwith "Empty stack"
let pop s =
match s with
| Cons(hd, tl) -> tl
| Nil -> failwith "Empty stack"
The methods above are equivalent (although not implemented as such) to the following C#:
public static T Peek<T>(Stack<T> s)
{
if (s is Stack<T>.Cons)
{
T hd = ((Stack<T>.Cons)s).Item1;
Stack<T> tl = ((Stack<T>.Cons)s).Item2;
return hd;
}
else if (s is Stack<T>.Nil)
throw new Exception("Empty stack");
else
throw new MatchFailureException();
}
public static Stack<T> Pop<T>(Stack<T> s)
{
if (s is Stack<T>.Cons)
{
T hd = ((Stack<T>.Cons)s).Item1;
Stack<T> tl = ((Stack<T>.Cons)s).Item2;
return tl;
}
else if (s is Stack<T>.Nil)
throw new Exception("Empty stack");
else
throw new MatchFailureException();
}
(Almost always, ML languages implement pattern matching without run-time type-tests or casts, so the C# code is somewhat deceptive. Let's brush implementation details aside with some hand-waving please :) )
Data structure decomposition in a nutshell
Ok, let's go back to the peek method:
let peek s =
match s with
| Cons(hd, tl) -> hd
| Nil -> failwith "Empty stack"
The trick is understanding that the hd and tl identifiers are variables (errm... since they're immutable, they're not really "variables", but "values" ;) ). If s has the type Cons, then we're going to pull out its values out of the constructor and bind them to variables named hd and tl.
Pattern matching is useful because it lets us decompose a data structure by its shape instead of its contents. So imagine if we define a binary tree as follows:
type 'a tree =
| Node of 'a tree * 'a * 'a tree
| Nil
We can define some tree rotations as follows:
let rotateLeft = function
| Node(a, p, Node(b, q, c)) -> Node(Node(a, p, b), q, c)
| x -> x
let rotateRight = function
| Node(Node(a, p, b), q, c) -> Node(a, p, Node(b, q, c))
| x -> x
(The let rotateRight = function constructor is syntax sugar for let rotateRight s = match s with ....)
So in addition to binding data structure to variables, we can also drill down into it. Let's say we have a node let x = Node(Nil, 1, Nil). If we call rotateLeft x, we test x against the first pattern, which fails to match because the right child has type Nil instead of Node. It'll move to the next pattern, x -> x, which will match any input and return it unmodified.
For comparison, we'd write the methods above in C# as:
public abstract class Tree<T>
{
public abstract U Match<U>(Func<U> nilFunc, Func<Tree<T>, T, Tree<T>, U> nodeFunc);
public class Nil : Tree<T>
{
public override U Match<U>(Func<U> nilFunc, Func<Tree<T>, T, Tree<T>, U> nodeFunc)
{
return nilFunc();
}
}
public class Node : Tree<T>
{
readonly Tree<T> Left;
readonly T Value;
readonly Tree<T> Right;
public Node(Tree<T> left, T value, Tree<T> right)
{
this.Left = left;
this.Value = value;
this.Right = right;
}
public override U Match<U>(Func<U> nilFunc, Func<Tree<T>, T, Tree<T>, U> nodeFunc)
{
return nodeFunc(Left, Value, Right);
}
}
public static Tree<T> RotateLeft(Tree<T> t)
{
return t.Match(
() => t,
(l, x, r) => r.Match(
() => t,
(rl, rx, rr) => new Node(new Node(l, x, rl), rx, rr))));
}
public static Tree<T> RotateRight(Tree<T> t)
{
return t.Match(
() => t,
(l, x, r) => l.Match(
() => t,
(ll, lx, lr) => new Node(ll, lx, new Node(lr, x, r))));
}
}
For seriously.
Pattern matching is awesome
You can implement something similar to pattern matching in C# using the visitor pattern, but its not nearly as flexible because you can't effectively decompose complex data structures. Moreover, if you are using pattern matching, the compiler will tell you if you left out a case. How awesome is that?
Think about how you'd implement similar functionality in C# or languages without pattern matching. Think about how you'd do it without test-tests and casts at runtime. Its certainly not hard, just cumbersome and bulky. And you don't have the compiler checking to make sure you've covered every case.
So pattern matching helps you decompose and navigate data structures in a very convenient, compact syntax, it enables the compiler to check the logic of your code, at least a little bit. It really is a killer feature.
Short answer: Pattern matching arises because functional languages treat the equals sign as an assertion of equivalence instead of assignment.
Long answer: Pattern matching is a form of dispatch based on the “shape” of the value that it's given. In a functional language, the datatypes that you define are usually what are known as discriminated unions or algebraic data types. For instance, what's a (linked) list? A linked list List of things of some type a is either the empty list Nil or some element of type a Consed onto a List a (a list of as). In Haskell (the functional language I'm most familiar with), we write this
data List a = Nil
| Cons a (List a)
All discriminated unions are defined this way: a single type has a fixed number of different ways to create it; the creators, like Nil and Cons here, are called constructors. This means that a value of the type List a could have been created with two different constructors—it could have two different shapes. So suppose we want to write a head function to get the first element of the list. In Haskell, we would write this as
-- `head` is a function from a `List a` to an `a`.
head :: List a -> a
-- An empty list has no first item, so we raise an error.
head Nil = error "empty list"
-- If we are given a `Cons`, we only want the first part; that's the list's head.
head (Cons h _) = h
Since List a values can be of two different kinds, we need to handle each one separately; this is the pattern matching. In head x, if x matches the pattern Nil, then we run the first case; if it matches the pattern Cons h _, we run the second.
Short answer, explained: I think one of the best ways to think about this behavior is by changing how you think of the equals sign. In the curly-bracket languages, by and large, = denotes assignment: a = b means “make a into b.” In a lot of functional languages, however, = denotes an assertion of equality: let Cons a (Cons b Nil) = frob x asserts that the thing on the left, Cons a (Cons b Nil), is equivalent to the thing on the right, frob x; in addition, all variables used on the left become visible. This is also what's happening with function arguments: we assert that the first argument looks like Nil, and if it doesn't, we keep checking.
It means that instead of writing
double f(int x, int y) {
if (y == 0) {
if (x == 0)
return NaN;
else if (x > 0)
return Infinity;
else
return -Infinity;
} else
return (double)x / y;
}
You can write
f(0, 0) = NaN;
f(x, 0) | x > 0 = Infinity;
| else = -Infinity;
f(x, y) = (double)x / y;
Hey, C++ supports pattern matching too.
static const int PositiveInfinity = -1;
static const int NegativeInfinity = -2;
static const int NaN = -3;
template <int x, int y> struct Divide {
enum { value = x / y };
};
template <bool x_gt_0> struct aux { enum { value = PositiveInfinity }; };
template <> struct aux<false> { enum { value = NegativeInfinity }; };
template <int x> struct Divide<x, 0> {
enum { value = aux<(x>0)>::value };
};
template <> struct Divide<0, 0> {
enum { value = NaN };
};
#include <cstdio>
int main () {
printf("%d %d %d %d\n", Divide<7,2>::value, Divide<1,0>::value, Divide<0,0>::value, Divide<-1,0>::value);
return 0;
};
Pattern matching is sort of like overloaded methods on steroids. The simplest case would be the same roughly the same as what you seen in java, arguments are a list of types with names. The correct method to call is based on the arguments passed in, and it doubles as an assignment of those arguments to the parameter name.
Patterns just go a step further, and can destructure the arguments passed in even further. It can also potentially use guards to actually match based on the value of the argument. To demonstrate, I'll pretend like JavaScript had pattern matching.
function foo(a,b,c){} //no pattern matching, just a list of arguments
function foo2([a],{prop1:d,prop2:e}, 35){} //invented pattern matching in JavaScript
In foo2, it expects a to be an array, it breaks apart the second argument, expecting an object with two props (prop1,prop2) and assigns the values of those properties to variables d and e, and then expects the third argument to be 35.
Unlike in JavaScript, languages with pattern matching usually allow multiple functions with the same name, but different patterns. In this way it is like method overloading. I'll give an example in erlang:
fibo(0) -> 0 ;
fibo(1) -> 1 ;
fibo(N) when N > 0 -> fibo(N-1) + fibo(N-2) .
Blur your eyes a little and you can imagine this in javascript. Something like this maybe:
function fibo(0){return 0;}
function fibo(1){return 1;}
function fibo(N) when N > 0 {return fibo(N-1) + fibo(N-2);}
Point being that when you call fibo, the implementation it uses is based on the arguments, but where Java is limited to types as the only means of overloading, pattern matching can do more.
Beyond function overloading as shown here, the same principle can be applied other places, such as case statements or destructuring assingments. JavaScript even has this in 1.7.
Pattern matching allows you to match a value (or an object) against some patterns to select a branch of the code. From the C++ point of view, it may sound a bit similar to the switch statement. In functional languages, pattern matching can be used for matching on standard primitive values such as integers. However, it is more useful for composed types.
First, let's demonstrate pattern matching on primitive values (using extended pseudo-C++ switch):
switch(num) {
case 1:
// runs this when num == 1
case n when n > 10:
// runs this when num > 10
case _:
// runs this for all other cases (underscore means 'match all')
}
The second use deals with functional data types such as tuples (which allow you to store multiple objects in a single value) and discriminated unions which allow you to create a type that can contain one of several options. This sounds a bit like enum except that each label can also carry some values. In a pseudo-C++ syntax:
enum Shape {
Rectangle of { int left, int top, int width, int height }
Circle of { int x, int y, int radius }
}
A value of type Shape can now contain either Rectangle with all the coordinates or a Circle with the center and the radius. Pattern matching allows you to write a function for working with the Shape type:
switch(shape) {
case Rectangle(l, t, w, h):
// declares variables l, t, w, h and assigns properties
// of the rectangle value to the new variables
case Circle(x, y, r):
// this branch is run for circles (properties are assigned to variables)
}
Finally, you can also use nested patterns that combine both of the features. For example, you could use Circle(0, 0, radius) to match for all shapes that have the center in the point [0, 0] and have any radius (the value of the radius will be assigned to the new variable radius).
This may sound a bit unfamiliar from the C++ point of view, but I hope that my pseudo-C++ make the explanation clear. Functional programming is based on quite different concepts, so it makes better sense in a functional language!
Pattern matching is where the interpreter for your language will pick a particular function based on the structure and content of the arguments you give it.
It is not only a functional language feature but is available for many different languages.
The first time I came across the idea was when I learned prolog where it is really central to the language.
e.g.
last([LastItem], LastItem).
last([Head|Tail], LastItem) :-
last(Tail, LastItem).
The above code will give the last item of a list. The input arg is the first and the result is the second.
If there is only one item in the list the interpreter will pick the first version and the second argument will be set to equal the first i.e. a value will be assigned to the result.
If the list has both a head and a tail the interpreter will pick the second version and recurse until it there is only one item left in the list.
For many people, picking up a new concept is easier if some easy examples are provided, so here we go:
Let's say you have a list of three integers, and wanted to add the first and the third element. Without pattern matching, you could do it like this (examples in Haskell):
Prelude> let is = [1,2,3]
Prelude> head is + is !! 2
4
Now, although this is a toy example, imagine we would like to bind the first and third integer to variables and sum them:
addFirstAndThird is =
let first = head is
third = is !! 3
in first + third
This extraction of values from a data structure is what pattern matching does. You basically "mirror" the structure of something, giving variables to bind for the places of interest:
addFirstAndThird [first,_,third] = first + third
When you call this function with [1,2,3] as its argument, [1,2,3] will be unified with [first,_,third], binding first to 1, third to 3 and discarding 2 (_ is a placeholder for things you don't care about).
Now, if you only wanted to match lists with 2 as the second element, you can do it like this:
addFirstAndThird [first,2,third] = first + third
This will only work for lists with 2 as their second element and throw an exception otherwise, because no definition for addFirstAndThird is given for non-matching lists.
Until now, we used pattern matching only for destructuring binding. Above that, you can give multiple definitions of the same function, where the first matching definition is used, thus, pattern matching is a little like "a switch statement on stereoids":
addFirstAndThird [first,2,third] = first + third
addFirstAndThird _ = 0
addFirstAndThird will happily add the first and third element of lists with 2 as their second element, and otherwise "fall through" and "return" 0. This "switch-like" functionality can not only be used in function definitions, e.g.:
Prelude> case [1,3,3] of [a,2,c] -> a+c; _ -> 0
0
Prelude> case [1,2,3] of [a,2,c] -> a+c; _ -> 0
4
Further, it is not restricted to lists, but can be used with other types as well, for example matching the Just and Nothing value constructors of the Maybe type in order to "unwrap" the value:
Prelude> case (Just 1) of (Just x) -> succ x; Nothing -> 0
2
Prelude> case Nothing of (Just x) -> succ x; Nothing -> 0
0
Sure, those were mere toy examples, and I did not even try to give a formal or exhaustive explanation, but they should suffice to grasp the basic concept.
You should start with the Wikipedia page that gives a pretty good explanation. Then, read the relevant chapter of the Haskell wikibook.
This is a nice definition from the above wikibook:
So pattern matching is a way of
assigning names to things (or binding
those names to those things), and
possibly breaking down expressions
into subexpressions at the same time
(as we did with the list in the
definition of map).
Here is a really short example that shows pattern matching usefulness:
Let's say you want to sort up an element in a list:
["Venice","Paris","New York","Amsterdam"]
to (I've sorted up "New York")
["Venice","New York","Paris","Amsterdam"]
in an more imperative language you would write:
function up(city, cities){
for(var i = 0; i < cities.length; i++){
if(cities[i] === city && i > 0){
var prev = cities[i-1];
cities[i-1] = city;
cities[i] = prev;
}
}
return cities;
}
In a functional language you would instead write:
let up list value =
match list with
| [] -> []
| previous::current::tail when current = value -> current::previous::tail
| current::tail -> current::(up tail value)
As you can see the pattern matched solution has less noise, you can clearly see what are the different cases and how easy it's to travel and de-structure our list.
I've written a more detailed blog post about it here.

Resources