Translate binary string to mathematical expression - math

I've been experimenting with genetic algorithms as of late and now I'd like to build mathematical expressions out of the genomes (For easy talk, its to find an expression that matches a certain outcome).
I have genomes consisting of genes which are represented by bytes, One genome can look like this: {12, 127, 82, 35, 95, 223, 85, 4, 213, 228}. The length is predefined (although it must fall in a certain range), neither is the form it takes. That is, any entry can take any byte value.
Now the trick is to translate this to mathematical expressions. It's fairly easy to determine basic expressions, for example: Pick the first 2 values and treat them as products, pick the 3rd value and pick it as an operator ( +, -, *, /, ^ , mod ), pick the 4th value as a product and pick the 5th value as an operator again working over the result of the 3rd operator over the first 2 products. (or just handle it as an postfix expression)
The complexity rises when you start allowing priority rules. Now when for example the entry under index 2 represents a '(', your bound to have a ')' somewhere further on except for entry 3, but not necessarily entry 4
Of course the same goes for many things, you can't end up with an operator at the end, you can't end up with a loose number etc.
Now i can make a HUGE switch statement (for example) taking in all the possible possibilities but this will make the code unreadable. I was hoping if someone out there knows a good strategy of how to take this one on.
Thanks in advance!
** EDIT **
On request: The goal I'm trying to achieve is to make an application which can resolve a function for a set of numbers. As for the example I've given in the comment below: {4, 11, 30} and it might come up with the function (X ^ 3) + X

Belisarius in a comment gave a link to an identical topic: Algorithm for permutations of operators and operands
My code:
private static double ResolveExpression(byte[] genes, double valueForX)
{
// folowing: https://stackoverflow.com/questions/3947937/algorithm-for-permutations-of-operators-and-operands/3948113#3948113
Stack<double> operandStack = new Stack<double>();
for (int index = 0; index < genes.Length; index++)
{
int genesLeft = genes.Length - index;
byte gene = genes[index];
bool createOperand;
// only when there are enough possbile operators left, possibly add operands
if (genesLeft > operandStack.Count)
{
// only when there are at least 2 operands on the stack
if (operandStack.Count >= 2)
{
// randomly determine wether to create an operand by threating everything below 127 as an operand and the rest as an operator (better then / 2 due to 0 values)
createOperand = gene < byte.MaxValue / 2;
}
else
{
// else we need an operand for sure since an operator is illigal
createOperand = true;
}
}
else
{
// false for sure since there are 2 many operands to complete otherwise
createOperand = false;
}
if (createOperand)
{
operandStack.Push(GeneToOperand(gene, valueForX));
}
else
{
double left = operandStack.Pop();
double right = operandStack.Pop();
double result = PerformOperator(gene, left, right);
operandStack.Push(result);
}
}
// should be 1 operand left on the stack which is the ending result
return operandStack.Pop();
}
private static double PerformOperator(byte gene, double left, double right)
{
// There are 5 options currently supported, namely: +, -, *, /, ^ and log (math)
int code = gene % 6;
switch (code)
{
case 0:
return left + right;
case 1:
return left - right;
case 2:
return left * right;
case 3:
return left / right;
case 4:
return Math.Pow(left, right);
case 5:
return Math.Log(left, right);
default:
throw new InvalidOperationException("Impossible state");
}
}
private static double GeneToOperand(byte gene, double valueForX)
{
// We only support numbers 0 - 9 and X
int code = gene % 11; // Get a value between 0 and 10
if (code == 10)
{
// 10 is a placeholder for x
return valueForX;
}
else
{
return code;
}
}
#endregion // Helpers
}

Use "post-fix" notation. That handles priorities very nicely.
Post-fix notation handles the "grouping" or "priority rules" trivially.
For example, the expression b**2-4*a*c, in post-fix is
b, 2, **, 4, a, *, c, *, -
To evaluate a post-fix expression, you simply push the values onto a stack and execute the operations.
So the above becomes something approximately like the following.
stack.push( b )
stack.push( 2 )
x, y = stack.pop(), stack.pop(); stack.push( y ** x )
stack.push( 4 )
stack.push( a )
x, y = stack.pop(), stack.pop(); stack.push( y * x )
stack.push( c )
x, y = stack.pop(), stack.pop(); stack.push( y * x )
x, y = stack.pop(), stack.pop(); stack.push( y - x )
To make this work, you need to have to partition your string of bytes into values and operators. You also need to check the "arity" of all your operators to be sure that the number of operators and the number of operands balances out. In this case, the number of binary operators + 1 is the number of operands. Unary operators don't require extra operands.

As ever with GA a large part of the solution is choosing a good representation. RPN (or post-fix) has already been suggested. One concern you still have is that your GA might throw up expressions which begin with operators (or mismatch operators and operands elsewhere) such as:
+,-,3,*,4,2,5,+,-
A (small) part of the solution would be to define evaluations for operand-less operators. For example one might decide that the sequence:
+
evaluates to 0, which is the identity element for addition. Naturally
*
would evaluate to 1. Mathematics may not have figured out what the identity element for division is, but APL has.
Now you have the basis of an approach which doesn't care if you get the right sequence of operators and operands, but you still have a problem when you have too many operands for the number of operators. That is, what is the intepretation of (postfix following) ?
2,4,5,+,3,4,-
which (possibly) evaluates to
2,9,-1
Well, now you have to invent your own convention if you want to reduce this to a single value. But you could adopt the convention that the GA has created a vector-valued function.
EDIT: response to OP's comment ...
If a byte can represent either an operator or an operand, and if your program places no restrictions on where a genome can be split for reproduction, then there will always be a risk that the offspring represents an invalid sequence of operators and operands. Consider, instead of having each byte encode either an operator or an operand, a byte could encode an operator+operand pair (you might run out of bytes quickly so perhaps you'd need to use two bytes). Then a sequence of bytes might be translated to something like:
(plus 1)(plus x)(power 2)(times 3)
which could evaluate, following a left-to-right rule with a meaningful interpretation for the first term, to 3((x+1)^2)

Related

Change the sign of one number to match the sign of another number

I'm not really sure what to search for on this.
If I have a variable A = 10. And another variable B. If B is negative I want to make A = -10. If B is positive I want A = 10.
Here is how I have been doing this quite often:
A = A * abs(B) / B
The obvious issue here is that if B is zero I get a divide by zero error.
Is there a better (preferably mathematical) way to accomplish this without the complexity of conditional statements?
Backstory. I am working with students in a graphical robotics programming language called Lego EV3.
The algorithm above looks like this:
Using a conditional statement it looks like this:
Quite the waste of space, especially when you are working on 13" laptop screens. And confusing.
Just to turn #MBo's comment into an official answer, note that many languages have a function called sign(x) or signum(x) that returns -1, 0, or 1 if x is negative, zero, or positive respectively, and another function abs(x) (for absolute value) that can be used together to achieve your purpose:
A = abs(A) * sign(B)
will copy the sign from B to A if B ≠ 0. If B == 0 you will have to do something extra.
Many languages (C++, Java, python) also have a straightforward copysign(x, y) function that does exactly what you want, returning x modified to have y's sign.
In many programming languages, a simple if statement would work:
A = 10;
if (B < 0) {
A = -1*A;
}
If your language supports ternary expressions, we could reduce the above to a single line:
A = B < 0 ? -1*A : A;
Another option might be to define a helper function:
reverseSign(A, B) {
if (B < 0) {
return -1*A;
}
else {
return A;
}
}
C99 has the (POSIX) function copysign that does just this. Fortran has had this for ages. It's also a IEEE 754 recommended function

What does this extra '+' represent in this code? Recursive function

Problem:
A digital root is the recursive sum of all the digits in a number. Given n, take the sum of the digits of n. If that value has two digits, continue reducing in this way until a single-digit number is produced. This is only applicable to the natural numbers.
example:
digital_root(16)
=> 1 + 6
=> 7
This is a function that was coded:
function digital_root(n) {
if (n < 10) {
return n;
}
return digital_root( n.toString().split('').reduce( function (a, b) {
return a + +b;
}, 0));
}
Can someone clarify what the extra + is doing in this line of code? return a + +b;
Its probably a sneaky way of converting a string to an integer. You don't say what language this is, but many dynamic languages allow variables to be any type without declaration and use + for both addition and string concatenation, with implicit conversions between strings and numbers. Such languages make it easy to accidentally get the wrong thing (concatenating when you intend to add or vice versa).
However, using a unary + is (usually) a numeric identity, which will convert its argument to a number (if it happens to be a string -- it does nothing if the argument is already a number). So then the binary + will be add rather than concatenate.

Can someone explain this code that recursively finds the minimum element in an array in C?

I don't quite understand this piece of code. So if for example n = 5 and we have:
array[5] = {13, 27, 78, 42, 69}
Would someone explain please?
All I understand is if n = 1, that is the lowest.
But when n = 5, we would get the 4th index and compare it to the 4th index and check which is the smallest and return the smallest, then take the 4th index and compare it to the 3rd index and check which one is the smallest and return the smallest? I am confused.
int min(int a, int b)
{
return (a < b) ? a: b;
}
// Recursively find the minimum element in an array, n is the length of the
// array, which you assume is at least 1.
int find_min(int *array, int n)
{
if(n == 1)
return array[0];
return min(array[n - 1], find_min(array, n - 1));
}
Given your array:
1. initial call: find_min(array, 5)
n!=1, therefore if() doesn't trigger
2. return(min(array[4], find_min(array, 4)))
n!=1, therefore if doesn't trigger
3. return(min(array[3], find_min(array,3)))
n!=1, therefore if doesn't trigger
4. return(min(array[2], find_min(array,2)))
n!=1, threfore if() doesn't trigger
5. return(min(array[1], find_min(array,1)))
n==1, so return array[0]
4. return(min(array[1], array[0]))
return(min(13, 27)
return(13)
3. return(min(array[2], 13))
etc...
It's quite simple. Run through the code using the example you gave.
On the first run through find_min(), it will return the minimum of the last element in the array (69) and the minimum of the rest of the array. To calculate the minimum of the rest of the array, it calls itself, i.e. it is recursive. This 2nd-level call will compare the number 42 (the new "last" element) with the minimum from the rest of the array, and so on. The final call to find_min() will have n=1 with the array "{13}", so it will return 13. The layer that called it will compare 13 with 27 and find that 13 is less so it will return it, and so on back up the chain.
Note: I assume the backward quotes in your code are not supposed to be there.
The solution uses recursion to compute the minimum for the smallest possible comparison set and comparing that result with the next bigger set of numbers. Each recursive call returns a result that is compared against the next element in a backward manner until the minimum value bubbles up to the top. Recursion appears to be tricky at first, but can be quite effective once you get familiar with it.

java 8: How to convert following code to functional?

Instead of using the for loop, how do I use the Stream API of Java 8 on array of booleans? How do I use methods such as forEach, reduce etc.?
I want to get rid of the two variables totalRelevant and retrieved which I am using to maintain state.
As in a lambda expression, we can only reference final variables from its lexical context.
import java.util.Arrays;
import java.util.List;
public class IRLab {
public static void main(String[] args) {
// predefined list of either document is relevant or not
List<Boolean> documentRelivency = Arrays.asList(true, false, true, true, false);
System.out.println("Precision\tRecall\tF-Measure");
// variables for output
double totalRelevant = 0.0;
double retrieved = 0.0;
for (int i = 0; i < documentRelivency.size(); ++i) {
Boolean isRelevant = documentRelivency.get(i);
// check if document is relevant
if (isRelevant) totalRelevant += 1;
// total number of retrieved documents will be equal to
// number of document being processed currently, i.e. retrieved = i + 1
retrieved += 1;
// storing values using formulas
double precision = totalRelevant / retrieved;
double recall = totalRelevant / totalRelevant;
double fmeasure = (2 * precision * recall) / (precision + recall);
// Printing the final calculated values
System.out.format("%9.2f\t%.2f\t%.2f\t\n", precision, recall, fmeasure);
}
}
}
How do I convert above code to functional code using the Java 8 Stream API and Lambda Expressions? I need to maintain state for two variables as above.
Generally, converting imperative to a functional code will only be an improvement when you manage to get rid of mutable state that causes the processing of one element to depend on the processing of the previous one.
There are workarounds that allow you to incorporate mutable state, but you should first try to find a different representation of your problem that works without. In your example, the processing of each element depends on two values, totalRelevant and retrieved. The latter is just an ascending number and therefore can be represented as a range, e.g. IntStream.range(startValue, endValue). The second stems from your list of boolean values and is the number of true value inside the sublist (0, retrieved)(inclusive).
You could recalculate that value without needing the previous value, but reiterating the list in each step could turn out to be expensive. So instead, collect your list into a single int number representing a bitset first, i.e. [true, false, true, true, false] becomes 0b_10110. Then, you can get the number of one bits using intrinsic operations:
List<Boolean> documentRelivency = Arrays.asList(true, false, true, true, false);
int numBits=documentRelivency.size(), bitset=IntStream.range(0, numBits)
.map(i -> documentRelivency.get(i)? 1<<(numBits-i-1): 0).reduce(0, (i,j) -> i|j);
System.out.println("Precision\tRecall\tF-Measure");
IntStream.rangeClosed(1, numBits)
.mapToObj(retrieved -> {
double totalRelevant = Integer.bitCount(bitset&(-1<<(numBits-retrieved)));
return String.format("%9.2f\t%.2f\t%.2f",
totalRelevant/retrieved, 1f, 2/(1+retrieved/totalRelevant));
})
.forEach(System.out::println);
This way, you have expressed the entire operation in a functional way where the processing of one element does not depend on the previous one. It could even run in parallel, though this would offer no benefit here.
If the list size exceeds 32, you have to resort to long, or java.util.BitSet for more than 64.
But the whole operation is more an example of how to change the thinking from “this is a number I increment in each iteration” to “I’m processing a continuous range of values” and from “this is a number I increment when the element is true” to “this is the count of true values in a range of this list”.
It's unclear why you need to change your code to lambdas. Currently it's quite short and lambdas will not make it shorter or cleaner. However if you really want, you may encapsulate your shared state in the separate object:
static class Stats {
private int totalRelevant, retrieved;
public void add(boolean relevant) {
if(relevant)
totalRelevant++;
retrieved++;
}
public double getPrecision() {
return ((double)totalRelevant) / retrieved;
}
public double getRecall() {
return 1.0; // ??? was totalRelevant/totalRelevant in original code
}
public double getFMeasure() {
double precision = getPrecision();
double recall = getRecall();
return (2 * precision * recall) / (precision + recall);
}
}
And use with lambda like this:
Stats stats = new Stats();
documentRelivency.forEach(relevant -> {
stats.add(relevant);
System.out.format("%9.2f\t%.2f\t%.2f\t\n", stats.getPrecision(),
stats.getRecall(), stats.getFMeasure());
});
Lambda is here, but not Stream API. Seems that involving Stream API for such problem is not very good idea as you need to output the intermediate states of mutable container which should be mutated strictly in given order. Well, if you desperately need Stream API, replace .forEach with .stream().forEachOrdered.

What is 'Pattern Matching' in functional languages?

I'm reading about functional programming and I've noticed that Pattern Matching is mentioned in many articles as one of the core features of functional languages.
Can someone explain for a Java/C++/JavaScript developer what does it mean?
Understanding pattern matching requires explaining three parts:
Algebraic data types.
What pattern matching is
Why its awesome.
Algebraic data types in a nutshell
ML-like functional languages allow you define simple data types called "disjoint unions" or "algebraic data types". These data structures are simple containers, and can be recursively defined. For example:
type 'a list =
| Nil
| Cons of 'a * 'a list
defines a stack-like data structure. Think of it as equivalent to this C#:
public abstract class List<T>
{
public class Nil : List<T> { }
public class Cons : List<T>
{
public readonly T Item1;
public readonly List<T> Item2;
public Cons(T item1, List<T> item2)
{
this.Item1 = item1;
this.Item2 = item2;
}
}
}
So, the Cons and Nil identifiers define simple a simple class, where the of x * y * z * ... defines a constructor and some data types. The parameters to the constructor are unnamed, they're identified by position and data type.
You create instances of your a list class as such:
let x = Cons(1, Cons(2, Cons(3, Cons(4, Nil))))
Which is the same as:
Stack<int> x = new Cons(1, new Cons(2, new Cons(3, new Cons(4, new Nil()))));
Pattern matching in a nutshell
Pattern matching is a kind of type-testing. So let's say we created a stack object like the one above, we can implement methods to peek and pop the stack as follows:
let peek s =
match s with
| Cons(hd, tl) -> hd
| Nil -> failwith "Empty stack"
let pop s =
match s with
| Cons(hd, tl) -> tl
| Nil -> failwith "Empty stack"
The methods above are equivalent (although not implemented as such) to the following C#:
public static T Peek<T>(Stack<T> s)
{
if (s is Stack<T>.Cons)
{
T hd = ((Stack<T>.Cons)s).Item1;
Stack<T> tl = ((Stack<T>.Cons)s).Item2;
return hd;
}
else if (s is Stack<T>.Nil)
throw new Exception("Empty stack");
else
throw new MatchFailureException();
}
public static Stack<T> Pop<T>(Stack<T> s)
{
if (s is Stack<T>.Cons)
{
T hd = ((Stack<T>.Cons)s).Item1;
Stack<T> tl = ((Stack<T>.Cons)s).Item2;
return tl;
}
else if (s is Stack<T>.Nil)
throw new Exception("Empty stack");
else
throw new MatchFailureException();
}
(Almost always, ML languages implement pattern matching without run-time type-tests or casts, so the C# code is somewhat deceptive. Let's brush implementation details aside with some hand-waving please :) )
Data structure decomposition in a nutshell
Ok, let's go back to the peek method:
let peek s =
match s with
| Cons(hd, tl) -> hd
| Nil -> failwith "Empty stack"
The trick is understanding that the hd and tl identifiers are variables (errm... since they're immutable, they're not really "variables", but "values" ;) ). If s has the type Cons, then we're going to pull out its values out of the constructor and bind them to variables named hd and tl.
Pattern matching is useful because it lets us decompose a data structure by its shape instead of its contents. So imagine if we define a binary tree as follows:
type 'a tree =
| Node of 'a tree * 'a * 'a tree
| Nil
We can define some tree rotations as follows:
let rotateLeft = function
| Node(a, p, Node(b, q, c)) -> Node(Node(a, p, b), q, c)
| x -> x
let rotateRight = function
| Node(Node(a, p, b), q, c) -> Node(a, p, Node(b, q, c))
| x -> x
(The let rotateRight = function constructor is syntax sugar for let rotateRight s = match s with ....)
So in addition to binding data structure to variables, we can also drill down into it. Let's say we have a node let x = Node(Nil, 1, Nil). If we call rotateLeft x, we test x against the first pattern, which fails to match because the right child has type Nil instead of Node. It'll move to the next pattern, x -> x, which will match any input and return it unmodified.
For comparison, we'd write the methods above in C# as:
public abstract class Tree<T>
{
public abstract U Match<U>(Func<U> nilFunc, Func<Tree<T>, T, Tree<T>, U> nodeFunc);
public class Nil : Tree<T>
{
public override U Match<U>(Func<U> nilFunc, Func<Tree<T>, T, Tree<T>, U> nodeFunc)
{
return nilFunc();
}
}
public class Node : Tree<T>
{
readonly Tree<T> Left;
readonly T Value;
readonly Tree<T> Right;
public Node(Tree<T> left, T value, Tree<T> right)
{
this.Left = left;
this.Value = value;
this.Right = right;
}
public override U Match<U>(Func<U> nilFunc, Func<Tree<T>, T, Tree<T>, U> nodeFunc)
{
return nodeFunc(Left, Value, Right);
}
}
public static Tree<T> RotateLeft(Tree<T> t)
{
return t.Match(
() => t,
(l, x, r) => r.Match(
() => t,
(rl, rx, rr) => new Node(new Node(l, x, rl), rx, rr))));
}
public static Tree<T> RotateRight(Tree<T> t)
{
return t.Match(
() => t,
(l, x, r) => l.Match(
() => t,
(ll, lx, lr) => new Node(ll, lx, new Node(lr, x, r))));
}
}
For seriously.
Pattern matching is awesome
You can implement something similar to pattern matching in C# using the visitor pattern, but its not nearly as flexible because you can't effectively decompose complex data structures. Moreover, if you are using pattern matching, the compiler will tell you if you left out a case. How awesome is that?
Think about how you'd implement similar functionality in C# or languages without pattern matching. Think about how you'd do it without test-tests and casts at runtime. Its certainly not hard, just cumbersome and bulky. And you don't have the compiler checking to make sure you've covered every case.
So pattern matching helps you decompose and navigate data structures in a very convenient, compact syntax, it enables the compiler to check the logic of your code, at least a little bit. It really is a killer feature.
Short answer: Pattern matching arises because functional languages treat the equals sign as an assertion of equivalence instead of assignment.
Long answer: Pattern matching is a form of dispatch based on the “shape” of the value that it's given. In a functional language, the datatypes that you define are usually what are known as discriminated unions or algebraic data types. For instance, what's a (linked) list? A linked list List of things of some type a is either the empty list Nil or some element of type a Consed onto a List a (a list of as). In Haskell (the functional language I'm most familiar with), we write this
data List a = Nil
| Cons a (List a)
All discriminated unions are defined this way: a single type has a fixed number of different ways to create it; the creators, like Nil and Cons here, are called constructors. This means that a value of the type List a could have been created with two different constructors—it could have two different shapes. So suppose we want to write a head function to get the first element of the list. In Haskell, we would write this as
-- `head` is a function from a `List a` to an `a`.
head :: List a -> a
-- An empty list has no first item, so we raise an error.
head Nil = error "empty list"
-- If we are given a `Cons`, we only want the first part; that's the list's head.
head (Cons h _) = h
Since List a values can be of two different kinds, we need to handle each one separately; this is the pattern matching. In head x, if x matches the pattern Nil, then we run the first case; if it matches the pattern Cons h _, we run the second.
Short answer, explained: I think one of the best ways to think about this behavior is by changing how you think of the equals sign. In the curly-bracket languages, by and large, = denotes assignment: a = b means “make a into b.” In a lot of functional languages, however, = denotes an assertion of equality: let Cons a (Cons b Nil) = frob x asserts that the thing on the left, Cons a (Cons b Nil), is equivalent to the thing on the right, frob x; in addition, all variables used on the left become visible. This is also what's happening with function arguments: we assert that the first argument looks like Nil, and if it doesn't, we keep checking.
It means that instead of writing
double f(int x, int y) {
if (y == 0) {
if (x == 0)
return NaN;
else if (x > 0)
return Infinity;
else
return -Infinity;
} else
return (double)x / y;
}
You can write
f(0, 0) = NaN;
f(x, 0) | x > 0 = Infinity;
| else = -Infinity;
f(x, y) = (double)x / y;
Hey, C++ supports pattern matching too.
static const int PositiveInfinity = -1;
static const int NegativeInfinity = -2;
static const int NaN = -3;
template <int x, int y> struct Divide {
enum { value = x / y };
};
template <bool x_gt_0> struct aux { enum { value = PositiveInfinity }; };
template <> struct aux<false> { enum { value = NegativeInfinity }; };
template <int x> struct Divide<x, 0> {
enum { value = aux<(x>0)>::value };
};
template <> struct Divide<0, 0> {
enum { value = NaN };
};
#include <cstdio>
int main () {
printf("%d %d %d %d\n", Divide<7,2>::value, Divide<1,0>::value, Divide<0,0>::value, Divide<-1,0>::value);
return 0;
};
Pattern matching is sort of like overloaded methods on steroids. The simplest case would be the same roughly the same as what you seen in java, arguments are a list of types with names. The correct method to call is based on the arguments passed in, and it doubles as an assignment of those arguments to the parameter name.
Patterns just go a step further, and can destructure the arguments passed in even further. It can also potentially use guards to actually match based on the value of the argument. To demonstrate, I'll pretend like JavaScript had pattern matching.
function foo(a,b,c){} //no pattern matching, just a list of arguments
function foo2([a],{prop1:d,prop2:e}, 35){} //invented pattern matching in JavaScript
In foo2, it expects a to be an array, it breaks apart the second argument, expecting an object with two props (prop1,prop2) and assigns the values of those properties to variables d and e, and then expects the third argument to be 35.
Unlike in JavaScript, languages with pattern matching usually allow multiple functions with the same name, but different patterns. In this way it is like method overloading. I'll give an example in erlang:
fibo(0) -> 0 ;
fibo(1) -> 1 ;
fibo(N) when N > 0 -> fibo(N-1) + fibo(N-2) .
Blur your eyes a little and you can imagine this in javascript. Something like this maybe:
function fibo(0){return 0;}
function fibo(1){return 1;}
function fibo(N) when N > 0 {return fibo(N-1) + fibo(N-2);}
Point being that when you call fibo, the implementation it uses is based on the arguments, but where Java is limited to types as the only means of overloading, pattern matching can do more.
Beyond function overloading as shown here, the same principle can be applied other places, such as case statements or destructuring assingments. JavaScript even has this in 1.7.
Pattern matching allows you to match a value (or an object) against some patterns to select a branch of the code. From the C++ point of view, it may sound a bit similar to the switch statement. In functional languages, pattern matching can be used for matching on standard primitive values such as integers. However, it is more useful for composed types.
First, let's demonstrate pattern matching on primitive values (using extended pseudo-C++ switch):
switch(num) {
case 1:
// runs this when num == 1
case n when n > 10:
// runs this when num > 10
case _:
// runs this for all other cases (underscore means 'match all')
}
The second use deals with functional data types such as tuples (which allow you to store multiple objects in a single value) and discriminated unions which allow you to create a type that can contain one of several options. This sounds a bit like enum except that each label can also carry some values. In a pseudo-C++ syntax:
enum Shape {
Rectangle of { int left, int top, int width, int height }
Circle of { int x, int y, int radius }
}
A value of type Shape can now contain either Rectangle with all the coordinates or a Circle with the center and the radius. Pattern matching allows you to write a function for working with the Shape type:
switch(shape) {
case Rectangle(l, t, w, h):
// declares variables l, t, w, h and assigns properties
// of the rectangle value to the new variables
case Circle(x, y, r):
// this branch is run for circles (properties are assigned to variables)
}
Finally, you can also use nested patterns that combine both of the features. For example, you could use Circle(0, 0, radius) to match for all shapes that have the center in the point [0, 0] and have any radius (the value of the radius will be assigned to the new variable radius).
This may sound a bit unfamiliar from the C++ point of view, but I hope that my pseudo-C++ make the explanation clear. Functional programming is based on quite different concepts, so it makes better sense in a functional language!
Pattern matching is where the interpreter for your language will pick a particular function based on the structure and content of the arguments you give it.
It is not only a functional language feature but is available for many different languages.
The first time I came across the idea was when I learned prolog where it is really central to the language.
e.g.
last([LastItem], LastItem).
last([Head|Tail], LastItem) :-
last(Tail, LastItem).
The above code will give the last item of a list. The input arg is the first and the result is the second.
If there is only one item in the list the interpreter will pick the first version and the second argument will be set to equal the first i.e. a value will be assigned to the result.
If the list has both a head and a tail the interpreter will pick the second version and recurse until it there is only one item left in the list.
For many people, picking up a new concept is easier if some easy examples are provided, so here we go:
Let's say you have a list of three integers, and wanted to add the first and the third element. Without pattern matching, you could do it like this (examples in Haskell):
Prelude> let is = [1,2,3]
Prelude> head is + is !! 2
4
Now, although this is a toy example, imagine we would like to bind the first and third integer to variables and sum them:
addFirstAndThird is =
let first = head is
third = is !! 3
in first + third
This extraction of values from a data structure is what pattern matching does. You basically "mirror" the structure of something, giving variables to bind for the places of interest:
addFirstAndThird [first,_,third] = first + third
When you call this function with [1,2,3] as its argument, [1,2,3] will be unified with [first,_,third], binding first to 1, third to 3 and discarding 2 (_ is a placeholder for things you don't care about).
Now, if you only wanted to match lists with 2 as the second element, you can do it like this:
addFirstAndThird [first,2,third] = first + third
This will only work for lists with 2 as their second element and throw an exception otherwise, because no definition for addFirstAndThird is given for non-matching lists.
Until now, we used pattern matching only for destructuring binding. Above that, you can give multiple definitions of the same function, where the first matching definition is used, thus, pattern matching is a little like "a switch statement on stereoids":
addFirstAndThird [first,2,third] = first + third
addFirstAndThird _ = 0
addFirstAndThird will happily add the first and third element of lists with 2 as their second element, and otherwise "fall through" and "return" 0. This "switch-like" functionality can not only be used in function definitions, e.g.:
Prelude> case [1,3,3] of [a,2,c] -> a+c; _ -> 0
0
Prelude> case [1,2,3] of [a,2,c] -> a+c; _ -> 0
4
Further, it is not restricted to lists, but can be used with other types as well, for example matching the Just and Nothing value constructors of the Maybe type in order to "unwrap" the value:
Prelude> case (Just 1) of (Just x) -> succ x; Nothing -> 0
2
Prelude> case Nothing of (Just x) -> succ x; Nothing -> 0
0
Sure, those were mere toy examples, and I did not even try to give a formal or exhaustive explanation, but they should suffice to grasp the basic concept.
You should start with the Wikipedia page that gives a pretty good explanation. Then, read the relevant chapter of the Haskell wikibook.
This is a nice definition from the above wikibook:
So pattern matching is a way of
assigning names to things (or binding
those names to those things), and
possibly breaking down expressions
into subexpressions at the same time
(as we did with the list in the
definition of map).
Here is a really short example that shows pattern matching usefulness:
Let's say you want to sort up an element in a list:
["Venice","Paris","New York","Amsterdam"]
to (I've sorted up "New York")
["Venice","New York","Paris","Amsterdam"]
in an more imperative language you would write:
function up(city, cities){
for(var i = 0; i < cities.length; i++){
if(cities[i] === city && i > 0){
var prev = cities[i-1];
cities[i-1] = city;
cities[i] = prev;
}
}
return cities;
}
In a functional language you would instead write:
let up list value =
match list with
| [] -> []
| previous::current::tail when current = value -> current::previous::tail
| current::tail -> current::(up tail value)
As you can see the pattern matched solution has less noise, you can clearly see what are the different cases and how easy it's to travel and de-structure our list.
I've written a more detailed blog post about it here.

Resources