f# types' properties in inconsistent order and of slightly differing types - reflection

I'm trying to iterate through an array of objects and recursively print out each objects properties.
Here is my object model:
type firmIdentifier = {
firmId: int ;
firmName: string ;
}
type authorIdentifier = {
authorId: int ;
authorName: string ;
firm: firmIdentifier ;
}
type denormalizedSuggestedTradeRecommendations = {
id: int ;
ticker: string ;
direction: string ;
author: authorIdentifier ;
}
Here is how I am instantiating my objects:
let getMyIdeasIdeas = [|
{id=1; ticker="msfqt"; direction="buy";
author={authorId=0; authorName="john Smith"; firm={firmId=12; firmName="Firm1"}};};
{id=2; ticker="goog"; direction="sell";
author={authorId=1; authorName="Bill Jones"; firm={firmId=13; firmName="ABC Financial"}};};
{id=3; ticker="DFHF"; direction="buy";
author={authorId=2; authorName="Ron James"; firm={firmId=2; firmName="DEFFirm"}};}|]
And here is my algorithm to iterate, recurse and print:
let rec recurseObj (sb : StringBuilder) o=
let props : PropertyInfo [] = o.GetType().GetProperties()
sb.Append( o.GetType().ToString()) |> ignore
for x in props do
let getMethod = x.GetGetMethod()
let value = getMethod.Invoke(o, Array.empty)
ignore <|
match value with
| :? float | :? int | :? string | :? bool as f -> sb.Append(x.Name + ": " + f.ToString() + "," ) |> ignore
| _ -> recurseObj sb value
for x in getMyIdeas do
recurseObj sb x
sb.Append("\r\n") |> ignore
If you couldnt tell, I'm trying to create a csv file and am printing out the types for debugging purposes. The problem is, the first element comes through in the order you'd expect, but all subsequent elements come through with a slightly different (and confusing) ordering of the "child" properties like so:
RpcMethods+denormalizedSuggestedTradeRecommendationsid:
1,ticker: msfqt,direction:
buy,RpcMethods+authorIdentifierauthorId:
0,authorName: john
Smith,RpcMethods+firmIdentifierfirmId:
12,firmName: Firm1,
RpcMethods+denormalizedSuggestedTradeRecommendationsid:
2,ticker: goog,direction:
sell,RpcMethods+authorIdentifierauthorName:
Bill
Jones,RpcMethods+firmIdentifierfirmName:
ABC Financial,firmId: 13,authorId: 1,
RpcMethods+denormalizedSuggestedTradeRecommendationsid:
3,ticker: DFHF,direction:
buy,RpcMethods+authorIdentifierauthorName:
Ron
James,RpcMethods+firmIdentifierfirmName:
DEFFirm,firmId: 2,authorId: 2,
Any idea what is going on here?

Does adding this help?
for x in props |> Array.sortBy (fun p -> p.Name) do
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
In general, I think reflection returns entities (like attributes, methods, properties) in an unspecified order. So just pick a fixed sort order?
(Or did I misunderstand the issue?)

This is a reflection thing. You can't rely on the order of the properties using reflection. I need to sort using MetaTokens. I will post this solution when I get around to implementing it.

Related

Usecase of Variants in Purescript/Haskell

Can someone tell me what is the use case of purescript-variants or variants in general
The documentation is very well written but I can't find any real use case scenario for it. Can someone tell how we could use Variants in real world?
Variants are duals of records. While records are sort of extensible ad-hoc product types (consider data T = T Int String vs. type T = { i :: Int, s :: String }), variants can be seen as extensible ad-hoc sum types - e.g. data T = A Int | B String vs. Variant (a :: Int, b :: String)
For example, just as you can write a function that handles a partial record:
fullName :: forall r. { first :: String, last :: String | r } -> String
fullName r = r.first <> " " <> r.last
myFullName = fullName { first: "Fyodor", last: "Soikin", weight: "Too much" }
so too, you can write a function that handles a partial variant:
weight :: forall r. Variant (kilos :: Int, vague :: String | r) -> String
weight =
default "Unknown"
# on _kilos (\n -> show n <> " kg.")
# on _vague (\s -> "Kind of a " <> s)
myWeight = weight (inj _kilos 100) -- "100 kg."
alsoMyWeight = weight (inj _vague "buttload") -- "Kind of a buttload"
But these are, of course, toy examples. For a less toy example, I would imagine something that handles alternatives, but needs to be extensible. Perhaps something like a file parser:
data FileType a = Json | Xml
basicParser :: forall a. FileType a -> String -> Maybe a
basicParser t contents = case t of
Json -> parseJson contents
Xml -> parseXml contents
Say I'm ok using this parser in most case, but in some cases I'd also like to be able to parse YAML. What do I do? I can't "extend" the FileType sum type post-factum, the best I can do is aggregate it in a larger type:
data BetterFileType a = BasicType (FileType a) | Yaml
betterParser :: forall a. BetterFileType a -> String -> Maybe a
betterParser t contents = case t of
BasicType bt -> basicParser bt contents
Yaml -> parseYaml contents
And now whenever I call the "better parser", I have to wrap the file type awkwardly:
result = betterParser (BasicType Json) "[1,2,3]"
Worse: now every consumer has to know the hierarchy of BetterFileType -> FileType, they can't just say "json", they have to know to wrap it in BasicType. Awkward.
But if I used extensible variants for the file type, I could have flattened them nicely:
type FileType r = (json :: String, xml :: String | r)
basicParser :: forall a r. Variant (FileType r) -> Maybe a
basicParser = onMatch { json: parseJson, xml: parseXml } $ default Nothing
----
type BetterFileType r = (yaml :: String | FileType r)
betterParser :: forall a r. Variant (BetterFileType r) -> Maybe a
betterParser = onMatch { yaml: parseYaml } basicParser
Now I can use the naked variant names with either basicParser or betterParser, without knowing to wrap them or not:
r1 = betterParser $ inj _json "[1,2,3]"
r2 = betterParser $ inj _yaml "foo: [1,2,3]"

Functional composition of Optionals

I have 2 Optionals (or Maybe objects) that I would like to combine so that I get the following results:
|| first operand
second ++-------------+-------------
operand || empty | optional(x)
============||=============|=============
empty || empty | optional(x)
------------++-------------+-------------
optional(y) || optional(y) |optional(x+y)
In other words, a non-empty Optional always replaces/overwrites an empty one, and two non-empty Optionals are combined according to some + function.
Initially, I assumed that the standard monadic flatMap method would do the trick, but (at least in Java) Optional.flatMap always returns an empty optional when the original Optional was already empty (and I'm not sure if any other implementation would comply with the Monad Laws).
Then, as both operands are wrapped in the same monadic type, I figured that this might be a good job for an Applicative Functor. I tried a couple different functional libraries, but I couldn't implement the desired behavior with any of the zip/ap methods that I tried.
What I'm trying to do seems to me a fairly common operation that one might do with Optionals, and I realize that I could just write my own operator with the desired behavior. Still, I am wondering if there is a standard function/method in functional programming to achieve this common operation?
Update: I removed the java tag, as I'm curious how other languages handle this situation
In a functional language, you'd do this with pattern matching, such as (Haskell):
combine :: Maybe t -> Maybe t -> (t -> t -> t) -> Maybe t
combine (Some x) (Some y) f = Some (f x y)
combine (Some x) _ _ = (Some x)
combine _ (Some y) _ = (Some y)
combine None None _ = None
There are other ways to write it, but you are basically pattern matching on the cases. Note that this still involves "unpacking" the optionals, but because its built into the language, it is less obvious.
In Haskell you can do this by wrapping any semigroup in a Maybe. Specifically, if you want to add numbers together:
Prelude> import Data.Semigroup
Prelude Data.Semigroup> Just (Sum 1) <> Just (Sum 2)
Just (Sum {getSum = 3})
Prelude Data.Semigroup> Nothing <> Just (Sum 2)
Just (Sum {getSum = 2})
Prelude Data.Semigroup> Just (Sum 1) <> Nothing
Just (Sum {getSum = 1})
Prelude Data.Semigroup> Nothing <> Nothing
Nothing
The above linked article contains more explanations, and also some C# examples.
It's not possible to combine optional objects without "unpacking" them.
I don't know the specifics of your case. For me, creating such a logic just in order to fuse the two optionals is an overkill.
But nevertheless, there's a possible solution with streams.
I assume that you're not going to pass optional objects as arguments (because such practice is discouraged). Therefore, there are two dummy methods returning Optional<T>.
Method combine() expects a BinaryOperator<T> as an argument and creates a stream by concatenating singleton-streams produced from each of the optional objects returned by getX() and getY().
The flavor of reduce(BinaryOperator) will produce an optional result.
public static <T> Optional<T> getX(Class<T> t) {
return // something
}
public static <T> Optional<T> getY(Class<T> t) {
return // something
}
public static <T> Optional<T> combine(BinaryOperator<T> combiner,
Class<T> t) {
return Stream.concat(getX(t).stream(), getY(t).stream())
.reduce(combiner);
}
If we generalize the problem to "how to combine N optional objects" then it can be solved like this:
#SafeVarargs
public static <T> Optional<T> combine(BinaryOperator<T> combiner,
Supplier<Optional<T>>... suppliers) {
return Arrays.stream(suppliers)
.map(Supplier::get) // fetching Optional<T>
.filter(Optional::isPresent) // filtering optionals that contain results to avoid NoSuchElementException while invoking `get()`
.map(Optional::get) // "unpacking" optionals
.reduce(combiner);
}
Here's one way:
a.map(x -> b.map(y -> x + y).orElse(x)).or(() -> b)
Ideone Demo
OptionalInt x = ...
OptionalInt y = ...
OptionalInt sum = IntStream.concat(x.stream(), y.stream())
.reduce(OptionalInt.empty(),
(opt, z) -> OptionalInt.of(z + opt.orElse(0)));
Since java 9 you can turn an Optional into a Stream.
With concat you get a Stream of 0, 1 or 2 elements.
Reduce it to an empty when 0 elements,and for more add it to the previous OptionalInt, defaulting to 0.
Not very straight (.sum()) because of the need for an empty().
You can implement your function in Java by combining flatMap and map:
optA.flatMap(a -> optB.map(b -> a + b));
More general example:
public static void main(String[] args) {
test(Optional.empty(), Optional.empty());
test(Optional.of(3), Optional.empty());
test(Optional.empty(), Optional.of(4));
test(Optional.of(3), Optional.of(4));
}
static void test(Optional<Integer> optX, Optional<Integer> optY) {
final Optional<Integer> optSum = apply(Integer::sum, optX, optY);
System.out.println(optX + " + " + optY + " = " + optSum);
}
static <A, B, C> Optional<C> apply(BiFunction<A, B, C> fAB, Optional<A> optA, Optional<B> optB) {
return optA.flatMap(a -> optB.map(b -> fAB.apply(a, b)));
}
Since flatMap and map are standard functions for Optional/Maybe (and monad types generally), this approach should work in any other language (though most FP languages will have a more concise solution). E.g. in Haskell:
combine ma mb = do a <- ma ; b <- mb ; return (a + b)
In F#, i would call this logic reduce.
Reason:
The function must be of type 'a -> 'a -> 'a as it only can combine thinks of equal type.
Like other reduce operations, like on list, you always need at least one value, otherwise it fails.
With a option and two of them, you just need to cover four cases. In F# it will be written this way.
(* Signature: ('a -> 'a -> 'a) -> option<'a> -> option<'a> -> option<'a> *)
let reduce fn x y =
match x,y with
| Some x, Some y -> Some (fn x y)
| Some x, None -> Some x
| None , Some y -> Some y
| None , None -> None
printfn "%A" (reduce (+) (Some 3) (Some 7)) // Some 10
printfn "%A" (reduce (+) (None) (Some 7)) // Some 7
printfn "%A" (reduce (+) (Some 3) (None)) // Some 3
printfn "%A" (reduce (+) (None) (None)) // None
In another lets say Pseudo-like C# language, it would look like.
Option<A> Reduce(Action<A,A,A> fn, Option<A> x, Option<A> y) {
if ( x.isSome ) {
if ( y.isSome ) {
return Option.Some(fn(x.Value, y.Value));
}
else {
return x;
}
}
else {
if ( y.isSome ) {
return y;
}
else {
return Option.None;
}
}
}

Simulate polymorphic variants in F#?

I'm new to F# so forgive me in advance if this is a stupid question or if the syntax may be a bit off. Hopefully it's possible to understand the gist of the question anyways.
What I'd like to achieve is the possibility to compose e.g. Result's (or an Either or something similar) having different error types (discriminated unions) without creating an explicit discriminated union that includes the union of the two other discriminated unions.
Let me present an example.
Let's say I have a type Person defined like this:
type Person =
{ Name: string
Email: string }
Imagine that you have a function that validates the name:
type NameValidationError =
| NameTooLong
| NameTooShort
let validateName person : Result<Person, NameValidationError>
and another that validates an email address:
type EmailValidationError =
| EmailTooLong
| EmailTooShort
let validateEmail person : Result<Person, EmailValidationError>
Now I want to compose validateName and validateEmail, but the problem is that the error type in the Result has different types. What I'd like to achieve is a function (or operator) that allows me to do something like this:
let validatedPerson = person |> validateName |>>> validateEmail
(|>>> is the "magic operator")
By using |>>> the error type of validatedPerson would be a union of NameValidationError and EmailValidationError:
Result<Person, NameValidationError | EmailValidationError>
Just to make it clear, it should be possible to an use arbitrary number of functions in the composition chain, i.e.:
let validatedPerson : Result<Person, NameValidationError | EmailValidationError | XValidationError | YValidationError> =
person |> validateName |>>> validateEmail |>>> validateX |>>> validateY
In languages like ReasonML you can use something called polymorphic variants but this is not available in F# as afaict.
Would it be possible to somehow mimic polymorphic variants using generics with union types (or any other technique)?! Or is this impossible?
There's some interesting proposals for erased type unions, allowing for Typescript-style anonymous union constraints.
type Goose = Goose of int
type Cardinal = Cardinal of int
type Mallard = Mallard of int
// a type abbreviation for an erased anonymous union
type Bird = (Goose | Cardinal | Mallard)
The magic operator which would give you a NameValidationError | EmailValidationError would have its type exist only at compile-time. It would be erased to object at runtime.
But it's still on the anvil, so maybe we can still have some readable code by doing the erasing ourselves?
The composition operator could 'erase' (box, really) the result error type:
let (|>>) input validate =
match input with
| Ok(v) -> validate v |> Result.mapError(box)
| Error(e) -> Error(box e)
and we can have a partial active pattern to make type-matching DU cases palatable.
let (|ValidationError|_|) kind = function
| Error(err) when Object.Equals(kind, err) -> Some ()
| _ -> None
Example (with super biased validations):
let person = { Name = "Bob"; Email = "bob#email.com "}
let validateName person = Result.Ok(person)
let validateEmail person = Result.Ok(person)
let validateVibe person = Result.Error(NameTooShort)
let result = person |> validateName |>> validateVibe |>> validateEmail
match result with
| ValidationError NameTooShort -> printfn "Why is your name too short"
| ValidationError EmailTooLong -> printfn "That was a long address"
| _ -> ()
This will shunt on validateVibe
This is probably more verbose than you would like but it does allow you to put things into a DU without explicitly defining it.
F# has Choice types which are defined like this:
type Choice<'T1,'T2> =
| Choice1Of2 of 'T1
| Choice2Of2 of 'T2
type Choice<'T1,'T2,'T3> =
| Choice1Of3 of 'T1
| Choice2Of3 of 'T2
| Choice3Of3 of 'T3
// Going up to ChoiceXOf7
With your existing functions you would use them like this:
// This function returns Result<Person,Choice<NameValidationError,EmailValidationError>>
let validatePerson person =
validateName person
|> Result.mapError Choice1Of2
|> Result.bind (validateEmail >> Result.mapError Choice2Of2)
This is how you would consume the result:
let displayValidationError person =
match person with
| Ok p -> None
| Error (Choice1Of2 NameTooLong) -> Some "Name too long"
| Error (Choice2Of2 EmailTooLong) -> Some "Email too long"
// etc.
If you want to add a third validation into validatePerson you'll need to switch to Choice<_,_,_> DU cases, e.g. Choice1Of3 and so on.

F#, FParsec, and Calling a Stream Parser Recursively, Second Take

Thank you for the replies to my first post and my second post on this project. This question is basically the same question as the first, but with my code updated according to the feedback received on those two questions. How do I call my parser recursively?
I'm scratching my head and staring blankly at the code. I've no idea where to go from here. That's when I turn to stackoverflow.
I've included in code comments the compile-time errors I'm receiving. One stumbling block may be my discriminated union. I've not worked with discriminated unions much, so I may be using mine incorrectly.
The example POST I'm working with, bits of which I've included in my previous two questions, consists of one boundary that includes a second post with a new boundary. That second post includes several additional parts separated by the second boundary. Each of those several additional parts is a new post consisting of headers and XML.
My goal in this project is to build a library to be used in our C# solution, with the library taking a stream and returning the POST parsed into headers and parts recursively. I really want F# to shine here.
namespace MultipartMIMEParser
open FParsec
open System.IO
type Header = { name : string
; value : string
; addl : (string * string) list option }
type Content = Content of string
| Post of Post list
and Post = { headers : Header list
; content : Content }
type UserState = { Boundary : string }
with static member Default = { Boundary="" }
module internal P =
let ($) f x = f x
let undefined = failwith "Undefined."
let ascii = System.Text.Encoding.ASCII
let str cs = System.String.Concat (cs:char list)
let makeHeader ((n,v),nvps) = { name=n; value=v; addl=nvps}
let runP p s = match runParserOnStream p UserState.Default "" s ascii with
| Success (r,_,_) -> r
| Failure (e,_,_) -> failwith (sprintf "%A" e)
let blankField = parray 2 newline
let delimited d e =
let pEnd = preturn () .>> e
let part = spaces
>>. (manyTill
$ noneOf d
$ (attempt (preturn () .>> pstring d)
<|> pEnd)) |>> str
in part .>>. part
let delimited3 firstDelimiter secondDelimiter thirdDelimiter endMarker =
delimited firstDelimiter endMarker
.>>. opt (many (delimited secondDelimiter endMarker
>>. delimited thirdDelimiter endMarker))
let isBoundary ((n:string),_) = n.ToLower() = "boundary"
let pHeader =
let includesBoundary (h:Header) = match h.addl with
| Some xs -> xs |> List.exists isBoundary
| None -> false
let setBoundary b = { Boundary=b }
in delimited3 ":" ";" "=" blankField
|>> makeHeader
>>= fun header stream -> if includesBoundary header
then
stream.UserState <- setBoundary (header.addl.Value
|> List.find isBoundary
|> snd)
Reply ()
else Reply ()
let pHeaders = manyTill pHeader $ attempt (preturn () .>> blankField)
let rec pContent (stream:CharStream<UserState>) =
match stream.UserState.Boundary with
| "" -> // Content is text.
let nl = System.Environment.NewLine
let unlines (ss:string list) = System.String.Join (nl,ss)
let line = restOfLine false
let lines = manyTill line $ attempt (preturn () .>> blankField)
in pipe2 pHeaders lines
$ fun h c -> { headers=h
; content=Content $ unlines c }
| _ -> // Content contains boundaries.
let b = "--" + stream.UserState.Boundary
// VS complains about pContent in the following line:
// Type mismatch. Expecting a
// Parser<'a,UserState>
// but given a
// CharStream<UserState> -> Parser<Post,UserState>
// The type 'Reply<'a>' does not match the type 'Parser<Post,UserState>'
let p = pipe2 pHeaders pContent $ fun h c -> { headers=h; content=c }
in skipString b
>>. manyTill p (attempt (preturn () .>> blankField))
// VS complains about Content.Post in the following line:
// Type mismatch. Expecting a
// Post list -> Post
// but given a
// Post list -> Content
// The type 'Post' does not match the type 'Content'
|>> Content.Post
// VS complains about pContent in the following line:
// Type mismatch. Expecting a
// Parser<'a,UserState>
// but given a
// CharStream<UserState> -> Parser<Post,UserState>
// The type 'Reply<'a>' does not match the type 'Parser<Post,UserState>'
let pStream = runP (pipe2 pHeaders pContent $ fun h c -> { headers=h; content=c })
type MParser (s:Stream) =
let r = P.pStream s
let findHeader name =
match r.headers |> List.tryFind (fun h -> h.name.ToLower() = name) with
| Some h -> h.value
| None -> ""
member p.Boundary =
let header = r.headers
|> List.tryFind (fun h -> match h.addl with
| Some xs -> xs |> List.exists P.isBoundary
| None -> false)
in match header with
| Some h -> h.addl.Value |> List.find P.isBoundary |> snd
| None -> ""
member p.ContentID = findHeader "content-id"
member p.ContentLocation = findHeader "content-location"
member p.ContentSubtype = findHeader "type"
member p.ContentTransferEncoding = findHeader "content-transfer-encoding"
member p.ContentType = findHeader "content-type"
member p.Content = r.content
member p.Headers = r.headers
member p.MessageID = findHeader "message-id"
member p.MimeVersion = findHeader "mime-version"
EDIT
In response to the feedback I've received thus far (thank you!), I made the following adjustments, receiving the errors annotated:
let rec pContent (stream:CharStream<UserState>) =
match stream.UserState.Boundary with
| "" -> // Content is text.
let nl = System.Environment.NewLine
let unlines (ss:string list) = System.String.Join (nl,ss)
let line = restOfLine false
let lines = manyTill line $ attempt (preturn () .>> blankField)
in pipe2 pHeaders lines
$ fun h c -> { headers=h
; content=Content $ unlines c }
| _ -> // Content contains boundaries.
let b = "--" + stream.UserState.Boundary
// The following complaint is about `pContent stream`:
// This expression was expected to have type
// Reply<'a>
// but here has type
// Parser<Post,UserState>
let p = pipe2 pHeaders (fun stream -> pContent stream) $ fun h c -> { headers=h; content=c }
in skipString b
>>. manyTill p (attempt (preturn () .>> blankField))
// VS complains about the line above:
// Type mismatch. Expecting a
// Parser<Post,UserState>
// but given a
// Parser<'a list,UserState>
// The type 'Post' does not match the type ''a list'
// See above complaint about `pContent stream`. Same complaint here.
let pStream = runP (pipe2 pHeaders (fun stream -> pContent stream) $ fun h c -> { headers=h; content=c })
I tried throwing in Reply ()s, but they just returned parsers, meaning c above became a Parser<...> rather than Content. That seemed to have been a step backwards, or at least in the wrong direction. I admit my ignorance, though, and welcome correction!
I can help with one of the errors.
F# generally binds arguments left to right, so you need to use either parentheses around the recursive calls to pContent or a pipe-backward operator <| to show that you want to evaluate the recursive call and bind the return value.
It's also worth noting that <| is the same as your $ operator.
Content.Post is not a constructor for a Post object. You need a function to accept a Post list and return a Post. (Does something from the List module do what you need?)
My first answer was completely wrong, but I'd thought I'd leave it up.
The types Post and Content are defined as:
type Content =
| Content of string
| Post of Post list
and Post =
{ headers : Header list
; content : Content }
Post is a Record, and Content is a Discriminated Union.
F# treats the cases for Discriminated Unions as a separate namespace from types. So Content is different from Content.Content, and Post is different from Content.Post. Because they are different, having the same identifier is confusing.
What is pContent supposed to be returning? If it's supposed to be returning the Discriminated Union Content, you need to wrap the Post record you are returning in the first case in the Content.Post case i.e.
$ fun h c -> Post [ { headers=h
; content=Content $ unlines c } ]
(F# is able to infer that 'Post' refers to Content.Post case, instead of the Post record type here.)

What is 'Pattern Matching' in functional languages?

I'm reading about functional programming and I've noticed that Pattern Matching is mentioned in many articles as one of the core features of functional languages.
Can someone explain for a Java/C++/JavaScript developer what does it mean?
Understanding pattern matching requires explaining three parts:
Algebraic data types.
What pattern matching is
Why its awesome.
Algebraic data types in a nutshell
ML-like functional languages allow you define simple data types called "disjoint unions" or "algebraic data types". These data structures are simple containers, and can be recursively defined. For example:
type 'a list =
| Nil
| Cons of 'a * 'a list
defines a stack-like data structure. Think of it as equivalent to this C#:
public abstract class List<T>
{
public class Nil : List<T> { }
public class Cons : List<T>
{
public readonly T Item1;
public readonly List<T> Item2;
public Cons(T item1, List<T> item2)
{
this.Item1 = item1;
this.Item2 = item2;
}
}
}
So, the Cons and Nil identifiers define simple a simple class, where the of x * y * z * ... defines a constructor and some data types. The parameters to the constructor are unnamed, they're identified by position and data type.
You create instances of your a list class as such:
let x = Cons(1, Cons(2, Cons(3, Cons(4, Nil))))
Which is the same as:
Stack<int> x = new Cons(1, new Cons(2, new Cons(3, new Cons(4, new Nil()))));
Pattern matching in a nutshell
Pattern matching is a kind of type-testing. So let's say we created a stack object like the one above, we can implement methods to peek and pop the stack as follows:
let peek s =
match s with
| Cons(hd, tl) -> hd
| Nil -> failwith "Empty stack"
let pop s =
match s with
| Cons(hd, tl) -> tl
| Nil -> failwith "Empty stack"
The methods above are equivalent (although not implemented as such) to the following C#:
public static T Peek<T>(Stack<T> s)
{
if (s is Stack<T>.Cons)
{
T hd = ((Stack<T>.Cons)s).Item1;
Stack<T> tl = ((Stack<T>.Cons)s).Item2;
return hd;
}
else if (s is Stack<T>.Nil)
throw new Exception("Empty stack");
else
throw new MatchFailureException();
}
public static Stack<T> Pop<T>(Stack<T> s)
{
if (s is Stack<T>.Cons)
{
T hd = ((Stack<T>.Cons)s).Item1;
Stack<T> tl = ((Stack<T>.Cons)s).Item2;
return tl;
}
else if (s is Stack<T>.Nil)
throw new Exception("Empty stack");
else
throw new MatchFailureException();
}
(Almost always, ML languages implement pattern matching without run-time type-tests or casts, so the C# code is somewhat deceptive. Let's brush implementation details aside with some hand-waving please :) )
Data structure decomposition in a nutshell
Ok, let's go back to the peek method:
let peek s =
match s with
| Cons(hd, tl) -> hd
| Nil -> failwith "Empty stack"
The trick is understanding that the hd and tl identifiers are variables (errm... since they're immutable, they're not really "variables", but "values" ;) ). If s has the type Cons, then we're going to pull out its values out of the constructor and bind them to variables named hd and tl.
Pattern matching is useful because it lets us decompose a data structure by its shape instead of its contents. So imagine if we define a binary tree as follows:
type 'a tree =
| Node of 'a tree * 'a * 'a tree
| Nil
We can define some tree rotations as follows:
let rotateLeft = function
| Node(a, p, Node(b, q, c)) -> Node(Node(a, p, b), q, c)
| x -> x
let rotateRight = function
| Node(Node(a, p, b), q, c) -> Node(a, p, Node(b, q, c))
| x -> x
(The let rotateRight = function constructor is syntax sugar for let rotateRight s = match s with ....)
So in addition to binding data structure to variables, we can also drill down into it. Let's say we have a node let x = Node(Nil, 1, Nil). If we call rotateLeft x, we test x against the first pattern, which fails to match because the right child has type Nil instead of Node. It'll move to the next pattern, x -> x, which will match any input and return it unmodified.
For comparison, we'd write the methods above in C# as:
public abstract class Tree<T>
{
public abstract U Match<U>(Func<U> nilFunc, Func<Tree<T>, T, Tree<T>, U> nodeFunc);
public class Nil : Tree<T>
{
public override U Match<U>(Func<U> nilFunc, Func<Tree<T>, T, Tree<T>, U> nodeFunc)
{
return nilFunc();
}
}
public class Node : Tree<T>
{
readonly Tree<T> Left;
readonly T Value;
readonly Tree<T> Right;
public Node(Tree<T> left, T value, Tree<T> right)
{
this.Left = left;
this.Value = value;
this.Right = right;
}
public override U Match<U>(Func<U> nilFunc, Func<Tree<T>, T, Tree<T>, U> nodeFunc)
{
return nodeFunc(Left, Value, Right);
}
}
public static Tree<T> RotateLeft(Tree<T> t)
{
return t.Match(
() => t,
(l, x, r) => r.Match(
() => t,
(rl, rx, rr) => new Node(new Node(l, x, rl), rx, rr))));
}
public static Tree<T> RotateRight(Tree<T> t)
{
return t.Match(
() => t,
(l, x, r) => l.Match(
() => t,
(ll, lx, lr) => new Node(ll, lx, new Node(lr, x, r))));
}
}
For seriously.
Pattern matching is awesome
You can implement something similar to pattern matching in C# using the visitor pattern, but its not nearly as flexible because you can't effectively decompose complex data structures. Moreover, if you are using pattern matching, the compiler will tell you if you left out a case. How awesome is that?
Think about how you'd implement similar functionality in C# or languages without pattern matching. Think about how you'd do it without test-tests and casts at runtime. Its certainly not hard, just cumbersome and bulky. And you don't have the compiler checking to make sure you've covered every case.
So pattern matching helps you decompose and navigate data structures in a very convenient, compact syntax, it enables the compiler to check the logic of your code, at least a little bit. It really is a killer feature.
Short answer: Pattern matching arises because functional languages treat the equals sign as an assertion of equivalence instead of assignment.
Long answer: Pattern matching is a form of dispatch based on the “shape” of the value that it's given. In a functional language, the datatypes that you define are usually what are known as discriminated unions or algebraic data types. For instance, what's a (linked) list? A linked list List of things of some type a is either the empty list Nil or some element of type a Consed onto a List a (a list of as). In Haskell (the functional language I'm most familiar with), we write this
data List a = Nil
| Cons a (List a)
All discriminated unions are defined this way: a single type has a fixed number of different ways to create it; the creators, like Nil and Cons here, are called constructors. This means that a value of the type List a could have been created with two different constructors—it could have two different shapes. So suppose we want to write a head function to get the first element of the list. In Haskell, we would write this as
-- `head` is a function from a `List a` to an `a`.
head :: List a -> a
-- An empty list has no first item, so we raise an error.
head Nil = error "empty list"
-- If we are given a `Cons`, we only want the first part; that's the list's head.
head (Cons h _) = h
Since List a values can be of two different kinds, we need to handle each one separately; this is the pattern matching. In head x, if x matches the pattern Nil, then we run the first case; if it matches the pattern Cons h _, we run the second.
Short answer, explained: I think one of the best ways to think about this behavior is by changing how you think of the equals sign. In the curly-bracket languages, by and large, = denotes assignment: a = b means “make a into b.” In a lot of functional languages, however, = denotes an assertion of equality: let Cons a (Cons b Nil) = frob x asserts that the thing on the left, Cons a (Cons b Nil), is equivalent to the thing on the right, frob x; in addition, all variables used on the left become visible. This is also what's happening with function arguments: we assert that the first argument looks like Nil, and if it doesn't, we keep checking.
It means that instead of writing
double f(int x, int y) {
if (y == 0) {
if (x == 0)
return NaN;
else if (x > 0)
return Infinity;
else
return -Infinity;
} else
return (double)x / y;
}
You can write
f(0, 0) = NaN;
f(x, 0) | x > 0 = Infinity;
| else = -Infinity;
f(x, y) = (double)x / y;
Hey, C++ supports pattern matching too.
static const int PositiveInfinity = -1;
static const int NegativeInfinity = -2;
static const int NaN = -3;
template <int x, int y> struct Divide {
enum { value = x / y };
};
template <bool x_gt_0> struct aux { enum { value = PositiveInfinity }; };
template <> struct aux<false> { enum { value = NegativeInfinity }; };
template <int x> struct Divide<x, 0> {
enum { value = aux<(x>0)>::value };
};
template <> struct Divide<0, 0> {
enum { value = NaN };
};
#include <cstdio>
int main () {
printf("%d %d %d %d\n", Divide<7,2>::value, Divide<1,0>::value, Divide<0,0>::value, Divide<-1,0>::value);
return 0;
};
Pattern matching is sort of like overloaded methods on steroids. The simplest case would be the same roughly the same as what you seen in java, arguments are a list of types with names. The correct method to call is based on the arguments passed in, and it doubles as an assignment of those arguments to the parameter name.
Patterns just go a step further, and can destructure the arguments passed in even further. It can also potentially use guards to actually match based on the value of the argument. To demonstrate, I'll pretend like JavaScript had pattern matching.
function foo(a,b,c){} //no pattern matching, just a list of arguments
function foo2([a],{prop1:d,prop2:e}, 35){} //invented pattern matching in JavaScript
In foo2, it expects a to be an array, it breaks apart the second argument, expecting an object with two props (prop1,prop2) and assigns the values of those properties to variables d and e, and then expects the third argument to be 35.
Unlike in JavaScript, languages with pattern matching usually allow multiple functions with the same name, but different patterns. In this way it is like method overloading. I'll give an example in erlang:
fibo(0) -> 0 ;
fibo(1) -> 1 ;
fibo(N) when N > 0 -> fibo(N-1) + fibo(N-2) .
Blur your eyes a little and you can imagine this in javascript. Something like this maybe:
function fibo(0){return 0;}
function fibo(1){return 1;}
function fibo(N) when N > 0 {return fibo(N-1) + fibo(N-2);}
Point being that when you call fibo, the implementation it uses is based on the arguments, but where Java is limited to types as the only means of overloading, pattern matching can do more.
Beyond function overloading as shown here, the same principle can be applied other places, such as case statements or destructuring assingments. JavaScript even has this in 1.7.
Pattern matching allows you to match a value (or an object) against some patterns to select a branch of the code. From the C++ point of view, it may sound a bit similar to the switch statement. In functional languages, pattern matching can be used for matching on standard primitive values such as integers. However, it is more useful for composed types.
First, let's demonstrate pattern matching on primitive values (using extended pseudo-C++ switch):
switch(num) {
case 1:
// runs this when num == 1
case n when n > 10:
// runs this when num > 10
case _:
// runs this for all other cases (underscore means 'match all')
}
The second use deals with functional data types such as tuples (which allow you to store multiple objects in a single value) and discriminated unions which allow you to create a type that can contain one of several options. This sounds a bit like enum except that each label can also carry some values. In a pseudo-C++ syntax:
enum Shape {
Rectangle of { int left, int top, int width, int height }
Circle of { int x, int y, int radius }
}
A value of type Shape can now contain either Rectangle with all the coordinates or a Circle with the center and the radius. Pattern matching allows you to write a function for working with the Shape type:
switch(shape) {
case Rectangle(l, t, w, h):
// declares variables l, t, w, h and assigns properties
// of the rectangle value to the new variables
case Circle(x, y, r):
// this branch is run for circles (properties are assigned to variables)
}
Finally, you can also use nested patterns that combine both of the features. For example, you could use Circle(0, 0, radius) to match for all shapes that have the center in the point [0, 0] and have any radius (the value of the radius will be assigned to the new variable radius).
This may sound a bit unfamiliar from the C++ point of view, but I hope that my pseudo-C++ make the explanation clear. Functional programming is based on quite different concepts, so it makes better sense in a functional language!
Pattern matching is where the interpreter for your language will pick a particular function based on the structure and content of the arguments you give it.
It is not only a functional language feature but is available for many different languages.
The first time I came across the idea was when I learned prolog where it is really central to the language.
e.g.
last([LastItem], LastItem).
last([Head|Tail], LastItem) :-
last(Tail, LastItem).
The above code will give the last item of a list. The input arg is the first and the result is the second.
If there is only one item in the list the interpreter will pick the first version and the second argument will be set to equal the first i.e. a value will be assigned to the result.
If the list has both a head and a tail the interpreter will pick the second version and recurse until it there is only one item left in the list.
For many people, picking up a new concept is easier if some easy examples are provided, so here we go:
Let's say you have a list of three integers, and wanted to add the first and the third element. Without pattern matching, you could do it like this (examples in Haskell):
Prelude> let is = [1,2,3]
Prelude> head is + is !! 2
4
Now, although this is a toy example, imagine we would like to bind the first and third integer to variables and sum them:
addFirstAndThird is =
let first = head is
third = is !! 3
in first + third
This extraction of values from a data structure is what pattern matching does. You basically "mirror" the structure of something, giving variables to bind for the places of interest:
addFirstAndThird [first,_,third] = first + third
When you call this function with [1,2,3] as its argument, [1,2,3] will be unified with [first,_,third], binding first to 1, third to 3 and discarding 2 (_ is a placeholder for things you don't care about).
Now, if you only wanted to match lists with 2 as the second element, you can do it like this:
addFirstAndThird [first,2,third] = first + third
This will only work for lists with 2 as their second element and throw an exception otherwise, because no definition for addFirstAndThird is given for non-matching lists.
Until now, we used pattern matching only for destructuring binding. Above that, you can give multiple definitions of the same function, where the first matching definition is used, thus, pattern matching is a little like "a switch statement on stereoids":
addFirstAndThird [first,2,third] = first + third
addFirstAndThird _ = 0
addFirstAndThird will happily add the first and third element of lists with 2 as their second element, and otherwise "fall through" and "return" 0. This "switch-like" functionality can not only be used in function definitions, e.g.:
Prelude> case [1,3,3] of [a,2,c] -> a+c; _ -> 0
0
Prelude> case [1,2,3] of [a,2,c] -> a+c; _ -> 0
4
Further, it is not restricted to lists, but can be used with other types as well, for example matching the Just and Nothing value constructors of the Maybe type in order to "unwrap" the value:
Prelude> case (Just 1) of (Just x) -> succ x; Nothing -> 0
2
Prelude> case Nothing of (Just x) -> succ x; Nothing -> 0
0
Sure, those were mere toy examples, and I did not even try to give a formal or exhaustive explanation, but they should suffice to grasp the basic concept.
You should start with the Wikipedia page that gives a pretty good explanation. Then, read the relevant chapter of the Haskell wikibook.
This is a nice definition from the above wikibook:
So pattern matching is a way of
assigning names to things (or binding
those names to those things), and
possibly breaking down expressions
into subexpressions at the same time
(as we did with the list in the
definition of map).
Here is a really short example that shows pattern matching usefulness:
Let's say you want to sort up an element in a list:
["Venice","Paris","New York","Amsterdam"]
to (I've sorted up "New York")
["Venice","New York","Paris","Amsterdam"]
in an more imperative language you would write:
function up(city, cities){
for(var i = 0; i < cities.length; i++){
if(cities[i] === city && i > 0){
var prev = cities[i-1];
cities[i-1] = city;
cities[i] = prev;
}
}
return cities;
}
In a functional language you would instead write:
let up list value =
match list with
| [] -> []
| previous::current::tail when current = value -> current::previous::tail
| current::tail -> current::(up tail value)
As you can see the pattern matched solution has less noise, you can clearly see what are the different cases and how easy it's to travel and de-structure our list.
I've written a more detailed blog post about it here.

Resources