Match Comparison OCaml - functional-programming

I have come to love this syntax in OCaml
match myCompare x y with
|Greater->
|Less->
|Equal->
However, it needs 2 things, a custom type, and a myCompare function that returns my custom type.
Would there be anyway to do this without doing the steps above?
The pervasives module seems to have 'compare' which returns 0 if equal, pos int when greater and neg int when less. Is it possible to match those? Conceptually like so (which does not compile):
match myCompare x y with
| (>0) ->
| (0) ->
| (<0) ->
I know I could just use if statements, but pattern matching is more elegant to me. Is there an easy (if not maybe standard) way of doing this?

Is there an easy … way of doing this?
No!
The advantage of match over what switch does in another language is that OCaml's match tells you if you have thought of covering all the cases (and it allows to match in-depth and is compiled more efficiently, but this could also be considered an advantage of types). You would lose the advantage of being warned if you do something stupid, if you started using arbitrary conditions instead of patterns. You would just end up with a construct with the same drawbacks as a switch.
This said, actually, Yes!
You can write:
match myCompare x y with
| z when (z > 0) -> 0
| 0 -> 0
| z when (z < 0) -> 0
But using when makes you lose the advantage of being warned if you do something stupid.
The custom type type comparison = Greater | Less | Equal and pattern-matching over the three only constructors is the right way. It documents what myCompare does instead of letting it return an int that could also, in another language, represent a file descriptor. Type definitions do not have any run-time cost. There is no reason not to use one in this example.

You can use a library that already provide those variant-returning compare functions. This is the case of the BatOrd module of Batteries, for example.
Otherwise your best bet is to define the type and create a conversion function from integers to comparisons.
type comparison = Lt | Eq | Gt
let comp n =
if n < 0 then Lt
else if n > 0 then Gt
else Eq
(* ... *)
match comp (Pervasives.compare foo bar) with
| Lt -> ...
| Gt -> ...
| Eq -> ...

Related

How to remove seq of array elments matching a pattern at the beginning and the end?

Suppose that I have a json file, in which the following pattern appears many times.
... [ ... ["X"], ... ,["Y"] ... ] ...
I want to remove everything between each pair of ["X"] and ["Y"]. How can I do it?
Assuming pairwise occurrences, and that "between" means excluding the border items, you could query the indices of the border items, and fetch everything in between.
Given the following sample, the following filter will produce the following output:
[
["A"], ["B"], ["X"], ["C"], ["D"], ["Y"], ["E"], ["F"],
["G"], ["H"], ["X"], ["I"], ["J"], ["Y"], ["K"], ["L"]
]
[
(
[[indices([["X"]]), indices([["Y"]])] | 0, transpose[][], infinite]
| _nwise(2)
)
as [$a, $b] | .[$a:$b+1][]
]
[["A"],["B"],["X"],["Y"],["E"],["F"],["G"],["H"],["X"],["Y"],["K"],["L"]]
If the border items may occur in any order, and not necessarily in equal amounts, before transposing the two lists of indices they would need to be filtered first to only contain mutually successive positions.
The question seems to have two components:
(1) Given an array, how can I delete all segments that are "bookended" by two values?
(2) Given a single JSON entity (aka document), how can I perform the above-mentioned deletion operation on all arrays, no matter where they occur within the document?
Here, I will offer an alternative to #pmf's solution for (1) and show how to apply it to an entire JSON entity.
Here's the alternative, which has the possible advantage that it doesn't make any strong assumptions about the occurrence of the $x and $y values, and allows for both interpretations regarding the removal of the bookends themselves:
# Input: an array
# Remove all stretches from $x to the next $y,
# removing both bookends too if and only if $bookends.
# Both bookends must be present for a stretch to be removed.
def remove_all_xy($x; $y; $bookends):
# The helper function removes a single stretch from $x to $y, if any
def r:
index($x) as $ix
| if $ix then .[$ix+1:] as $tail
| ($tail | index($y)) as $iy
| if $iy
then (if $bookends then 0 else 1 end) as $adjust
| .[:$ix + $adjust] + ($tail | .[1+$iy - $adjust:] | r)
else . end
else . end;
r;
Now let's say you decide on some function, foo($x;$y;$bookends), for performing the per-array operation. To apply it to the whole document,
you could write:
walk(if type == "array" then foo($x;$y;$bookends) else . end)
This might not be as efficient as possible, but in practice it should suffice. (If not, then simply adapt the standard walk.)

DFA to mathematical notation

Let's say I have a DFA with alphabet {0,1} which basically accepts any strings as long as there is no consecutive 0's (at most one 0 at a time). How do I express this in a mathematical notation?
I was thinking of any number of 1's followed by either one or none 0's, then any number of 1's..... but couldn't figure out the appropriate mathematical notation for it.
My attempt but obviously incorrect since 1010 should be accepted but the notation does not indicate so:
As a regular expression you could write this as 1*(01+)*0?. Arbitrary many ones, then arbitrary many groups of exactly one zero followed by at least one one, and in the end possibly one zero. Nico already wrote as much in a comment. Personally I'd consider such a regular expression sufficiently formal to call it mathematical.
Now if you want to write this using exponents, you could do something like
L = {1a (0 11+bi)c 0d mod 2 | a,bi,c,d ∈ ℕ for 1≤i≤c}
Writing a bit of formula in the exponents has the great benefit that you don't have to split the place where you use the exponent and the place where you define the range. Here all my numbers are natural numbers (including zero). Adding one means at least one repetition. And the modulo 2 makes the exponent 0 or 1 to express the ? in the regular expression.
Of course, there is an implied assumption here, namely that the c serves as a kind of loop, but it doesn't repeat the same expression every time, but the bi changes for each iteration. The range of the i implies this interpretation, but it might be considered confusing or even incorrect nonetheless.
The proper solution here would be using some formal product notation using a big ∏ with a subscript i = 1 and a superscript c. That would indicate that for every i from 1 through c you want to compute the given expression (i.e. 011+bi) and concatenate all the resulting words.
You could also give a recursive definition: The minimal fixpoint of the following definition
L' = {1, 10} ∪ {1a 0 b | a ∈ ℕ, a > 0, b ∈ L'}
is the language of all words which begin with a 1 and satisfy your conditions. From this you can build
L = {ε, 0} ∪ L' ∪ {0 a | a ∈ L'}
so you add the empty word and the lone zero, then take all the words from L' in their unmodified form and in the form with a zero added in front.

Functions with the same name but different arguments in functional languages

I see this code in the example of Elixir:
defmodule Recursion do
def print_multiple_times(msg, n) when n <= 1 do
IO.puts msg
end
def print_multiple_times(msg, n) do
IO.puts msg
print_multiple_times(msg, n - 1)
end
end
Recursion.print_multiple_times("Hello!", 3)
I see here the same function defined twice with different arguments, and I want to understand this technique.
Can I look at them as at overloaded functions?
Is it a single function with different behavior or are these two different functions, like print_only_once and print_multiple_times?
Are these functions linked anyhow or not?
Usually in functional languages a function is defined by clauses. For example, one way to implement Fibonacci in an imperative language would be the following code (not the best implementation):
def fibonacci(n):
if n < 0:
return None
if n == 0 or n == 1:
return 1
else:
return fibonacci(n - 1) + fibonacci(n - 2)
To define the function in Elixir you would do the following:
defmodule Test do
def fibonacci(0), do: 1
def fibonacci(1), do: 1
def fibonacci(n) when n > 1 do
fibonacci(n-1) + fibonacci(n - 2)
end
def fibonacci(_), do: nil
end
Test.fibonacci/1 is only one function. A function with four clauses and arity of 1.
The first clause matches only when the number is 0.
The second clause matches only when the number is 1.
The third clause matches with any number greater than 1.
The last clause matches anything (_ is used when the value of the variable is not going to be used inside the clause or is not relevant for the match).
The clauses are evaluated in the order they are declared, so for Test.fibonacci(2) will fail in the first 2 clauses and match the third one because 2 > 1.
Think of clauses as a more powerful if statement. The code looks cleaner this way. And is very useful for recursion. For example, a map implementation (the language already provide one in Enum.map/2):
defmodule Test do
def map([], _), do: []
def map([x | xs], f) when is_function(f) do
[f.(x) | map(xs, f)]
end
end
First clause matches an empty list. No need to apply a function.
Second clause matches a list where the first element (head) is x and the rest of the list (tail) is xs and f is a function. It applies the function to the first element and recursively calls map with the rest of the list.
Calling Test.map([1,2,3], fn x -> x * 2 end) will give you the following output [2, 4, 6]
So, a function in Elixir is defined with one or more clauses where every clause have the same arity as the rest.
I hope this answers your question.
In the example you posted both definitions of the function have the same number of arguments: 2, this "when" thing is a guard, but you can also have definitions with many arguments. First, guards -- they are uses to express what cannot be written as a mere matching, like the second line of the following:
def fac(0), do: 1
def fac(n), when n<0 do: "no factorial for negative numbers!"
def fac(n), do: n*fac(n-1)
-- since it's not possible to express being negative number by just equality/matching.
Btw this fac is a single definition, only with three cases. Notice the coolness of using constant "0" in the position of argument :)
You can think of this as it would be nicer way to write:
def fac(n) do
if n==0, do: 1, else: if n<0, do: "no factorial!", else: n*fac(n-1)
end
or a switch case (which even looks pretty close to the above):
def fa(n) do
case n do
0 -> 1
n when n>0 -> n*fa(n-1)
_ -> "no no no"
end
end
only "looks more fancy". Actually it turns out certain definitions (e.g. parsers, small interpreters) look much better in the former than latter style. Nb guard expressions are very limited (I think you can't use your own function in guard).
Now the real thing, varying number of arguments -- check this out!
def mutant(a), do: a*a
def mutant(a,b), do: a*b
def mutant(a,b,c), do: mutant(a,b)+mutant(c)
e.g.
iex(1)> Lol.mutant(2)
4
iex(2)> Lol.mutant(2,3)
6
iex(3)> Lol.mutant(2,3,4)
22
It works a bit similar like (lambda arg ...) in scheme -- think of mutant as taking all its arguments as a list and matching over it. But this time, elixir treats mutant as 3 functions, mutant/1, mutant/2, and mutant/3 and will refer to them as such.
So, to answer your question: these are not like overloaded functions, but rather scattered/fragmented definitions. You see similar ones in functional languages like miranda, haskell or sml.

Why is the function addpos defined this way?

The following is the definition of the function addpos which defines addtition of a natural number to an integer. What is puzzling is the fact that here when n is matched with 0, addpos x2 0 gives succZ x2. Why cant it be just x2? Please explain.
Fixpoint addpos (x2 : Z) (n : nat) {struct n} : Z :=
match n with
| O ⇒ succZ x2
| S n0 ⇒ succZ (addpos x2 n0)
end.
I think that, given the name of the function, it is likely that this is intentional behavior. addpos means that we are adding a positive number; if we take "positive" to mean "strictly positive" (as, for instance, it is the case for the positive type in the standard library), then we see that the function is just using an element n : nat to represent the strictly positive number S n.
Why cant it be just x2?
It probably should be. Where did you get this definition from? I don't have succZ in my Coq install, so I had to change that to Z.succ. Then Eval compute in (addpos 0 0) yields 1%Z, for example. Either the definition is wrong, or it is intended to add one more than n.
EDIT: Another answer suggests that it may indeed have been intended to add S n, and the definition accepts n as an encoding for S n. I think such an encoding should be made explicit, since it is easy to do so. For example, by defining a new type for positive integers with a single OnePlus constructor with a nat parameter.

Recursive Data types in sml

Is there a way define a datatype for whole numbers. i.e. 0,1,2,... not zero, one ,... individually.
I want to define the set of whole numbers. bu using 0, n,n+1 with recursion.
I tried something like this: datatype nat=0|n|n+1 . But it was nearly obvious not to work because it does not recognize 0 as integer right?
I would appreciate any help.
Since the set of natural numbers is countably infinite, you can't enumerate all the cases.
You can represent natural numbers conceptually by Peano numbers:
datatype peano = Zero | Succ of peano
The datatype is very simple, it only defines 0 and ensures that each natural number has a successor. For example, 2 is actually represented as Succ (Succ Zero).
fun count Zero = 0
| count (Succ p) = 1 + count p
Use similar techniques, you can build up add, sub, mult functions like you have with natural numbers.

Resources