Subst refl closing duplicate subgoals. What's going on? - isabelle

In this thread Mathieu demonstrates that subst refl closes duplicate subgoals. How/Why is it doing that?

I'm not completely sure, but a quick look at the code suggests that subst calls distinct_subgoals_tac for some reason and does not restrict it to the subgoal it is working on:
fun eqsubst_tac ctxt occs thms i st =
let val nprems = Thm.nprems_of st in
if nprems < i then Seq.empty else
let
val thmseq = Seq.of_list thms;
fun apply_occ occ st =
thmseq |> Seq.maps (fn r =>
eqsubst_tac' ctxt
(skip_first_occs_search occ searchf_lr_unify_valid) r
(i + (Thm.nprems_of st - nprems)) st);
val sorted_occs = Library.sort (rev_order o int_ord) occs;
in
Seq.maps distinct_subgoals_tac (Seq.EVERY (map apply_occ sorted_occs) st)
end
end;
That does not seem like intended behaviour to me – probably an oversight in the implementation of subst. I'll write an email to the mailing list to ask about it.

Related

Optimized version of except for Map

I've written a generic except function for Maps that, given a source map and an other map, returns only the items of the source map without corresponding keys in the other map.
module MapExt =
let getKeys<'k,'v when 'k : comparison> : Map<'k,'v> -> 'k[] =
Map.toArray >> Array.map fst
let except<'k,'v when 'k : comparison>(other:Map<'k,'v>) (source:Map<'k,'v>) : ('k * 'v)[] =
source |> getKeys
|> Array.except (other |> getKeys)
|> Array.map(fun k -> (k, source.[k]))
Now, I've seen in the second part of this answer, that an optimized version of the map's keys is obtained via a Map.fold.
Therefore, can I do a similar optimization of my original MapExt module in the following way?
module MapExtOpt =
let getKeys<'k,'v when 'k : comparison> (m : Map<'k,'v>) : 'k list =
Map.fold (fun keys key _ -> key::keys) [] m
let except<'k,'v when 'k : comparison>
(other : Map<'k,'v>) (source : Map<'k,'v>) : ('k * 'v) list =
source
|> Map.fold (fun s k v ->
if (other.ContainsKey k) then
s
else
(k,v) :: s
) []
Or am I reinventing some already existing (and optimized) functions?
I don't think there is a built in function, but this is a simpler way of doing what you are trying to do. It only goes over the 'to be removed' map once, so its much more efficient.
let except toRemove source =
Map.fold (fun m k _ -> if Map.containsKey k m then Map.remove k m else m) source toRemove
Finally,
thanks to Loïc Denuzière for his comment on Slack:
The if is not necessary: if m doesn't contain k, Map.remove k m just returns m anyway
I think I can also apply a double eta reduction by considering that it makes sense to speak about the keys to remove (not about a map whose values are ignored), so I would simply redefine it as
let except<'k,'v when 'k : comparison> = List.foldBack Map.remove<'k,'v>

Limitations of let rec in OCaml

I'm studying OCaml these days and came across this:
OCaml has limits on what it can put on the righthand side of a let rec. Like this one
let memo_rec f_norec =
let rec f = memoize (fun x -> f_norec f x) in
f;;
Error: This kind of expression is not allowed as right-hand side of `let rec'
in which, the memoize is a function that take a function and turns it into a memorized version with Hashtable. It's apparent that OCaml has some restriction on the use of constructs at the right-hand side of 'let rec', but I don't really get it, could anyone explain a bit more on this?
The kind of expressions that are allowed to be bound by let rec are described in section 8.1 of the manual. Specifically, function applications involving the let rec defined names are not allowed.
A rough summary (taken from that very link):
Informally, the class of accepted definitions consists of those definitions where the defined names occur only inside function bodies or as argument to a data constructor.
You can use tying-the-knot techniques to define memoizing fixpoints. See for example those two equivalent definitions:
let fix_memo f =
let rec g = {contents = fixpoint}
and fixpoint x = f !g x in
g := memoize !g;
!g
let fix_memo f =
let g = ref (fun _ -> assert false) in
g := memoize (fun x -> f !g x);
!g
Or using lazy as reminded by Alain:
let fix_memo f =
let rec fix = lazy (memoize (fun x -> f (Lazy.force fix) x)) in
Lazy.force fix

Using continuation / CPS to implement tail-recursive MergeSort in OCaml

I am trying to implement a tail-recursive MergeSort in OCaml.
Since Mergesort naturally is not tail-recursive, so I am using CPS to implement it.
Also my implementation is inspired by Tail-recursive merge sort in OCaml
Below is my code
let merge compare_fun l1 l2 =
let rec mg l1 l2 acc =
match l1, l2 with
| ([], []) -> List.rev acc
| ([], hd2::tl2) -> mg [] tl2 (hd2::acc)
| (hd1::tl1, []) -> mg tl1 [] (hd1::acc)
| (hd1::tl1, hd2::tl2) ->
let c = compare_fun hd1 hd2
in
if c = 1 then mg l1 tl2 (hd2::acc)
else if c = 0 then mg tl1 tl2 (hd2::hd1::acc)
else mg tl1 l2 (hd1::acc)
in
mg l1 l2 [];;
let split_list p l =
let rec split_list p (acc1, acc2) = function
| [] -> (List.rev acc1, List.rev acc2)
| hd::tl ->
if p > 0 then split_list (p-1) (hd::acc1, acc2) tl
else split_list (p-2) (acc1, hd::acc2) tl
in
split_list p ([], []) l;;
let mergeSort_cps compare_fun l =
let rec sort_cps l cf = (*cf = continuation func*)
match l with
| [] -> cf []
| hd::[] -> cf [hd]
| _ ->
let (left, right) = split_list ((List.length l)/2) l
in
sort_cps left (fun leftR -> sort_cps right (fun rightR -> cf (merge compare_fun leftR rightR)))
in
sort_cps l (fun x -> x);;
When I compile it, and run it with a 1,000,000 integers, it gives the error of stackoverflow. Why?
Edit
Here is the code I used for testing:
let compare_int x y =
if x > y then 1
else if x = y then 0
else -1;;
let create_list n =
Random.self_init ();
let rec create n' acc =
if n' = 0 then acc
else
create (n'-1) ((Random.int (n/2))::acc)
in
create n [];;
let l = create_list 1000000;;
let sl = mergeSort_cps compare_int l;;
in http://try.ocamlpro.com/, it gave this error: Exception: RangeError: Maximum call stack size exceeded.
in local ocaml top level, it didn't have any problem
Adding another answer to make a separate point: it seems that much of the confusion among answerers is caused by the fact that you don't use the standard OCaml compiler, but the TryOCaml website which runs a distinct OCaml backend, on top of javascript, and has therefore slightly different optimization and runtime characteristics.
I can reliably reproduce the fact that, on the TryOCaml website, the CPS-style function mergeSort_cps you show fails on lists of length 1_000_000 with the following error:
Exception: InternalError: too much recursion.
My analysis is that this is not due to a lack of tail-rec-ness, but by a lack of support, on the Javascript backend, of the non-obvious way in which the CPS-translated call is tailrec: recursion goes through a lambda-abstraction boundary (but still in tail position).
Turning the code in the direct, non-tail-rec version makes the problem go away:
let rec merge_sort compare = function
| [] -> []
| [hd] -> [hd]
| l ->
let (left, right) = split_list (List.length l / 2) l in
merge compare (merge_sort compare left) (merge_sort compare right);;
As I said in my other answer, this code has a logarithmic stack depth, so no StackOverflow will arise from its use (tail-rec is not everything). It is simpler code that the Javascript backend handles better.
Note that you can make it noticeably faster by using a better implementation split (still with your definition of merge) that avoids the double traversal of List.length then splitting:
let split li =
let rec split ls rs = function
| [] -> (ls, rs)
| x::xs -> split rs (x::ls) xs in
split [] [] li;;
let rec merge_sort compare = function
| [] -> []
| [hd] -> [hd]
| l ->
let (left, right) = split l in
merge compare (merge_sort compare left) (merge_sort compare right);;
Reading the comments, it seems that your Stack_overflow error is hard to reproduce.
Nevertheless, your code is not entirely in CPS or tail-recursive: in merge_sort, the calls to split_list and merge are made in a non-tail-call position.
The question is: by making your CPS transform and generous use of accumulators, what will be the worst stack depth related to recursion? Saving stack depth on the sort calls is in fact not very interesting: as each split the list in two, the worst stack depth would be O(log n) for n the size the input list.
On the contrary, split and merge would have made a linear O(n) usage of the stack if they weren't written in accumulator-passing style, so they are important to make tail-rec. As your implementation of those routines is tail-rec, there should be no need to worry about stack usage, and neither to convert the sort routine itself in CPS form that makes the code harder to read.
(Note that this logarithmic-decrease argument is specific to mergesort. A quicksort can have linear stack usage in worst case, so it could be important to make it tail-rec.)

Recursive anonymous functions in SML

Is it possible to write recursive anonymous functions in SML? I know I could just use the fun syntax, but I'm curious.
I have written, as an example of what I want:
val fact =
fn n => case n of
0 => 1
| x => x * fact (n - 1)
The anonymous function aren't really anonymous anymore when you bind it to a
variable. And since val rec is just the derived form of fun with no
difference other than appearance, you could just as well have written it using
the fun syntax. Also you can do pattern matching in fn expressions as well
as in case, as cases are derived from fn.
So in all its simpleness you could have written your function as
val rec fact = fn 0 => 1
| x => x * fact (x - 1)
but this is the exact same as the below more readable (in my oppinion)
fun fact 0 = 1
| fact x = x * fact (x - 1)
As far as I think, there is only one reason to use write your code using the
long val rec, and that is because you can easier annotate your code with
comments and forced types. For examples if you have seen Haskell code before and
like the way they type annotate their functions, you could write it something
like this
val rec fact : int -> int =
fn 0 => 1
| x => x * fact (x - 1)
As templatetypedef mentioned, it is possible to do it using a fixed-point
combinator. Such a combinator might look like
fun Y f =
let
exception BlackHole
val r = ref (fn _ => raise BlackHole)
fun a x = !r x
fun ta f = (r := f ; f)
in
ta (f a)
end
And you could then calculate fact 5 with the below code, which uses anonymous
functions to express the faculty function and then binds the result of the
computation to res.
val res =
Y (fn fact =>
fn 0 => 1
| n => n * fact (n - 1)
)
5
The fixed-point code and example computation are courtesy of Morten Brøns-Pedersen.
Updated response to George Kangas' answer:
In languages I know, a recursive function will always get bound to a
name. The convenient and conventional way is provided by keywords like
"define", or "let", or "letrec",...
Trivially true by definition. If the function (recursive or not) wasn't bound to a name it would be anonymous.
The unconventional, more anonymous looking, way is by lambda binding.
I don't see what unconventional there is about anonymous functions, they are used all the time in SML, infact in any functional language. Its even starting to show up in more and more imperative languages as well.
Jesper Reenberg's answer shows lambda binding; the "anonymous"
function gets bound to the names "f" and "fact" by lambdas (called
"fn" in SML).
The anonymous function is in fact anonymous (not "anonymous" -- no quotes), and yes of course it will get bound in the scope of what ever function it is passed onto as an argument. In any other cases the language would be totally useless. The exact same thing happens when calling map (fn x => x) [.....], in this case the anonymous identity function, is still in fact anonymous.
The "normal" definition of an anonymous function (at least according to wikipedia), saying that it must not be bound to an identifier, is a bit weak and ought to include the implicit statement "in the current environment".
This is in fact true for my example, as seen by running it in mlton with the -show-basis argument on an file containing only fun Y ... and the val res ..
val Y: (('a -> 'b) -> 'a -> 'b) -> 'a -> 'b
val res: int32
From this it is seen that none of the anonymous functions are bound in the environment.
A shorter "lambdanonymous" alternative, which requires OCaml launched
by "ocaml -rectypes":
(fun f n -> f f n)
(fun f n -> if n = 0 then 1 else n * (f f (n - 1))
7;; Which produces 7! = 5040.
It seems that you have completely misunderstood the idea of the original question:
Is it possible to write recursive anonymous functions in SML?
And the simple answer is yes. The complex answer is (among others?) an example of this done using a fix point combinator, not a "lambdanonymous" (what ever that is supposed to mean) example done in another language using features not even remotely possible in SML.
All you have to do is put rec after val, as in
val rec fact =
fn n => case n of
0 => 1
| x => x * fact (n - 1)
Wikipedia describes this near the top of the first section.
let fun fact 0 = 1
| fact x = x * fact (x - 1)
in
fact
end
This is a recursive anonymous function. The name 'fact' is only used internally.
Some languages (such as Coq) use 'fix' as the primitive for recursive functions, while some languages (such as SML) use recursive-let as the primitive. These two primitives can encode each other:
fix f => e
:= let rec f = e in f end
let rec f = e ... in ... end
:= let f = fix f => e ... in ... end
In languages I know, a recursive function will always get bound to a name. The convenient and conventional way is provided by keywords like "define", or "let", or "letrec",...
The unconventional, more anonymous looking, way is by lambda binding. Jesper Reenberg's answer shows lambda binding; the "anonymous" function gets bound to the names "f" and "fact" by lambdas (called "fn" in SML).
A shorter "lambdanonymous" alternative, which requires OCaml launched by "ocaml -rectypes":
(fun f n -> f f n)
(fun f n -> if n = 0 then 1 else n * (f f (n - 1))
7;;
Which produces 7! = 5040.

Recursive lambdas in F#

Take this example code (ignore it being horribly inefficient for the moment)
let listToString (lst:list<'a>) = ;;' prettify fix
let rec inner (lst:list<'a>) buffer = ;;' prettify fix
match List.length lst with
| 0 -> buffer
| _ -> inner (List.tl lst) (buffer + ((List.hd lst).ToString()))
inner lst ""
This is a common pattern I keep coming across in F#, I need to have an inner function who recurses itself over some value - and I only need this function once, is there in any way possible to call a lambda from within it self (some magic keyword or something) ? I would like the code to look something like this:
let listToString2 (lst:list<'a>) = ;;' prettify fix
( fun
(lst:list<'a>) buffer -> match List.length lst with ;;' prettify fix
| 0 -> buffer
| _ -> ##RECURSE## (List.tl lst) (buffer + ((List.hd lst).ToString()))
) lst ""
But as you might expect there is no way to refer to the anonymous function within itself, which is needed where I put ##RECURSE##
Yes, it's possible using so called y-combinators (or fixed-point combinators). Ex:
let rec fix f x = f (fix f) x
let fact f = function
| 0 -> 1
| x -> x * f (x-1)
let _ = (fix fact) 5 (* evaluates to "120" *)
I don't know articles for F# but this haskell entry might also be helpful.
But: I wouldn't use them if there is any alternative - They're quite hard to understand.
Your code (omit the type annotations here) is a standard construct and much more expressive.
let listToString lst =
let rec loop acc = function
| [] -> acc
| x::xs -> loop (acc ^ (string x)) xs
loop "" lst
Note that although you say you use the function only once, technically you refer to it by name twice, which is why it makes sense to give it a name.

Resources