Understanding derivatives in Isabelle (meaning of `at .. within`) - isabelle

This is a question about the notion of derivatives from the Isabelle libraries.
I am trying to understand what (f has_field_derivative D x) (at x within S) means. I know (at x within S) is a filter, but intuitively I was imagining that the following statement is true
lemma DERIV_at_within:
"(∀x ∈ S. (f has_field_derivative D x) (at x))
= (∀x. (f has_field_derivative D x) (at x within S))"
If it is not, how else should I interpret (at x within S) in the context of derivatives?

at x within A is the pointed neighbourhood of x, intersected with A. For instance, at_right is an abbreviation for at x within {x<..}, i.e. the right-neighbourhood of x. This allows you to express one-sided derivatives.
Occasionally, one also sees assumptions like ∀x∈{a..b}. (f has_field_derivative f' x) (at x within {a..b}). This means that f is differentiable with derivative f' between a and b, but the derivatives at the edges (i.e. at a and b) need only be one-sided.
Note also that at x = at x within UNIV. Also, if A is an open set containint x, you simply have at x within A = at x.
Typically, you only really need has_field_derivative with at x within … if you want something like a one-sided limit (or, in higher dimensions, if you somehow want to constrain the direction of approach).

Related

Introducing fixed representation for a quotient type in Isabelle

This question is better explained with an example. Suppose I want to prove the following lemma:
lemma int_inv: "(n::int) - (n::int) = (0::int)"
How I'd informally prove this is something along these lines:
Lemma: n - n = 0, for any integer n and 0 = abs_int(0,0).
Proof:
Let abs_int(a,b) = n for some fixed natural numbers a and b.
--- some complex and mind blowing argument here ---
That means it suffices to prove that a+b+0 = a+b+0, which is true by reflexivity.
QED.
However, I'm having trouble with the first step "Let abs_int(a,b) = n". The let statement doesn't seem to be made for this, as it only allows one term on the left side, so I'm lost at how I could introduce the variables a and b in an arbitrary representation for n.
How may I introduce a fixed reprensentation for a quotient type so I may use the variables in it?
Note: I know the statement above can be proved by auto, and the problem may be sidestepped by rewriting the lemma as "lemma int_inv: "Abs_integ(a,b) - Abs_integ(a,b) = (0::int)". However, I'm looking specifically for a way to prove by introducing an arbitrary representation in the proof.
You can introduce a concrete representation with the theorem int.abs_induct. However, you almost never want to do that manually.
The general method of proving statements about quotients is to first state an equivalent theorem about the underlying relation, and then use the transfer tool. It would've helped if your example wasn't automatically discharged by automation... in fact, let's create our own little int type so that it isn't:
theory Scratch
imports Main
begin
quotient_type int = "nat × nat" / "intrel"
morphisms Rep_Integ Abs_Integ
proof (rule equivpI)
show "reflp intrel" by (auto simp: reflp_def)
show "symp intrel" by (auto simp: symp_def)
show "transp intrel" by (auto simp: transp_def)
qed
lift_definition sub :: "int ⇒ int ⇒ int"
is "λ(x, y) (u, v). (x + v, y + u)"
by auto
lift_definition zero :: "int" is "(0, 0)".
Now, we have
lemma int_inv: "sub n n = zero"
apply transfer
proof (prove)
goal (1 subgoal):
1. ⋀n. intrel ((case n of (x, y) ⇒ λ(u, v). (x + v, y + u)) n) (0, 0)
So, the version we want to prove is
lemma int_inv': "intrel ((case n of (x, y) ⇒ λ(u, v). (x + v, y + u)) n) (0, 0)"
by (induct n) simp
Now we can transfer it with
lemma int_inv: "sub n n = zero"
by transfer (fact int_inv')
Note that the transfer proof method is backtracking — this means that it will try many possible transfers until one of them succeeds. Note however, that this backtracking doesn't apply across separate apply commands. Thus you will always want to write a transfer proof as by transfer something_simple, instead of, say proof transfer.
You can see the many possible versions with
apply transfer
back back back back back
Note also, that if your theorem mentions constants about int which weren't defined with lift_definition, you will need to prove a transfer rule for them separately. There are some examples of that here.
In general, after defining a quotient you will want to "forget" about its underlying construction as soon as possible, proving enough properties by transfer so that the rest can be proven without peeking into your type's construction.

membership proof

I need to prove the following:
lemma "m = min_list(x#xs) ⟹ m ∈ set (x#xs)"
In plain English, I need to prove that the return value from "min_list (x#xs)" is always a member of (x#xs)
I tried:
apply(induct xs)
apply(auto)
I also tried to reuse existing lemmas for the min_list by using:
find_theorems min_list
The sub-goal at this point is so long that I do not know how to proceed.
I am not looking for a full answer just hints on how to approach this lemma. Moreover, is this proof an easy one or significantly difficult one for someone just learning Isabelle?
Spoiler: it is possible to use the standard list induction and auto to prove the theorem, i.e. something similar to by (induct xs ...) (auto simp: ...). I deliberately left out sections in the proof for you to fill in on your own. You will need to think about if any variables (i.e. m or x) need to be specified as arbitrary and also understand what information the simplifier may need (look for clues in the specification of min_list in the theory List).
With regard to your question about the difficulty of the problem, I believe, that difficulty is a function of experience. Most certainly, when I started learning Isabelle, I was finding it difficult to formalise proofs similar to the one in your question. After a certain time spent coding in Isabelle (by the time of answering this question, I must have accrued an equivalent of 4-5 months of full-time coding in Isabelle), such problems no longer seem to present a significant challenge for me. Of course, there are other factors that need to be taken into account, e.g. previous training in mathematics or logic and previous coding experience.
General advice from someone who is learning Isabelle on his own (the advice may not be consistent with the approach that is normally recommended by professional instructors)
I believe, when proving similar results, it is important to understand that Isabelle is, primarily, a tool for formalisation of 'pen-and-paper' proofs. Therefore, it is important to have the 'pen-and-paper' proof at hand before trying to formalise it. I would suggest the following general approach when attacking similar problems:
Write the proof on paper.
Formalise the proof using Isar, providing as many details as possible and not caring too much about the length of the proof. Also, try not to rely on the tools for automated reasoning (i.e. auto, blast, meson, metis, fastforce) and use direct methods like rule and intro as much as you can.
Once your Isar proof is complete, apply tools for automated reasoning (e.g. auto, blast) to your Isar proof to simplify your proof as much as possible.
Of course, eventually, it will become increasingly easy to omit 1 and 2 as you make progress in learning Isabelle.
I can provide further details, e.g. the complete short proof and the long Isar version of the proof.
UPDATE
As per your request in the comments, I provide an informal proof.
Lemma. m = min_list (x # xs) ⟹ m ∈ set (x # xs).
Remarks. For completeness, I also provide the definition of min_list and some comments about the const set. The definition of min_list can be found in the theory List:
fun min_list :: "'a::ord list ⇒ 'a" where
"min_list (x # xs) = (case xs of [] ⇒ x | _ ⇒ min x (min_list xs))"
The const set is defined implicitly and constitutes a part of the datatype infrastructure for list (see the document "Defining (Co)datatypes and Primitively (Co)recursive Functions in Isabelle/HOL" in the standard documentation if Isabelle). In particular, it is called the 'set function' of the datatype. Many basic properties of the const set can be found by inspection/search, e.g. find_theorems list.set. I believe that the theorem thm list.set is representative of the main properties of the const set (I took the liberty to rename the schematic variables in the theorem):
set [] = {}
set (?x # ?xs) = insert ?x (set ?xs)
Proof. The proof is by structural induction on the list xs. The induction principle is stated as an unnamed lemma at the beginning of the theory List. For completeness, I restate the induction principle below:
"P [] ⟹ (⋀a list. P list ⟹ P (a # list)) ⟹ P list"
Base case: assume xs = [], show m = min_list (x # xs) ⟹ m ∈ set (x # xs) for all x. From the definition of min_list, it is trivial to see that min_list (x # []) = x. Similarly, set (x # []) = {x} can be shown directly from the properties of the const set. Substituting into the predicate above, it remains to show that m = x ⟹ m ∈ {x} for all x. This follows from basic set theory.
Inductive step: assume ⋀x. m = min_list (x # xs) ⟹ m ∈ set (x # xs), show m = min_list (a # x # xs) ⟹ m ∈ set (a # x # xs) for all a, x and xs. Fix a, x and xs. Assume m = min_list (a # x # xs). Then it remains to show that m ∈ set (a # x # xs). Given m = min_list (a # x # xs), from the definition of min_list, it is easy to infer that either m = a or m = min_list (x # xs). Consider these cases explicitly:
Case I: m = a. a ∈ set (a # x # xs) follows from the definitions. Then, m ∈ set (a # x # xs) by substitution.
Case II: m = min_list (x # xs). Then, from the assumption ⋀x. m = min_list (x # xs) ⟹ m ∈ set (x # xs) it follows that m ∈ set (x # xs). Thus, m ∈ set (a # x # xs) follows from the properties of set.
In all possible cases m ∈ set (a # x # xs), which is what was required to prove.
Thus, the proof is concluded.
Concluding thoughts. Try converting this informal proof to an Isar proof. Also, please note that the proof may not be ideal - I might make edits to the proof later.

If the order of growth of a process is `log3 a`, can we simplify it to `log a`?

I'm learning the book SICP, and for the exercise 1.15:
Exercise 1.15. The sine of an angle (specified in radians) can be computed by making use of the approximation sin x x if x is sufficiently small, and the trigonometric identity
to reduce the size of the argument of sin. (For purposes of this exercise an angle is considered ``sufficiently small'' if its magnitude is not greater than 0.1 radians.) These ideas are incorporated in the following procedures:
(define (cube x) (* x x x))
(define (p x) (- (* 3 x) (* 4 (cube x))))
(define (sine angle)
(if (not (> (abs angle) 0.1))
angle
(p (sine (/ angle 3.0)))))
a. How many times is the procedure p applied when (sine 12.15) is evaluated?
b. What is the order of growth in space and number of steps (as a function of a) used by the process generated by the sine procedure when (sine a) is evaluated?
I get the answer of the "order of growth in number of steps" by myself is log3 a. But I found something say, the constants in the expression can be ignored, so it's the same as log a, which looks simpler.
I understand the 2n can be simplified to n, and 2n2 + 1 can be simplified to n2, but not sure if this is applied to log3 a too.
Yes, you can (since we're just interested in the order of the number of steps, not the exact number of steps)
Consider the formula for changing the base of a logarithm:
logb(x) = loga(x) / loga(b)
In other words you can rewrite log3(a) as:
log3(a) = log10(a) / log10(3)
Since log10(3) is just a constant, then for the order of growth we are only interested in the log10(a) term.

What does squaring a transformation mean?

I am trying to understand a solution that I read for an exercise that defines a logarithmic time procedure for finding the nth digit in the Fibonacci sequence. The problem is 1.19 in Structure and Interpretation of Computer Programs (SICP).
SPOILER ALERT: The solution to this problem is discussed below.
Fib(n) can be calculated in linear time as follows: Start with a = 1 and b = 0. Fib(n) always equals the value of b. So initially, with n = 0, Fib(0) = 0. Each time the following transformation is applied, n is incremented by 1 and Fib(n) equals the value of b.
a <-- a + b
b <-- a
To do this in logarithmic time, the problem description defines a transformation T as the transformation
a' <-- bq + aq + ap
b' <-- bp + aq
where p = 0 and q = 1, initially, so that this transformation is the same as the one above.
Then applying the above transformation twice, the exercise guides us to express the new values a'' and b'' in terms of the original values of a and b.
a'' <-- b'q + a'q + a'p = (2pq + q^2)b + (2pq + q^2)a + (p^2 + q^2)a
b' <-- b'p + a'q = (p^2 + q^2)b + (2pq + q^2)a
The exercise then refers to such application of applying a transformation twice as "squaring a transformation". Am I correct in my understanding?
The solution to this exercise applies the technique of using the value of squared transformations above to produce a solution that runs in logarithmic time. How does the problem run in logarithmic time? It seems to me that every time we use the result of applying a squared transformation, we need to do one transformation instead of two. So how do we successively cut the number of steps in half every time?
The solution from schemewiki.org is posted below:
(define (fib n)
(fib-iter 1 0 0 1 n))
(define (fib-iter a b p q count)
(cond ((= count 0) b)
((even? count)
(fib-iter a
b
(+ (square p) (square q))
(+ (* 2 p q) (square q))
(/ count 2)))
(else (fib-iter (+ (* b q) (* a q) (* a p))
(+ (* b p) (* a q))
p
q
(- count 1)))))
(define (square x) (* x x))
The exercise then refers to such application of applying a transformation twice as "squaring a transformation". Am I correct in my understanding?
Yes, squaring a transformation means applying it twice or (as is the case in the solution to this exercise) finding another transformation that is equivalent to applying it twice.
How does the problem run in logarithmic time? It seems to me that every time we use the result of applying a squared transformation, we need to do one transformation instead of two. So how do we successively cut the number of steps in half every time?
Squaring the given transformation enables us to cut down the number of steps because the values of p and q grow much faster in the squared transformation than they do in the original one. This is analogous to the way you can compute exponents using successive squaring much faster than by repeated multiplication.
So how do we successively cut the number of steps in half every time?
This is in the code given. Whenever count is even, (/ count 2) is passed for count on the next iteration. No matter what value of n is passed in on the initial iteration, it will be even on alternating iterations (worst case).
You can read my blog post on SICP Exercise 1.19: Computing Fibonacci numbers if you want to see a step-by-step derivation of the squared transformation in this exercise.
#Bill-the-Lizard provides a nice proof, but you are allowing yourself to be conflicted by what you think of the word "twice" and the word "square", in relation to transforms.
a) Computing twice the term T--that is, two-times-T--is a case of multiplication. The process of multiplication is simply a process of incrementing T by a constant value at each step, where the constant value is the original term itself.
BUT by contrast:
b) The given fibonacci transform is a process that requires use of the most current state of term T at each step of manipulation (as opposed to the use of a constant value). AND, the formula for manipulation is not a simple increment, but in effect, a quadratic expression (i.e. involves squaring at each successive step).
Like bill says, this successive squaring effect will become very clear if you step through it in your debugger (I prefer to compute a few simple cases by hand whenever I get stuck somewhere).
Think of the process another way:
To reach your destination if you could cover the square of the current distance in the next step, but still somehow manage to take a constant amount of time to complete each step, you're going to get there way faster than if you take constant steps, each in constant time.

How do you find the 6th root using primitive expressions in Scheme?

By primitive expressions, I mean + - * / sqrt, unless there are others that I am missing. I'm wondering how to write a Scheme expression that finds the 6th root using only these functions.
I know that I can find the cube root of the square root, but cube root doesn't appear to be a primitive expression.
Consider expt, passing in a fractional power as its second argument.
But let's say we didn't know about expt. Could we still compute it?
One way to do it is by applying something like Newton's method. For example, let's say we wanted to compute n^(1/4). Of course, we already know we can just take the sqrt twice to do this, but let's see how Newton's method may apply to this problem.
Given n, we'd like to discover roots x of the function:
f(x) = x^4 - n
Concretely, if we wanted to look for 16^(1/4), then we'd look for a root for the function:
f(x) = x^4 - 16
We already know if we plug in x=2 in there, we'll discover that 2 is a root of this function. But say that we didn't know that. How would we discover the x values that make this function zero?
Newton's method says that if we have a guess at x, call it x_0, we can improve that guess by doing the following process:
x_1 = x_0 - f(x_0) / f'(x_0)
where f'(x) is notation for the derivative of f(x). For the case above, the derivative of f(x) is 4x^3.
And we can get better guesses x_2, x_3, ... by repeating the computation:
x_2 = x_1 - f(x_1) / f'(x_1)
x_3 = x_2 - f(x_2) / f'(x_2)
...
until we get tired.
Let's write this all in code now:
(define (f x)
(- (* x x x x) 16))
(define (f-prime x)
(* 4 x x x))
(define (improve guess)
(- guess (/ (f guess)
(f-prime guess))))
(define approx-quad-root-of-16
(improve (improve (improve (improve (improve 1.0))))))
The code above just expresses f(x), f'(x), and the idea of improving an initial guess five times. Let's see what the value of approx-quad-root-of-16 is:
> approx-quad-root-of-16
2.0457437305170534
Hey, cool. It's actually doing something, and it's close to 2. Not bad for starting with such a bad first guess of 1.0.
Of course, it's a little silly to hardcode 16 in there. Let's generalize, and turn it into a function that takes an arbitrary n instead, so we can compute the quad root of anything:
(define (approx-quad-root-of-n n)
(define (f x)
(- (* x x x x) n))
(define (f-prime x)
(* 4 x x x))
(define (improve guess)
(- guess (/ (f guess)
(f-prime guess))))
(improve (improve (improve (improve (improve 1.0))))))
Does this do something effective? Let's see:
> (approx-quad-root-of-n 10)
1.7800226459895
> (expt (approx-quad-root-of-n 10) 4)
10.039269440807693
Cool: it is doing something useful. But note that it's not that precise yet. To get better precision, we should keep calling improve, and not just four or five times. Think loops or recursion: repeat the improvement till the solution is "close enough".
This is a sketch of how to solve these kinds of problems. For a little more detail, look at the section on computing square roots in Structure and Interpretation of Computer Programs.
You may want to try out a numeric way, which may be inefficient for larger numbers, but it works.
Also if you also count pow as a primitive (since you also count sqrt) you could do this:
pow(yournum, 1/6);

Resources