Power with integer exponents in Isabelle - isabelle

Here is my definition of power for integer exponents following this mailing-list post:
definition
"ipow x n = (if n < 0 then (1 / x) ^ n else x ^ n)"
notation ipow (infixr "^⇩i" 80)
Is there a better way to define it?
Is there an existing theory in Isabelle that already includes it so that I can reuse its results?
Context
I am dealing with complex exponentials, for instance consider this theorem:
after I proved it I realized I need to work with integers n not just naturals and this involves using powers to take out the n from the exponential.

I don't think something like this exists in the library. However, you have a typo in your definition. I believe you want something like
definition
"ipow x n = (if n < 0 then (1 / x) ^ nat (-n) else x ^ nat n)"
Apart from that, it is fine. You could write inverse x ^ nat (-n), but it should make little difference in practice. I would suggest the name int_power since the corresponding operation with natural exponents is called power.
Personally, I would avoid introducting a new constant like this because in order to actually use it productively, you also need an extensive collection of theorems around it. This means quite a bit of (tedious) work. Do you really need to talk about integers here? I find that one can often get around it in practice (in particular, note that the exponentials in question are periodic anyway).
It may be useful to introduce such a power operator nevertheless; all I'm saying is you should be aware of the trade-off.
Side note: An often overlooked function in Isabelle that is useful when talking about exponentials like this is cis (as in ‘cosine + i · sine‘). cis x is equivalent to ‘exp(ix)’ where x is real.

Related

Math-operations (or hash functions) without order dependency

The problem I am thinking about is hash functions, although I'm mainly interested in the mathematical terms/background to describe my requested property.
Consider the case where I have a hash-function taking a secret (S) and a number (X) which creates another number (Y):
Hash : S, X → Y
I then define two different hash-functions with their own secrets (a and b):
H1(X) := Hash(a, X)
H2(X) := Hash(b, X)
The property I want is that:
H1(H2(x)) = H2(H1(X))
(I think this is called that the functions commute?)
Taking a step back from programming and thinking about math we can look at different operations. If the function consist of one operation only, then I'm quite sure that this property will always be satisfied if the operation has both associative and commutative properties. However there are operations which are order insensitive but non-commutative, e.g. division. How does I know if my choice of hash function will make it commute?
Some examples that seems to work:
Simple addition:
Hash(S, X) := S + X
Bitwise xor:
Hash(S, X) := S xor X
Modular exponentiation:
Hash(S, X) := X^S mod p
if S ∈ N and X ∈ Z
How do I know if my choice of hash function will make it commute?
Commutativity under composition is an unusual property. It's not typical unless the functions are using a commutative operation of some underlying algebraic structures, such as "multiply by x". This is the form of your three examples.
The practical answer is "if you don't have a proof that it's commutative, assume it's not commutative". There's no general algorithm that will provide that proof for you.

Convergence and vectors theories

Is there a convergence theory in Isabelle/HOL? I need to define ∥x(t)∥ ⟶ 0 as t ⟶ ∞.
Also, I'm looking for vectors theory, I found a matrix theory but I couldn't find the vectors one, Is there exist such theory in Isabelle/HOL?
Cheers.
Convergence etc. are expressed with filters in Isabelle. (See the corresponding documentation)
In your case, that would be something like
filterlim (λt. norm (x t)) (nhds 0) at_top
or, using the tendsto abbreviation,
((λt. norm (x t)) ⤏ 0) at_top
where ⤏ is the Isabelle symbol \<longlongrightarrow>, which can be input using the abbreviation --->.
As a side note, I am wondering why you are writing it that way in the first place, seeing as it is equivalent to
filterlim x (nhds 0) at_top
or, with the tendsto syntax:
(x ⤏ 0) at_top
Reasoning with these filters can be tricky at first, but it has the advantage of providing a unified framework for limits and other topological concepts, and once you get the hang of it, it is very elegant.
As for vectors, just import ~~/src/HOL/Analysis/Analysis. That should have everything you need. Ideally, build the HOL-Analysis session image by starting Isabelle/jEdit with isabelle jedit -l HOL-Analysis. Then you won't have to process all of Isabelle's analysis library every time you start the system.
I assume that by ‘vectors’ you mean concrete finite-dimensional real vector spaces like ℝn. This is provided by ~~/src/HOL/Analysis/Finite_Cartesian_Product.thy, which is part of HOL-Analysis. This provides the vec type, which takes two parameters: the component type (probably real in your case) and the index type, which specifies the dimension of the vector space.
There is also a pre-defined type n for every positive integer n, so that you can write e.g. (real, 3) vec for the vector space ℝ³. There is also type syntax so that you can write 'a ^ 'n for ('a, 'n) vec.

algorithm to find derivative

I'm writing program in Python and I need to find the derivative of a function (a function expressed as string).
For example: x^2+3*x
Its derivative is: 2*x+3
Are there any scripts available, or is there something helpful you can tell me?
If you are limited to polynomials (which appears to be the case), there would basically be three steps:
Parse the input string into a list of coefficients to x^n
Take that list of coefficients and convert them into a new list of coefficients according to the rules for deriving a polynomial.
Take the list of coefficients for the derivative and create a nice string describing the derivative polynomial function.
If you need to handle polynomials like a*x^15125 + x^2 + c, using a dict for the list of coefficients may make sense, but require a little more attention when doing the iterations through this list.
sympy does it well.
You may find what you are looking for in the answers already provided. I, however, would like to give a short explanation on how to compute symbolic derivatives.
The business is based on operator overloading and the chain rule of derivatives. For instance, the derivative of v^n is n*v^(n-1)dv/dx, right? So, if you have v=3*x and n=3, what would the derivative be? The answer: if f(x)=(3*x)^3, then the derivative is:
f'(x)=3*(3*x)^2*(d/dx(3*x))=3*(3*x)^2*(3)=3^4*x^2
The chain rule allows you to "chain" the operation: each individual derivative is simple, and you just "chain" the complexity. Another example, the derivative of u*v is v*du/dx+u*dv/dx, right? If you get a complicated function, you just chain it, say:
d/dx(x^3*sin(x))
u=x^3; v=sin(x)
du/dx=3*x^2; dv/dx=cos(x)
d/dx=v*du+u*dv
As you can see, differentiation is only a chain of simple operations.
Now, operator overloading.
If you can write a parser (try Pyparsing) then you can request it to evaluate both the function and derivative! I've done this (using Flex/Bison) just for fun, and it is quite powerful. For you to get the idea, the derivative is computed recursively by overloading the corresponding operator, and recursively applying the chain rule, so the evaluation of "*" would correspond to u*v for function value and u*der(v)+v*der(u) for derivative value (try it in C++, it is also fun).
So there you go, I know you don't mean to write your own parser - by all means use existing code (visit www.autodiff.org for automatic differentiation of Fortran and C/C++ code). But it is always interesting to know how this stuff works.
Cheers,
Juan
Better late than never?
I've always done symbolic differentiation in whatever language by working with a parse tree.
But I also recently became aware of another method using complex numbers.
The parse tree approach consists of translating the following tiny Lisp code into whatever language you like:
(defun diff (s x)(cond
((eq s x) 1)
((atom s) 0)
((or (eq (car s) '+)(eq (car s) '-))(list (car s)
(diff (cadr s) x)
(diff (caddr s) x)
))
; ... and so on for multiplication, division, and basic functions
))
and following it with an appropriate simplifier, so you get rid of additions of 0, multiplying by 1, etc.
But the complex method, while completely numeric, has a certain magical quality. Instead of programming your computation F in double precision, do it in double precision complex.
Then, if you need the derivative of the computation with respect to variable X, set the imaginary part of X to a very small number h, like 1e-100.
Then do the calculation and get the result R.
Now real(R) is the result you would normally get, and imag(R)/h = dF/dX
to very high accuracy!
How does it work? Take the case of multiplying complex numbers:
(a+bi)(c+di) = ac + i(ad+bc) - bd
Now suppose the imaginary parts are all zero, except we want the derivative with respect to a.
We set b to a very small number h. Now what do we get?
(a+hi)(c) = ac + hci
So the real part of this is ac, as you would expect, and the imaginary part, divided by h, is c, which is the derivative of ac with respect to a.
The same sort of reasoning seems to apply to all the differentiation rules.
Symbolic Differentiation is an impressive introduction to the subject-at least for non-specialist like me :) The code is written in C++ btw.
Look up automatic differentiation. There are tools for Python. Also, this.
If you are thinking of writing the differentiation program from scratch, without utilizing other libraries as help, then the algorithm/approach of computing the derivative of any algebraic equation I described in my blog will be helpful.
You can try creating a class that will represent a limit rigorously and then evaluate it for (f(x)-f(a))/(x-a) as x approaches a. That should give a pretty accurate value of the limit.
if you're using string as an input, you can separate individual terms using + or - char as a delimiter, which will give you individual terms. Now you can use power rule to solve for each term, say you have x^3 which using power rule will give you 3x^2, or suppose you have a more complicated term like a/(x^3) or a(x^-3), again you can single out other variables as a constant and now solving for x^-3 will give you -3a/(x^2). power rule alone should be enough, however it will require extensive use of the factorization.
Unless any already made library deriving it's quite complex because you need to parse and handle functions and expressions.
Deriving by itself it's an easy task, since it's mechanical and can be done algorithmically but you need a basic structure to store a function.

Is finding the equivalence of two functions undecidable?

Is it impossible to know if two functions are equivalent? For example, a compiler writer wants to determine if two functions that the developer has written perform the same operation, what methods can he use to figure that one out? Or can what can we do to find out that two TMs are identical? Is there a way to normalize the machines?
Edit: If the general case is undecidable, how much information do you need to have before you can correctly say that two functions are equivalent?
Given an arbitrary function, f, we define a function f' which returns 1 on input n if f halts on input n. Now, for some number x we define a function g which, on input n, returns 1 if n = x, and otherwise calls f'(n).
If functional equivalence were decidable, then deciding whether g is identical to f' decides whether f halts on input x. That would solve the Halting problem. Related to this discussion is Rice's theorem.
Conclusion: functional equivalence is undecidable.
There is some discussion going on below about the validity of this proof. So let me elaborate on what the proof does, and give some example code in Python.
The proof creates a function f' which on input n starts to compute f(n). When this computation finishes, f' returns 1. Thus, f'(n) = 1 iff f halts on input n, and f' doesn't halt on n iff f doesn't. Python:
def create_f_prime(f):
def f_prime(n):
f(n)
return 1
return f_prime
Then we create a function g which takes n as input, and compares it to some value x. If n = x, then g(n) = g(x) = 1, else g(n) = f'(n). Python:
def create_g(f_prime, x):
def g(n):
return 1 if n == x else f_prime(n)
return g
Now the trick is, that for all n != x we have that g(n) = f'(n). Furthermore, we know that g(x) = 1. So, if g = f', then f'(x) = 1 and hence f(x) halts. Likewise, if g != f' then necessarily f'(x) != 1, which means that f(x) does not halt. So, deciding whether g = f' is equivalent to deciding whether f halts on input x. Using a slightly different notation for the above two functions, we can summarise all this as follows:
def halts(f, x):
def f_prime(n): f(n); return 1
def g(n): return 1 if n == x else f_prime(n)
return equiv(f_prime, g) # If only equiv would actually exist...
I'll also toss in an illustration of the proof in Haskell (GHC performs some loop detection, and I'm not really sure whether the use of seq is fool proof in this case, but anyway):
-- Tells whether two functions f and g are equivalent.
equiv :: (Integer -> Integer) -> (Integer -> Integer) -> Bool
equiv f g = undefined -- If only this could be implemented :)
-- Tells whether f halts on input x
halts :: (Integer -> Integer) -> Integer -> Bool
halts f x = equiv f' g
where
f' n = f n `seq` 1
g n = if n == x then 1 else f' n
Yes, it is undecidable. This is a form of the halting problem.
Note that I mean that it's undecidable for the general case. Just as you can determine halting for sufficiently simple programs, you can determine equivalency for sufficiently simple functions, and it's not inconceivable that this could be of some use for an application. But you cannot make a general method for determining equivalency of any two possible functions.
The general case is undecidable by Rice's Theorem, as others have already said (Rice's Theorem essentially says that any nontrivial property of a Turing-complete formalism is undecidable).
There are special cases where equivalence is decidable, the best-known example is probably equivalence of finite state automata. If I remember correctly equivalence of pushdown automata is already undecidable by reduction to Post's Correspondence Problem.
To prove that two given functions are equivalent you would require as input a proof of the equivalence in some formalism, which you can then check for correctness. The essential parts of this proof are the loop invariants, as these cannot be derived automatically.
In the general case it's undecidable whether two turing machines have always the same output for the identical input. Since you can't even decide whether a tm will halt on the input, I don't see how it should be possible to decide whether both halt AND output the same result...
It depends on what you mean by "function."
If the functions you are talking about are guaranteed to terminate -- for example, because they are written in a language in which all functions terminate -- and operate over finite domains, it's "easy" (although it might still take a very, very long time): two functions are equivalent if and only if they have the same value at every point in their shared domain.
This is called "extensional" equivalence to distinguish it from syntactic or "intensional" equivalence. Two functions are extensionally equivalent if they are intensionally equivalent, but the converse does not hold.
(All the other people above noting that it is undecidable in the general case are quite correct, of course, this is a fairly uncommon -- and usually uninteresting in practice -- special case.)
Note that the halting problem is decidable for linear bounded automata. Real computers are always bounded, and programs for them will always loop back to a previous configuration after sufficiently many steps. If you are using an unbounded (imaginary) computer to keep track of the configurations, you can detect that looping and take it into account.
You could check in your compiler to see if they are "exactly" identical, sure, but determining if they return identical values would be difficult and time consuming. You would have to basically call that method and perform its routine over an infinite number of possible calls and compare the value with that from the other routine.
Even if you could do the above, you would have to account for what global values change within the function, what objects are destroyed / changed in the function that do not affect the outcome.
You can really only compare the compiled code. So compile the compiled code to refactor?
Imagine the run time on trying to compile the code with "that" compiler. You could spend a LOT of time on here answering questions saying: "busy compiling..." :)
I think if you allow side effects, you can show that the problem can be morphed into the Post correspondence problem so you can't, in general, show if two functions are even capable of having the same side effects.
Is it impossible to know if two functions are equivalent?
No. It is possible to know that two functions are equivalent. If you have f(x), you know f(x) is equivalent to f(x).
If the question is "it is possible to determine if f(x) and g(x) are equivalent with f and g being any function and for all functions g and f", then the answer is no.
However, if the question is "can a compiler determine that if f(x) and g(x) are equivalent that they are equivalent?", then the answer is yes if they are equivalent in both output and side effects and order of side effects. In other words, if one is a transformation of the other that preserves behavior, then a compiler of sufficient complexity should be able to detect it. It also means that the compiler can transform a function f into a more optimal and equivalent function g given a particular definition of equivalent. It gets even more fun if f includes undefined behavior, because then g can also include undefined (but different) behavior!

A little diversion into floating point (im)precision, part 1

Most mathematicians agree that:
eπi + 1 = 0
However, most floating point implementations disagree. How well can we settle this dispute?
I'm keen to hear about different languages and implementations, and various methods to make the result as close to zero as possible. Be creative!
It's not that most floating point implementations disagree, it's just that they cannot get the accuracy necessary to get a 100% answer. And the correct answer is that they can't.
PI is an infinite series of digits that nobody has been able to denote by anything other than a symbolic representation, and e^X is the same, and thus the only way to get to 100% accuracy is to go symbolic.
Here's a short list of implementations and languages I've tried. It's sorted by closeness to zero:
Scheme: (+ 1 (make-polar 1 (atan 0 -1)))
⇒ 0.0+1.2246063538223773e-16i (Chez Scheme, MIT Scheme)
⇒ 0.0+1.22460635382238e-16i (Guile)
⇒ 0.0+1.22464679914735e-16i (Chicken with numbers egg)
⇒ 0.0+1.2246467991473532e-16i (MzScheme, SISC, Gauche, Gambit)
⇒ 0.0+1.2246467991473533e-16i (SCM)
Common Lisp: (1+ (exp (complex 0 pi)))
⇒ #C(0.0L0 -5.0165576136843360246L-20) (CLISP)
⇒ #C(0.0d0 1.2246063538223773d-16) (CMUCL)
⇒ #C(0.0d0 1.2246467991473532d-16) (SBCL)
Perl: use Math::Complex; Math::Complex->emake(1, pi) + 1
⇒ 1.22464679914735e-16i
Python: from cmath import exp, pi; exp(complex(0, pi)) + 1
⇒ 1.2246467991473532e-16j (CPython)
Ruby: require 'complex'; Complex::polar(1, Math::PI) + 1
⇒ Complex(0.0, 1.22464679914735e-16) (MRI)
⇒ Complex(0.0, 1.2246467991473532e-16) (JRuby)
R: complex(argument = pi) + 1
⇒ 0+1.224606353822377e-16i
Is it possible to settle this dispute?
My first thought is to look to a symbolic language, like Maple. I don't think that counts as floating point though.
In fact, how does one represent i (or j for the engineers) in a conventional programming language?
Perhaps a better example is sin(π) = 0? (Or have I missed the point again?)
I agree with Ryan, you would need to move to another number representation system. The solution is outside the realm of floating point math because you need pi to represented as an infinitely long decimal so any limited precision scheme just isn't going to work (at least not without employing some kind of fudge-factor to make up the lost precision).
Your question seems a little odd to me, as you seem to be suggesting that the Floating Point math is implemented by the language. That's generally not true, as the FP math is done using a floating point processor in hardware. But software or hardware, floating point will always be inaccurate. That's just how floats work.
If you need better precision you need to use a different number representation. Just like if you're doing integer math on numbers that don't fit in an int or long. Some languages have libraries for that built in (I know java has BigInteger and BigDecimal), but you'd have to explicitly use those libraries instead of native types, and the performance would be (sometimes significantly) worse than if you used floats.
#Ryan Fox In fact, how does one represent i (or j for the engineers) in a conventional programming language?
Native complex data types are far from unknown. Fortran had it by the mid-sixties, and the OP exhibits a variety of other languages that support them in hist followup.
And complex numbers can be added to other languages as libraries (with operator overloading they even look just like native types in the code).
But unless you provide a special case for this problem, the "non-agreement" is just an expression of imprecise machine arithmetic, no? It's like complaining that
float r = 2/3;
float s = 3*r;
float t = s - 2;
ends with (t != 0) (At least if you use an dumb enough compiler)...
I had looooong coffee chats with my best pal talking about Irrational numbers and the diference between other numbers. Well, both of us agree in this different point of view:
Irrational numbers are relations, as functions, in a way, what way? Well, think about "if you want a perfect circle, give me a perfect pi", but circles are diferent to the other figures (4 sides, 5, 6... 100, 200) but... How many more sides do you have, more like a circle it look like. If you followed me so far, connecting all this ideas here is the pi formula:
So, pi is a function, but one that never ends! because of the ∞ parameter, but I like to think that you can have "instance" of pi, if you change the ∞ parameter for a very big Int, you will have a very big pi instance.
Same with e, give me a huge parameter, I will give you a huge e.
Putting all the ideas together:
As we have memory limitations, the language and libs provide to us huge instance of irrational numbers, in this case, pi and e, as final result, you will have long aproach to get 0, like the examples provided by #Chris Jester-Young
In fact, how does one represent i (or j for the engineers) in a conventional programming language?
In a language that doesn't have a native representation, it is usually added using OOP to create a Complex class to represent i and j, with operator overloading to properly deal with operations involving other Complex numbers and or other number primitives native to the language.
Eg: Complex.java, C++ < complex >
Numerical Analysis teaches us that you can't rely on the precise value of small differences between large numbers.
This doesn't just affect the equation in question here, but can bring instability to everything from solving a near-singular set of simultaneous equations, through finding the zeros of polynomials, to evaluating log(~1) or exp(~0) (I have even seen special functions for evaluating log(x+1) and (exp(x)-1) to get round this).
I would encourage you not to think in terms of zeroing the difference -- you can't -- but rather in doing the associated calculations in such a way as to ensure the minimum error.
I'm sorry, it's 43 years since I had this drummed into me at uni, and even if I could remember the references, I'm sure there's better stuff around now. I suggest this as a starting point.
If that sounds a bit patronising, I apologise. My "Numerical Analysis 101" was part of my Chemistry course, as there wasn't much CS in those days. I don't really have a feel for the place/importance numerical analysis has in a modern CS course.
It's a limitation of our current floating point computational architectures. Floating point arithmetic is only an approximation of numeric poles like e or pi (or anything beyond the precision your bits allow). I really enjoy these numbers because they defy classification, and appear to have greater entropy(?) than even primes, which are a canonical series. A ratio defy's numerical representation, sometimes simple things like that can blow a person's mind (I love it).
Luckily entire languages and libraries can be dedicated to precision trigonometric functions by using notational concepts (similar to those described by Lasse V. Karlsen ).
Consider a library/language that describes concepts like e and pi in a form that a machine can understand. Does a machine have any notion of what a perfect circle is? Probably not, but we can create an object - circle that satisfies all the known features we attribute to it (constant radius, relationship of radius to circumference is 2*pi*r = C). An object like pi is only described by the aforementioned ratio. r & C can be numeric objects described by whatever precision you want to give them. e can be defined "as the e is the unique real number such that the value of the derivative (slope of the tangent line) of the function f(x) = ex at the point x = 0 is exactly 1" from wikipedia.
Fun question.

Resources