Isabelle book Exercise 2.11: Transforming expressions to polynomial form - isabelle

my question is concerning Exercise 2.11 in the book Concrete Semantics (http://concrete-semantics.org/):
Define arithmetic expressions in one variable over integers
(type int) as a data type:
datatype exp = Var | Const int | Add exp exp | Mult exp exp
Define a function eval :: exp => int => int such that eval e x evaluates e at
the value x.
A polynomial can be represented as a list of coefficients, starting with the
constant. For example, [4, 2, -1, 3] represents the polynomial 4+2x-x^2+3x^3.
Define a function evalp :: int list => int => int that evaluates a polynomial at
the given value. Define a function coeffs :: exp => int list that transforms an
expression into a polynomial. This may require auxiliary functions. Prove that
coeffs preserves the value of the expression: evalp (coeffs e) x = eval e x.
---end
It's all pretty straightforward until you get to coeffs. We would have to deal with expressions like (X + X)*(2*X + 3*X*X) which have to be recursively expanded bottom-up using a distributive law until its in polynomial form. The resulting expression might still be something like (X*X + X*2*X + 3*X*X + 4*X*X*X) so then its necessary to normalize product terms (so eg X*2*X becomes 2*X*X), collect together like terms, and finally order them in order of increasing degree! This just seems significantly more complicated than any of the exercises so far that I wonder if I'm missing something or overly complicating it.

I think this exercise is considerably easier than you think. You can write a single primitively-recursive function coeffs that does the job: the coefficients of Var are [0,1], the coefficients of Const c are [c]. Similarly, if you have two subexpressions and you know their coefficients, you can combine those two coefficient lists into a single list for addition/multiplication.
For that, you should ideally write two auxiliary functions add_coeffs and mult_coeffs which add and multiply two lists of coefficients. (the latter will probably make use of the former)
You will have to prove that add_coeffs and mult_coeffs do the right thing (w.r.t. eval and evalp). The resulting lemmas also make good [simp] rules.
The proofs are all simple inductions where each case is automatic.
As a general rule: a good definition often makes the difference between a long and tedious proof and a straight-forward or even completely automatic proof. Doing a long-winded expansion and then grouping summands etc. as you suggested in your question is sure to lead to a tedious proof.
Of course, the method that I suggested in this answer is not very efficient, but when you want to do things in a theorem prover, efficiency is usually not a big concern – you want things to be simple and elegant and amenable to nice proofs. If you need efficient code, you can still develop your nice and simple abstract formulation into something more efficient later and show equivalence.

Related

Convergence and vectors theories

Is there a convergence theory in Isabelle/HOL? I need to define ∥x(t)∥ ⟶ 0 as t ⟶ ∞.
Also, I'm looking for vectors theory, I found a matrix theory but I couldn't find the vectors one, Is there exist such theory in Isabelle/HOL?
Cheers.
Convergence etc. are expressed with filters in Isabelle. (See the corresponding documentation)
In your case, that would be something like
filterlim (λt. norm (x t)) (nhds 0) at_top
or, using the tendsto abbreviation,
((λt. norm (x t)) ⤏ 0) at_top
where ⤏ is the Isabelle symbol \<longlongrightarrow>, which can be input using the abbreviation --->.
As a side note, I am wondering why you are writing it that way in the first place, seeing as it is equivalent to
filterlim x (nhds 0) at_top
or, with the tendsto syntax:
(x ⤏ 0) at_top
Reasoning with these filters can be tricky at first, but it has the advantage of providing a unified framework for limits and other topological concepts, and once you get the hang of it, it is very elegant.
As for vectors, just import ~~/src/HOL/Analysis/Analysis. That should have everything you need. Ideally, build the HOL-Analysis session image by starting Isabelle/jEdit with isabelle jedit -l HOL-Analysis. Then you won't have to process all of Isabelle's analysis library every time you start the system.
I assume that by ‘vectors’ you mean concrete finite-dimensional real vector spaces like ℝn. This is provided by ~~/src/HOL/Analysis/Finite_Cartesian_Product.thy, which is part of HOL-Analysis. This provides the vec type, which takes two parameters: the component type (probably real in your case) and the index type, which specifies the dimension of the vector space.
There is also a pre-defined type n for every positive integer n, so that you can write e.g. (real, 3) vec for the vector space ℝ³. There is also type syntax so that you can write 'a ^ 'n for ('a, 'n) vec.

In pure functional languages, is data (strings, ints, floats.. ) also just functions?

I was thinking about pure Object Oriented Languages like Ruby, where everything, including numbers, int, floats, and strings are themselves objects. Is this the same thing with pure functional languages? For example, in Haskell, are Numbers and Strings also functions?
I know Haskell is based on lambda calculus which represents everything, including data and operations, as functions. It would seem logical to me that a "purely functional language" would model everything as a function, as well as keep with the definition that a function most always returns the same output with the same inputs and has no state.
It's okay to think about that theoretically, but...
Just like in Ruby not everything is an object (argument lists, for instance, are not objects), not everything in Haskell is a function.
For more reference, check out this neat post: http://conal.net/blog/posts/everything-is-a-function-in-haskell
#wrhall gives a good answer. However you are somewhat correct that in the pure lambda calculus it is consistent for everything to be a function, and the language is Turing-complete (capable of expressing any pure computation that Haskell, etc. is).
That gives you some very strange things, since the only thing you can do to anything is to apply it to something else. When do you ever get to observe something? You have some value f and want to know something about it, your only choice is to apply it some value x to get f x, which is another function and the only choice is to apply it to another value y, to get f x y and so on.
Often I interpret the pure lambda calculus as talking about transformations on things that are not functions, but only capable of expressing functions itself. That is, I can make a function (with a bit of Haskelly syntax sugar for recursion & let):
purePlus = \zero succ natCase ->
let plus = \m n -> natCase m n (\m' -> plus m' n)
in plus (succ (succ zero)) (succ (succ zero))
Here I have expressed the computation 2+2 without needing to know that there are such things as non-functions. I simply took what I needed as arguments to the function I was defining, and the values of those arguments could be church encodings or they could be "real" numbers (whatever that means) -- my definition does not care.
And you could think the same thing of Haskell. There is no particular reason to think that there are things which are not functions, nor is there a particular reason to think that everything is a function. But Haskell's type system at least prevents you from applying an argument to a number (anybody thinking about fromInteger right now needs to hold their tongue! :-). In the above interpretation, it is because numbers are not necessarily modeled as functions, so you can't necessarily apply arguments to them.
In case it isn't clear by now, this whole answer has been somewhat of a technical/philosophical digression, and the easy answer to your question is "no, not everything is a function in functional languages". Functions are the things you can apply arguments to, that's all.
The "pure" in "pure functional" refers to the "freedom from side effects" kind of purity. It has little relation to the meaning of "pure" being used when people talk about a "pure object-oriented language", which simply means that the language manipulates purely (only) in objects.
The reason is that pure-as-in-only is a reasonable distinction to use to classify object-oriented languages, because there are languages like Java and C++, which clearly have values that don't have all that much in common with objects, and there are also languages like Python and Ruby, for which it can be argued that every value is an object1
Whereas for functional languages, there are no practical languages which are "pure functional" in the sense that every value the language can manipulate is a function. It's certainly possible to program in such a language. The most basic versions of the lambda calculus don't have any notion of things that are not functions, but you can still do arbitrary computation with them by coming up with ways of representing the things you want to compute on as functions.2
But while the simplicity and minimalism of the lambda calculus tends to be great for proving things about programming, actually writing substantial programs in such a "raw" programming language is awkward. The function representation of basic things like numbers also tends to be very inefficient to implement on actual physical machines.
But there is a very important distinction between languages that encourage a functional style but allow untracked side effects anywhere, and ones that actually enforce that your functions are "pure" functions (similar to mathematical functions). Object-oriented programming is very strongly wed to the use of impure computations3, so there are no practical object-oriented programming languages that are pure in this sense.
So the "pure" in "pure functional language" means something very different from the "pure" in "pure object-oriented language".4 In each case the "pure vs not pure" distinction is one that is completely uninteresting applied to the other kind of language, so there's no very strong motive to standardise the use of the term.
1 There are corner cases to pick at in all "pure object-oriented" languages that I know of, but that's not really very interesting. It's clear that the object metaphor goes much further in languages in which 1 is an instance of some class, and that class can be sub-classed, than it does in languages in which 1 is something else than an object.
2 All computation is about representation anyway. Computers don't know anything about numbers or anything else. They just have bit-patterns that we use to represent numbers, and operations on bit-patterns that happen to correspond to operations on numbers (because we designed them so that they would).
3 This isn't fundamental either. You could design a "pure" object-oriented language that was pure in this sense. I tend to write most of my OO code to be pure anyway.
4 If this seems obtuse, you might reflect that the terms "functional", "object", and "language" have vastly different meanings in other contexts also.
A very different angle on this question: all sorts of data in Haskell can be represented as functions, using a technique called Church encodings. This is a form of inversion of control: instead of passing data to functions that consume it, you hide the data inside a set of closures, and to consume it you pass in callbacks describing what to do with this data.
Any program that uses lists, for example, can be translated into a program that uses functions instead of lists:
-- | A list corresponds to a function of this type:
type ChurchList a r = (a -> r -> r) --^ how to handle a cons cell
-> r --^ how to handle the empty list
-> r --^ result of processing the list
listToCPS :: [a] -> ChurchList a r
listToCPS xs = \f z -> foldr f z xs
That function is taking a concrete list as its starting point, but that's not necessary. You can build up ChurchList functions out of just pure functions:
-- | The empty 'ChurchList'.
nil :: ChurchList a r
nil = \f z -> z
-- | Add an element at the front of a 'ChurchList'.
cons :: a -> ChurchList a r -> ChurchList a r
cons x xs = \f z -> f z (xs f z)
foldChurchList :: (a -> r -> r) -> r -> ChurchList a r -> r
foldChurchList f z xs = xs f z
mapChurchList :: (a -> b) -> ChurchList a r -> ChurchList b r
mapChurchList f = foldChurchList step nil
where step x = cons (f x)
filterChurchList :: (a -> Bool) -> ChurchList a r -> ChurchList a r
filterChurchList pred = foldChurchList step nil
where step x xs = if pred x then cons x xs else xs
That last function uses Bool, but of course we can replace Bool with functions as well:
-- | A Bool can be represented as a function that chooses between two
-- given alternatives.
type ChurchBool r = r -> r -> r
true, false :: ChurchBool r
true a _ = a
false _ b = b
filterChurchList' :: (a -> ChurchBool r) -> ChurchList a r -> ChurchList a r
filterChurchList' pred = foldChurchList step nil
where step x xs = pred x (cons x xs) xs
This sort of transformation can be done for basically any type, so in theory, you could get rid of all "value" types in Haskell, and keep only the () type, the (->) and IO type constructors, return and >>= for IO, and a suitable set of IO primitives. This would obviously be hella impractical—and it would perform worse (try writing tailChurchList :: ChurchList a r -> ChurchList a r for a taste).
Is getChar :: IO Char a function or not? Haskell Report doesn't provide us with a definition. But it states that getChar is a function (see here). (Well, at least we can say that it is a function.)
So I think the answer is YES.
I don't think there can be correct definition of "function" except "everything is a function". (What is "correct definition"? Good question...) Consider the next example:
{-# LANGUAGE NoMonomorphismRestriction #-}
import Control.Applicative
f :: Applicative f => f Int
f = pure 1
g1 :: Maybe Int
g1 = f
g2 :: Int -> Int
g2 = f
Is f a function or datatype? It depends.

New to OCaml: How would I go about implementing Gaussian Elimination?

I'm new to OCaml, and I'd like to implement Gaussian Elimination as an exercise. I can easily do it with a stateful algorithm, meaning keep a matrix in memory and recursively operating on it by passing around a reference to it.
This statefulness, however, smacks of imperative programming. I know there are capabilities in OCaml to do this, but I'd like to ask if there is some clever functional way I haven't thought of first.
OCaml arrays are mutable, and it's hard to avoid treating them just like arrays in an imperative language.
Haskell has immutable arrays, but from my (limited) experience with Haskell, you end up switching to monadic, mutable arrays in most cases. Immutable arrays are probably amazing for certain specific purposes. I've always imagined you could write a beautiful implementation of dynamic programming in Haskell, where the dependencies among array entries are defined entirely by the expressions in them. The key is that you really only need to specify the contents of each array entry one time. I don't think Gaussian elimination follows this pattern, and so it seems it might not be a good fit for immutable arrays. It would be interesting to see how it works out, however.
You can use a Map to emulate a matrix. The key would be a pair of integers referencing the row and column. You'll want to use your own get x y function to ensure x < n and y < n though, instead of accessing the Map directly. (edit) You can use the compare function in Pervasives directly.
module OrderedPairs = struct
type t = int * int
let compare = Pervasives.compare
end
module Pairs = Map.Make (OrderedPairs)
let get_ n set x y =
assert( x < n && y < n );
Pairs.find (x,y) set
let set_ n set x y v =
assert( x < n && y < n );
Pairs.add (x,y) set v
Actually, having a general set of functions (get x y and set x y at a minimum), without specifying the implementation, would be an even better option. The functions then can be passed to the function, or be implemented in a module through a functor (a better solution, but having a set of functions just doing what you need would be a first step since you're new to OCaml). In this way you can use a Map, Array, Hashtbl, or a set of functions to access a file on the hard-drive to implement the matrix if you wanted. This is the really important aspect of functional programming; that you trust the interface over exploiting the side-effects, and not worry about the underlying implementation --since it's presumed to be pure.
The answers so far are using/emulating mutable data-types, but what does a functional approach look like?
To see, let's decompose the problem into some functional components:
Gaussian elimination involves a sequence of row operations, so it is useful first to define a function taking 2 rows and scaling factors, and returning the resultant row operation result.
The row operations we want should eliminate a variable (column) from a particular row, so lets define a function which takes a pair of rows and a column index and uses the previously defined row operation to return the modified row with that column entry zero.
Then we define two functions, one to convert a matrix into triangular form, and another to back-substitute a triangular matrix to the diagonal form (using the previously defined functions) by eliminating each column in turn. We could iterate or recurse over the columns, and the matrix could be defined as a list, vector or array of lists, vectors or arrays. The input is not changed, but a modified matrix is returned, so we can finally do:
let out_matrix = to_diagonal (to_triangular in_matrix);
What makes it functional is not whether the data-types (array or list) are mutable, but how they they are used. This approach may not be particularly 'clever' or be the most efficient way to do Gaussian eliminations in OCaml, but using pure functions lets you express the algorithm cleanly.

Derivative Calculator

I'm interested in building a derivative calculator. I've racked my brains over solving the problem, but I haven't found a right solution at all. May you have a hint how to start? Thanks
I'm sorry! I clearly want to make symbolic differentiation.
Let's say you have the function f(x) = x^3 + 2x^2 + x
I want to display the derivative, in this case f'(x) = 3x^2 + 4x + 1
I'd like to implement it in objective-c for the iPhone.
I assume that you're trying to find the exact derivative of a function. (Symbolic differentiation)
You need to parse the mathematical expression and store the individual operations in the function in a tree structure.
For example, x + sin²(x) would be stored as a + operation, applied to the expression x and a ^ (exponentiation) operation of sin(x) and 2.
You can then recursively differentiate the tree by applying the rules of differentiation to each node. For example, a + node would become the u' + v', and a * node would become uv' + vu'.
you need to remember your calculus. basically you need two things: table of derivatives of basic functions and rules of how to derivate compound expressions (like d(f + g)/dx = df/dx + dg/dx). Then take expressions parser and recursively go other the tree. (http://www.sosmath.com/tables/derivative/derivative.html)
Parse your string into an S-expression (even though this is usually taken in Lisp context, you can do an equivalent thing in pretty much any language), easiest with lex/yacc or equivalent, then write a recursive "derive" function. In OCaml-ish dialect, something like this:
let rec derive var = function
| Const(_) -> Const(0)
| Var(x) -> if x = var then Const(1) else Deriv(Var(x), Var(var))
| Add(x, y) -> Add(derive var x, derive var y)
| Mul(a, b) -> Add(Mul(a, derive var b), Mul(derive var a, b))
...
(If you don't know OCaml syntax - derive is two-parameter recursive function, with first parameter the variable name, and the second being mathched in successive lines; for example, if this parameter is a structure of form Add(x, y), return the structure Add built from two fields, with values of derived x and derived y; and similarly for other cases of what derive might receive as a parameter; _ in the first pattern means "match anything")
After this you might have some clean-up function to tidy up the resultant expression (reducing fractions etc.) but this gets complicated, and is not necessary for derivation itself (i.e. what you get without it is still a correct answer).
When your transformation of the s-exp is done, reconvert the resultant s-exp into string form, again with a recursive function
SLaks already described the procedure for symbolic differentiation. I'd just like to add a few things:
Symbolic math is mostly parsing and tree transformations. ANTLR is a great tool for both. I'd suggest starting with this great book Language implementation patterns
There are open-source programs that do what you want (e.g. Maxima). Dissecting such a program might be interesting, too (but it's probably easier to understand what's going on if you tried to write it yourself, first)
Probably, you also want some kind of simplification for the output. For example, just applying the basic derivative rules to the expression 2 * x would yield 2 + 0*x. This can also be done by tree processing (e.g. by transforming 0 * [...] to 0 and [...] + 0 to [...] and so on)
For what kinds of operations are you wanting to compute a derivative? If you allow trigonometric functions like sine, cosine and tangent, these are probably best stored in a table while others like polynomials may be much easier to do. Are you allowing for functions to have multiple inputs,e.g. f(x,y) rather than just f(x)?
Polynomials in a single variable would be my suggestion and then consider adding in trigonometric, logarithmic, exponential and other advanced functions to compute derivatives which may be harder to do.
Symbolic differentiation over common functions (+, -, *, /, ^, sin, cos, etc.) ignoring regions where the function or its derivative is undefined is easy. What's difficult, perhaps counterintuitively, is simplifying the result afterward.
To do the differentiation, store the operations in a tree (or even just in Polish notation) and make a table of the derivative of each of the elementary operations. Then repeatedly apply the chain rule and the elementary derivatives, together with setting the derivative of a constant to 0. This is fast and easy to implement.

algorithm to find derivative

I'm writing program in Python and I need to find the derivative of a function (a function expressed as string).
For example: x^2+3*x
Its derivative is: 2*x+3
Are there any scripts available, or is there something helpful you can tell me?
If you are limited to polynomials (which appears to be the case), there would basically be three steps:
Parse the input string into a list of coefficients to x^n
Take that list of coefficients and convert them into a new list of coefficients according to the rules for deriving a polynomial.
Take the list of coefficients for the derivative and create a nice string describing the derivative polynomial function.
If you need to handle polynomials like a*x^15125 + x^2 + c, using a dict for the list of coefficients may make sense, but require a little more attention when doing the iterations through this list.
sympy does it well.
You may find what you are looking for in the answers already provided. I, however, would like to give a short explanation on how to compute symbolic derivatives.
The business is based on operator overloading and the chain rule of derivatives. For instance, the derivative of v^n is n*v^(n-1)dv/dx, right? So, if you have v=3*x and n=3, what would the derivative be? The answer: if f(x)=(3*x)^3, then the derivative is:
f'(x)=3*(3*x)^2*(d/dx(3*x))=3*(3*x)^2*(3)=3^4*x^2
The chain rule allows you to "chain" the operation: each individual derivative is simple, and you just "chain" the complexity. Another example, the derivative of u*v is v*du/dx+u*dv/dx, right? If you get a complicated function, you just chain it, say:
d/dx(x^3*sin(x))
u=x^3; v=sin(x)
du/dx=3*x^2; dv/dx=cos(x)
d/dx=v*du+u*dv
As you can see, differentiation is only a chain of simple operations.
Now, operator overloading.
If you can write a parser (try Pyparsing) then you can request it to evaluate both the function and derivative! I've done this (using Flex/Bison) just for fun, and it is quite powerful. For you to get the idea, the derivative is computed recursively by overloading the corresponding operator, and recursively applying the chain rule, so the evaluation of "*" would correspond to u*v for function value and u*der(v)+v*der(u) for derivative value (try it in C++, it is also fun).
So there you go, I know you don't mean to write your own parser - by all means use existing code (visit www.autodiff.org for automatic differentiation of Fortran and C/C++ code). But it is always interesting to know how this stuff works.
Cheers,
Juan
Better late than never?
I've always done symbolic differentiation in whatever language by working with a parse tree.
But I also recently became aware of another method using complex numbers.
The parse tree approach consists of translating the following tiny Lisp code into whatever language you like:
(defun diff (s x)(cond
((eq s x) 1)
((atom s) 0)
((or (eq (car s) '+)(eq (car s) '-))(list (car s)
(diff (cadr s) x)
(diff (caddr s) x)
))
; ... and so on for multiplication, division, and basic functions
))
and following it with an appropriate simplifier, so you get rid of additions of 0, multiplying by 1, etc.
But the complex method, while completely numeric, has a certain magical quality. Instead of programming your computation F in double precision, do it in double precision complex.
Then, if you need the derivative of the computation with respect to variable X, set the imaginary part of X to a very small number h, like 1e-100.
Then do the calculation and get the result R.
Now real(R) is the result you would normally get, and imag(R)/h = dF/dX
to very high accuracy!
How does it work? Take the case of multiplying complex numbers:
(a+bi)(c+di) = ac + i(ad+bc) - bd
Now suppose the imaginary parts are all zero, except we want the derivative with respect to a.
We set b to a very small number h. Now what do we get?
(a+hi)(c) = ac + hci
So the real part of this is ac, as you would expect, and the imaginary part, divided by h, is c, which is the derivative of ac with respect to a.
The same sort of reasoning seems to apply to all the differentiation rules.
Symbolic Differentiation is an impressive introduction to the subject-at least for non-specialist like me :) The code is written in C++ btw.
Look up automatic differentiation. There are tools for Python. Also, this.
If you are thinking of writing the differentiation program from scratch, without utilizing other libraries as help, then the algorithm/approach of computing the derivative of any algebraic equation I described in my blog will be helpful.
You can try creating a class that will represent a limit rigorously and then evaluate it for (f(x)-f(a))/(x-a) as x approaches a. That should give a pretty accurate value of the limit.
if you're using string as an input, you can separate individual terms using + or - char as a delimiter, which will give you individual terms. Now you can use power rule to solve for each term, say you have x^3 which using power rule will give you 3x^2, or suppose you have a more complicated term like a/(x^3) or a(x^-3), again you can single out other variables as a constant and now solving for x^-3 will give you -3a/(x^2). power rule alone should be enough, however it will require extensive use of the factorization.
Unless any already made library deriving it's quite complex because you need to parse and handle functions and expressions.
Deriving by itself it's an easy task, since it's mechanical and can be done algorithmically but you need a basic structure to store a function.

Resources