I'm looking for the lemmas of the time derivative of vectors with finite dimension on the real numbers (finite-dimensional real vector spaces like ℝ^n) and on matrixes also with ℝ^nxn. I found the jacobian derivative of matrixes Cartesian_Euclidean_Space. Could anyone guide me to the theory name or how to implement the time derivative with vectors and matrixes?
Okay, I've taken a brief look at the theory we have in Isabelle and this is what I came up with:
The concept you are looking for is called a vector derivative in Isabelle.
The operation itself is called vector_derivative and is defined on any real normed vector space.
In practice, you almost never use vector_derivative though, but you instead use has_vector_derivative
For instance, you can prove the following:
lemma
"((λt. (t, t^2)) has_vector_derivative (1, 2*t)) (at t)"
by (auto intro!: derivative_eq_intros
simp: has_field_derivative_iff_has_vector_derivative [symmetric])
Here, I show that the curve t ↦ (t, t²) has the derivative (1, 2t). Note that for convenience, I used real × real instead of real ^ 2 here, the reason being that the former is more convenient for demonstration. Depending on what you want to do exactly, working with real × real × real instead of real ^ 3 may also be more convenient for you.
Also note the related concept of a field_derivative. This is the ‘normal’ derivative on a normed field, (e.g. the real or complex numbers). In the above proof, the automation first reduces the goal to showing what the vector derivatives of t and t ^ 2 are, and the last rule in the proof takes care of reducing that to showing what the field derivatives are (since these are now real functions), and that can be done automatically.
Related
Here is the question and solution to Structure and Interpretation of Computer Programs' exercise 1.15 (see here). My problem is, I don't know how the combination of these formulae actually work:
and
for small x radian values.
I understand the idea that the closer the radian angle gets to zero, the more it approximates the sine of that angle. I've seen excellent explanations (MIT OCW, Khan Academy). I also have worked out how the
formula is derived. But how are they being used together to derive an answer to sin(x)? The p function seems to simply be taking the variable angle divided by 3 each recursive pass until angle is down below 0.1 Then on the way back, we perform p as many times as we had to divide by 3. So it seems
magically becomes the same as
through recursive application. How? I'm not very deeply versed in recursion theory. Also, if this is logarithmically getting closer to 0.1, it's not as if we're totaling up lots of small x's a la integration. This seems to be doing something vaguely like the Y-combinator -- which I also don't grasp that well yet.
Also, when we see the recursive steps (recursion) repeatedly dividing angle by $3$, what tells you definitively this is logarithmic? I mean, it looks like it's taking those giant order of magnitude leaps at each division, but is there another analytical way to call this logarithmic reduction?
The first thing to point out is that is not exactly accurate since x is just an approximation. The correct notation is
. This might seem a little nitpicky but it's important because this explains the exercise and the definition of sine given in the book.
The way and are combined is in the definition of the sine procedure. The idea is that we would like to return either the approximation or the second formula () depending on the value of x.
If x is "sufficiently small", then we just return x as an approximation for sin(x). But if it's not "sufficiently small" we will use . This is obviously fine since it's an equality. It might seem unnecessary until you notice that sin(x/3) is smaller and therefore it might be "sufficiently smaller". This is why the procedure is recursive, we will keep doing this until the argument for sine is "sufficiently small".
It seems that the source of your confusion is here:
So it seems magically becomes the same as .
This is not the case. It's a bit tricky since (define (p x) (- (* 3 x) (* (4 (cube x)))) doesn't include any sine but remember that the x in this definition is just a local variable. But if we look at the final line of the definition of the sine procedure we can see that we are actually calling (p (sin (/ angle 3.0))))) so the sine is in the argument of the p call.
The reason why the recursion is logarithmic in terms of the number of steps is that the number of times we will be calling the p procedure is around the number of times we have to divide the angle by 3.0 to get a value smaller than 0.1. This is a value close to 1 if the angle is a big number. So we will have to call p until angle/(3.0^n) < 0.1 which approximates to the n such that 3.0^n > angle which approximates to
Is there a convergence theory in Isabelle/HOL? I need to define ∥x(t)∥ ⟶ 0 as t ⟶ ∞.
Also, I'm looking for vectors theory, I found a matrix theory but I couldn't find the vectors one, Is there exist such theory in Isabelle/HOL?
Cheers.
Convergence etc. are expressed with filters in Isabelle. (See the corresponding documentation)
In your case, that would be something like
filterlim (λt. norm (x t)) (nhds 0) at_top
or, using the tendsto abbreviation,
((λt. norm (x t)) ⤏ 0) at_top
where ⤏ is the Isabelle symbol \<longlongrightarrow>, which can be input using the abbreviation --->.
As a side note, I am wondering why you are writing it that way in the first place, seeing as it is equivalent to
filterlim x (nhds 0) at_top
or, with the tendsto syntax:
(x ⤏ 0) at_top
Reasoning with these filters can be tricky at first, but it has the advantage of providing a unified framework for limits and other topological concepts, and once you get the hang of it, it is very elegant.
As for vectors, just import ~~/src/HOL/Analysis/Analysis. That should have everything you need. Ideally, build the HOL-Analysis session image by starting Isabelle/jEdit with isabelle jedit -l HOL-Analysis. Then you won't have to process all of Isabelle's analysis library every time you start the system.
I assume that by ‘vectors’ you mean concrete finite-dimensional real vector spaces like ℝn. This is provided by ~~/src/HOL/Analysis/Finite_Cartesian_Product.thy, which is part of HOL-Analysis. This provides the vec type, which takes two parameters: the component type (probably real in your case) and the index type, which specifies the dimension of the vector space.
There is also a pre-defined type n for every positive integer n, so that you can write e.g. (real, 3) vec for the vector space ℝ³. There is also type syntax so that you can write 'a ^ 'n for ('a, 'n) vec.
I've been struggling with the basics of functional programming lately. I started writing small functions in SML, so far so good. Although, there is one problem I can not solve. It's on Project Euler (https://projecteuler.net/problem=5) and it simply asks for the smallest natural number that is divisible from all the numbers from 1 - n (where n is the argument of the function I'm trying to build).
Searching for the solution, I've found that through prime factorization, you analyze all the numbers from 1 to 10, and then keep the numbers where the highest power on a prime number occurs (after performing the prime factorization). Then you multiply them and you have your result (eg for n = 10, that number is 2520).
Can you help me on implementing this to an SML function?
Thank you for your time!
Since coding is not a spectator sport, it wouldn't be helpful for me to give you a complete working program; you'd have no way to learn from it. Instead, I'll show you how to get started, and start breaking down the pieces a bit.
Now, Mark Dickinson is right in his comments above that your proposed approach is neither the simplest nor the most efficient; nonetheless, it's quite workable, and plenty efficient enough to solve the Project Euler problem. (I tried it; the resulting program completed instantly.) So, I'll go with it.
To start with, if we're going to be operating on the prime decompositions of positive integers (that is: the results of factorizing them), we need to figure out how we're going to represent these decompositions. This isn't difficult, but it's very helpful to lay out all the details explicitly, so that when we write the functions that use them, we know exactly what assumptions we can make, what requirements we need to satisfy, and so on. (I can't tell you how many times I've seen code-writing attempts where different parts of the program disagree about what the data should look like, because the exact easiest form for one function to work with was a bit different from the exact easiest form for a different function to work with, and it was all done in an ad hoc way without really planning.)
You seem to have in mind an approach where a prime decomposition is a product of primes to the power of exponents: for example, 12 = 22 × 31. The simplest way to represent that in Standard ML is as a list of pairs: [(2,2),(3,1)]. But we should be a bit more precise than this; for example, we don't want 12 to sometimes be [(2,2),(3,1)] and sometimes [(3,1),(2,2)] and sometimes [(3,1),(5,0),(2,2)]. So, we can say something like "The prime decomposition of a positive integer is represented as a list of prime–exponent pairs, with the primes all being positive primes (2,3,5,7,…), the exponents all being positive integers (1,2,3,…), and the primes all being distinct and arranged in increasing order." This ensures a unique, easy-to-work-with representation. (N.B. 1 is represented by the empty list, nil.)
By the way, I should mention — when I tried this out, I found that everything was a little bit simpler if instead of storing exponents explicitly, I just repeated each prime the appropriate number of times, e.g. [2,2,3] for 12 = 2 × 2 × 3. (There was no single big complication with storing exponents explicitly, it just made a lot of little things a bit more finicky.) But the below breakdown is at a high level, and applies equally to either representation.
So, the overall algorithm is as follows:
Generate a list of the integers from 1 to 10, or 1 to 20.
This part is optional; you can just write the list by hand, if you want, so as to jump into the meatier part faster. But since your goal is to learn the basics of functional programming, you might as well do this using List.tabulate [documentation].
Use this to generate a list of the prime decompositions of these integers.
Specifically: you'll want to write a factorize or decompose function that takes a positive integer and returns its prime decomposition. You can then use map, a.k.a. List.map [documentation], to apply this function to each element of your list of integers.
Note that this decompose function will need to keep track of the "next" prime as it's factoring the integer. In some languages, you would use a mutable local variable for this; but in Standard ML, the normal approach is to write a recursive helper function with a parameter for this purpose. Specifically, you can write a function helper such that, if n and p are positive integers, p ≥ 2, where n is not divisible by any prime less than p, then helper n p is the prime decomposition of n. Then you just write
local
fun helper n p = ...
in
fun decompose n = helper n 2
end
Use this to generate the prime decomposition of the least common multiple of these integers.
To start with, you'll probably want to write a lcmTwoDecompositions function that takes a pair of prime decompositions, and computes the least common multiple (still in prime-decomposition form). (Writing this pairwise function is much, much easier than trying to create a multi-way least-common-multiple function from scratch.)
Using lcmTwoDecompositions, you can then use foldl or foldr, a.k.a. List.foldl or List.foldr [documentation], to create a function that takes a list of zero or more prime decompositions instead of just a pair. This makes use of the fact that the least common multiple of { n1, n2, …, nN } is lcm(n1, lcm(n2, lcm(…, lcm(nN, 1)…))). (This is a variant of what Mark Dickinson mentions above.)
Use this to compute the least common multiple of these integers.
This just requires a recompose function that takes a prime decomposition and computes the corresponding integer.
I search the correct notation of Pareto Dominance, but i don't know if it is a question for Mathexchange or here..
In multiple papers, like the classic Deb nsga2 paper, you can found the pareto dominance relation written like this
A dominate B if
In wikipedia, or other paper, like this one of Zitler, you find another notation :
A dominate B if
What is the best and correct mathematic notation here ?
What is the exact name of this symbol ?
≺ is called precedes, ≻ succeeds.
According to the well known Evolutionary Optimization Algorithms the standard notation for A dominates B is A ≻ B:
20.1 Pareto Optimality
[...]
Domination: a point x* is said to dominate x if the following two conditions hold:
fi(x*) <= fi(x) for all i ∈ [1,k]
fi(x*) < fi(x) for at least one j ∈ [1,k]
that is x* is at least as good as x for all objective function values and it's better than x for at least one objective function value. We use the notation:
x* ≻ x
to indicate that x* dominates x.
This notation can be confusing because the symbol ≻ looks like a
"greater than" symbol but since we deal mainly with minimization problems,
the symbol ≻ means the function values of x* are less than
or equal to those of x.
However this notation is standard in the
literature, so this is the notation that we use.
However even the reverse notation is used (probably "to avoid" the confusion the author refers to!)
I'm writing program in Python and I need to find the derivative of a function (a function expressed as string).
For example: x^2+3*x
Its derivative is: 2*x+3
Are there any scripts available, or is there something helpful you can tell me?
If you are limited to polynomials (which appears to be the case), there would basically be three steps:
Parse the input string into a list of coefficients to x^n
Take that list of coefficients and convert them into a new list of coefficients according to the rules for deriving a polynomial.
Take the list of coefficients for the derivative and create a nice string describing the derivative polynomial function.
If you need to handle polynomials like a*x^15125 + x^2 + c, using a dict for the list of coefficients may make sense, but require a little more attention when doing the iterations through this list.
sympy does it well.
You may find what you are looking for in the answers already provided. I, however, would like to give a short explanation on how to compute symbolic derivatives.
The business is based on operator overloading and the chain rule of derivatives. For instance, the derivative of v^n is n*v^(n-1)dv/dx, right? So, if you have v=3*x and n=3, what would the derivative be? The answer: if f(x)=(3*x)^3, then the derivative is:
f'(x)=3*(3*x)^2*(d/dx(3*x))=3*(3*x)^2*(3)=3^4*x^2
The chain rule allows you to "chain" the operation: each individual derivative is simple, and you just "chain" the complexity. Another example, the derivative of u*v is v*du/dx+u*dv/dx, right? If you get a complicated function, you just chain it, say:
d/dx(x^3*sin(x))
u=x^3; v=sin(x)
du/dx=3*x^2; dv/dx=cos(x)
d/dx=v*du+u*dv
As you can see, differentiation is only a chain of simple operations.
Now, operator overloading.
If you can write a parser (try Pyparsing) then you can request it to evaluate both the function and derivative! I've done this (using Flex/Bison) just for fun, and it is quite powerful. For you to get the idea, the derivative is computed recursively by overloading the corresponding operator, and recursively applying the chain rule, so the evaluation of "*" would correspond to u*v for function value and u*der(v)+v*der(u) for derivative value (try it in C++, it is also fun).
So there you go, I know you don't mean to write your own parser - by all means use existing code (visit www.autodiff.org for automatic differentiation of Fortran and C/C++ code). But it is always interesting to know how this stuff works.
Cheers,
Juan
Better late than never?
I've always done symbolic differentiation in whatever language by working with a parse tree.
But I also recently became aware of another method using complex numbers.
The parse tree approach consists of translating the following tiny Lisp code into whatever language you like:
(defun diff (s x)(cond
((eq s x) 1)
((atom s) 0)
((or (eq (car s) '+)(eq (car s) '-))(list (car s)
(diff (cadr s) x)
(diff (caddr s) x)
))
; ... and so on for multiplication, division, and basic functions
))
and following it with an appropriate simplifier, so you get rid of additions of 0, multiplying by 1, etc.
But the complex method, while completely numeric, has a certain magical quality. Instead of programming your computation F in double precision, do it in double precision complex.
Then, if you need the derivative of the computation with respect to variable X, set the imaginary part of X to a very small number h, like 1e-100.
Then do the calculation and get the result R.
Now real(R) is the result you would normally get, and imag(R)/h = dF/dX
to very high accuracy!
How does it work? Take the case of multiplying complex numbers:
(a+bi)(c+di) = ac + i(ad+bc) - bd
Now suppose the imaginary parts are all zero, except we want the derivative with respect to a.
We set b to a very small number h. Now what do we get?
(a+hi)(c) = ac + hci
So the real part of this is ac, as you would expect, and the imaginary part, divided by h, is c, which is the derivative of ac with respect to a.
The same sort of reasoning seems to apply to all the differentiation rules.
Symbolic Differentiation is an impressive introduction to the subject-at least for non-specialist like me :) The code is written in C++ btw.
Look up automatic differentiation. There are tools for Python. Also, this.
If you are thinking of writing the differentiation program from scratch, without utilizing other libraries as help, then the algorithm/approach of computing the derivative of any algebraic equation I described in my blog will be helpful.
You can try creating a class that will represent a limit rigorously and then evaluate it for (f(x)-f(a))/(x-a) as x approaches a. That should give a pretty accurate value of the limit.
if you're using string as an input, you can separate individual terms using + or - char as a delimiter, which will give you individual terms. Now you can use power rule to solve for each term, say you have x^3 which using power rule will give you 3x^2, or suppose you have a more complicated term like a/(x^3) or a(x^-3), again you can single out other variables as a constant and now solving for x^-3 will give you -3a/(x^2). power rule alone should be enough, however it will require extensive use of the factorization.
Unless any already made library deriving it's quite complex because you need to parse and handle functions and expressions.
Deriving by itself it's an easy task, since it's mechanical and can be done algorithmically but you need a basic structure to store a function.