About correct notation of pareto dominance in optimization - math

I search the correct notation of Pareto Dominance, but i don't know if it is a question for Mathexchange or here..
In multiple papers, like the classic Deb nsga2 paper, you can found the pareto dominance relation written like this
A dominate B if
In wikipedia, or other paper, like this one of Zitler, you find another notation :
A dominate B if
What is the best and correct mathematic notation here ?
What is the exact name of this symbol ?

≺ is called precedes, ≻ succeeds.
According to the well known Evolutionary Optimization Algorithms the standard notation for A dominates B is A ≻ B:
20.1 Pareto Optimality
[...]
Domination: a point x* is said to dominate x if the following two conditions hold:
fi(x*) <= fi(x) for all i ∈ [1,k]
fi(x*) < fi(x) for at least one j ∈ [1,k]
that is x* is at least as good as x for all objective function values and it's better than x for at least one objective function value. We use the notation:
x* ≻ x
to indicate that x* dominates x.
This notation can be confusing because the symbol ≻ looks like a
"greater than" symbol but since we deal mainly with minimization problems,
the symbol ≻ means the function values of x* are less than
or equal to those of x.
However this notation is standard in the
literature, so this is the notation that we use.
However even the reverse notation is used (probably "to avoid" the confusion the author refers to!)

Related

how to judge that the function is non-convex or convex

For example:(6-z)(x1+x2+x3)<40.
z is a positive integer variable.x1,x2 and x3 are all real variables. Then how can I judge that whether the function is convex or non-convex.
This is actually a more complicated question than one would expect.
This is not a function but a constraint.
A constraint should never have a < but rather a <=.
One definition of a convex constraint f(x) <= c is that f(lambda*x1+(1-lambda)*x2) <= lambda*f(x1)+(1-lambda)*f(x2) for all x1,x2,0<lambda<1. (Strict convexity requires < instead of <=)
Often easier is to prove that the matrix of second derivatives (the Hessian) is symmetric and positive-semi definite, for all x.
For quadratic constraints, like in your example, form the constraint x'Qx + a'x <= c (this is just a different notation) and prove Q is positive-semi definite. E.g. by looking at the eigenvalues.
Another way is to try to solve the problem using CVXPY. It will complain if the problem is not convex.
If the model converges to different solutions (with different objective values) depending on the starting point, the problem is non-convex.
Another heuristic I often use: throw into global solver such as Baron, and inspect the log and see if it does any branching. If it does, the problem is non-convex.
For quadratic problems: throw it at Cplex or Gurobi and see if it complains about non-convexity. Gurobi has an option to solve non-convex quadratic problems (but it requires setting an option).
For some problem classes we know whether the problem is convex or not (be familiar with the literature on this).
Constraints with integer variables are convex if the relaxation is convex (i.e. ignore the integer restrictions).
Constraints with integer/binary variables can often be reformulated into a set of linear inequalities.

Power with integer exponents in Isabelle

Here is my definition of power for integer exponents following this mailing-list post:
definition
"ipow x n = (if n < 0 then (1 / x) ^ n else x ^ n)"
notation ipow (infixr "^⇩i" 80)
Is there a better way to define it?
Is there an existing theory in Isabelle that already includes it so that I can reuse its results?
Context
I am dealing with complex exponentials, for instance consider this theorem:
after I proved it I realized I need to work with integers n not just naturals and this involves using powers to take out the n from the exponential.
I don't think something like this exists in the library. However, you have a typo in your definition. I believe you want something like
definition
"ipow x n = (if n < 0 then (1 / x) ^ nat (-n) else x ^ nat n)"
Apart from that, it is fine. You could write inverse x ^ nat (-n), but it should make little difference in practice. I would suggest the name int_power since the corresponding operation with natural exponents is called power.
Personally, I would avoid introducting a new constant like this because in order to actually use it productively, you also need an extensive collection of theorems around it. This means quite a bit of (tedious) work. Do you really need to talk about integers here? I find that one can often get around it in practice (in particular, note that the exponentials in question are periodic anyway).
It may be useful to introduce such a power operator nevertheless; all I'm saying is you should be aware of the trade-off.
Side note: An often overlooked function in Isabelle that is useful when talking about exponentials like this is cis (as in ‘cosine + i · sine‘). cis x is equivalent to ‘exp(ix)’ where x is real.

SICP: Why does this recursion-based sine approximation work?

Here is the question and solution to Structure and Interpretation of Computer Programs' exercise 1.15 (see here). My problem is, I don't know how the combination of these formulae actually work:
and
for small x radian values.
I understand the idea that the closer the radian angle gets to zero, the more it approximates the sine of that angle. I've seen excellent explanations (MIT OCW, Khan Academy). I also have worked out how the
formula is derived. But how are they being used together to derive an answer to sin(x)? The p function seems to simply be taking the variable angle divided by 3 each recursive pass until angle is down below 0.1 Then on the way back, we perform p as many times as we had to divide by 3. So it seems
magically becomes the same as
through recursive application. How? I'm not very deeply versed in recursion theory. Also, if this is logarithmically getting closer to 0.1, it's not as if we're totaling up lots of small x's a la integration. This seems to be doing something vaguely like the Y-combinator -- which I also don't grasp that well yet.
Also, when we see the recursive steps (recursion) repeatedly dividing angle by $3$, what tells you definitively this is logarithmic? I mean, it looks like it's taking those giant order of magnitude leaps at each division, but is there another analytical way to call this logarithmic reduction?
The first thing to point out is that is not exactly accurate since x is just an approximation. The correct notation is
. This might seem a little nitpicky but it's important because this explains the exercise and the definition of sine given in the book.
The way and are combined is in the definition of the sine procedure. The idea is that we would like to return either the approximation or the second formula () depending on the value of x.
If x is "sufficiently small", then we just return x as an approximation for sin(x). But if it's not "sufficiently small" we will use . This is obviously fine since it's an equality. It might seem unnecessary until you notice that sin(x/3) is smaller and therefore it might be "sufficiently smaller". This is why the procedure is recursive, we will keep doing this until the argument for sine is "sufficiently small".
It seems that the source of your confusion is here:
So it seems magically becomes the same as .
This is not the case. It's a bit tricky since (define (p x) (- (* 3 x) (* (4 (cube x)))) doesn't include any sine but remember that the x in this definition is just a local variable. But if we look at the final line of the definition of the sine procedure we can see that we are actually calling (p (sin (/ angle 3.0))))) so the sine is in the argument of the p call.
The reason why the recursion is logarithmic in terms of the number of steps is that the number of times we will be calling the p procedure is around the number of times we have to divide the angle by 3.0 to get a value smaller than 0.1. This is a value close to 1 if the angle is a big number. So we will have to call p until angle/(3.0^n) < 0.1 which approximates to the n such that 3.0^n > angle which approximates to

Time derivative for vectors and matrixes on the real field

I'm looking for the lemmas of the time derivative of vectors with finite dimension on the real numbers (finite-dimensional real vector spaces like ℝ^n) and on matrixes also with ℝ^nxn. I found the jacobian derivative of matrixes Cartesian_Euclidean_Space. Could anyone guide me to the theory name or how to implement the time derivative with vectors and matrixes?
Okay, I've taken a brief look at the theory we have in Isabelle and this is what I came up with:
The concept you are looking for is called a vector derivative in Isabelle.
The operation itself is called vector_derivative and is defined on any real normed vector space.
In practice, you almost never use vector_derivative though, but you instead use has_vector_derivative
For instance, you can prove the following:
lemma
"((λt. (t, t^2)) has_vector_derivative (1, 2*t)) (at t)"
by (auto intro!: derivative_eq_intros
simp: has_field_derivative_iff_has_vector_derivative [symmetric])
Here, I show that the curve t ↦ (t, t²) has the derivative (1, 2t). Note that for convenience, I used real × real instead of real ^ 2 here, the reason being that the former is more convenient for demonstration. Depending on what you want to do exactly, working with real × real × real instead of real ^ 3 may also be more convenient for you.
Also note the related concept of a field_derivative. This is the ‘normal’ derivative on a normed field, (e.g. the real or complex numbers). In the above proof, the automation first reduces the goal to showing what the vector derivatives of t and t ^ 2 are, and the last rule in the proof takes care of reducing that to showing what the field derivatives are (since these are now real functions), and that can be done automatically.

What is the difference between equality and equivalence?

I've read a few instances in reading mathematics and computer science that use the equivalence symbol ≡, (basically an '=' with three lines) and it always makes sense to me to read this as if it were equality. What is the difference between these two concepts?
Wikipedia: Equivalence relation:
In mathematics, an equivalence
relation is a binary relation between
two elements of a set which groups
them together as being "equivalent" in
some way. Let a, b, and c be arbitrary
elements of some set X. Then "a ~ b"
or "a ≡ b" denotes that a is
equivalent to b.
An equivalence relation "~" is reflexive, symmetric, and transitive.
In other words, = is just an instance of equivalence relation.
Edit: This seemingly simple criteria of being reflexive, symmetric, and transitive are not always trivial. See Bloch's Effective Java 2nd ed p. 35 for example,
public final class CaseInsensitiveString {
...
// broken
#Override public boolean equals(Object o) {
if (o instance of CaseInsensitiveString)
return s.equalsIgnoreCase(
((CaseInsensitiveString) o).s);
if (o instanceof String) // One-way interoperability!
return s.equalsIgnoreCase((String) o);
return false;
}
}
The above equals implementation breaks the symmetry because CaseInsensitiveString knows about String class, but the String class doesn't know about CaseInsensitiveString.
I take your question to be about math notation rather than programming. The triple equal sign you refer to can be written ≡ in HTML or \equiv in LaTeX.
a ≡ b most commonly means "a is defined to be b" or "let a be equal to b".
So 2+2=4 but φ ≡ (1+sqrt(5))/2.
Here's a handy equivalence table:
Mathematicians Computer scientists
-------------- -------------------
= ==
≡ =
(The other answers about equivalence relations are correct too but I don't think those are as common. There's also a ≡ b (mod m) which is pronounced "a is congruent to b, mod m" and in programmer parlance would be expressed as mod(a,m) == mod(b,m). In other words, a and b are equal after mod'ing by m.)
A lot of languages distinguish between equality of the objects and equality of the values of those objects.
Ruby for example has 3 different ways to test equality. The first, equal?, compares two variables to see if they point to the same instance. This is equivalent in a C-style language of doing a check to see if 2 pointers refer to the same address. The second method, ==, tests value equality. So 3 == 3.0 would be true in this case. The third, eql?, compares both value and class type.
Lisp also has different concepts of equality depending on what you're trying to test.
In languages that I have seen that differentiate between equality and equivalence, equality usually means the type and value are the same while equivalence means that just the values are the same. For example:
int i = 3;
double d = 3.0;
i and d would be have an equivalence relationship since they represent the same value but not equality since they have different types. Other languages may have different ideas of equivalence (such as whether two variables represent the same object).
The answers above are right / partially right but they don't explain what the difference is exactly. In theoretical computer science (and probably in other branches of maths) it has to do with quantification over free variables of the logical equation (that is when we use the two notations at once).
For me the best ways to understand the difference is:
By definition
A ≡ B
means
For all possible values of free variables in A and B, A = B
or
A ≡ B <=> [A = B]
By example
x=2x
iff (in fact iff is the same as ≡)
x=0
x ≡ 2x
iff (because it is not the case that x = 2x for all possible values of x)
False
I hope it helps
Edit:
Another thing that came to my head is the definitions of the two.
A = B is defined as A <= B and A >= B, where <= (smaller equal, not implies) can be any ordering relation
A ≡ B is defined as A <=> B (iff, if and only if, implies both sides), worth noting that implication is also an ordering relation and so it is possible (but less precise and often confusing) to use = instead of ≡.
I guess the conclusion is that when you see =, then you have to figure out the authors intention based on the context.
Take it outside the realm of programming.
(31) equal -- (having the same quantity, value, or measure as another; "on equal terms"; "all men are equal before the law")
equivalent, tantamount -- (being essentially equal to something; "it was as good as gold"; "a wish that was equivalent to a command"; "his statement was tantamount to an admission of guilt"
At least in my dictionary, 'equivelance' means its a good-enough subsitute for the original, but not necessarily identical, and likewise 'equality' conveys complete identical.
null == 0 # true , null is equivelant to 0 ( in php )
null === 0 # false, null is not equal to 0 ( in php )
( Some people use ≈ to represent nonidentical values instead )
The difference resides above all in the level at which the two concepts are introduced. '≡' is a symbol of formal logic where, given two propositions a and b, a ≡ b means (a => b AND b => a).
'=' is instead the typical example of an equivalence relation on a set, and presumes at least a theory of sets. When one defines a particular set, usually he provides it with a suitable notion of equality, which comes in the form of an equivalence relation and uses the symbol '='. For example, when you define the set Q of the rational numbers, you define equality a/b = c/d (where a/b and c/d are rational) if and only if ad = bc (where ad and bc are integers, the notion of equality for integers having already been defined elsewhere).
Sometimes you will find the informal notation f(x) ≡ g(x), where f and g are functions: It means that f and g have the same domain and that f(x) = g(x) for each x in such domain (this is again an equivalence relation). Finally, sometimes you find ≡ (or ~) as a generic symbol to denote an equivalence relation.
You could have two statements that have the same truth value (equivalent) or two statements that are the same (equality). As well the "equal sign with three bars" can also mean "is defined as."
Equality really is a special kind of equivalence relation, in fact. Consider what it means to say:
0.9999999999999999... = 1
That suggests that equality is just an equivalence relation on "string numbers" (which are defined more formally as functions from Z -> {0,...,9}). And we can see from this case, the equivalence classes are not even singletons.
The first problem is, what equality and equivalence mean in this case? Essentially, contexts are quite free to define these terms.
The general tenor I got from various definitions is: For values called equal, it should make no difference which one you read from.
The grossest example that violates this expectation is C++: x and y are said to be equal if x == y evaluates to true, and x and y are said to be equivalent if !(x < y) && !(y < x). Even apart from user-defined overloads of these operators, for floating-point numbers (float, double) those are not the same: All NaN values are equivalent to each other (in fact, equivalent to everything), but not equal to anything including themselves, and the values -0.0 and +0.0 compare equal (and equivalent) although you can distinguish them if you’re clever.
In a lot of cases, you’d need better terms to convey your intent precisely. Given two variables x and y,
identity or “the same” for expressing that there is only one object and x and y refer to it. Any change done through x is inadvertantly observable through y and vice versa. In Java, reference type variables are checked for identity using ==, in C# using the ReferenceEquals method. In C++, if x and y are references, std::addressof(x) == std::addressof(y) will do (whereas &x == &y will work most of the time, but & can be customized for user-defined types).
bitwise or structure equality for expressing that the internal representations of x and y are the same. Notice that bitwise equality breaks down when objects can reference (parts of) themselves internally. To get the intended meaning, the notion has to be refined in such cases to say: Structured the same. In D, bitwise equality is checked via is and C offers memcmp. I know of no language that has built-in structure equality testing.
indistinguishability or substitutability for expressing that values cannot be distinguished (through their public interface): If a function f takes two parameters and x and y are indistinguishable, the calls f(x, y), f(x, x), and f(y, y) always return indistinguishable values – unless f checks for identity (see bullet point above) directly or maybe by mutating the parameters. An example could be two search-trees that happen to contain indistinguishable elements, but the internal trees are layed-out differently. The internal tree layout is an implementation detail that normally cannot be observed through its public methods.
This is also called Leibniz-equality after Gottfried Wilhelm Leibniz who defined equality as the lack of differences.
equivalence for expressing that objects represent values considered essentially the same from some abstract reasoning. For an example for distinguishable equivalent values, observe that floating-point numbers have a negative zero -0.0 distinct from +0.0, and e.g. sign(1/x) is different for -0.0 and +0.0. Equivalence for floating-point numbers is checked using == in many languages with C-like syntax (aka. Algol syntax). Most object-oriented languages check equivalence of objects using an equals (or similarly named) method. C# has the IEquatable<T> interface to designate that the class has a standard/canonical/default equivalence relation defined on it. In Java, one overrides the equals method every class inherits from Object.
As you can see, the notions become increasingly vague. Checking for identity is something most languages can express. Identity and bitwise equality usually cannot be hooked by the programmer as the notions are independent from interpretations. There was a C++20 proposal, which ended up being rejected, that would have introduced the last two notions as strong† and weak equality†. († This site looks like CppReference, but is not; it is not up-to-date.) The original paper is here.
There are languages without mutation, primarily functional languages like Haskell. The difference between equality and equivalence there is less of an issue and tilts to the mathematical use of those words. (In math, generally speaking, (recursively defined) sequences are used instead of re-assignments.)
Everything C has, is also available to C++ and any language that can use C functionality. Everything said about C# is true for Visual Basic .NET and probably all languages built on the .NET framework. Analogously, Java represents the JRE languages that also include Kotlin and Scala.
If you just want stupid definitions without wisdom: An equivalence relation is a reflexive, symmetrical, and transitive binary relation on a set. Equality then is the intersection of all those equivalence relations.

Resources