Whats the meaning of "unique up to isomorphism"? [closed] - math

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 6 years ago.
Improve this question
Whats the meaning of "unique up to isomorphism"? To give some context, I came across the phrase reading about initial algebras.
It seems that up to means "to ignore" (sometimes said as "modulo"). Isomorphism means that the objects are the same in some way (with a bidirectional mapping). However, "unique ignoring that they are the same" still perplexes me.

Rather than "unique ignoring they are the same" it is more like "unique (ignoring irrelevant differences that are no real differences in the context that we are discussing here)".
For example, if you are considering geometic figures, an equilateral triangle is "the same" as another equilateral triangle of twice the size that is upside-down, so you can count this as a unique figure.

Suppose I have a set of numbers {0, 1, 2, ..., 11} under addition modulo 12 or a regular 12-gon under the rotations generated by a rotation of 30 degrees. Both of these sets are different, but the corresponding algebraic structure is the same (it's the cyclic group on 12 elements). There's an isomorphism between them in that addition by "1" modulo 12 corresponds to rotation (say clockwise) by 30 degrees.
Its awkward to say "look at this unique structure" because it's clearly shown up in at least two distinct settings. But somehow the distinguishing features between these two examples are non-essential in that they disappear under isomorphism while the algebraic structure is preserved. Hence they concede "It's unique, up to ismorphism."

The background notion is that of equivalence relations. An equivalence relation on a set is a relation, ~, on a set S which shares with equality the three properties of symmetry (x ~ y => y ~ x) reflexivity (x ~ x for all x) and transitivity (x ~ y ~ z => x ~ z). They are ubiquitous, and familiar, in mathematics. For example 1/2 is equivalent to 5/10 even though 1/2 is manifestly not really identical to 5/10. Whenever you have an equivalence relation you can have objects which are the same in one perspective but different from another. For example, it is a common undergraduate programming exercise to implement sets as lists. As a set you wouldn't distinguish between the set {1,2,3} and the set {2,3,1}, but if you represent them as lists, you can distinguish [1,2,3] from [2,3,1]. These later are different qua lists but the same qua sets.
Isomorphism is an equivalence relation on algebraic structures. To say that initial algebras are unique up to isomorphism means that they are all equivalent to each other with respect to the equivalence relation of isomorphism. #AlfredRossi 's example is an excellent illustration of the way this plays out in abstract algebra.

Related

lattice puzzles in the Constant Propagation for hotspot

I read a article below about Constant Propagation for hotspot through lattice .
http://www.cliffc.org/blog/2012/02/27/too-much-theory-part-2/
And it described that " (top meet -1) == -1 == (-1 meet top), so this example works. So does: (1 meet 2,3) == bottom == (2,3 meet 1). So does: (0 meet 0,1) == 0,1 == (0,1 meet 0)"
However I cannot understand why (top meet -1)==1 , and (-1 meet top)==-1? also why (1 meet 2,3) and (2,3 meet 1)==bottom ? how the meet is calculated?
Jacky, from your question it seems that you don't get some basic concepts. Have you tried to read linked Lattice wiki article?
I'm not sure if I can be better than collective mind of the Wiki but I'll try.
Let's start with poset aka "Partially ordered set". Having a "poset" means that you have a set of some objects and some comparator <= that you can feed two objects to and it will say which one is less (or rather "less or equal"). What differs partially ordered set from totally ordered one is that in more usual totally ordered set at least one of a <= b and a >= b holds true. In "partially ordered" mean that both might be false at the same time. I.e. you have some elements that you can't compare at all.
Now lattice is a structure over poset (and potentially not every poset can be converted to a lattice). To define a lattice you need to define two methods meet and join. meet is a function from a pair of elements of the poset to an element of the poset such that (I will use meet(a, b) syntax instead of a meet b as it seems to be more friendly for Java-developers):
For every pair of elements a and b there is an element inf = meet(a,b) i.e. meet is defined for every pair of elements
For every pair of elements a and b meet(a,b) <= a and meet(a,b) <= b
For every pair of elements a and b if inf = meet(a,b) there is no other element c in the set such that c <= a AND c <= b AND NOT c <= inf i.e. meet(a,b) defines the largest common minimum element (or more technically an infimum) and such element is unique.
The same goes for join but the join is finding "maximum" of two elements or more technically supremum.
So now let's go back to the example you referenced. The poset in that example contains of 4 types or rather layers of elements:
Top - an artificial element added to poset such that it is greater than any other element
Single integers
Pairs of neighbor integers (range) such as "[0, 1]" (here unlike the author I will use "[" and "]" to define ranges to not confuse with application of meet)
Bottom - an artificial element added to poset such that it is less than any other element
Note that all elements in single layer are not comparable(!) but all elements in any higher layer are greater than all elements in any lower layer. So no 1 is not less than 2 under that poset structure but [1,2] is less than both 1 and 2.
Note that all elements in single layer are not comparable(!). So no 1 is not less than 2 under that poset structure but [1,2] is less than both 1 and 2. Top is greater than anything. Bottom is less than anything. And range [x,y] is comparable with raw number z if and only if z lines inside the range and in that case range is less, otherwise they are not comparable.
You may notice that the structure of the poset "induces" corresponding lattice. So given such structure it is easy to understand how to define the meet function to satisfy all the requirements:
meet(Top, a) = meet(a, Top) = a for any a
meet(Bottom, a) = meet(a, Bottom) = Bottom for any a
meet(x, y) where both x and y are integers (i.e. for layer #2) is either:
Just x if x = y
Range [x, y] if x + 1 = y
Range [y, x] if y + 1 = x
Bottom otherwise
(I'm not sure if this is the right definition, it might always be range [min(x,y), max(x,y)] unless x = y . It is not clear from examples but it is not very important)
meet([x,y], z) = meet(z, [x,y]) where x, y, and z are integers i.e. meet of an integer (layer #2) and range (layer #3) is:
Range [x, y] if x = z or y = z (in other words if [x,y] < z)
Bottom otherwise
So meet of a range and an integer is almost always Bottom except most trivial cases
meet(a, b) where both a and b are ranges i.e. meet of two ranges (layer #3) is:
Range a is a = b
Bottom otherwise
So meet of two ranges is also Bottom except most trivial cases
What that part of example is about is actually about "inducing" the lattice from the structure and verifying that most of the desirable features hold (except for symmetry which is added in the next example).
Hope this helps
Update (answers to comments)
It is hard to answer "why". This is because the author build his poset in that way (probably because that way will be useful later). I think you are confused because set of natural numbers has a "natural" (pun not intended) sort order that we all are used to. Put there is nothing that could prohibit me to get the same set (i.e. the same object = all natural numbers) and define some other sorting order. Are you familiar with java.util.Comparator interface? Using that interface you can specify any sorting rule for Integer type such as "all even numbers are greater than all odd ones and inside even or odd classes works "usual" comparison rule" and you can use such a Comparartor to sort collection if for some reason such sorting order makes sense for your task. This is the same case, for author's task it makes sense to define an alternative (custom) sorting order. Moreover he want to make it only a partial order (which is impossible with Comparator). An the way he defines his order is the way I described.
Also if it is possible to do compare for [1,2] with 0 or 3?
Yes, you can compare and the answer directly follows from "all elements in single layer are not comparable(!) but all elements in any higher layer are greater than all elements in any lower layer": any number such as 0, 3, 42 or Integer.MAX_VALUE from layer #2 is greater than any range (layer #3) including the [1,2] range.
After some more thinking about it, my original answer was wrong. To satisfy author's goal range [1,2] should be not comparable with 0 or 3. So the answer is No. Actually my specification of the meet is correct but my description of sorting order is wrong.
Also the explanation for top and bottom are different from you, the original author explained that "bottom” == “we don’t know what values this might take on , “top” == “we can pick any value we like", I have do idea if you both explanation for the top and bottom actual refer to the same thing.
Here you mix up how the author defines top and bottom as a part of a mathematical structure called "lattice" and how he uses them for his practical task.
What this article is about is that there is an algorithm that analyses code for optimization based on "analyses of constants" and the algorithm is build upon the lattice of the described structure. The algorithm is based on processing different objects of the defined poset and involves finding meet of them multiple times. What the quoted answer describes is how to interpret the final value that algorithm produces rather than how those values are define.
AFAIU the basic idea behind algorithm is following: we have some variable and we see a few places where the value is assigned to it. For various optimizations it is good to know what it the possible range of values that the variable can take without running the actual code with all possible inputs. So the suggested algorithm is based on a simple idea: if we have two assignments (probably conditional) to the variable and in the first we know that values are in range [L1, R1] and in the second one values are in the range [L2, R2], we can be sure that now value is in the range [min(L1, L2), max(R1, R2)] (and this is effectively how meet is defined on that lattice). So now we can analyze all assignments in a function and to each assign range of possible values. Note that this structure of numbers and unlimited ranges also forms a lattice that the author describes in the first article (http://www.cliffc.org/blog/2012/02/12/too-much-theory/).
Note that Top is effectively impossible in Java because it provide some guarantees, but in C/C++ as the author mentions, we can have a variable that is not assigned at all and in such case C Language standard allows the compiler to treat that variable as having any value by compiler's choice i.e. compiler might assume whatever is most useful for optimization and this is what Top stands for. On the other hand if some value comes in as a argument to a method, it is Bottom because it can be any value without any control by compiler i.e. compiler can not assume anything about the value.
In the second article author points out that although the lattice from the first article is good theoretically in practice it can be very inefficient computationally. Thus to simplify computations he reduces his lattice to a much simpler one but the general theory stays the same: we assign ranges to the variables at each assignment so later we can analyze code and optimize it. And when we finished computing all the ranges, the interpretation for optimization assuming that analyzed line is if in the:
if (variable > 0) {
block#1
}
else {
block#2
}
is following
Top - if the line of code might be optimized assuming the variable has some specific value, compiler is free to do that optimization. So in the example compiler is free to eliminate that branch and decide that code will always go to block#1 and remove block#2 altogether OR decide that code will always go to block#2 and remove block#1 whichever alternative seems better to the compiler.
x - i.e. some constant value x - if the line of code might be optimized assuming the variable has value exactly x, compiler is free to do that optimization. So in the example compiler can evaluate x > 0 with that constant and leave only the code branch that corresponds to the calculated boolean value removing the other branch.
[x, y] - i.e. range from x to y. If the line of code might be optimized assuming the variable has value between x and y, compiler is free to do that optimization. So in the example if x > 0 (and thus y > 0), then compiler can remove block #2; y <= 0 (and thus x <= 0) , then compiler can remove block #1; if x <= 0 and y > 0 compiler can't optimize that code
Bottom - compiler can't optimize that code.

How is a table like a mathematical relation? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
I have been recently been reviewing Codd's relational algebra and relational databases. I recall that a relation is a set of ordered tuples and that a function is a relation that satisfies the additional property that each point in the domain must map to a single point in the codomain. In this sense, each table defines a finite-point function from the primary key onto the space of the codomain, defined by all the other columns. Is this the sense in which it is a relation? If so, why is relational algebra not functional algebra and why not call it a functional database instead?
Thanks.
BTW, sorry if this is not quite a normal form for stackoverflow (hah, a DB joke!) but I looked at all the forums and this seemed the best.
Well, there is C.J. Date's "An Introduction to Database Systems", and H. Darwen's "An Introduction to Relational Database Theory". Both are excellent books and I highly recommmend to read them both.
Now to the actual question. In mathematics, if you have n sets A1, A2, ..., An, you can form their Cartesian product A1 x A2 x ... x An, which is a set of all possible n-tuples (a1, a2, ..., an), where ai is an element from Ai. A n-ary relation R is, by definition, a subset of the Cartesian product of n sets.
Functions are binary relations — they're are subsets of Dom x Cod. But there're relations with higher arity. For example, if we take set Humans x Humans x Humans, we can define, say, a relation R, by taking all tuples (x, y, z) where x and y are parents of z.
Now there is a one important notion from logic: predicate. A predicate is a map from a Cartesian set A1 x A2 x ... x An to set of statements. Let's look at the predicate P(x,y,z) = "x and y are parents of z". For each tuple (x,y,z) from Humans x Humans x Humans we obtain a statement from it, true or false. And the set of all tuples which give us true statements, the predicate's truth set, is... a relation!
And notice, that having a truth set is all we actually need to work with a predicate. So, when we model our enterprise, we invent a bunch of predicates which describe it, and store their truth sets in the relational database.
And so, each operation with relations has a corresponding operation with predicates, so when we take relations, join and project and filter them, we end up with a new relation — and we know what predicate's its truth set is: we just take the corresponding predicates, and AND them, and bound with existential quantifiers, and we get a new predicate, whose truth set we know.
Edit: Now, I have to note that since relation is a set, its tuples are not ordered. So a table is just a model for a relation: you can take to different tables which will represent the same relation. Also, it is customary in relational theory to work with more generally defined tuples and Cartesian products. I defined higher the tuple as (a1, a2, ..., an) — basically, a function from {1,2,...,n} to A1 U A2 U ... U An (where i's image must be in Ai). In relational theory, we take a tuple to be a function from { name, name', ..., name } to A1 U A2 U ... U An — so, it becomes a record, a tuple with named components. And of course, it means that record's components are not ordered: (x: 1, y: 2), a function from { "x", "y" } to N which maps x to 1 and y to 2, is the same tuple/record as (y:2, x: 1).
So, if you take a table, swap rows, swap columns (with their headers!), you end up with a new table, which represent the same relation.
This Wikipedia page goes into detail about the rationale behind the model. Conceptually, the key is just a means of accessing a given tuple, not part of the tuple itself--see also Codd's 12 rules, #2.

Compute expectation when multiple vectors are involved [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 8 years ago.
Improve this question
From my understanding, the expectation of vector (let's say nx1) is equivalent to finding the mean. However if we have two vectors x and y, both of which are (nx1), what does it mean to try to find the expectation of the product of these vectors?
e.g:
E[x * y] = ?
Here are we taking the inner product or the outer product? If I was using Matlab, would I be doing:
E[x' * y]
or
E[x * y']
or
E[x .* y]
I'm not really understanding the intuition behind expectation as applied to the product of vectors (my background is not in mathematics), so if someone could shed light on this for me I would really appreciate it. Thanks!
== EDIT ==
You're right, I wasn't clear. I came across the definition of the covariance where the formula given was:
Cov[X; Y] = E[X * Y] - E[X] * E[Y]
And the part where E[X * Y] came up is what confused me. I should have put this up on a math site, and will next time. Thanks for the help.
As much as I believe this belongs either on a math or statistics site, I'm feeling bored at the moment, so I'll say a few words.
YOU need to define when you are doing, and to understand what you want to see. Numbers, vectors, by themselves are all just that - numbers. There is no meaning without context. I'll argue this is your problem.
For example, you can view a vector as a list of numbers, thus samples from some distribution, but samples of a scalar valued parameter. Thus, my vector might be a list of the temperatures in my house over the course of a day, or of the rainfall for the last week. As such, we can talk about a mean of those measurements. If we had a distribution, we could talk about the expected value of that distribution.
You might also look at a vector as a SINGLE piece of information. It might represent my location on the surface of the earth, so perhaps [latitude, longitude, elevation]. As such, it makes no sense to take the mean of these three pieces of information. However, I might be interested in an average location, taken over many such location measurements over a period of time.
As far as worrying about inner versus outer products, they are confusing you. Instead, think about WHAT these numbers represent and what you need to do with them, and only THEN worry about how to compute what you need.
Following on from #woodchips 's answer - when it does make sense to multiply two random variables and find the expectation of the product, in the discrete case it depends on whether you have the values for X and Y that correspond with each other i.e. if for each event you have an x and a y. In that case to find the expectation of the product, you simply multiply each pair of x and y you have and find the mean. If they're independent and you just have two vectors of samples and there is no co-occurrence, the expectation of the product is simply the product of their individual expectations.

Solving a system of linear equations in a non-square matrix [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 3 years ago.
Improve this question
I have a system of linear equations that make up an NxM matrix (i.e. Non-square) which I need to solve - or at least attempt to solve in order to show that there is no solution to the system. (more likely than not, there will be no solution)
As I understand it, if my matrix is not square (over or under-determined), then no exact solution can be found - am I correct in thinking this? Is there a way to transform my matrix into a square matrix in order to calculate the determinate, apply Gaussian Elimination, Cramer's rule, etc?
It may be worth mentioning that the coefficients of my unknowns may be zero, so in certain, rare cases it would be possible to have a zero-column or zero-row.
Whether or not your matrix is square is not what determines the solution space. It is the rank of the matrix compared to the number of columns that determines that (see the rank-nullity theorem). In general you can have zero, one or an infinite number of solutions to a linear system of equations, depending on its rank and nullity relationship.
To answer your question, however, you can use Gaussian elimination to find the rank of the matrix and, if this indicates that solutions exist, find a particular solution x0 and the nullspace Null(A) of the matrix. Then, you can describe all your solutions as x = x0 + xn, where xn represents any element of Null(A). For example, if a matrix is full rank its nullspace will be empty and the linear system will have at most one solution. If its rank is also equal to the number of rows, then you have one unique solution. If the nullspace is of dimension one, then your solution will be a line that passes through x0, any point on that line satisfying the linear equations.
Ok, first off: a non-square system of equations can have an exact solution
[ 1 0 0 ][x] = [1]
[ 0 0 1 ][y] [1]
[z]
clearly has a solution (actually, it has an 1-dimensional family of solutions: x=z=1). Even if the system is overdetermined instead of underdetermined it may still have a solution:
[ 1 0 ][x] = [1]
[ 0 1 ][y] [1]
[ 1 1 ] [2]
(x=y=1). You may want to start by looking at least squares solution methods, which find the exact solution if one exists, and "the best" approximate solution (in some sense) if one does not.
Taking Ax = b, with A having m columns and n rows. We are not guaranteed to have one and only one solution, which in many cases is because we have more equations than unknowns (m bigger n). This could be because of repeated measurements, that we actually want because we are cautious about influence of noise.
If we observe that we can not find a solution that actually means, that there is no way to find b travelling the column space spanned by A. (As x is only taking a combination of the columns).
We can however ask for the point in the space spanned by A that is nearest to b. How can we find such a point? Walking on a plane the closest one can get to a point outside it, is to walk until you are right below. Geometrically speaking this is when our axis of sight is perpendicular to the plane.
Now that is something we can have a mathematical formulation of. A perpendicular vector reminds us of orthogonal projections. And that is what we are going to do. The simplest case tells us to do a.T b. But we can take the whole matrix A.T b.
For our equation let us apply the transformation to both sides: A.T Ax = A.T b.
Last step is to solve for x by taking the inverse of A.T A:
x = (A.T A)^-1 * A.T b
The least squares recommendation is a very good one.
I'll add that you can try a singular value decomposition (SVD) that will give you the best answer possible and provide information about the null space for free.

If there are M different boxes and N identical balls

and we need to put these balls into boxes.
How many states of the states could there be?
This is part of a computer simulation puzzle. I've almost forget all my math knowledges.
I believe you are looking for the Multinomial Coefficient.
I will check myself and expand my answer.
Edit:
If you take a look at the wikipedia article I gave a link to, you can see that the M and N you defined in your question correspond to the m and n defined in the Theorem section.
This means that your question corresponds to: "What is the number of possible coefficient orderings when expanding a polynomial raised to an arbitrary power?", where N is the power, and M is the number of variables in the polynomial.
In other words:
What you are looking for is to sum over the multinomial coefficients of a polynomial of M variables expanded when raised to the power on N.
The exact equations are a bit long, but they are explained very clearly in wikipedia.
Why is this true:
The multinomial coefficient gives you the number of ways to order identical balls between baskets when grouped into a specific grouping (for example, 4 balls grouped into 3, 1, and 1 - in this case M=4 and N=3). When summing over all grouping options you get all possible combinations.
I hope this helped you out.
These notes explain how to solve the "balls in boxes" problem in general: whether the balls are labeled or not, whether the boxes are labeled or not, whether you have to have at least one ball in each box, etc.
this is a basic combinatorial question (distribution of identical objects into non identical slots)
the number of states is [(N+M-1) choose (M-1)]

Resources