Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
I have been recently been reviewing Codd's relational algebra and relational databases. I recall that a relation is a set of ordered tuples and that a function is a relation that satisfies the additional property that each point in the domain must map to a single point in the codomain. In this sense, each table defines a finite-point function from the primary key onto the space of the codomain, defined by all the other columns. Is this the sense in which it is a relation? If so, why is relational algebra not functional algebra and why not call it a functional database instead?
Thanks.
BTW, sorry if this is not quite a normal form for stackoverflow (hah, a DB joke!) but I looked at all the forums and this seemed the best.
Well, there is C.J. Date's "An Introduction to Database Systems", and H. Darwen's "An Introduction to Relational Database Theory". Both are excellent books and I highly recommmend to read them both.
Now to the actual question. In mathematics, if you have n sets A1, A2, ..., An, you can form their Cartesian product A1 x A2 x ... x An, which is a set of all possible n-tuples (a1, a2, ..., an), where ai is an element from Ai. A n-ary relation R is, by definition, a subset of the Cartesian product of n sets.
Functions are binary relations — they're are subsets of Dom x Cod. But there're relations with higher arity. For example, if we take set Humans x Humans x Humans, we can define, say, a relation R, by taking all tuples (x, y, z) where x and y are parents of z.
Now there is a one important notion from logic: predicate. A predicate is a map from a Cartesian set A1 x A2 x ... x An to set of statements. Let's look at the predicate P(x,y,z) = "x and y are parents of z". For each tuple (x,y,z) from Humans x Humans x Humans we obtain a statement from it, true or false. And the set of all tuples which give us true statements, the predicate's truth set, is... a relation!
And notice, that having a truth set is all we actually need to work with a predicate. So, when we model our enterprise, we invent a bunch of predicates which describe it, and store their truth sets in the relational database.
And so, each operation with relations has a corresponding operation with predicates, so when we take relations, join and project and filter them, we end up with a new relation — and we know what predicate's its truth set is: we just take the corresponding predicates, and AND them, and bound with existential quantifiers, and we get a new predicate, whose truth set we know.
Edit: Now, I have to note that since relation is a set, its tuples are not ordered. So a table is just a model for a relation: you can take to different tables which will represent the same relation. Also, it is customary in relational theory to work with more generally defined tuples and Cartesian products. I defined higher the tuple as (a1, a2, ..., an) — basically, a function from {1,2,...,n} to A1 U A2 U ... U An (where i's image must be in Ai). In relational theory, we take a tuple to be a function from { name, name', ..., name } to A1 U A2 U ... U An — so, it becomes a record, a tuple with named components. And of course, it means that record's components are not ordered: (x: 1, y: 2), a function from { "x", "y" } to N which maps x to 1 and y to 2, is the same tuple/record as (y:2, x: 1).
So, if you take a table, swap rows, swap columns (with their headers!), you end up with a new table, which represent the same relation.
This Wikipedia page goes into detail about the rationale behind the model. Conceptually, the key is just a means of accessing a given tuple, not part of the tuple itself--see also Codd's 12 rules, #2.
Related
I am finding difficulties to understand
1) AST matching, how two AST's are similar? Are types included in the comparison/matching or only the operations like +, -, ++,...etc inlcuded?
2) Two statements are syntactically similar (This term I read it somewhere in a paper), can we say the below example that the two statement are syntactically similar?
int x = 1 + 2
String y = "1" + "2"
Java - Eclipse is what am using right now and trying to understand the AST for.
Best Regards,
What ASTs are:
An AST is a data structure representing program source text that consists of nodes that contain a node type, and possibly a literal value, and a list of children nodes. The node type corresponds to what OP calls "operations" (+, -, ...) but also includes language commands (do, if, assignment, call, ...) , declarations (int, struct, ...) and literals (number, string, boolean). [It is unclear what OP means by "type"]. (ASTs often have additional information in each node referring back to the point of origin in the source text file).
What ASTs are for:
OP seems puzzled by the presence of ASTs in Eclipse.
ASTs are used to represent program text in a form which is easier to interpret than the raw text. They provide a means to reason about the program structure or content; sometimes they are used to enable the modification of program ("refactoring") by modifying the AST for a program and then regenerating the text from the AST.
Comparing ASTs for similarity is not a really common use in my experience, except in clone detection and/or pattern matching.
Comparing ASTs:
Comparing ASTs for equality is easy: compare the root node type/literal value for equality; if not equal, the comparision is complete, else (recursively) compare the children nodes).
Comparing ASTs of similarity is harder because you must decide how to relax the equality comparision. In particular, you must decide on a precise definition of similarity. There are many ways to define this, some rather shallow syntactically, some more semantically sophisticated.
My paper Clone Detection Using Abstract Syntax Trees describes one way to do this, using similarity defined as the ratio of the number of nodes shared divided by the number of nodes total in both trees. Shared nodes are computed by comparing the trees top down to the point where some child is different. (The actual comparision is to compute an anti-unifier). This similary measure is rather shallow, but it works remarkably well in finding code clones in big software systems.
From that perspective, OPs's examples:
int x = 1 + 2
String y = "1" + "2"
have trees written as S-expressions:
(declaration_with_assignment (int x) (+ 1 2))
(declaration_with_assignment (String y) (+ "1" "2"))
These are not very similar; they only share a root node whose type is "declaration-with-assignment" and the top of the + subtree. (Here the node count is 12 with only 2 matching nodes for a similarity of 2/12).
These would be more similar:
int x = 1 + 2
float x = 1.0 + 2
(S-expressions)
(declaration_with_assignment (int x) (+ 1 2))
(declaration_with_assignment (float x) (+ 1.0 2))
which share the declaration with assignment, the add node, the literal leaf node 2, and arguably the literal nodes for integer 1 and float 1.0, depending on whether you wish to define them as "equal" or not, for a similarity of 4/12.
If you change one of the trees to be a pattern tree, in which some "leaves" are pattern variables, you can then use such pattern trees to find code that has certain structure.
The surface syntax pattern:
?type ?variable = 1 + ?expression
with S-expression
((declaration_with_assignment (?type ?varaible)) (+ 1 ?expression))
matches the first of OP's examples but not the second.
As far as I know, Eclipse doesn't offer any pattern-based matching abilities.
But these are very useful in program analysis and/or program transformation tools. For some specific examples, too long to include here, see DMS Rewrite Rules
(Full disclosure: DMS is a product of my company. I'm the architect).
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 6 years ago.
Improve this question
Whats the meaning of "unique up to isomorphism"? To give some context, I came across the phrase reading about initial algebras.
It seems that up to means "to ignore" (sometimes said as "modulo"). Isomorphism means that the objects are the same in some way (with a bidirectional mapping). However, "unique ignoring that they are the same" still perplexes me.
Rather than "unique ignoring they are the same" it is more like "unique (ignoring irrelevant differences that are no real differences in the context that we are discussing here)".
For example, if you are considering geometic figures, an equilateral triangle is "the same" as another equilateral triangle of twice the size that is upside-down, so you can count this as a unique figure.
Suppose I have a set of numbers {0, 1, 2, ..., 11} under addition modulo 12 or a regular 12-gon under the rotations generated by a rotation of 30 degrees. Both of these sets are different, but the corresponding algebraic structure is the same (it's the cyclic group on 12 elements). There's an isomorphism between them in that addition by "1" modulo 12 corresponds to rotation (say clockwise) by 30 degrees.
Its awkward to say "look at this unique structure" because it's clearly shown up in at least two distinct settings. But somehow the distinguishing features between these two examples are non-essential in that they disappear under isomorphism while the algebraic structure is preserved. Hence they concede "It's unique, up to ismorphism."
The background notion is that of equivalence relations. An equivalence relation on a set is a relation, ~, on a set S which shares with equality the three properties of symmetry (x ~ y => y ~ x) reflexivity (x ~ x for all x) and transitivity (x ~ y ~ z => x ~ z). They are ubiquitous, and familiar, in mathematics. For example 1/2 is equivalent to 5/10 even though 1/2 is manifestly not really identical to 5/10. Whenever you have an equivalence relation you can have objects which are the same in one perspective but different from another. For example, it is a common undergraduate programming exercise to implement sets as lists. As a set you wouldn't distinguish between the set {1,2,3} and the set {2,3,1}, but if you represent them as lists, you can distinguish [1,2,3] from [2,3,1]. These later are different qua lists but the same qua sets.
Isomorphism is an equivalence relation on algebraic structures. To say that initial algebras are unique up to isomorphism means that they are all equivalent to each other with respect to the equivalence relation of isomorphism. #AlfredRossi 's example is an excellent illustration of the way this plays out in abstract algebra.
When I learn Scala/Haskell, I see there is a concept of Algebraic data type. I've read the explanation from the wikipedia, but I still have a question:
Why does it use the word "Algebraic" in its name? Does it have some relationship with "Algebraic"?
Consider the type Bool. This type, of course, can take on one of two possible values: True or False.
Now consider
data EitherBool = Left Bool | Right Bool
How many values can this type take on? There are 4: Left False, Left True, Right False, Right True. How about
data EitherBoolInt = Left Bool | Right Int8
Here there are 2 possible values in the Left branch, and 2^8 in the Right branch. For a total of 2 + 2^8 possible values for EitherBoolInt. It should be easy to see that for any set of constructors and types, this kind of construction will give you a datatype with a space of possible values the size of the sum of the possible values of each individual constructor. For this reason, it's called a sum type.
Consider instead
data BoolAndInt = BAndI Bool Int8
or simply
type BoolAndInt = (Bool, Int)
How many values can this take on? For each possible Int8, there are two BoolAndInts, for a total of 2*2^8 = 2^9 total values. The total number of possible values is the product of the number of values of each field of the constructor, so this is called a product type.
This idea can be extended further -- for example, functions from a->b are an exponential datatype (see The Algebra of Algebraic Datatypes). You can even create a reasonable notion of the derivative of a datatype. This is not even a purely theoretical idea -- it's the basis for the functional construct of "zippers". See The Derivative of a Datatype is the Type of its One-Hole Contexts and The Wikipedia entry on zippers.
In simple words we must consider here relationship between algebra and types. Haskell's algebraic data types are named such since they correspond to an initial algebra in category theory.
Wikipedia says:
In computer programming, particularly functional programming and type
theory, an algebraic data type is a kind of composite type, i.e. a
type formed by combining other types.
Let's take Maybe a data type:
data Maybe a = Nothing | Just a
Maybe a indicates that it might contain something of type a - Just Int for example, but also can be empty - Nothing. In haskell types are objects, for example Int. Operators gets types and produces new types, for example Maybe Int. Algebraic refers to the property that an Algebraic Data Type is created by algebraic operations: sums and product where:
"sum" is alternation (A | B, meaning A or B but not both)
"product" is combination (A B, meaning A and B together)
For example, let's see sum for Maybe a. For the start let's define Add type:
data Add a b = Left a | Right b
In haskell | is or, so it can be or Left a or Right b. Vertical bar | shows us that Maybe which we defined above is a sum type, it means that we can write it with Add:
type Maybe a = Add Nothing (Just a)
Nothing here is here is a unit type:
In the area of mathematical logic and computer science known as type
theory, a unit type is a type that allows only one value
data Unit = Unit
Or () in haskell.
Just a is a singleton type as. Singleton types are those types which have only one value.
data Just a = Just a
After it we can rewrite it as:
type Maybe a = Add () a
So we have unit type - 1, and singleton type which is - a. Now we can say that Maybe a is the same as 1 + a.
If you want to go deep - The Algebra of Data, and the Calculus of Mutation
Related question: https://math.stackexchange.com/questions/50375/whats-the-meaning-of-algebraic-data-type
My answer (there): it's all about algebraic theories.
I am very new to Prolog and I was given this assignment.
My code is as follows:
relatives(cindy,tanya).
relatives(tanya,alan).
relatives(alan,mike).
relatives(kerry,jay).
relatives(jay,alan).
isRelated(X,Y):-
relatives(X,Y).
isRelated(X,Y):-
relatives(X,Z),
isRelated(Z,Y).
Simple enough. This shows that if:
?- isRelated(cindy,mike).
Prolog will return true. Now, I'm stuck on how to make it return true if:
?- isRelated(mike,cindy).
I've been trying to come up with ideas like if isRelated(Z,Y) returns false, then switch X and Y, and run isRelated again. But I'm not sure if Prolog even allows such an idea. Any hints or advice would be greatly appreciated. Thanks!
UPDATE:************************************
So I added:
isRelated(X,Y):-
relatives(X,Y);
relatives(Y,X).
That will satisfy "direct" relationships, but simply enough I found out that it doesn't satisfy indirect relationships.
I really want to do something like, if the initial query:
isRelated(mike,cindy)
fails, then try and see if the reverse is true by switching X and Y:
isRelated(cindy,mike)
That will definitely return true. I just don't know how to do this syntactically in Prolog.
Further hint to those in the comments, as I can't leave comments yet: With your original set of rules and facts,
isRelated(cindy,tanya) is true, but isRelated(tanya,cindy) is not, so you need to make isRelated(X,Y) symmetric; what simple addition to isRelated would achieve that?
Also, you could try drawing a graph of the relation relatives(X,Y), with an arrow from X to Y for all your base facts, and see if that helps you think about how the Prolog interpreter is going to attempt to satisfy a query.
So to answer your last question, you don't switch the values of X and Y in Prolog, like you would call swap(x,y) in C, say. The value held by a logic variable can not be changed explicitly, only back-tracked over. But you can easily use Y where you would use X, and vice versa:
somePred(X,Y):- is_it(X,Y).
somePred(X,Y):- is_it(Y,X).
This defines somePred predicate as a logical disjunction, an "OR". It can be written explicitly too, like
somePred(X,Y):- is_it(X,Y) ; is_it(Y,X).
Note the semicolon there. A comma , between predicates OTOH defines a conjunction, an "AND" (a comma inside a compound term just serves to delimit the term's "arguments").
YOu're almost there, you're just trying, I think, to cram too much stuff into one predicate.
Write the problem statement in English and work from that:
A relationship exists between two people, X and Y
if X and Y are directly related, or
if any direct relative of X, P, is related to Y.
Then it gets easy. I'd approach it like this:
First, you have your set of facts about relatives.
related( cindy, tanya ).
...
related( james, alan ).
Then, a predicate describing a direct relationship is terms of those facts:
directly_related( X , Y ) :- % a direct relationship exists
related(X,Y) % if X is related to Y
. % ... OR ...
directly_related( X , Y ) :- % a direct relationship exists
related(Y,X) % if Y is related to X
. %
Finally, a predicate describing any relationship:
is_related(X,Y) :- % a relationship exists between X and Y
directly_related(X,Y) % if a direct relationship exists between them
. % ... OR ...
is_related(X,Y) :- % a relationship exists between X and Y
directly_related(X,P) , % if a direct relationship exists between X and some other person P
is_related(P,Y) % and [recursively] a relationship exists between P and Y.
. %
The solution is actually more complicated than this:
The facts about relationships describe one or more graphs. More on graphs at http://web.cecs.pdx.edu/~sheard/course/Cs163/Doc/Graphs.html. What you're doing is finding a path from node X to Node Y in the graph.
If the graphs described by the facts about relationships have one or more paths between X and Y, the above solution can (and will) succeed multiple times (on backtracking), once for every such path. The solution needs to be deterministic. Normallly, having established that two people are related, we're done: just because I have two cousins doesn't mean I'm related to my aunt twice.
If the graph of relationships contains cycles (almost certainly true) such that a "circular" path exists: A → B → C → A …, the solution is susceptible to unlimited recursion. That means the solution needs to detect and deal with cycles. How might that be accomplished?
I've read a few instances in reading mathematics and computer science that use the equivalence symbol ≡, (basically an '=' with three lines) and it always makes sense to me to read this as if it were equality. What is the difference between these two concepts?
Wikipedia: Equivalence relation:
In mathematics, an equivalence
relation is a binary relation between
two elements of a set which groups
them together as being "equivalent" in
some way. Let a, b, and c be arbitrary
elements of some set X. Then "a ~ b"
or "a ≡ b" denotes that a is
equivalent to b.
An equivalence relation "~" is reflexive, symmetric, and transitive.
In other words, = is just an instance of equivalence relation.
Edit: This seemingly simple criteria of being reflexive, symmetric, and transitive are not always trivial. See Bloch's Effective Java 2nd ed p. 35 for example,
public final class CaseInsensitiveString {
...
// broken
#Override public boolean equals(Object o) {
if (o instance of CaseInsensitiveString)
return s.equalsIgnoreCase(
((CaseInsensitiveString) o).s);
if (o instanceof String) // One-way interoperability!
return s.equalsIgnoreCase((String) o);
return false;
}
}
The above equals implementation breaks the symmetry because CaseInsensitiveString knows about String class, but the String class doesn't know about CaseInsensitiveString.
I take your question to be about math notation rather than programming. The triple equal sign you refer to can be written ≡ in HTML or \equiv in LaTeX.
a ≡ b most commonly means "a is defined to be b" or "let a be equal to b".
So 2+2=4 but φ ≡ (1+sqrt(5))/2.
Here's a handy equivalence table:
Mathematicians Computer scientists
-------------- -------------------
= ==
≡ =
(The other answers about equivalence relations are correct too but I don't think those are as common. There's also a ≡ b (mod m) which is pronounced "a is congruent to b, mod m" and in programmer parlance would be expressed as mod(a,m) == mod(b,m). In other words, a and b are equal after mod'ing by m.)
A lot of languages distinguish between equality of the objects and equality of the values of those objects.
Ruby for example has 3 different ways to test equality. The first, equal?, compares two variables to see if they point to the same instance. This is equivalent in a C-style language of doing a check to see if 2 pointers refer to the same address. The second method, ==, tests value equality. So 3 == 3.0 would be true in this case. The third, eql?, compares both value and class type.
Lisp also has different concepts of equality depending on what you're trying to test.
In languages that I have seen that differentiate between equality and equivalence, equality usually means the type and value are the same while equivalence means that just the values are the same. For example:
int i = 3;
double d = 3.0;
i and d would be have an equivalence relationship since they represent the same value but not equality since they have different types. Other languages may have different ideas of equivalence (such as whether two variables represent the same object).
The answers above are right / partially right but they don't explain what the difference is exactly. In theoretical computer science (and probably in other branches of maths) it has to do with quantification over free variables of the logical equation (that is when we use the two notations at once).
For me the best ways to understand the difference is:
By definition
A ≡ B
means
For all possible values of free variables in A and B, A = B
or
A ≡ B <=> [A = B]
By example
x=2x
iff (in fact iff is the same as ≡)
x=0
x ≡ 2x
iff (because it is not the case that x = 2x for all possible values of x)
False
I hope it helps
Edit:
Another thing that came to my head is the definitions of the two.
A = B is defined as A <= B and A >= B, where <= (smaller equal, not implies) can be any ordering relation
A ≡ B is defined as A <=> B (iff, if and only if, implies both sides), worth noting that implication is also an ordering relation and so it is possible (but less precise and often confusing) to use = instead of ≡.
I guess the conclusion is that when you see =, then you have to figure out the authors intention based on the context.
Take it outside the realm of programming.
(31) equal -- (having the same quantity, value, or measure as another; "on equal terms"; "all men are equal before the law")
equivalent, tantamount -- (being essentially equal to something; "it was as good as gold"; "a wish that was equivalent to a command"; "his statement was tantamount to an admission of guilt"
At least in my dictionary, 'equivelance' means its a good-enough subsitute for the original, but not necessarily identical, and likewise 'equality' conveys complete identical.
null == 0 # true , null is equivelant to 0 ( in php )
null === 0 # false, null is not equal to 0 ( in php )
( Some people use ≈ to represent nonidentical values instead )
The difference resides above all in the level at which the two concepts are introduced. '≡' is a symbol of formal logic where, given two propositions a and b, a ≡ b means (a => b AND b => a).
'=' is instead the typical example of an equivalence relation on a set, and presumes at least a theory of sets. When one defines a particular set, usually he provides it with a suitable notion of equality, which comes in the form of an equivalence relation and uses the symbol '='. For example, when you define the set Q of the rational numbers, you define equality a/b = c/d (where a/b and c/d are rational) if and only if ad = bc (where ad and bc are integers, the notion of equality for integers having already been defined elsewhere).
Sometimes you will find the informal notation f(x) ≡ g(x), where f and g are functions: It means that f and g have the same domain and that f(x) = g(x) for each x in such domain (this is again an equivalence relation). Finally, sometimes you find ≡ (or ~) as a generic symbol to denote an equivalence relation.
You could have two statements that have the same truth value (equivalent) or two statements that are the same (equality). As well the "equal sign with three bars" can also mean "is defined as."
Equality really is a special kind of equivalence relation, in fact. Consider what it means to say:
0.9999999999999999... = 1
That suggests that equality is just an equivalence relation on "string numbers" (which are defined more formally as functions from Z -> {0,...,9}). And we can see from this case, the equivalence classes are not even singletons.
The first problem is, what equality and equivalence mean in this case? Essentially, contexts are quite free to define these terms.
The general tenor I got from various definitions is: For values called equal, it should make no difference which one you read from.
The grossest example that violates this expectation is C++: x and y are said to be equal if x == y evaluates to true, and x and y are said to be equivalent if !(x < y) && !(y < x). Even apart from user-defined overloads of these operators, for floating-point numbers (float, double) those are not the same: All NaN values are equivalent to each other (in fact, equivalent to everything), but not equal to anything including themselves, and the values -0.0 and +0.0 compare equal (and equivalent) although you can distinguish them if you’re clever.
In a lot of cases, you’d need better terms to convey your intent precisely. Given two variables x and y,
identity or “the same” for expressing that there is only one object and x and y refer to it. Any change done through x is inadvertantly observable through y and vice versa. In Java, reference type variables are checked for identity using ==, in C# using the ReferenceEquals method. In C++, if x and y are references, std::addressof(x) == std::addressof(y) will do (whereas &x == &y will work most of the time, but & can be customized for user-defined types).
bitwise or structure equality for expressing that the internal representations of x and y are the same. Notice that bitwise equality breaks down when objects can reference (parts of) themselves internally. To get the intended meaning, the notion has to be refined in such cases to say: Structured the same. In D, bitwise equality is checked via is and C offers memcmp. I know of no language that has built-in structure equality testing.
indistinguishability or substitutability for expressing that values cannot be distinguished (through their public interface): If a function f takes two parameters and x and y are indistinguishable, the calls f(x, y), f(x, x), and f(y, y) always return indistinguishable values – unless f checks for identity (see bullet point above) directly or maybe by mutating the parameters. An example could be two search-trees that happen to contain indistinguishable elements, but the internal trees are layed-out differently. The internal tree layout is an implementation detail that normally cannot be observed through its public methods.
This is also called Leibniz-equality after Gottfried Wilhelm Leibniz who defined equality as the lack of differences.
equivalence for expressing that objects represent values considered essentially the same from some abstract reasoning. For an example for distinguishable equivalent values, observe that floating-point numbers have a negative zero -0.0 distinct from +0.0, and e.g. sign(1/x) is different for -0.0 and +0.0. Equivalence for floating-point numbers is checked using == in many languages with C-like syntax (aka. Algol syntax). Most object-oriented languages check equivalence of objects using an equals (or similarly named) method. C# has the IEquatable<T> interface to designate that the class has a standard/canonical/default equivalence relation defined on it. In Java, one overrides the equals method every class inherits from Object.
As you can see, the notions become increasingly vague. Checking for identity is something most languages can express. Identity and bitwise equality usually cannot be hooked by the programmer as the notions are independent from interpretations. There was a C++20 proposal, which ended up being rejected, that would have introduced the last two notions as strong† and weak equality†. († This site looks like CppReference, but is not; it is not up-to-date.) The original paper is here.
There are languages without mutation, primarily functional languages like Haskell. The difference between equality and equivalence there is less of an issue and tilts to the mathematical use of those words. (In math, generally speaking, (recursively defined) sequences are used instead of re-assignments.)
Everything C has, is also available to C++ and any language that can use C functionality. Everything said about C# is true for Visual Basic .NET and probably all languages built on the .NET framework. Analogously, Java represents the JRE languages that also include Kotlin and Scala.
If you just want stupid definitions without wisdom: An equivalence relation is a reflexive, symmetrical, and transitive binary relation on a set. Equality then is the intersection of all those equivalence relations.