How do I denote this syntax's semantics? - bnf

I am writing a language specification, and I need the following rudimentary question resolved. Suppose I have the (admittedly contrived) abstract syntax:
<A> ::= <B> | <C>
<B> ::= 1 | 2 | 3
<C> ::= 4 | 5 | 6
What does the denotational semantics for this language look like? Non-terminals are enclosed in '<' and '>' and terminals are not. I want to map 1 ... 6 to the domain of natural numbers. The thing that is not at all clear to me is whether I need to provide mappings for non-terminals. It seems like I should not need to since, e.g., <A> ::= <B> | <C> has no meaning; it is simply a bit of structure. Ignore, for the moment that we could eliminate this rule entirely.
So, as it stands, this is what I think the complete denotational definition should look like, where the right-hand side (in italics) represents the corresponding value from the natural numbers:
[[1]] =one
[[2]] =two
[[3]] =three
[[4]] =four
[[5]] =five
[[6]] =six
Aesthetically, it seems odd not to mention A, B, or C at all, but I suppose that those symbols will never appear in an actual program (like: 4), so maybe this suffices. All of the materials I have on this subject omit this very simple, nuts and bolts, aspect of the language definition process from their discussions.

Semantics is applied to program generated by grammar. When you generate program from this grammar you will not see non-terminals, so you don't need to define semantics for them.
Consider example:
Exp ::= Num | Exp + Exp
Num ::= 0,1,2,...
In this example it is necessary to define semantics of whole Num case by case, but it is also important to define semantics for Exp, because one of the productions even though is not terminal (but it has some notion of terminality - + is not going to change) has some special meaning assigned to it - addition.
So you will do as follows:
[[0]] = 0 // first 0 is just some string, the second is a number from N
[[1]] = 1 ...
[[Exp + Exp]] = [[Exp]] + [[Exp]] // first + sign is just string,
// the second is addtion in N
As you can see, when you define semantics you are striving to give meaning to every possible program which can be generated by your grammar.
You can think of it as follows: when your production can be applied recursively you need to define semantics for it - that is the case for Exp + Exp. Num, however, leads straight to terminal no matter what you try to do.
Sidenote:
It is worth mentioning why we are given grammar to define semantics.
Grammars are most handy definitions of language to define semantics. This is because semantics can be easily defined using induction over structure of our language.
We supply rules for base cases - terminals, and non-trivial production, like + in our example.

Related

Perl 6 calculate the average of an int array at once using reduce

I'm trying to calculate the average of an integer array using the reduce function in one step. I can't do this:
say (reduce {($^a + $^b)}, <1 2 3>) / <1 2 3>.elems;
because it calculates the average in 2 separate pieces.
I need to do it like:
say reduce {($^a + $^b) / .elems}, <1 2 3>;
but it doesn't work of course.
How to do it in one step? (Using map or some other function is welcomed.)
TL;DR This answer starts with an idiomatic way to write equivalent code before discussing P6 flavored "tacit" programming and increasing brevity. I've also added "bonus" footnotes about the hyperoperation Håkon++ used in their first comment on your question.5
Perhaps not what you want, but an initial idiomatic solution
We'll start with a simple solution.1
P6 has built in routines2 that do what you're asking. Here's a way to do it using built in subs:
say { sum($_) / elems($_) }(<1 2 3>); # 2
And here it is using corresponding3 methods:
say { .sum / .elems }(<1 2 3>); # 2
What about "functional programming"?
First, let's replace .sum with an explicit reduction:
.reduce(&[+]) / .elems
When & is used at the start of an expression in P6 you know the expression refers to a Callable as a first class citizen.
A longhand way to refer to the infix + operator as a function value is &infix:<+>. The shorthand way is &[+].
As you clearly know, the reduce routine takes a binary operation as an argument and applies it to a list of values. In method form (invocant.reduce) the "invocant" is the list.
The above code calls two methods -- .reduce and .elems -- that have no explicit invocant. This is a form of "tacit" programming; methods written this way implicitly (or "tacitly") use $_ (aka "the topic" or simply "it") as their invocant.
Topicalizing (explicitly establishing what "it" is)
given binds a single value to $_ (aka "it") for a single statement or block.
(That's all given does. Many other keywords also topicalize but do something else too. For example, for binds a series of values to $_, not just one.)
Thus you could write:
say .reduce(&[+]) / .elems given <1 2 3>; # 2
Or:
$_ = <1 2 3>;
say .reduce(&[+]) / .elems; # 2
But given that your focus is FP, there's another way that you should know.
Blocks of code and "it"
First, wrap the code in a block:
{ .reduce(&[+]) / .elems }
The above is a Block, and thus a lambda. Lambdas without a signature get a default signature that accepts one optional argument.
Now we could again use given, for example:
say do { .reduce(&[+]) / .elems } given <1 2 3>; # 2
But we can also just use ordinary function call syntax:
say { .reduce(&[+]) / .elems }(<1 2 3>)
Because a postfix (...) calls the Callable on its left, and because in the above case one argument is passed in the parens to a block that expects one argument, the net result is the same as the do4 and the given in the prior line of code.
Brevity with built ins
Here's another way to write it:
<1 2 3>.&{.sum/.elems}.say; #2
This calls a block as if it were a method. Imo that's still eminently readable, especially if you know P6 basics.
Or you can start to get silly:
<1 2 3>.&{.sum/$_}.say; #2
This is still readable if you know P6. The / is a numeric (division) operator. Numeric operators coerce their operands to be numeric. In the above $_ is bound to <1 2 3> which is a list. And in Perls, a collection in numeric context is its number of elements.
Changing P6 to suit you
So far I've stuck with standard P6.
You can of course write subs or methods and name them using any Unicode letters. If you want single letter aliases for sum and elems then go ahead:
my (&s, &e) = &sum, &elems;
But you can also extend or change the language as you see fit. For example, you can create user defined operators:
#| LHS ⊛ RHS.
#| LHS is an arbitrary list of input values.
#| RHS is a list of reducer function, then functions to be reduced.
sub infix:<⊛> (#lhs, *#rhs (&reducer, *#fns where *.all ~~ Callable)) {
reduce &reducer, #fns».(#lhs)
}
say <1 2 3> ⊛ (&[/], &sum, &elems); # 2
I won't bother to explain this for now. (Feel free to ask questions in the comments.) My point is simply to highlight that you can introduce arbitrary (prefix, infix, circumfix, etc.) operators.
And if custom operators aren't enough you can change any of the rest of the syntax. cf "braid".
Footnotes
1 This is how I would normally write code to do the computation asked for in the question. #timotimo++'s comment nudged me to alter my presentation to start with that, and only then shift gears to focus on a more FPish solution.
2 In P6 all built in functions are referred to by the generic term "routine" and are instances of a sub-class of Routine -- typically a Sub or Method.
3 Not all built in sub routines have correspondingly named method routines. And vice-versa. Conversely, sometimes there are correspondingly named routines but they don't work exactly the same way (with the most common difference being whether or not the first argument to the sub is the same as the "invocant" in the method form.) In addition, you can call a subroutine as if it were a method using the syntax .&foo for a named Sub or .&{ ... } for an anonymous Block, or call a method foo in a way that looks rather like a subroutine call using the syntax foo invocant: or foo invocant: arg2, arg3 if it has arguments beyond the invocant.
4 If a block is used where it should obviously be invoked then it is. If it's not invoked then you can use an explicit do statement prefix to invoke it.
5 Håkon's first comment on your question used "hyperoperation". With just one easy to recognize and remember "metaop" (for unary operations) or a pair of them (for binary operations), hyperoperations distribute an operation to all the "leaves"6 of a data structure (for an unary) or create a new one based on pairing up the "leaves" of a pair of data structures (for binary operations). NB. Hyperoperations are done in parallel7.
6 What is a "leaf" for a hyperoperation is determined by a combination of the operation being applied (see the is nodal trait) and whether a particular element is Iterable.
7 Hyperoperation is applied in parallel, at least semantically. Hyperoperation assumes8 that the operations on the "leaves" have no mutually interfering side-effects -- that is to say, that any side effect when applying the operation to one "leaf" can safely be ignored in respect to applying the operation to any another "leaf".
8 By using a hyperoperation the developer is declaring that the assumption of no meaningful side-effects is correct. The compiler will act on the basis it is, but will not check that it is true. In the safety sense it's like a loop with a condition. The compiler will follow the dev's instructions, even if the result is an infinite loop.
Here is an example using given and the reduction meta operator:
given <1 2 3> { say ([+] $_)/$_.elems } ;

Simple example of call-by-need

I'm trying to understand the theorem behind "call-by-need." I do understand the definition, but I'm a bit confused. I would like to see a simple example which shows how call-by-need works.
After reading some previous threads, I found out that Haskell uses this kind of evaluation. Are there any other programming languages which support this feature?
I read about the call-by-name of Scala, and I do understand that call-by-name and call-by-need are similar but different by the fact that call-by-need will keep the evaluated value. But I really would love to see a real-life example (it does not have to be in Haskell), which shows call-by-need.
The function
say_hello numbers = putStrLn "Hello!"
ignores its numbers argument. Under call-by-value semantics, even though an argument is ignored, the parameter at the function call site may need to be evaluated, perhaps because of side effects that the rest of the program depends on.
In Haskell, we might call say_hello as
say_hello [1..]
where [1..] is the infinite list of naturals. Under call-by-value semantics, the CPU would run off trying to build an infinite list and never get to the say_hello at all!
Haskell merely outputs
$ runghc cbn.hs
Hello!
For less dramatic examples, the first ten natural numbers are
ghci> take 10 [1..]
[1,2,3,4,5,6,7,8,9,10]
The first ten odds are
ghci> take 10 $ filter odd [1..]
[1,3,5,7,9,11,13,15,17,19]
Under call-by-need semantics, each value — even a conceptually infinite one as in the examples above — is evaluated only to the extent required and no more.
update: A simple example, as asked for:
ff 0 = 1
ff 1 = 1
ff n = go (ff (n-1))
where
go x = x + x
Under call-by-name, each invocation of go evaluates ff (n-1) twice, each for each appearance of x in its definition (because + is strict in both arguments, i.e. demands the values of the both of them).
Under call-by-need, go's argument is evaluated at most once. Specifically, here, x's value is found out only once, and reused for the second appearance of x in the expression x + x. If it weren't needed, x wouldn't be evaluated at all, just as with call-by-name.
Under call-by-value, go's argument is always evaluated exactly once, prior to entering the function's body, even if it isn't used anywhere in the function's body.
Here's my understanding of it, in the context of Haskell.
According to Wikipedia, "call by need is a memoized variant of call by name where, if the function argument is evaluated, that value is stored for subsequent uses."
Call by name:
take 10 . filter even $ [1..]
With one consumer the produced value disappears after being produced so it might as well be call-by-name.
Call by need:
import qualified Data.List.Ordered as O
h = 1 : map (2*) h <> map (3*) h <> map (5*) h
where
(<>) = O.union
The difference is, here the h list is reused by several consumers, at different tempos, so it is essential that the produced values are remembered. In a call-by-name language there'd be much replication of computational effort here because the computational expression for h would be substituted at each of its occurrences, causing separate calculation for each. In a call-by-need--capable language like Haskell the results of computing the elements of h are shared between each reference to h.
Another example is, most any data defined by fix is only possible under call-by-need. With call-by-value the most we can have is the Y combinator.
See: Sharing vs. non-sharing fixed-point combinator and its linked entries and comments (among them, this, and its links, like Can fold be used to create infinite lists?).

Left/Right recursion and Bison parsing stack behavior

So in my < 24 hours of bison/flex investigation I've seen lots of documentation that indicates that left recursion is better than right recursion. Some places even mention that with left recursion, you need constant space on the Bison parser stack whereas right recursion requires order N space. However, I can't quite find any sources that explains what is going on explicitly.
As an example (parser that only adds and subtracts):
Scanner:
%%
[0-9]+ {return NUMBER;}
%%
Parser:
%%
/* Left */
expression:
NUMBER
| expression '+' NUMBER { $$ = $1 + $3; }
| expression '-' NUMBER { $$ = $1 - $3; }
;
/* Right */
expression:
NUMBER
| NUMBER '+' expression { $$ = $1 + $3; }
| NUMBER '-' expression { $$ = $1 - $3; }
;
%%
For the example of 1+5-2, it seems with left recursion, the parser receives '1' from the lexer and sees that '1' matches expression: NUMBER and pushes an expression of value 1 to the parser stack. It sees + and pushes. Then it sees 5 and the expression(1), + and 5 matches expression: expression '+' NUMBER so it pops twice, does the math and pushes a new expression with value 6 on the stack and then repeats for the subtraction. At any one point, there is at max, 3 symbols on the stack. So it's like an in-place calculation and operates left to right.
With right recursion, I'm not sure why it has to load all symbols on the stack but I'm going to attempt at describing why that might be the case. It sees a 1 and matches expression: NUMBER so it pushes an expression with value 1 on the stack. It pushes '+' on the stack. When it sees the 5, my first thought is that 5 on it's own could match expression: NUMBER and hence be an expression of value 5 and then it plus the last two symbols on the stack could match expression: NUMBER '+' expression but my assumption is that because expression is on the right of the rule, that it can't jump the gun and evaluate 5 as an expression as a NUMBER since with LALR(1), it already knows more symbols are coming so it has to wait until it hits the end of the list?
TL;DR;
Can someone maybe explain with some detail how Bison manages its parse stack relative to how it does its shift/reduction with the parser grammar rules? Silly/contrived examples welcome!
With LR (bottom-up) parsing, each non-terminal is reduced precisely when its last token is encountered. (LALR parsing is a simplified LR parse which handles lookahead slightly less precisely.) Until a non-terminal is reduced, all its components live on the stack. So if you use right recursion and you are parsing
NUMBER + NUMBER + NUMBER + NUMBER
the reductions won't start until you get to the end, because each NUMBER starts an expression and all the expressions end at the last NUMBER.
If you use left recursion, each NUMBER terminates an expression, so a reduction happens each time a NUMBER is encountered.
That's not the reason to use left-recursion though. You use left-recursion because it accurately describes the language. If you have 7 - 2 - 1, you want the result to be 4, because that's what algebraic rules require: the expression is parsed as though it were (7 - 2) - 1, so 7 - 2 must be reduced first. With right-recursion, you would incorrectly evsluate that as 6, because the 2 - 1 would reduce first.
Most operators associate to the left, so you use left-recursion. For the occasional operator which associates to the right, you need right recursion and you have to live with the stack growing. It's not a big deal. Your machine has tons of memory.
As an example, consider assignment. a = b = 42 means a = (b = 42). If you did it left associatively, you'd first set a to b, and then attempt to set something to 42; (a = b) = 42 wouldn't make sense in most languages and it is certainly not the expected action.
LL (topdown) parsing uses lookahead to predict which production will be reduced. It can't handle left-recursion at all because the prediction ends up in a recursive loop: expression starts with an expression which starts with an expression … and the parser never manages to predict a NUMBER. So with LL parsers, you have to use right-recursion and then your grammar does not correctly describe the language (assuming that the language has left-associative operators, as is usually the case). Some people don't mind this, but I think that grammars should actually indicate the correct parse, and I find the modification necessary to make a grammar parseable with a top-down parser to be messy and hard to read. Your mileage may vary.
By the way, "force down your throat" is a very ungenerous description of documentation which is trying to give you good advice. It is good to be skeptical -- you understand things better if you work at figuring out why they work the way they do -- but many people just want the good advice.
So after reading this rather important page in the bison documentation:
https://www.gnu.org/software/bison/manual/html_node/Lookahead.html#Lookahead
combined with running with
%debug
and
yydebug = 1;
in my main()
I think I see exactly what is happening.
With left recursion, it sees a 1 and shifts it. The lookahead is now +. Then it determines that it can reduce the 1 to an expression via expression: NUMBER. So it pops and puts an expression on the stack. It shifts + and NUMBER(5) and then sees it can reduce via expression: expression '+' NUMBER and pops 3x and pushes a new expression(6). Basically, by using the lookahead and the rules, bison can determine if it needs to shift or reduce at any time as it reads the tokens. As this repeats, at most there are 3 symbols/groupings on the parse stack (for this simplified expression evaluation section).
With right recursion, it sees a 1 and shifts it. The lookahead is now +. The parser sees no reason to reduce 1 to an expression(1) because there is no rule that goes expression '+' so it just continues shifting each token until it gets to the end of the input. At this point, there is [NUMBER, +, NUMBER, -, NUMBER] on the stack so it sees that the most recent number can be reduced to expression(2) and then shifts that. Then the rules start getting applied (`expression: NUMBER '-' expression') etc etc.
So the key to my understanding is that Bison uses the lookahead token to make intelligent decisions about reducing now or just shifting based on the rules it has at its disposal.

Count all possible computer programs of length N

I will make this easy to understand using Javascript as language
Using ascii as alphabet, and given all possible "statements" of length N formed using ascii, how many will be a valid JS program. Example for length 8
var i=0; //Is a string 8 characters long, and valid JS
jar i=0; //Is a string 8 characters long, but invalid JS
gfjsjhh3 //Is a string 8 characters long, but invalid JS
now imagine we have all possible strings 8 characters long.
How many will be valid JS?
Further rules:
1) variables are as short as possible
2) no blank spaces except where necessary
More formal definition of problem:
If we are given a grammar G for alphabet K, and also the collection of all possible combinations (statements) of length N that can be formed with K(or K* up to length N), how many of those statements will satisfy grammar G.
I know you are thinking this is some academic, away from reality stuff. However, if the number of statements that are programs is much less than the total number of combinations, you could use small "numbers" to refer to locations where these sparse programs appear in the sea of combinations, and be able to send this "address" instead of the whole program, greatly reducing payload
The grammar is usually only one aspect of validity checks, check out ES6 §5.3. And grammars are very poor to capture your “as short as possible” requirement. But grammars are a good tool to reason about things, so let's concentrate on that. It will still allow invalid programs, and programs with long variable names, but as a starting point for proof of concept (whatever you concept may be) it should suffice.
You might start by thinking about derivation trees based on some BNF grammar for JS. Something like sections §11 to §15 of the ES6 standard, but make sure that the grammar you use is going down to the character level, not treating a whole identifier as a single terminal. For your hypothetical compression scheem, if you encode the choice at each node of the tree, you have something like the compression effect you describe.
In order to count the number of programs of a given length, you could do dynamic programming on the BNF rules. So if you have a rule which says
IndexedMemberExpression ::= MemberExpression '[' Expression ']'
(loosely modeled after ES6 §12.3) then you know that an IndexedMemberExpression of length n is made up of a MemberExpression of length i and an Expression of length n − i − 2, for some 0 ≤ i ≤ n − 2, so if you know how many ways there are for those, you know for the whole expression. Have one array of length 1000 for each BNF non-terminal, fill them in order of increasing length, and you get the number of derivation trees (i.e. grammatically correct programs) for the root rule.
Actually an IndexedMemberExpression is just one kind of MemberExpression, and you'd want to sum all of those up. So it's more like this:
allocate an arry of size 1000 for each non-terminal,
and initialize all the elements to zero
for n from 0 to 1000:
…
# compute PrimaryExpression[n] first
# rule MemberExpression ::= PrimaryExpression
MemberExpression[n] = PrimaryExpression[n]
# rule MemberExpression ::= MemberExpression '[' Expression ']'
if n >= 2:
for i from 0 to n - 2:
MemberExpression[n] += MemberExpression[i] * Expression[n-i-2]
# rule MemberExpression ::= MemberExpression '.' IdentifierName
if n >= 1:
for i from 0 to n - 1:
MemberExpression[n] += MemberExpression[i] * IdentifierName[n-i-1]
…
Do this for all the rules, in an order which makes sure that you fully update each non-terminal on the left hand side of the increments before using it for the first time on the right hand side.
Note that for almost any character sequence of length n, the corresponding string literal will be a valid JS expression of length n + 2. So don't expect the number of valid programs to have a radically different asymptotic behavior compared to the number of arbitrary character sequences.

What are the most interesting equivalences arising from the Curry-Howard Isomorphism?

I came upon the Curry-Howard Isomorphism relatively late in my programming life, and perhaps this contributes to my being utterly fascinated by it. It implies that for every programming concept there exists a precise analogue in formal logic, and vice versa. Here's a "basic" list of such analogies, off the top of my head:
program/definition | proof
type/declaration | proposition
inhabited type | theorem/lemma
function | implication
function argument | hypothesis/antecedent
function result | conclusion/consequent
function application | modus ponens
recursion | induction
identity function | tautology
non-terminating function | absurdity/contradiction
tuple | conjunction (and)
disjoint union | disjunction (or) -- corrected by Antal S-Z
parametric polymorphism | universal quantification
So, to my question: what are some of the more interesting/obscure implications of this isomorphism? I'm no logician so I'm sure I've only scratched the surface with this list.
For example, here are some programming notions for which I'm unaware of pithy names in logic:
currying | "((a & b) => c) iff (a => (b => c))"
scope | "known theory + hypotheses"
And here are some logical concepts which I haven't quite pinned down in programming terms:
primitive type? | axiom
set of valid programs? | theory
Edit:
Here are some more equivalences collected from the responses:
function composition | syllogism -- from Apocalisp
continuation-passing | double negation -- from camccann
Since you explicitly asked for the most interesting and obscure ones:
You can extend C-H to many interesting logics and formulations of logics to obtain a really wide variety of correspondences. Here I've tried to focus on some of the more interesting ones rather than on the obscure, plus a couple of fundamental ones that haven't come up yet.
evaluation | proof normalisation/cut-elimination
variable | assumption
S K combinators | axiomatic formulation of logic
pattern matching | left-sequent rules
subtyping | implicit entailment (not reflected in expressions)
intersection types | implicit conjunction
union types | implicit disjunction
open code | temporal next
closed code | necessity
effects | possibility
reachable state | possible world
monadic metalanguage | lax logic
non-termination | truth in an unobservable possible world
distributed programs | modal logic S5/Hybrid logic
meta variables | modal assumptions
explicit substitutions | contextual modal necessity
pi-calculus | linear logic
EDIT: A reference I'd recommend to anyone interested in learning more about extensions of C-H:
"A Judgmental Reconstruction of Modal Logic" http://www.cs.cmu.edu/~fp/papers/mscs00.pdf - this is a great place to start because it starts from first principles and much of it is aimed to be accessible to non-logicians/language theorists. (I'm the second author though, so I'm biased.)
You're muddying things a little bit regarding nontermination. Falsity is represented by uninhabited types, which by definition can't be non-terminating because there's nothing of that type to evaluate in the first place.
Non-termination represents contradiction--an inconsistent logic. An inconsistent logic will of course allow you to prove anything, including falsity, however.
Ignoring inconsistencies, type systems typically correspond to an intuitionistic logic, and are by necessity constructivist, which means certain pieces of classical logic can't be expressed directly, if at all. On the other hand this is useful, because if a type is a valid constructive proof, then a term of that type is a means of constructing whatever you've proven the existence of.
A major feature of the constructivist flavor is that double negation is not equivalent to non-negation. In fact, negation is rarely a primitive in a type system, so instead we can represent it as implying falsehood, e.g., not P becomes P -> Falsity. Double negation would thus be a function with type (P -> Falsity) -> Falsity, which clearly is not equivalent to something of just type P.
However, there's an interesting twist on this! In a language with parametric polymorphism, type variables range over all possible types, including uninhabited ones, so a fully polymorphic type such as ∀a. a is, in some sense, almost-false. So what if we write double almost-negation by using polymorphism? We get a type that looks like this: ∀a. (P -> a) -> a. Is that equivalent to something of type P? Indeed it is, merely apply it to the identity function.
But what's the point? Why write a type like that? Does it mean anything in programming terms? Well, you can think of it as a function that already has something of type P somewhere, and needs you to give it a function that takes P as an argument, with the whole thing being polymorphic in the final result type. In a sense, it represents a suspended computation, waiting for the rest to be provided. In this sense, these suspended computations can be composed together, passed around, invoked, whatever. This should begin to sound familiar to fans of some languages, like Scheme or Ruby--because what it means is that double-negation corresponds to continuation-passing style, and in fact the type I gave above is exactly the continuation monad in Haskell.
Your chart is not quite right; in many cases you have confused types with terms.
function type implication
function proof of implication
function argument proof of hypothesis
function result proof of conclusion
function application RULE modus ponens
recursion n/a [1]
structural induction fold (foldr for lists)
mathematical induction fold for naturals (data N = Z | S N)
identity function proof of A -> A, for all A
non-terminating function n/a [2]
tuple normal proof of conjunction
sum disjunction
n/a [3] first-order universal quantification
parametric polymorphism second-order universal quantification
currying (A,B) -> C -||- A -> (B -> C), for all A,B,C
primitive type axiom
types of typeable terms theory
function composition syllogism
substitution cut rule
value normal proof
[1] The logic for a Turing-complete functional language is inconsistent. Recursion has no correspondence in consistent theories. In an inconsistent logic/unsound proof theory you could call it a rule which causes inconsistency/unsoundness.
[2] Again, this is a consequence of completeness. This would be a proof of an anti-theorem if the logic were consistent -- thus, it can't exist.
[3] Doesn't exist in functional languages, since they elide first-order logical features: all quantification and parametrization is done over formulae. If you had first-order features, there would be a kind other than *, * -> *, etc.; the kind of elements of the domain of discourse. For example, in Father(X,Y) :- Parent(X,Y), Male(X), X and Y range over the domain of discourse (call it Dom), and Male :: Dom -> *.
function composition | syllogism
I really like this question. I don't know a whole lot, but I do have a few things (assisted by the Wikipedia article, which has some neat tables and such itself):
I think that sum types/union types (e.g. data Either a b = Left a | Right b) are equivalent to inclusive disjunction. And, though I'm not very well acquainted with Curry-Howard, I think this demonstrates it. Consider the following function:
andImpliesOr :: (a,b) -> Either a b
andImpliesOr (a,_) = Left a
If I understand things correctly, the type says that (a ∧ b) → (a ★ b) and the definition says that this is true, where ★ is either inclusive or exclusive or, whichever Either represents. You have Either representing exclusive or, ⊕; however, (a ∧ b) ↛ (a ⊕ b). For instance, ⊤ ∧ ⊤ ≡ ⊤, but ⊤ ⊕ ⊥ ≡ ⊥, and ⊤ ↛ ⊥. In other words, if both a and b are true, then the hypothesis is true but the conclusion is false, and so this implication must be false. However, clearly, (a ∧ b) → (a ∨ b), since if both a and b are true, then at least one is true. Thus, if discriminated unions are some form of disjunction, they must be the inclusive variety. I think this holds as a proof, but feel more than free to disabuse me of this notion.
Similarly, your definitions for tautology and absurdity as the identity function and non-terminating functions, respectively, are a bit off. The true formula is represented by the unit type, which is the type which has only one element (data ⊤ = ⊤; often spelled () and/or Unit in functional programming languages). This makes sense: since that type is guaranteed to be inhabited, and since there's only one possible inhabitant, it must be true. The identity function just represents the particular tautology that a → a.
Your comment about non-terminating functions is, depending on what precisely you meant, more off. Curry-Howard functions on the type system, but non-termination is not encoded there. According to Wikipedia, dealing with non-termination is an issue, as adding it produces inconsistent logics (e.g., I can define wrong :: a -> b by wrong x = wrong x, and thus “prove” that a → b for any a and b). If this is what you meant by “absurdity”, then you're exactly correct. If instead you meant the false statement, then what you want instead is any uninhabited type, e.g. something defined by data ⊥—that is, a data type without any way to construct it. This ensures that it has no values at all, and so it must be uninhabited, which is equivalent to false. I think you could probably also use a -> b, since if we forbid non-terminating functions, then this is also uninhabited, but I'm not 100% sure.
Wikipedia says that axioms are encoded in two different ways, depending on how you interpret Curry-Howard: either in the combinators or in the variables. I think the combinator view means that the primitive functions we are given encode the things we can say by default (similar to the way that modus ponens is an axiom because function application is primitive). And I think that the variable view may actually mean the same thing—combinators, after all, are just global variables which are particular functions. As for primitive types: if I'm thinking about this correctly, then I think that primitive types are the entities—the primitive objects that we're trying to prove things about.
According to my logic and semantics class, the fact that (a ∧ b) → c ≡ a → (b → c) (and also that b → (a → c)) is called the exportation equivalence law, at least in natural deduction proofs. I didn't notice at the time that it was just currying—I wish I had, because that's cool!
While we now have a way to represent inclusive disjunction, we don't have a way to represent the exclusive variety. We should be able to use the definition of exclusive disjunction to represent it: a ⊕ b ≡ (a ∨ b) ∧ ¬(a ∧ b). I don't know how to write negation, but I do know that ¬p ≡ p → ⊥, and both implication and falsehood are easy. We should thus able to represent exclusive disjunction by:
data ⊥
data Xor a b = Xor (Either a b) ((a,b) -> ⊥)
This defines ⊥ to be the empty type with no values, which corresponds to falsity; Xor is then defined to contain both (and) Either an a or a b (or) and a function (implication) from (a,b) (and) to the bottom type (false). However, I have no idea what this means. (Edit 1: Now I do, see the next paragraph!) Since there are no values of type (a,b) -> ⊥ (are there?), I can't fathom what this would mean in a program. Does anyone know a better way to think about either this definition or another one? (Edit 1: Yes, camccann.)
Edit 1: Thanks to camccann's answer (more particularly, the comments he left on it to help me out), I think I see what's going on here. To construct a value of type Xor a b, you need to provide two things. First, a witness to the existence of an element of either a or b as the first argument; that is, a Left a or a Right b. And second, a proof that there are not elements of both types a and b—in other words, a proof that (a,b) is uninhabited—as the second argument. Since you'll only be able to write a function from (a,b) -> ⊥ if (a,b) is uninhabited, what does it mean for that to be the case? That would mean that some part of an object of type (a,b) could not be constructed; in other words, that at least one, and possibly both, of a and b are uninhabited as well! In this case, if we're thinking about pattern matching, you couldn't possibly pattern-match on such a tuple: supposing that b is uninhabited, what would we write that could match the second part of that tuple? Thus, we cannot pattern match against it, which may help you see why this makes it uninhabited. Now, the only way to have a total function which takes no arguments (as this one must, since (a,b) is uninhabited) is for the result to be of an uninhabited type too—if we're thinking about this from a pattern-matching perspective, this means that even though the function has no cases, there's no possible body it could have either, and so everything's OK.
A lot of this is me thinking aloud/proving (hopefully) things on the fly, but I hope it's useful. I really recommend the Wikipedia article; I haven't read through it in any sort of detail, but its tables are a really nice summary, and it's very thorough.
Here's a slightly obscure one that I'm surprised wasn't brought up earlier: "classical" functional reactive programming corresponds to temporal logic.
Of course, unless you're a philosopher, mathematician or obsessive functional programmer, this probably brings up several more questions.
So, first off: what is functional reactive programming? It's a declarative way to work with time-varying values. This is useful for writing things like user interfaces because inputs from the user are values that vary over time. "Classical" FRP has two basic data types: events and behaviors.
Events represent values which only exist at discrete times. Keystrokes are a great example: you can think of the inputs from the keyboard as a character at a given time. Each keypress is then just a pair with the character of the key and the time it was pressed.
Behaviors are values that exist constantly but can be changing continuously. The mouse position is a great example: it is just a behavior of x, y coordinates. After all, the mouse always has a position and, conceptually, this position changes continually as you move the mouse. After all, moving the mouse is a single protracted action, not a bunch of discrete steps.
And what is temporal logic? Appropriately enough, it's a set of logical rules for dealing with propositions quantified over time. Essentially, it extends normal first-order logic with two quantifiers: □ and ◇. The first means "always": read □φ as "φ always holds". The second is "eventually": ◇φ means that "φ will eventually hold". This is a particular kind of modal logic. The following two laws relate the quantifiers:
□φ ⇔ ¬◇¬φ
◇φ ⇔ ¬□¬φ
So □ and ◇ are dual to each other in the same way as ∀ and ∃.
These two quantifiers correspond to the two types in FRP. In particular, □ corresponds to behaviors and ◇ corresponds to events. If we think about how these types are inhabited, this should make sense: a behavior is inhabited at every possible time, while an event only happens once.
Related to the relationship between continuations and double negation, the type of call/cc is Peirce's law http://en.wikipedia.org/wiki/Call-with-current-continuation
C-H is usually stated as correspondence between intuitionistic logic and programs. However if we add the call-with-current-continuation (callCC) operator (whose type corresponds to Peirce's law), we get a correspondence between classical logic and programs with callCC.
2-continuation | Sheffer stoke
n-continuation language | Existential graph
Recursion | Mathematical Induction
One thing that is important, but have not yet being investigated is the relationship of 2-continuation (continuations that takes 2 parameters) and Sheffer stroke. In classic logic, Sheffer stroke can form a complete logic system by itself (plus some non-operator concepts). Which means the familiar and, or, not can be implemented using only the Sheffer stoke or nand.
This is an important fact of its programming type correspondence because it prompts that a single type combinator can be used to form all other types.
The type signature of a 2-continuation is (a,b) -> Void. By this implementation we can define 1-continuation (normal continuations) as (a,a) -> Void, product type as ((a,b)->Void,(a,b)->Void)->Void, sum type as ((a,a)->Void,(b,b)->Void)->Void. This gives us an impressive of its power of expressiveness.
If we dig further, we will find out that Piece's existential graph is equivalent to a language with the only data type is n-continuation, but I didn't see any existing languages is in this form. So inventing one could be interesting, I think.
While it's not a simple isomorphism, this discussion of constructive LEM is a very interesting result. In particular, in the conclusion section, Oleg Kiselyov discusses how the use of monads to get double-negation elimination in a constructive logic is analogous to distinguishing computationally decidable propositions (for which LEM is valid in a constructive setting) from all propositions. The notion that monads capture computational effects is an old one, but this instance of the Curry--Howard isomorphism helps put it in perspective and helps get at what double-negation really "means".
First-class continuations support allows you to express $P \lor \neg P$.
The trick is based on the fact that not calling the continuation and exiting with some expression is equivalent to calling the continuation with that same expression.
For more detailed view please see: http://www.cs.cmu.edu/~rwh/courses/logic/www-old/handouts/callcc.pdf

Resources