Ada Numeric Literals and Underline - ada

This is from the online Ada reference manual:
http://www.adaic.org/resources/add_content/standards/05rm/RM.pdf (section 2.3)
A decimal_literal is a numeric_literal in the conventional decimal notation (that is, the base is ten).
Syntax
decimal_literal ::= numeral [.numeral] [exponent]
**numeral ::= digit {[underline] digit}**
exponent ::= E [+] numeral | E – numeral
digit ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
An exponent for an integer literal shall not have a minus sign.
Static Semantics
**An underline character in a numeric_literal does not affect its meaning.** The letter E of an exponent can be
written either in lower case or in upper case, with the same meaning.
If I do
my_literal ::= 123_456;
what does the underscore (underline) mean? It says it doesn't affect the meaning. Then what is it for? I am sure there is a simple answer but reading and re-reaidng the passage hasn't helped me.

It's the same reason for, say, commas (,) in currency or [other large] numbers: grouping.
Thus:
Million : Constant:= 1_000_000;
Furthermore, you could use it in conjunction with base-setting as a set-up for masking:
Type Bit is Range 1..8;
SubType Byte is Interfaces.Unsigned_8;
Type Masks is Array(Positive Range <>) of Byte;
Mask_Map : Constant Masks(Bit):=
(
2#0000_0001#,
2#0000_0010#,
2#0000_0100#,
2#0000_1000#,
2#0001_0000#,
2#0010_0000#,
2#0100_0000#,
2#1000_0000#
);
Then perhaps you would use Mask_Map and bits together with or, and, and xor to do bit-manipulation. The above method may seem a bit more work than the simple definition of a lot of constants and directly manipulating them, but it is more flexible in that you can later change it into a function and not have to change any client-code, that could further be useful if that function's result was a parametrized integer, where bit has the definition 1..PARAMETER'Size.

Related

Backus–Naur form with boolean algebra. Problem with brackets and parse tree

boolean algebra
I want to write those boolean expression in Backus-Naur-Form.
What I have got is:
< variable > ::= < signal > | < operator> | < bracket >< variable>
< signal> ::= <p> | <q> | <r>| <s>
< operator> ::= <AND> | <OR> | <implication>| <equivalence>| <NOT>
< bracket> ::= < ( > | < ) >
I made a rekursion with < bracket >< variable>, so that whenever there is a bracket it starts a new instance, but I still do not know when to close the brackets. With this you are able to set a closing bracket and make a new instance, but I only want that for opening brackets.
Can I seperate < bracket> in < open bracket> and < closing bracket>?
Is my Backus-Naur form even correct? There isn't much information about Backus-Naur form with boolean algebra on the internet. How does the parse tree of this look like?
I assume that you want to define a grammar for boolean expressions using the Backus-Naur form, and that your examples are concrete instances of such expressions. There are multiple problems with your grammar:
First of all, you want your grammar to only generate correct boolean expressions. With your grammar, you could generate a simple operator ∨ as valid expression using the path <variable> -> <operator> -> <OR>, which is clearly wrong since the operator is missing its operands. In other words, ∨ on its own cannot be a correct boolean expression. Various other incorrect expressions can be derived with your grammar. For the same reason, the opening and closing brackets should appear together somewhere within a production rule, since you want to ensure that every opening bracket has a closing bracket. Putting them in separate production rules might destroy that guarantee, depending on the overall structure of your grammar.
Secondly, you want to differentiate between non-terminal symbols (the ones that are refined by production rules, i.e. the ones written between < and >) and terminal symbols (atomic symbols like your variables p, q, r and s). Hence, your non-terminal symbols <p>, <q>, <r> and <s> should be terminal symbols p, q, r and s. Same goes for other symbols like brackets and operators.
Thirdly, in order to get an unambiguous parse tree, you want to get your precedence and associativity of your operators correct, i.e., you want to make sure that, for example, negation is evaluated before implication, since it has a higher precedence (similar to arithmetic expressions where multiplication must be evaluated before addition). In other words, we want operators with higher precedence to appear closer to the leaf nodes of the parse tree, and operators with lower precedence to appear closer to the root node of the tree, since the leaves of the tree are evaluated first. We can achieve that by defining our grammar in a way that reflects the precedences of the operators in a decreasing manner:
<expression> ::= <expression> ↔ <implication> | <implication>
<implication> ::= <implication> → <disjunction> | <disjunction>
<disjunction> ::= <disjunction> ∨ <conjunction> | <conjunction>
<conjunction> ::= <conjunction> ∧ <negation> | <negation>
<negation> ::= ¬ <negation> | <variable> | ( <expression> )
<variable> ::= p | q | r | s
Starting with <expression>, we can see that a valid boolean expression starts with chaining all the ↔ operators together, then all the → operators , then all the ∨ operators, and so on, according to their precedence. Hence, operators with lower precedence (e.g., ↔) are located near the root of the tree, where operators with higher precedence (e.g., ¬) are located near the leaves of the tree.
Note that the grammar above is left-recursive, which might cause some problems with software tools that cannot handle them (e.g., parser generators).

Can we represent a++ + ++a + a++ as an expression tree?

I'm confused in construction of expression trees for unary operators like negation(-), post/pre-increment(++), post/pre-decrement(--).
Depending on language, (a++) + (++a) + (a++) is perfectly legal, is undefined behaviour or is a downright compile error.
In a perfectly legal case, I would imagine it to be like
expression = unary_op operand | operand unary_op | operand binary_op operand
operand = literal | expression
unary_op = ++ | --
binary_op = + | - | * | /
expressions must be evaluated from left to right
Hence the expression tree in prefix notation would be
tree = unary_op{tree} | binary_op{tree, tree} | literal
Note that we differentiate pre/post increment/decrement in the tree representation.
The sample expression would have a tree of +{+{++_post{a}, ++_pre{a}}, ++_post{a}} evaluated in dfs.
In the case of undefined behaviour, it is likely that the language says the order of which the operations are evaluated is not specified, meaning (++a) / (++a) may be less than or greater than 1.
In the case of compile error, the language just says: we are not going into this mess.
The parser needs to distinguish the various operators syntactically, and then it doesn't matter that they're all spelled using the same character. The expression tree shouldn't care about spelling.
Perhaps it's easier to see if you substitute new spelling, say $ for pre-increment and # for post-increment:
A # + $ A + A #
Since it's already been parsed, might as well normalize all the unary operators to prefix notation:
# A + $ A + # A

When/Why Does Mutual Left Recursion Happen in Antlr?

I have an expression that is a collection of my other top-level things. In expression I have math that is expression (op) expression. With this I get
The following sets of rules are mutually left-recursive [expression, math]
compileUnit : expression EOF;
expression
: parens
| operation
| math
| variable
| number
| comparisonGroup
;
math : expression op=( ADD | SUBSTRACT | MULTIPLY | DIVIDE ) expression #mathExpression;
HOWEVER!
This is not a problem-
expression
: parens
| operation
| expression op=( ADD | SUBSTRACT | MULTIPLY | DIVIDE ) expression
| variable
| number
| comparisonGroup
;
And neither is this!-
math : op=( ADD | SUBSTRACT | MULTIPLY | DIVIDE ) expression expression #mathExpression;
So why is it that my first code block behaves differently than the other two examples?
Antlr4 can handle direct left recursion, but not indirect left recursion, where a left recursive rule is defined as a rule that "either directly or indirectly invokes itself on the left edge of an alternative" (TDAR; pg 71).
When, as in the first example, the #mathExpression alternative is factored out of the expression rule and into a separate math rule, the left direct recursion becomes indirect, i.e., the rules are 'mutually left-recursive'.
As realized in the second and third examples, a typical solution is to simply combine the indirect left-recursive rules in a single rule.

Common Lisp: A good way to represent grammar rules?

This is a Common Lisp data representation question.
What is a good way to represent grammars? By "good" I mean a representation that is simple, easy to understand, and I can operate on the representation without a lot of fuss. The representation doesn't have to be particularly efficient; the other properties (simple, understandable, process-able) are more important to me.
Here is a sample grammar:
Session → Facts Question
Session → ( Session ) Session
Facts → Fact Facts
Facts → ε
Fact → ! STRING
Question → ? STRING
The representation should allow the code that operates on the representation to readily distinguish between terminal symbols and non-terminal symbols.
Non-terminal symbols: Session, Facts, Fact, Question
Terminal symbols: (, ), ε, !, ?
This particular grammar uses parentheses symbols, which conflicts with Common Lisp's use of parentheses symbols. What's a good way to handle that?
I want my code to be able to be able to recognize the symbol for the empty string, ε. What's a good way to represent the symbol for the empty string, ε?
I want my code to be able to distinguish between the left-hand side and the right-hand side of a grammar rule.
Below are some common operations that I want to perform on the representation.
Consider this rule:
A → u1u2...un
Operations: I want to get the first symbol of a grammar rule's right-hand side. Then I want to know: is it a terminal symbol? Is it the ε-symbol? If it's a non-terminal symbol, then I want to get its grammar rule.
GRAIL (GRAmmar In Lisp)
Description of GRAIL
Slightly modified version of GRAIL with a function generator included
I'm including the BNF of GRAIL from the second link in case it expires:
<grail-list> ::= "'(" {<grail-rule>} ")"
<grail-rule> ::= <assignment> | <alternation>
<assignment> ::= "(" <type> " ::= " <s-exp> ")"
<alternation> ::= "(" <type> " ::= " <type> {<type>} ")"
<s-exp> ::= <symbol> | <nonterminal> | "(" {<s-exp>} ")"
<type> ::= "#(" <type-name> ")"
<nonterminal> ::= "#(" {<arg-name> " "} <type-name> ")"
<type-name> ::= <symbol>
<arg-name> ::= <symbol>
DCG Format (Definite Clause Grammar)
There is an implementation of a definite clause grammar in Paradigms of Artificial Intelligence Programming. Technically it's Prolog, but it's all implemented as Lisp in the book.
Grammar of English in DCG Format as used in PAIP
DCG Parser
Hope this helps!

What are the different kinds of cases?

I'm interested in the different kinds of identifier cases, and what people call them. Do you know of any additions to this list, or other alternative names?
myIdentifier : Camel case (e.g. in java variable names)
MyIdentifier : Capital camel case (e.g. in java class names)
my_identifier : Snake case (e.g. in python variable names)
my-identifier : Kebab case (e.g. in racket names)
myidentifier : Flat case (e.g. in java package names)
MY_IDENTIFIER : Upper case (e.g. in C constant names)
flatcase or mumblecase
kebab-case. Also called caterpillar-case, dash-case, hyphen-case, lisp-case, spinal-case and css-case
camelCase
PascalCase or CapitalCamelCase
snake_case or c_case
MACRO_CASE, UPPER_CASE or SCREAM_CASE
COBOL-CASE or TRAIN-CASE
Names are either generic, after a language, or colorful; most don’t have a standard name outside of a specific community.
There are many names for these naming conventions (names for names!); see Naming convention: Multiple-word identifiers, particularly for CamelCase (UpperCamelCase, lowerCamelCase). However, many don’t have a standard name. Consider the Python style guide PEP 0008 – it calls them by generic names like “lower_case_with_underscores”.
One convention is to name after a well-known use. This results in:
PascalCase
MACRO_CASE (C preprocessor macros)
…and suggests these names, which are not widely used:
c_case (used in K&R and in the standard library, like size_t)
lisp-case, css-case
COBOL-CASE
Alternatively, there are illustrative names, of which the best established is CamelCase. snake_case is more recent (2004), but is now well-established. kebab-case is yet more recent and still not established, and may have originated on Stack Overflow! (What's the name for dash-separated case?) There are many more colorful suggestions, like caterpillar-case, Train-case (initial capital), caravan-case, etc.
+--------------------------+-------------------------------------------------------------+
| Formatting | Name(s) |
+--------------------------+-------------------------------------------------------------|
| namingidentifier | flat case/Lazy Case |
| NAMINGIDENTIFIER | upper flat case |
| namingIdentifier | (lower) camelCase, dromedaryCase |
| NamingIdentifier | (upper) CamelCase, PascalCase, StudlyCase, CapitalCamelCase |
| naming_identifier | snake_case, snake_case, pothole_case, C Case |
| Naming_Identifier | Camel_Snake_Case |
| NAMING_IDENTIFIER | SCREAMING_SNAKE_CASE, MACRO_CASE, UPPER_CASE, CONSTANT_CASE |
| naming-identifier | Kebab Case/caterpillar-case/dash-case, hyphen-case, |
| | lisp-case, spinal-case and css-case |
| NAMING-IDENTIFIER | TRAIN-CASE, COBOL-CASE, SCREAMING-KEBAB-CASE |
| Naming-Identifier | Train-Case, HTTP-Header-Case |
| _namingIdentifier | Undercore Notation (prefixed by "_" followed by camelCase |
| datatypeNamingIdentifier | Hungarian Notation (variable names Prefixed by metadata |
| | data-types which is out-dated) |
|--------------------------+-------------------------------------------------------------+
MyVariable : Pascal Case => Used for Class
myVariable : Camel Case => Used for variable at Java, C#, etc.
myvariable : Flat Case => Used for package at Java, etc.
my_variable : Snake Case => Used for variable at Python, PHP, etc.
my-variable : Kebab Case => Used for css
The most common case types:
Camel case
Snake case
Kebab case
Pascal case
Upper case (with snake case)
camelCase
camelCase must (1) start with a lowercase letter and (2) the first letter of every new subsequent word has its first letter capitalized and is compounded with the previous word.
An example of camel case of the variable camel case var is camelCaseVar.
snake_case
snake_case is as simple as replacing all spaces with a "_" and lowercasing all the words. It's possible to snake_case and mix camelCase and PascalCase but imo, that ultimately defeats the purpose.
An example of snake case of the variable snake case var is snake_case_var.
kebab-case
kebab-case is as simple as replacing all spaces with a "-" and lowercasing all the words. It's possible to kebab-case and mix camelCase and PascalCase but that ultimately defeats the purpose.
An example of kebab case of the variable kebab case var is kebab-case-var.
PascalCase
PascalCase has every word starts with an uppercase letter (unlike camelCase in that the first word starts with a lowercase letter).
An example of pascal case of the variable pascal case var is PascalCaseVar.
Note: It's common to see this confused for camel case, but it's a separate case type altogether.
UPPER_CASE_SNAKE_CASE
UPPER_CASE_SNAKE_CASE is replacing all the spaces with a "_" and converting all the letters to capitals.
an example of upper case snake case of the variable upper case snake case var is UPPER_CASE_SNAKE_CASE_VAR.
For Python specifically, it is best to use snake_case for variable and function names, UPPER_CASE for constants (even though we don't have any keywords that specifically say that our variable is a constant) and PascalCase for class names.
camelCase is not recommended for Python (although languages such as Javascript have it as their main casing), and kebab-case would be invalid as Python names cannot contain a hypen (-).
variable_name = 'Hello World!'
def function_name():
pass
CONSTANT_NAME = 'Constant Hello World!!'
class ClassName:
pass

Resources