Does ast(abstract syntax tree) is necessary for a compiler?Why is it a tree structure? - abstract-syntax-tree

I know that the concept of AST exists in the compiler, interpreter in many programming languages, but after learning more about it, I have some doubts and am eager for help.Thnks.
1.Does ast is necessary for a compiler, whether the translation of the code can't be done without it?
2.Why is it a tree structure, could it be any other structure?
3.If it's not necessary, then under what scenes do I need to use it for a better experience?
4.If i want to write a new ast parser, what are the relations between the language ast-parser used and the language ast-parser parsed? Could they be the same?

Related

What is the diffrence between a function and a module?

I am very new to c++ and confused between what is the difference between modular programming and function oriented programming.I have never done modular programming so I just know modules by definition that it contains functions.So what is the difference between a sequential(function-oriented language)and modular programming?Thanks in advance.
EDIT:
I was reading about C++'s OOP.It started something like what is unstructured programming, than gave a basic idea about structured programming, than modular programming and finally,OOP.
Modular programming is mostly a strategy to reduce coupling in a computer program, mostly by means of encapsulation.
Before modular programming, local coherence of the code was ensured by structured programming, but global coherence was lacking: if you decided that your spell-checking dictionary would be implemented as a red-black tree, then this implementation would be exposed to everyone else in the program, so that the programmer working on, say, text rendering, would be able to access the red-black tree nodes to do meaningful things with them.
Of course, this became hell once you needed to change the implementation of your dictionary, because then you would have to fix the code of other programmers as well.
Even worse, if the implementation detail involved global variables, then you had to be exceedingly careful of who changed them and in what order, or strange bugs would crop up.
Modular programming applied encapsulation to all of this, by separating the implementation (private to the module) from the interface (what the rest of the program can use). So, a dictionary module could expose an abstract type that would only be accessible through module functions such as findWord(word,dictionary). Someone working on the dictionary module would never need to peek outside that module to check if someone might be using an implementation detail.
They are both ways of structuring your code. If your interested in function-oriented programming and want to understand it a bit better, I'd take a look at lisp. C++ isn't truly function oriented as every function should return a value yet C++ functions can return void (making it a procedure rather than a function), so it's not a true functional programming language in the sense.
"I have never done modular programming so I just know modules by definition that it contains functions".
Modules are a level higher than functions.
That's a good start. Think of a function as a unit of work that does something and when you have several functions that you can group in a certain way, you put them in a module. So, string.h has a bunch of functions for working with strings, but you simply include the header and you have access to all those functions directly. You can then reuse those modules in other projects as you'd already used the modules previously and they've been (I assume) debugged and tested and stop people from reinventing the wheel. The whole point is to benefit from the cumulative experience.
I'd suggest you think of a project you'd like and write some functions and think about how you'd like to organize the code for another developer to use.
Hope this is of some use to you.
I believe functional programming leads us to micro services paradigm as for now while modular programming tends to similar to OOP concept.

Patterns for functional, dynamic and aspect-oriented programming

We have a very nice GoF book (Design Patterns: Elements of Reusable Object-Oriented Software) about patterns in Object Oriented Programming, and plenty of articles and resources in the web on this subject.
Are there any books (articles, resources) on patterns(best practices) for functional programming?
For dynamic programming in languages like Python and Ruby?
For AOP?
A related question was asked before: "Does functional programming replace GoF design patterns", with great responses.
The equivalent of "design patterns" is very vague in FP. In general, every time you see a "pattern" in your code you should create something to cover all of its uses in a uniform way. Often it will be a higher-order function.
For example, the following C code
for (int i = 0; i < n; i++)
if (a[i] == 42)
return true;
return false;
can be thought of some basic "design pattern" - checking if there's some special element on the list. This snippet could appear many times in code with different conditions. In FP, you simply use a higher order function several times. It's not a "pattern" anymore.
Functional programming has its own practices, but they are much different from "design patterns" in OOP. They include use of polymorphism, lists, higher-order functions, immutability/purity, laziness [not all are essential or specific to FP]... See also "what are core concepts of FP". Also, type classes (Haskell), modules and functors (OCaml), continuations, monads, zippers, finger trees, monoids, arrows, applicative functors, monad transformers, many purely functional data structures (book) etc. Functional pearls, already mentioned by Randall Schulz, form a very rich resource of FP at its best.
To learn how to write idiomatic code, any book/resource on a functional programming language will suffice IMHO (for example, RWH and LYAH); differences between thinking imperatively and functionally are always explained there.
In dynamic languages, Jeff Foster's link is a good collection; here is a very clever use of memoization in JavaScript that could be considered a "design pattern".
The list of design patterns described in GoF is written for languages like C++ and Java. It is sometimes considered a list of workarounds to make inflexible languages more dynamic. For example the Visitor pattern is not really needed in Ruby because you can simply change add member functions to your class at runtime. The Decorator pattern is obsolete if you can use mixins.
It's my experience that when I'm implementing a solution in C++ I tend to spend most of my time writing scaffolding code. I begin with creating a platform that allows me to think in my application's program domain. Design patterns probably were developed as a way to categorize different kinds of scaffolding.
I should mention that when I am programming in Ruby I don't have much supporting code. There just doesn't seem to be a need for it.
My theory is that other languages don't emphasize the concept of design patterns simply because their basic language constructs are sufficient. In defense of Java and C++: maybe this is because functional and AOP languages are often used in more specific problem domains or niches, while Java and C++ are used for everything.
And now for something different. If you are getting a bit bored with OO design and you want to learn something new then you might be interested in the the book Elements of Programming written by Stepanov. In this book he explains how programming can be approached from a mathematical point of view. For a preview, check out his Class notes for Adobe (found among others on this page). You may also be interested in Adobe's Collected Papers.
Here's a link to Design Patterns in Dynamic Programming
Aren't the Functional Pearls (one of) the canonical set(s) of design patterns for functional programming?
There is a Design patten in Ruby.
Beside the design patterns mentioned in GOF, it also list some others pattern like Convention over Configuration .
Personally my most important pattern for dynamic languages - write tests. It's even more important than in statically-typed languages.

Compiled dynamic language

I search for a programming language for which a compiler exists and that supports self modifying code. I’ve heared that Lisp supports these features, but I was wondering if there is a more C/C++/D-Like language with these features.
To clarify what I mean:
I want to be able to have in some way access to the programms code at runtime and apply any kind of changes to it, that is, removing commands, adding commands, changing them.
As if i had the AstTree of my programm. Of course i can’t have that tree in a compiled language, so it must be done different. The compile would need to translate the self-modifying commands into their binary equivalent modifications so they would work in runtime with the compiled code.
I don’t want to be dependent on an VM, thats what i meant with compiled :)
Probably there is a reason Lisp is like it is? Lisp was designed to program other languages and to compute with symbolic representations of code and data. The boundary between code and data is no longer there. This influences the design AND the implementation of a programming language.
Lisp has got its syntactical features to generate new code, translate that code and execute it. Thus pre-parsed code is also using the same data structures (symbols, lists, numbers, characters, ...) that are used for other programs, too.
Lisp knows its data at runtime - you can query everything for its type or class. Classes are objects themselves, as are functions. So these elements of the programming language and the programs also are first-class objects, they can be manipulated as such. Dynamic language has nothing to do with 'dynamic typing'.
'Dynamic language' means that the elements of the programming language (for example via meta classes and the meta-object protocol) and the program (its classes, functions, methods, slots, inheritance, ...) can be looked at runtime and can be modified at runtime.
Probably the more of these features you add to a language, the more it will look like Lisp. Since Lisp is pretty much the local maximum of a simple, dynamic, programmable programming language. If you want some of these features, then you might want to think which features of your other program language you have to give up or are willing to give up. For example for a simple code-as-data language, the whole C syntax model might not be practical.
So C-like and 'dynamic language' might not really be a good fit - the syntax is one part of the whole picture. But even the C syntax model limits us how easy we can work with a dynamic language.
C# has always allowed for self-modifying code.
C# 1 allowed you to essentially create and compile code on the fly.
C# 3 added "expression trees", which offered a limited way to dynamically generate code using an object model and abstract syntax trees.
C# 4 builds on that by incorporating support for the "Dynamic Language Runtime". This is probably as close as you are going to get to LISP-like capabilities on the .NET platform in a compiled language.
You might want to consider using C++ with LLVM for (mostly) portable code generation. You can even pull in clang as well to work in C parse trees (note that clang has incomplete support for C++ currently, but is written in C++ itself)
For example, you could write a self-modification core in C++ to interface with clang and LLVM, and the rest of the program in C. Store the parse tree for the main program alongside the self-modification code, then manipulate it with clang at runtime. Clang will let you directly manipulate the AST tree in any way, then compile it all the way down to machine code.
Keep in mind that manipulating your AST in a compiled language will always mean including a compiler (or interpreter) with your program. LLVM is just an easy option for this.
JavaScirpt + V8 (the Chrome JavaScript compiler)
JavaScript is
dynamic
self-modifying (self-evaluating) (well, sort of, depending on your definition)
has a C-like syntax (again, sort of, that's the best you will get for dynamic)
And you now can compile it with V8: http://code.google.com/p/v8/
"Dynamic language" is a broad term that covers a wide variety of concepts. Dynamic typing is supported by C# 4.0 which is a compiled language. Objective-C also supports some features of dynamic languages. However, none of them are even close to Lisp in terms of supporting self modifying code.
To support such a degree of dynamism and self-modifying code, you should have a full-featured compiler to call at run time; this is pretty much what an interpreter really is.
Try groovy. It's a dynamic Java-JVM based language that is compiled at runtime. It should be able to execute its own code.
http://groovy.codehaus.org/
Otherwise, you've always got Perl, PHP, etc... but those are not, as you suggest, C/C++/D- like languages.
I don’t want to be dependent on an VM, thats what i meant with compiled :)
If that's all you're looking for, I'd recommend Python or Ruby. They can both run on their own virtual machines and the JVM and the .Net CLR. Thus, you can choose any runtime you want. Of the two, Ruby seems to have more meta-programming facilities, but Python seems to have more mature implementations on other platforms.

What are the main issues in designing an interpreter for a functional language?

Suppose I want to implement an interpreter for a functional language. I would like to understand the issues involved in doing so and suitable literature that is available. This is a new language that is in early design stages, that is why the question is broad in scope.
For the purpose of this discussion we can assume that the purpose of the language is not important and that its functional features can be changed (even drastically) if it makes a significant difference in the ease of writing an interpreter.
The MIT website has an online copy of Structure and Interpretation of Computer Programs as well as videos of the MIT 6.001 lectures using Scheme, recorded at HP in 1986. These form a great introduction to language design.
I would highly recommend Structure and Interpretation of Computer Programs (SICP) as a starting point. This book will introduce the idea of what it means to write an interpreter (and a compiler), and is generally a must-read for anybody designing languages.
Implementing an interpreter for a functional language isn't likely to be too much different from implementing an interpreter for any other general purpose language. There's lexical analysis, parsing, AST construction, semantic analysis, plus execution (for a pure interpreter) or code generation and optimisation (for a compiler, even compiling to bytecode like Java/Perl/Python). SICP will introduce the difference between "applicative order" and "normal order" evaluation, which may be important for you in a pure functional context.
For just about any language interpreter or compiler, the main issues are the same, I think.
You need to decide certain basic characteristics of the language (semantics, not syntax), and the bulk of the design of the thing follows from that.
For example, does your language have
a type system? If so, what sorts of
types does it have? Is it going to be
statically typed, dynamically typed,
duck-typed?
What sort of expressions are you
planning to support? Do you need to
define an order of operations? Will
you even have operators?
What will you use as the run-time
representation of the program? Will
you convert the text to a byte-code
representation, or an AST, or a
tokenized form of the source text?
There are toolkits available to help take some of the tedium out of the actual parsing of text (ANTLR and Bison, to name two), but I don't know of anything that helps with the actual interpretation part of the task. I'm sure somebody will suggest something.
The main issue is having a semantics for the language you're implementing -- with that, the implementation becomes straightforward. Otherwise, this question is incredibly broad and hard to answer.
I'd recommend Essentials of Programming Languages as a good complement to SICP, particularly if you're interested in interpreters: Official EOPL site. You may want to check out the third edition-- the site hasn't been updated for it yet.
Edit: spam prevention is making me choose between links, so the official page is now unheated. It's easily Google-able, though.

A universal reflections API?

Some time back I was working on an algorithm that processed code, and required a reflections API. We were interested in its implementation for multiple languages, but the reflections API for a language would not work for any other language. So is there any thing like a "universal reflections API" that would work for all languages, or maybe for a few mainstream languages (.NET,Java,Ruby,Python)
If there isnt any, is it possible to build such a thing that can process classes from different languages.
How would you go about having a unified way to process OO code from multiple languages
I don't believe there is universal Reflection API. Any Reflection API depends on the metadata that the compiler generates for the language constructs and these can vary quite a lot from language to language, even though there is a common subset across multiple languages.
In .NET there is CodeDOM, which provides a way to generate a universal syntax tree and then serialize it as (C#, VB .NET etc...) code and/or compile it. Of course that's the mirror image of Reflection, but if anyone ever writes a tool to generate the AST directly from IL the functionality could start to overlap.
In any case its the closest thing I can think of.
A reflection API depends on the metadata generated for the code, so you can have a universal API for all languages on the JVM, or all languages on the CLR...but it wouldn't really be possible to make one that does Python, Java, and VB etc...
If you want a universal API, you need to step outside the language. See our DMS meta-tool for processing arbitrary languages, and answering arbitrary questions, including those you think of as reflection.
(Op asked for support for various languages: DMS has full parsers for C#, VB.net, Java, and Python. Ruby not yet in the list; we're working on it).

Resources