Static analyzer for functional programming languages, e.g.Scheme - functional-programming

I seldom see static analyzer for functional programming languages, like Racket/Scheme, I even doubt that whether there are any. I would like to write a static analyzer for functional languages, say Scheme/Racket. How should I go about it?

Yes, there is some work on static analysis of dynamic languages like Scheme. For instance, see the work of Olin Shivers (http://www.ccs.neu.edu/home/shivers/citations.html) and Manuel Serrano (http://www-sop.inria.fr/members/Manuel.Serrano/index-1.html).

There's already, e.g., typed racket: http://docs.racket-lang.org/ts-guide/index.html
Since valid racket code is valid typed racket, you just have to change the language you're working in. Then, for libraries with typed versions, load those instead of the untyped versions, and certain type errors can get caught statically already. Further type annotations can be added to your own code to get guarantees of type correctness beyond that...

First read this paper by Shivers, explaining why there is no static control flow graph available in Scheme.
Might implemented k-CFA in Scheme. Matt Might's site and blog is a good starting point for exploring static analysis of higher-order languages.
I did some static analysis implementations for Scheme in Java as well:
k-CFA implementation
Interprocedural Dependence Analysis implementation based on a paper by Might and Prabhu

Related

What is more in the spirit of the Julia language and philosophy?

I recently started programming in Julia for research purposes. Going through it I started loving the syntax, I positively experienced the community here in SO and now I am thinking about porting some code from other programming languages.
Working with highly computational expensive forecasting models, it would be nice to have them all in a powerful modern language as Julia.
I would like to create a project and I am wondering how I should design it. I am concerned both from a performance and a language perspective (i.e.: Would it be better to create modules – submodules – functions or something else would be preferred? Is it better off to use dictionaries or custom types?).
I have looked at different GitHub projects in my field, but I haven't really found a common standard. Therefore I am wondering: what is more in the spirit of the Julia language and philosophy?
EDIT:
It has been pointed out that this question might be too generic. Therefore, I would like to focus it on how it would be better structuring modules (i.e. separate modules for main functions and subroutines versus modules and submodules, etc.). I believe this would be enough for me to have a feel about what might be considered in the spirit of the Julia language and philosophy. Of course, additional examples and references are more than welcome.
The most you'll find is that there is an "official" style-guide. The rest of the "Julian" style is ill-defined, but there are some ways to heuristically define it.
First of all, it means designing the software around multiple dispatch and the type system. A software which follows a Julian design philosophy usually won't be defining a bunch of functions like test_pumpkin and test_pineapple, instead it will use dispatches on test for types Pumpkin and Pineapple. This allows for clean/understandable code. It will break tasks up into small type-stable functions which will allow for good performance. It likely will also be written very generically, allowing the user to use items that are subtypes of AbstractArray or Number, and using the power of dispatch to allow their software to work on numbers they've never even heard of. (In this respect, custom types are recommended over dictionaries when you need performance. However, for a type you have to know all of the fields at the beginning, which means some things require dictionaries.)
A software which follows a Julian design philosophy may also implement a DSL (Domain-Specific Language) to allow a simpler interface to the user. Instead of requiring the user to conform to archaic standards derived from C/Fortran, or write large repetitive items and inputs, the package may provide macros to allow the user to more heuristically define the problem for the software to solve.
Other items which are part of the Julian design philosophy are up for much debate. Is proper Julia code devectorized? I would say no, and the loop fusing broadcast . is a powerful way to write MATLAB-style "vectorized" code and have it be perform like a devectorized loop. However, I have seen others prefer devectorized styles.
Also note that Julia is very different from something like Python where in Julia, you can essentially "build your own standard way of doing something". Since there's no performance penalty for functions/types declared in packages rather than Base, you can build your own Julia world if you want, using macros to define your own "function-like" objects, etc. I mean, you can re-create Java styles in Julia if you wanted.

F# and tuple output

Over at http://diditwith.net, I see that, in F#, it isn't strictly necessary to pass out parameters to a function that otherwise requires them. The language will auto-magically stuff the result and the output parameter into a tuple. (!)
Is this some kind of side effect (pardon the pun) of the general mechanics of the language, or a feature that was specifically articulated in the F# specification and deliberately programmed into the language?
It's an awesome feature, and if it was expressly put into F#, then I'm wondering what other nuggets of gold like this are lurking within the language, because I've pored over dozens of web pages and read through three books (by D. Syme, T. Petricek, and C. Smith) and I hadn't seen this particular trick mentioned at all.
EDIT: As Mr. Petricek has responded, below, he does mention the feature in at least two places in his book, Real-World Functional Programming. My bad.
This is not a side-effect of some other, more general, mechanism in the F# language.
It has been added specifically for this purpose. .NET libraries often return multiple values by adding out (or ref) parameters at the end of the method signature. In F#, returning multiple values is done by returning tuple, so it makes sense to turn the .NET style into the typical F# pattern.
I don't think F# does many similar tricks, especially when it comes to interoperability, but you can browse through some of the handy snippets here and here.
(I quickly checked and Real-World Functional Programming mentions the trick briefly on pages 88 and 111.)
This is a specific feature to make interop with .NET methods more pleasant - all trailing out parameters can instead be treated as part of the return value (but note that this only affects trailing out parameters, so a method with the C# signature like void f(out int i, int j) can't be called this way).
Arguably, out parameters are just a way to work around the lack of tuples in .NET 1.0, anyway. It seems likely that many methods that use them would be written differently if they targeted later versions of the framework (by using Nullable<_> types or tuples as return types).

What is the diffrence between a function and a module?

I am very new to c++ and confused between what is the difference between modular programming and function oriented programming.I have never done modular programming so I just know modules by definition that it contains functions.So what is the difference between a sequential(function-oriented language)and modular programming?Thanks in advance.
EDIT:
I was reading about C++'s OOP.It started something like what is unstructured programming, than gave a basic idea about structured programming, than modular programming and finally,OOP.
Modular programming is mostly a strategy to reduce coupling in a computer program, mostly by means of encapsulation.
Before modular programming, local coherence of the code was ensured by structured programming, but global coherence was lacking: if you decided that your spell-checking dictionary would be implemented as a red-black tree, then this implementation would be exposed to everyone else in the program, so that the programmer working on, say, text rendering, would be able to access the red-black tree nodes to do meaningful things with them.
Of course, this became hell once you needed to change the implementation of your dictionary, because then you would have to fix the code of other programmers as well.
Even worse, if the implementation detail involved global variables, then you had to be exceedingly careful of who changed them and in what order, or strange bugs would crop up.
Modular programming applied encapsulation to all of this, by separating the implementation (private to the module) from the interface (what the rest of the program can use). So, a dictionary module could expose an abstract type that would only be accessible through module functions such as findWord(word,dictionary). Someone working on the dictionary module would never need to peek outside that module to check if someone might be using an implementation detail.
They are both ways of structuring your code. If your interested in function-oriented programming and want to understand it a bit better, I'd take a look at lisp. C++ isn't truly function oriented as every function should return a value yet C++ functions can return void (making it a procedure rather than a function), so it's not a true functional programming language in the sense.
"I have never done modular programming so I just know modules by definition that it contains functions".
Modules are a level higher than functions.
That's a good start. Think of a function as a unit of work that does something and when you have several functions that you can group in a certain way, you put them in a module. So, string.h has a bunch of functions for working with strings, but you simply include the header and you have access to all those functions directly. You can then reuse those modules in other projects as you'd already used the modules previously and they've been (I assume) debugged and tested and stop people from reinventing the wheel. The whole point is to benefit from the cumulative experience.
I'd suggest you think of a project you'd like and write some functions and think about how you'd like to organize the code for another developer to use.
Hope this is of some use to you.
I believe functional programming leads us to micro services paradigm as for now while modular programming tends to similar to OOP concept.

Compiled dynamic language

I search for a programming language for which a compiler exists and that supports self modifying code. I’ve heared that Lisp supports these features, but I was wondering if there is a more C/C++/D-Like language with these features.
To clarify what I mean:
I want to be able to have in some way access to the programms code at runtime and apply any kind of changes to it, that is, removing commands, adding commands, changing them.
As if i had the AstTree of my programm. Of course i can’t have that tree in a compiled language, so it must be done different. The compile would need to translate the self-modifying commands into their binary equivalent modifications so they would work in runtime with the compiled code.
I don’t want to be dependent on an VM, thats what i meant with compiled :)
Probably there is a reason Lisp is like it is? Lisp was designed to program other languages and to compute with symbolic representations of code and data. The boundary between code and data is no longer there. This influences the design AND the implementation of a programming language.
Lisp has got its syntactical features to generate new code, translate that code and execute it. Thus pre-parsed code is also using the same data structures (symbols, lists, numbers, characters, ...) that are used for other programs, too.
Lisp knows its data at runtime - you can query everything for its type or class. Classes are objects themselves, as are functions. So these elements of the programming language and the programs also are first-class objects, they can be manipulated as such. Dynamic language has nothing to do with 'dynamic typing'.
'Dynamic language' means that the elements of the programming language (for example via meta classes and the meta-object protocol) and the program (its classes, functions, methods, slots, inheritance, ...) can be looked at runtime and can be modified at runtime.
Probably the more of these features you add to a language, the more it will look like Lisp. Since Lisp is pretty much the local maximum of a simple, dynamic, programmable programming language. If you want some of these features, then you might want to think which features of your other program language you have to give up or are willing to give up. For example for a simple code-as-data language, the whole C syntax model might not be practical.
So C-like and 'dynamic language' might not really be a good fit - the syntax is one part of the whole picture. But even the C syntax model limits us how easy we can work with a dynamic language.
C# has always allowed for self-modifying code.
C# 1 allowed you to essentially create and compile code on the fly.
C# 3 added "expression trees", which offered a limited way to dynamically generate code using an object model and abstract syntax trees.
C# 4 builds on that by incorporating support for the "Dynamic Language Runtime". This is probably as close as you are going to get to LISP-like capabilities on the .NET platform in a compiled language.
You might want to consider using C++ with LLVM for (mostly) portable code generation. You can even pull in clang as well to work in C parse trees (note that clang has incomplete support for C++ currently, but is written in C++ itself)
For example, you could write a self-modification core in C++ to interface with clang and LLVM, and the rest of the program in C. Store the parse tree for the main program alongside the self-modification code, then manipulate it with clang at runtime. Clang will let you directly manipulate the AST tree in any way, then compile it all the way down to machine code.
Keep in mind that manipulating your AST in a compiled language will always mean including a compiler (or interpreter) with your program. LLVM is just an easy option for this.
JavaScirpt + V8 (the Chrome JavaScript compiler)
JavaScript is
dynamic
self-modifying (self-evaluating) (well, sort of, depending on your definition)
has a C-like syntax (again, sort of, that's the best you will get for dynamic)
And you now can compile it with V8: http://code.google.com/p/v8/
"Dynamic language" is a broad term that covers a wide variety of concepts. Dynamic typing is supported by C# 4.0 which is a compiled language. Objective-C also supports some features of dynamic languages. However, none of them are even close to Lisp in terms of supporting self modifying code.
To support such a degree of dynamism and self-modifying code, you should have a full-featured compiler to call at run time; this is pretty much what an interpreter really is.
Try groovy. It's a dynamic Java-JVM based language that is compiled at runtime. It should be able to execute its own code.
http://groovy.codehaus.org/
Otherwise, you've always got Perl, PHP, etc... but those are not, as you suggest, C/C++/D- like languages.
I don’t want to be dependent on an VM, thats what i meant with compiled :)
If that's all you're looking for, I'd recommend Python or Ruby. They can both run on their own virtual machines and the JVM and the .Net CLR. Thus, you can choose any runtime you want. Of the two, Ruby seems to have more meta-programming facilities, but Python seems to have more mature implementations on other platforms.

What are the main issues in designing an interpreter for a functional language?

Suppose I want to implement an interpreter for a functional language. I would like to understand the issues involved in doing so and suitable literature that is available. This is a new language that is in early design stages, that is why the question is broad in scope.
For the purpose of this discussion we can assume that the purpose of the language is not important and that its functional features can be changed (even drastically) if it makes a significant difference in the ease of writing an interpreter.
The MIT website has an online copy of Structure and Interpretation of Computer Programs as well as videos of the MIT 6.001 lectures using Scheme, recorded at HP in 1986. These form a great introduction to language design.
I would highly recommend Structure and Interpretation of Computer Programs (SICP) as a starting point. This book will introduce the idea of what it means to write an interpreter (and a compiler), and is generally a must-read for anybody designing languages.
Implementing an interpreter for a functional language isn't likely to be too much different from implementing an interpreter for any other general purpose language. There's lexical analysis, parsing, AST construction, semantic analysis, plus execution (for a pure interpreter) or code generation and optimisation (for a compiler, even compiling to bytecode like Java/Perl/Python). SICP will introduce the difference between "applicative order" and "normal order" evaluation, which may be important for you in a pure functional context.
For just about any language interpreter or compiler, the main issues are the same, I think.
You need to decide certain basic characteristics of the language (semantics, not syntax), and the bulk of the design of the thing follows from that.
For example, does your language have
a type system? If so, what sorts of
types does it have? Is it going to be
statically typed, dynamically typed,
duck-typed?
What sort of expressions are you
planning to support? Do you need to
define an order of operations? Will
you even have operators?
What will you use as the run-time
representation of the program? Will
you convert the text to a byte-code
representation, or an AST, or a
tokenized form of the source text?
There are toolkits available to help take some of the tedium out of the actual parsing of text (ANTLR and Bison, to name two), but I don't know of anything that helps with the actual interpretation part of the task. I'm sure somebody will suggest something.
The main issue is having a semantics for the language you're implementing -- with that, the implementation becomes straightforward. Otherwise, this question is incredibly broad and hard to answer.
I'd recommend Essentials of Programming Languages as a good complement to SICP, particularly if you're interested in interpreters: Official EOPL site. You may want to check out the third edition-- the site hasn't been updated for it yet.
Edit: spam prevention is making me choose between links, so the official page is now unheated. It's easily Google-able, though.

Resources