How is a lattice used by a compiler - math

In my graduate class on compiler construction we've been introduced to the concept of a lattice. Three lectures have been devoted to lattices and so far it seems like an interesting tangent, but the dilemma is that it doesn't really help explain how a compiler uses a lattice to solve a concrete problem.
We have already covered parsing and typechecking. We're about to start liveness analysis and register allocation.
Note, I'm not looking for resources on building compilers. The following list of links have that covered pretty well. What I'm looking for is an explanation on the relationship between compilers and lattices, bonus points for the most examples.
Learning Resources on Parsers, Interpreters, and Compilers
How much of the compiler should we know?
Learning to write a compiler

Lattices are a very useful structure to represent state while doing static analysis on the program being compiled - eg. for removing dead code detected by liveness analysis, available/very busy expressions, reaching definitions, sign analysis and constant propagation.
Here is a very good read if you want the details: Lecture Notes on Static Analysis

Related

SIGNAL vs Esterel vs Lustre

I'm very interested in dataflow and concurrency focused languages. I've read up on the subject and repeatedly I see SIGNAL, Esterel, and Lustre mentioned; so I take it they're prominent players in those fields. However, many of their links in the resources I found are dead and they don't seem very accessible. I managed to find a couple compilers I can compile from source (Polychrony Toolset for SIGNAL and the Columbia Compiler for Esterel) but they've both had issues when trying to compile with cmake. Even textbooks teaching these languages have been tough to come by.
With the background of the way, my actual questions are: is anyone really familiar with this field of programming? Are these languages still big deals, or have they "died out" by now? Could it be they're just available to big companies with a hefty price tag, so the average programmer wouldn't really be able to pick those languages up?
I ran into a couple other dataflow/concurrent paradigm languages, such as Oz or E, but they seemed to be mostly for education and not suitable for real world projects. Not to say they aren't impressive languages, but their implementation was limited and it would be unlikely to see them in production contexts. Does anyone know of other languages in this field they can recommend that are actually accessible (have good documentation, tutorials, and an installable compiler to actually code in)? Or can anyone clarify a language such as Oz or E and hopefully show that they indeed are good enough for large real world projects?
All the languages you mentioned are not widespread. This means their compilers and runtime have bugs, the community is narrow and can give little help, and linking with general purpose libraries can be problematic.
I recommend to use an actively supported general purpose language such as Java, Scala, Kotlin or C++. They all have libraries to support asynchronous computations, and dataflow is no more than support of asynchronous procedure call. You even can develop your own dataflow library. This is not that hard: I wrote a dataflow library for Java which is only 40 kilobytes of source code.
Have you tried Céu? It is a recent variant of Esterel, and compiles to C. It is simple to understand, and provides a reactive and concurrent structuring of control flow. Native C calls can be made by just prefixing them with an underscore ("_printf").
http://ceu-lang.org
Also, see the paper "Structured Synchronous Reactive Programming with Céu" for a nice overview.
http://www.ceu-lang.org/chico/ceu_mod15_pre.pdf
These academics languages mostly disappeared as such and are used in industrial tools
Esterel-Lustre are the basis of in Ansys' SCADE
Signal is used in 3DS' ControlBuild
Esterel was used in Synopsys' ConcentricStudio.
Researchers use also Heptagon for synchronous language studies for code generation, formal methods, new concepts.

Short implementation examples of abstract interpretation

I am taking a course on abstract interpretation, but I haven't seen any examples of how the theory maps down to actual code.
I am looking for short code examples, where I preferably won't have to work with a whole compiler. The analysis doesn't have to be useful, I would just like to see an example where the analysis is derived and then implemented.
Does anyone know of any such examples, perhaps from a university course?
AI is based on a mathematic theory name Galois Connection. The theory is very simple:
Abstract the behaviour of the program.
Perform the analysis on the abstract level.
Galois connection: To relate the Actual and Abstract program.
This is the best tutorial I have seen so far about Abstract Interpretation:
There is this paper by Bertot
Structural abstract interpretation, A formal study using Coq
That gives a full implementation of an abstract interpreter for a simple toy language using the Coq Proof Assistant. I used this for a concrete reference, and found it useful, although a little hard going, which is to be expected given the subject matter. Coq is a great little piece of software.
I also came across in a Cousot paper:
A gentle introduction to formal verification of computer systems by abstract interpretation
rough details (but I am sure there will be useful citations for full details) of an implementation in Astrée, I am not familiar with Astrée, so didn't actually read that section, but I think it meets your criteria.
If you come across anymore, please let me know! Would especially like to see a prolog abstract interpreter.
Maybe this tool is also interesting for you:
Interproc Analyzer
It is an abstract analyzer for a very simple language, which however offers
interprocedural analyses. You can try out the analysis and get numerical invariants about the analyzed program. The source code is available (OCaml).
A really thorough and precise course, given by one of the "creators" of Abstract Interpretation, Patrick Cousot (already mentioned in one of the answers):
MIT course about Abstract Interpretation. The course also offers assignments, in OCaml.
There is MonoREIL, which comes with the recently open sourced tool BinNavi.
See here is a short intro.
Note that the context of the MonoREIL framework is not compilers but the analysis of binary code. Yet, it has been used for real world applications, see slide 34 ff of this introduction (which contains more formal background).

Modelling / documenting functional programs

I've found UML useful for documenting various aspects of OO systems, particularly class diagrams for overall architecture and sequence diagrams to illustrate particular routines. I'd like to do the same kind of thing for my clojure applications. I'm not currently interested in Model Driven Development, simply on communicating how applications work.
Is UML a common / reasonable approach to modelling functional programming? Is there a better alternative to UML for FP?
the "many functions on a single data structure" approach of idiomatic Clojure code waters down the typical "this uses that" UML diagram because many of the functions end up pointing at map/reduce/filter.
I get the impression that because Clojure is a somewhat more data centric language a way of visualizing the flow of data could help more than a way of visualizing control flow when you take lazy evaluation into account. It would be really useful to get a "pipe line" diagram of the functions that build sequences.
map and reduce etc would turn these into trees
Most functional programmers prefer types to diagrams. (I mean types very broadly speaking, to include such things as Caml "module types", SML "signatures", and PLT Scheme "units".) To communicate how a large application works, I suggest three things:
Give the type of each module. Since you are using Clojure you may want to check out the "Units" language invented by Matthew Flatt and Matthias Felleisen. The idea is to document the types and the operations that the module depends on and that the module provides.
Give the import dependencies of the interfaces. Here a diagram can be useful; in many cases you can create a diagram automatically using dot. This has the advantage that the diagram always accurately reflects the code.
For some systems you may want to talk about important dependencies of implementations. But usually not—the point of separating interfaces from implementations is that the implementations can be understood only in terms of the interfaces they depend on.
There was recently a related question on architectural thinking in functional languages.
It's an interesting question (I've upvoted it), I expect you'll get at least as many opinions as you do responses. Here's my contribution:
What do you want to represent on your diagrams? In OO one answer to that question might be, considering class diagrams, state (or attributes if you prefer) and methods. So, obviously I would suggest, class diagrams are not the right thing to start from since functions have no state and, generally, implement one function (aka method). Do any of the other UML diagrams provide a better starting point for your thinking? The answer is probably yes but you need to consider what you want to show and find that starting point yourself.
Once you've written a (sub-)system in a functional language, then you have a (UML) component to represent on the standard sorts of diagram, but perhaps that is too high-level, too abstract, for you.
When I write functional programs, which is not a lot I admit, I tend to document functions as I would document mathematical functions (I work in scientific computing, lots of maths knocking around so this is quite natural for me). For each function I write:
an ID;
sometimes, a description;
a specification of the domain;
a specification of the co-domain;
a statement of the rule, ie the operation that the function performs;
sometimes I write post-conditions too though these are usually adequately specified by the co-domain and rule.
I use LaTeX for this, it's good for mathematical notation, but any other reasonably flexible text or word processor would do. As for diagrams, no not so much. But that's probably a reflection of the primitive state of the design of the systems I program functionally. Most of my computing is done on arrays of floating-point numbers, so most of my functions are very easy to compose ad-hoc and the structuring of a system is very loose. I imagine a diagram which showed functions as nodes and inputs/outputs as edges between nodes -- in my case there would be edges between each pair of nodes in most cases. I'm not sure drawing such a diagram would help me at all.
I seem to be coming down on the side of telling you no, UML is not a reasonable way of modelling functional systems. Whether it's common SO will tell us.
This is something I've been trying to experiment with also, and after a few years of programming in Ruby I was used to class/object modeling. In the end I think the types of designs I create for Clojure libraries are actually pretty similar to what I would do for a large C program.
Start by doing an outline of the domain model. List the main pieces of data being moved around the primary functions being performed on this data. I write these in my notebook and a lot of the time it will be just a name with 3-5 bullet points underneath it. This outline will probably be a good approximation of your initial namespaces, and it should point out some of the key high level interfaces.
If it seems pretty straight forward then I'll create empty functions for the high level interface, and just start filling them in. Typically each high level function will require a couple support functions, and as you build up the whole interface you will find opportunities for sharing more code, so you refactor as you go.
If it seems like a more difficult problem then I'll start diagramming out the structure of the data and the flow of key functions. Often times the diagram and conceptual model that makes the most sense will depend on the type of abstractions you choose to use in a specific design. For example if you use a dataflow library for a Swing GUI then using a dependency graph would make sense, but if you are writing a server to processing relational database queries then you might want to diagram pools of agents and pipelines for processing tuples. I think these kinds of models and diagrams are also much more descriptive in terms of conveying to another developer how a program is architected. They show more of the functional connectivity between aspects of your system, rather than the pretty non-specific information conveyed by something like UML.

How To: Pattern Recognition

I'm interested in learning more about pattern recognition. I know that's somewhat of a broad field, so I'll list some specific types of problems I would like to learn to deal with:
Finding patterns in a seemingly random set of bytes.
Recognizing known shapes (such as circles and squares) in images.
Noticing movement patterns given a stream of positions (Vector3)
This is a new area of experimentation for me personally, and to be honest, I simply don't know where to start :-) I'm obviously not looking for the answers to be provided to me on a silver platter, but some search terms and/or online resources where I can start to acquaint myself with the concepts of the above problem domains would be awesome.
Thanks!
ps: For extra credit, if said resources provide code examples/discussion in C# would be grand :-) but doesn't need to be
Hidden Markov Models are a great place to look, as well as Artificial Neural Networks.
Edit: You could take a look at NeuronDotNet, it's open source and you could poke around the code.
Edit 2: You can also take a look at ITK, it's also open source and implements a lot of these types of algorithms.
Edit 3: Here's a pretty good intro to neural nets. It covers a lot of the basics and includes source code (albeit in C++). He implemented an unsupervised learning algorithm, I think you may be looking for a supervised backpropagation algorithm to train your network.
Edit 4: Another good intro, avoids really heavy math, but provides references to a lot of that detail at the bottom, if you want to dig into it. Includes pseudo-code, good diagrams, and a lengthy description of backpropagation.
This is kind of like saying "I'd like to learn more about electronics.. anyone tell me where to start?" Pattern Recognition is a whole field - there are hundreds, if not thousands of books out there, and any university has at least several (probably 10 or more) courses at the grad level on this. There are numerous journals dedicated to this as well, that have been publishing for decades ... conferences ..
You might start with the wikipedia.
http://en.wikipedia.org/wiki/Pattern_recognition
This is kind of an old question, but it's relevant so I figured I'd post it here :-) Stanford began offering an online Machine Learning class here - http://www.ml-class.org
OpenCV has some functions for pattern recognition in images.
You might want to look at this :http://opencv.willowgarage.com/documentation/pattern_recognition.html. (broken link: closest thing in the new doc is http://opencv.willowgarage.com/documentation/cpp/ml__machine_learning.html, although it is no longer what I'd call helpful documentation for a beginner - see other answers)
However, I also recommend starting with Matlab because openCV is not intuitive to use.
Lot of useful links on this page on computer vision related pattern recognition. Some of the links seem to be broken now but you may find it useful.
I am not an expert on this, but reading about Hidden Markov Models is a good way to start.
Beware false patterns! For any decently large data set you will find subsets that appear to have pattern, even if it is a data set of coin flips. No good process for pattern recognition should be without statistical techniques to assess confidence that the detected patterns are real. When possible, run your algorithms on random data to see what patterns they detect. These experiments will give you a baseline for the strength of a pattern that can be found in random (a.k.a "null") data. This kind of technique can help you assess the "false discovery rate" for your findings.
learning pattern-recoginition is easier in matlab..
there are several examples and there are functions to use.
it is good for the understanding concepts and experiments...
I would recommend starting with some MATLAB toolbox. MATLAB is an especially convenient place to start playing around with stuff like this due to its interactive console. A nice toolbox I personally used and really liked is PRTools (http://prtools.org); they have an implementation of pretty much every pattern recognition tool and also some other machine learning tools (Neural Networks, etc.). But the nice thing about MATLAB is that there are many other toolboxes as well you can try out (there is even a proprietary toolbox from Mathworks)
Whenever you feel comfortable enough with the different tools (and found out which classifier is perfomring best for you problem), you can start thinking about implementing the machine learning in a different application.

Concepts that surprised you when you read SICP?

SICP - "Structure and Interpretation of Computer Programs"
Explanation for the same would be nice
Can some one explain about Metalinguistic Abstraction
SICP really drove home the point that it is possible to look at code and data as the same thing.
I understood this before when thinking about universal Turing machines (the input to a UTM is just a representation of a program) or the von Neumann architecture (where a single storage structure holds both code and data), but SICP made the idea much more clear. Scheme (Lisp) helped here, as the syntax for a program is exactly the same as the syntax for lists in general, namely S-expressions.
Once you have the "equivalence" of code and data, suddenly a lot of things become easy. For example, you can write programs that have different evaluation methods (lazy, nondeterministic, etc). Previously, I might have thought that this would require an extension to the programming language; in reality, I can just add it on to the language myself, thus allowing the core language to be minimal. As another example, you can similarly implement an object-oriented framework; again, this is something I might have naively thought would require modifying the language.
Incidentally, one thing I wish SICP had mentioned more: types. Type checking at compilation time is an amazing thing. The SICP implementation of object-oriented programming did not have this benefit.
I didn't read that book yet, I have only looked at the video courses, but it taught me a lot. Functions as first class citizens was mind blowing for me. Executing a "variable" was something very new to me. After watching those videos the way I now see JavaScript and programming in general has greatly changed.
Oh, I think I've lied, the thing that really struck me was that + was a function.
I think the most surprising thing about SICP is to see how few primitives are actually required to make a Turing complete language--almost anything can be built from almost nothing.
Since we are discussing SICP, I'll put in my standard plug for the video lectures at http://groups.csail.mit.edu/mac/classes/6.001/abelson-sussman-lectures/, which are the best Introduction to Computer Science you could hope to get in 20 hours.
The one that I thought was really cool was streams with delayed evaluation. The one about generating primes was something I thought was really neat. Like a "PEZ" dispenser that magically dispenses the next prime in the sequence.
One example of "the data and the code are the same thing" from A. Rex's answer got me in a very deep way.
When I was taught Lisp back in Russia, our teachers told us that the language was about lists: car, cdr, cons. What really amazed me was the fact that you don't need those functions at all - you can write your own, given closures. So, Lisp is not about lists after all! That was a big surprise.
A concept I was completely unfamiliar with was the idea of coroutines, i.e. having two functions doing complementary work and having the program flow control alternate between them.
I was still in high school when I read SICP, and I had focused on the first and second chapters. For me at the time, I liked that you could express all those mathematical ideas in code, and have the computer do most of the dirty work.
When I was tutoring SICP, I got impressed by different aspects. For one, the conundrum that data and code are really the same thing, because code is executable data. The chapter on metalinguistic abstractions is mind-boggling to many and has many take-home messages. The first is that all the rules are arbitrary. This bothers some students, specially those who are physicists at heart. I think the beauty is not in the rules themselves, but in studying the consequence of the rules. A one-line change in code can mean the difference between lexical scoping and dynamic scoping.
Today, though SICP is still fun and insightful to many, I do understand that it's becoming dated. For one, it doesn't teach debugging skills and tools (I include type systems in there), which is essential for working in today's gigantic systems.
I was most surprised of how easy it is to implement languages. That one could write interpreter for Scheme onto a blackboard.
I felt Recursion in different sense after reading some of the chapters of SICP
I am right now on Section "Sequences as Conventional Interfaces" and have found the concept of procedures as first class citizens quite fascinating. Also, the application of recursion is something I have never seen in any language.
Closures.
Coming from a primarily imperative background (Java, C#, etc. -- I only read SICP a year or so ago for the first time, and am re-reading it now), thinking in functional terms was a big revelation for me; it totally changed the way I think about my work today.
I read most part of the book (without exercise). What I have learned is how to abstract the real world at a specific level, and how to implement a language.
Each chapter has ideas surprise me:
The first two chapters show me two ways of abstracting the real world: abstraction with the procedure, and abstraction with data.
Chapter 3 introduces time in the real world. That results in states. We try assignment, which raises problems. Then we try streams.
Chapter 4 is about metalinguistic abstraction, in other words, we implement a new language by constructing an evaluator, which determines the meaning of expressions.
Since the evaluator in Chapter 4 is itself a Lisp program, it inherits the control structure of the underlying Lisp system. So in Chapter 5, we dive into the step-by-step operation of a real computer with the help of an abstract model, register machine.
Thanks.

Resources