Navigating the automatic differentiation ecosystem in Julia

Navigating the automatic differentiation ecosystem in Julia - julia

Julia has a somewhat sprawling AD ecosystem, with perhaps by now more than a dozen different packages spanning, as far as I can tell, forward-mode (ForwardDiff.jl, ForwardDiff2.jl
), reverse-mode (ReverseDiff.jl, Nabla.jl, AutoGrad.jl), and source-to-source (Zygote.jl, Yota.jl, Enzyme.jl, and presumably also the forthcoming Diffractor.jl) at several different steps of the compilation pipeline, as well as more exotic things like NiLang.jl.
Between such packages, what is the support for different language constructs (control-flow, mutation, etc.), and are there any rules of thumb for how one should go about choosing a given AD for a given task? I believe there was a compare-and-contrast table on the Julia Slack at some point, but I can't seem find anything like that reproduced for posterity in the relevant discourse threads or other likely places (1, 2)

I'd also love to hear an informed answer to this. Some more links that might be of interest.
Diffractor now has a Github repo, which lays out the implementation plan. After reading the text there, my take is that it will require long-term implementation work before Diffractor is production ready. On the other hand, there is a feeling that Zygote may be in "maintenance mode" while awaiting Diffractor. At least from a distance, the situation seems a bit awkward. The good news is that the ChainRules.jl ecosystem seems to make it possible to easily swap between autodiff systems.
As of Sept 2021, Yota seems to be rapidly evolving. The 0.5 release brings support for ChainRules which seems to unlock it for production use. There is a lot of interesting discussion at this release thread. My understanding from reading through those threads is that the scope of Yota is more limited compared to Zygote (e.g., autodiff through mutation is not supported). This limited scope has the advantage of opening up optimization opportunities such as preallocation, and kernel fusion that may not be possible in a more general autodiff system. As such, Yota might be better suited to fill the niche of, e.g., PyTorch type modeling.

Related

Understanding the reasons behind Openmdao design

I am reading about MDO and I find openmdao really interesting. However I have trouble understanding/justifying the reasons behind some basic choices.
Why Gradient-based optimization ? Since gradient-based optimizer can never guarantee global optimum why is it preferred. I understand that finding a global minima is really hard for MDO problems with numerous design variables and a local optimum is far better than a human design. But considering that the application is generally for expensive systems like aircrafts or satellites, why settle for local minima ? Wouldn't it be better to use meta-heuristics or meta-heuristics on top of gradient methods to converge to global optimum ? Consequently the computation time will be high but now that almost every university/ leading industry have access to super computers, I would say it is an acceptable trade-off.
Speaking about computation time, why python ? I agree that python makes scripting convenient and can be interfaced to compiled languages. Does this alone tip the scales in favor of Python ? But if computation time is one of the primary reasons that makes finding the global minima really hard, wouldn't it be better to use C++ or any other energy efficient language ?
To clarify the only intention of this post is to justify (to myself) using Openmdao as I am just starting to learn about MDO.

No algorithm can guarantee that it finds a global optimum in finite time, but gradient-based methods generally find locals faster than gradient-free methods. OpenMDAO concentrates on gradient-based methods because they are able to traverse the design space much more rapidly than gradient-free methods.
Gradient-free methods are generally good for exploring the design space more broadly for better local optima, and there's nothing to prevent users from wrapping the gradient-based optimization drivers under a gradient-free caller. (see the literature about algorithms like Monotonic Basin Hopping, for instance)
Python was chosen because, while it's not the most efficient in run-time, it considerably reduces the development time. Since using OpenMDAO means writing code, the relatively low learning curve, ease of access, and cross-platform nature of Python made it attractive. There's also a LOT of open-source code out there that's written in Python, which makes it easier to incorporate things like 3rd party solvers and drivers. OpenMDAO is only possible because we stand on a lot of shoulders.
Despite being written in Python, we achieve relatively good performance because the algorithms involved are very efficient and we attempt to minimize the performance issues of Python by doing things like using vectorization via Numpy rather than Python loops.
Also, the calculations that Python handles at the core of OpenMDAO are generally very low cost. For complex engineering calculations like PDE solvers (e.g. CFD or FEA) the expensive parts of the code can be written in C, C++, Fortran, or even Julia. These languages are easy to interface with python, and many OpenMDAO users do just that.
OpenMDAO is actively used in a number of applications, and the needs of those applications drives its design. While we don't have a built-in monotonic-basin-hopping capability right now (for instance), if that was determined to be a need by our stakeholders we'd look to add it in. As our development continues, if we were to hit roadblocks that could be overcome by switching do a different language, we would consider it, but backwards compatibility (the ability of users to use their existing Python-based models) would be a requirement.

SIGNAL vs Esterel vs Lustre

I'm very interested in dataflow and concurrency focused languages. I've read up on the subject and repeatedly I see SIGNAL, Esterel, and Lustre mentioned; so I take it they're prominent players in those fields. However, many of their links in the resources I found are dead and they don't seem very accessible. I managed to find a couple compilers I can compile from source (Polychrony Toolset for SIGNAL and the Columbia Compiler for Esterel) but they've both had issues when trying to compile with cmake. Even textbooks teaching these languages have been tough to come by.
With the background of the way, my actual questions are: is anyone really familiar with this field of programming? Are these languages still big deals, or have they "died out" by now? Could it be they're just available to big companies with a hefty price tag, so the average programmer wouldn't really be able to pick those languages up?
I ran into a couple other dataflow/concurrent paradigm languages, such as Oz or E, but they seemed to be mostly for education and not suitable for real world projects. Not to say they aren't impressive languages, but their implementation was limited and it would be unlikely to see them in production contexts. Does anyone know of other languages in this field they can recommend that are actually accessible (have good documentation, tutorials, and an installable compiler to actually code in)? Or can anyone clarify a language such as Oz or E and hopefully show that they indeed are good enough for large real world projects?

All the languages you mentioned are not widespread. This means their compilers and runtime have bugs, the community is narrow and can give little help, and linking with general purpose libraries can be problematic.
I recommend to use an actively supported general purpose language such as Java, Scala, Kotlin or C++. They all have libraries to support asynchronous computations, and dataflow is no more than support of asynchronous procedure call. You even can develop your own dataflow library. This is not that hard: I wrote a dataflow library for Java which is only 40 kilobytes of source code.

Have you tried Céu? It is a recent variant of Esterel, and compiles to C. It is simple to understand, and provides a reactive and concurrent structuring of control flow. Native C calls can be made by just prefixing them with an underscore ("_printf").
http://ceu-lang.org
Also, see the paper "Structured Synchronous Reactive Programming with Céu" for a nice overview.
http://www.ceu-lang.org/chico/ceu_mod15_pre.pdf

These academics languages mostly disappeared as such and are used in industrial tools
Esterel-Lustre are the basis of in Ansys' SCADE
Signal is used in 3DS' ControlBuild
Esterel was used in Synopsys' ConcentricStudio.
Researchers use also Heptagon for synchronous language studies for code generation, formal methods, new concepts.

What are the main differences between CLISP, ECL, and SBCL?

I want to do some simulations with ACT-R and I will need a Common Lisp implementation. I have three Common Lisp implementations available: (1) CLISP [1], (2) ECL [1], and (3) SBCL [1]. As you might have gathered from the links I have read a bit about all three of them on Wikipedia. But I would like the opinion of some experienced users. More specifically I would like to know:
(i) What are the main differences between the three implementations (e.g.: What are they best at? Is any of them used only for specific purposes and might therefore not be suited for specific tasks?)?
(ii) Is there an obvious choice either based on the fact that I will be using ACT-R or based on general reasons?
As this could be interpreted as a subjective question
I checked What topics can I ask about here and What types of questions should I avoid asking? and if I read correctly it should not qualify as forbidden fruit.

I wrote a moderately-sized application and ran it in SBCL, CCL, ECL, CLISP, ABCL, and LispWorks. For my application, SBCL is far and away the fastest, and it's got a pretty good debugger. It's a bit strict about some warnings--you may end up coding in a slightly more regimented way, or turn off one or more warnings.
I agree with Sylwester: If possible, write to the standard, and then you can run your code in any implementation. You'll figure out through testing which is best for your project.
Since SBCL compiles so agressively, once in a while the stacktrace in the debugger is less informative than I'd like. This can probably be controlled with parameters, but I just rerun the same code in one of the other implementations. ABCL has an informative stacktrace, for example, as I recall. (It's also very slow, but if you want real Common Lisp and Java interoperability, it's the only option.)
One of the nice things about Common Lisp is how many high-quality implementations there are, most of them free.
For informal use--e.g. to learn Common Lisp, CCL or CLISP may be a better choice than SBCL.
I have never tried compiling to C using ECL. It's possible that it would beat SBCL on speed for some applications. I have no idea.
CLISP and LispWorks will not handle arbitrarily long argument lists (unless that's been fixed in the last couple of years, but I doubt it). This turned out to be a problem with my application, but would not be a problem for most code.
Doesn't ACT-R come out of Carnegie Mellon? What do its authors use? My guess would be CMUCL or SBCL, which is derived from CMUCL. (I only tried CMUCL briefly. Its interpreter is very slow, but I assume that compiled code is very fast. I think that most people choose SBCL over CMUCL, however.)
(It's possible that this question belongs on Programmers.SE.)

In general, SBCL is the default choice among open-source Lisps. It is solid, well-supported, produces fast code, and provides many goodies beyond what the standard mandates (concurrency primitives, profiling, etc.) Another implementation with similar properties is CCL.
CLISP is more suitable if you're not an engineer, or you want to quickly show Lisp to someone non-engineer. It's a pretty basic implementation, but quick to get running and user-friendly. A Lisp-calculator :)
ECL's major selling point is that it's embeddable, i.e. it is rather easy to make it work inside some C application, like a web-server etc. It's a good choice for geeks, who want to explore solutions on the boundary of Lisp and the outside world. If you're not intersted in such use case I wouldn't recommend you to try it, especially since it is not actively supported, at the moment.

Their names, their bugs and their non standard additions (using them will lock you in)
I use CLISP as REPL and testing during dev and usually SBCL for production. ECL i've never used.
I recommend you test your code with more than one implementation.

Tuning Mathematical Parallel Codes

Assuming that I am interested in performance rather than portability of my linear algebra iterative multi-threaded solver and that I have the results of profiling my code in hand, how do I go about tuning my code to run optimally on that machine of my choice?
The algorithm involves Matrix-Vector multiplications, norms and dot-products. (FWIW, I am working on CG and GMRES).
I am working on codes which are of matrix size roughly equivalent to the full size of the RAM (~6GB). I'll be working on Intel i3 Laptop. I'll be linking my codes using Intel MKL.
Specifically,
Is there a good resource(PDF/Book/Paper) for learning manual tuning? There are numerous things that I learnt by doing for instance : Manual Unrolling isn't always optimal or about compiler flags but I would prefer a centralized resource.
I need something to translate profiler information to improved performance. For instance, my profiler tells me that my stacks of one processor are being accessed by another or that my mulpd ASM is taking too much time. I have no clue what these mean and how I could use this information for improving my code.
My intention is to spend as much time as needed to squeeze as much compute power as possible. Its more of a learning experience than for actual use or distribution as of now.
(I am concerned about manual tuning not auto-tuning)
Misc Details:
This differs from usual performance tuning since the major portions of the code are linked to Intel's proprietary MKL library.
Because of Memory Bandwidth issues in O(N^2) matrix-vector multiplications and dependencies, there is a limit to what I could manage on my own through simple observation.
I write in C and Fortran and I have tried both and as discussed a million times on SO, I found no difference in either if I tweak them appropriately.

Gosh, this still has no answers. After you've read this you'll still have no useful answers ...
You imply that you've already done all the obvious and generic things to make your codes fast. Specifically you have:
chosen the fastest algorithm for your problem (either that, or your problem is to optimise the implementation of an algorithm rather than to optimise the finding of a solution to a problem);
worked your compiler like a dog to squeeze out the last drop of execution speed;
linked in the best libraries you can find which are any use at all (and tested to ensure that they do in fact improve the performance of your program;
hand-crafted your memory access to optimise r/w performance;
done all the obvious little tricks that we all do (eg when comparing the norms of 2 vectors you don't need to take a square root to determine that one is 'larger' than another, ...);
hammered the parallel scalability of your program to within a gnat's whisker of the S==P line on your performance graphs;
always executed your program on the right size of job, for a given number of processors, to maximise some measure of performance;
and still you are not satisfied !
Now, unfortunately, you are close to the bleeding edge and the information you seek is not to be found easily in books or on web-sites. Not even here on SO. Part of the reason for this is that you are now engaged in optimising your code on your platform and you are in the best position to diagnose problems and to fix them. But these problems are likely to be very local indeed; you might conclude that no-one else outside your immediate research group would be interested in what you do, I know you wouldn't be interested in any of the micro-optimisations I do on my code on my platform.
The second reason is that you have stepped into an area that is still an active research front and the useful lessons (if any) are published in the academic literature. For that you need access to a good research library, if you don't have one nearby then both the ACM and IEEE-CS Digital Libraries are good places to start. (Post or comment if you don't know what these are.)
In your position I'd be looking at journals on 2 topics: peta- and exa-scale computing for science and engineering, and compiler developments. I trust that the former is obvious, the latter may be less obvious: but if your compiler already did all the (useful) cutting-edge optimisations you wouldn't be asking this question and compiler-writers are working hard so that your successors won't have to.
You're probably looking for optimisations which like, say, loop unrolling, were relatively difficult to find implemented in compilers 25 years ago and which were therefore bleeding-edge back then, and which themselves will be old and established in another 25 years.
EDIT
First, let me make explicit something that was originally only implicit in my 'answer': I am not prepared to spend long enough on SO to guide you through even a summary of the knowledge I have gained in 25+ years in scientific/engineering and high-performance computing. I am not given to writing books, but many are and Amazon will help you find them. This answer was way longer than most I care to post before I added this bit.
Now, to pick up on the points in your comment:
on 'hand-crafted memory access' start at the Wikipedia article on 'loop tiling' (see, you can't even rely on me to paste the URL here) and read out from there; you should be able to quickly pick up the terms you can use in further searches.
on 'working your compiler like a dog' I do indeed mean becoming familiar with its documentation and gaining a detailed understanding of the intentions and realities of the various options; ultimately you will have to do a lot of testing of compiler options to determine which are 'best' for your code on your platform(s).
on 'micro-optimisations', well here's a start: Performance Optimization of Numerically Intensive Codes. Don't run away with the idea that you will learn all (or even much) of what you want to learn from this book. It's now about 10 years old. The take away messages are:
performance optimisation requires intimacy with machine architecture;
performance optimisation is made up of 1001 individual steps and it's generally impossible to predict which ones will be most useful (and which ones actually harmful) without detailed understanding of a program and its run-time environment;
performance optimisation is a participation sport, you can't learn it without doing it;
performance optimisation requires obsessive attention to detail and good record-keeping.
Oh, and never write a clever piece of optimisation that you can't easily un-write when the next compiler release implements a better approach. I spend a fair amount of time removing clever tricks from 20-year old Fortran that was justified (if at all) on the grounds of boosting execution performance but which now just confuses the programmer (it annoys the hell out of me too) and gets in the way of the compiler doing its job.
Finally, one piece of wisdom I am prepared to share: these days I do very little optimisation that is not under one of the items in my first list above; I find that the cost/benefit ratio of micro-optimisations is unfavourable to my employers.

Is a functional language a good choice for a Flight Simulator? How about Lisp?

I have been doing object-oriented programming for a few years now, and I have not done much functional programming. I have an interest in flight simulators, and am curious about the functional programming aspect of Lisp. Flight simulators or any other real world simulator makes sense to me in an object-oriented paradigm.
Here are my questions:
Is object oriented the best way to represent a real world simulation domain?
I know that Common Lisp has CLOS (OO for lisp), but my question is really about writing a flight simulator in a functional language. So if you were going to write it in Lisp, would you choose to use CLOS or write it in a functional manner?
Does anyone have any thoughts on coding a flight simulator in lisp or any functional language?
UPDATE 11/8/12 - A similar SO question for those interested -> How does functional programming apply to simulations?

It's a common mistake to think of "Lisp" as a functional language. Really it is best thought of as a family of languages, probably, but these days when people say Lisp they usually mean Common Lisp.
Common Lisp allows functional programming, but it isn't a functional language per se. Rather it is a general purpose language. Scheme is a much smaller variant, that is more functional in orientation, and of course there are others.
As for your question is it a good choice? That really depends on your plans. Common Lisp particularly has some real strengths for this sort of thing. It's both interactive and introspective at a level you usually see in so-called scripting languages, making it very quick to develop in. At the same time its compiled and has efficient compilers, so you can expect performance in the same ballpark as other efficient compilers (with a factor of two of c is typical ime). While a large language, it has a much more consistent design than things like c++, and the metaprogramming capabilities can make very clean, easy to understand code for your particular application. If you only look at these aspects
common lisp looks amazing.
However, there are downsides. The community is small, you won't find many people to help if that's what you're looking for. While the built in library is large, you won't find as many 3rd party libraries, so you may end up writing more of it from scratch. Finally, while it's by no means a walled garden, CL doesn't have the kind of smooth integration with foreign libraries that say python does. Which doesn't mean you can't call c code, there are nice tools for this.
By they way, CLOS is about the most powerful OO system I can think of, but it is quite a different approach if you're coming from a mainstream c++/java/c#/etc. OO background (yes, they differ, but beyond single vs. multiple inh. not that much) you may find it a bit strange at first, almost turned inside out.
If you go this route, you are going to have to watch for some issues with performance of the actual rendering pipeline, if you write that yourself with CLOS. The class system has incredible runtime flexibility (i.e. updating class definitions at runtime not via monkey patching etc. but via actually changing the class and updating instances) however you pay some dispatch cost on this.
For what it's worth, I've used CL in the past for research code requiring numerical efficiency, i.e. simulations of a different sort. It works well for me. In that case I wasn't worried about using existing code -- it didn't exist, so I was writing pretty much everything from scratch anyway.
In summary, it could be a fine choice of language for this project, but not the only one. If you don't use a language with both high-level aspects and good performance (like CL has, as does OCaml, and a few others) I would definitely look at the possibility of a two level approach with a language like lua or perhaps python (lots of libs) on top of some c or c++ code doing the heavy lifting.

If you look at the game or simulator industry you find a lot of C++ plus maybe some added scripting component. There can also be tools written in other languages for scenery design or related tasks. But there is only very little Lisp used in that domain. You need to be a good hacker to get the necessary performance out of Lisp and to be able to access or write the low-level code. How do you get this knowhow? Try, fail, learn, try, fail less, learn, ... There is nothing but writing code and experimenting with it. Lisp is really useful for good software engineers or those that have the potential to be a good software engineer.
One of the main obstacles is the garbage collector. Either you have a very simple one (then you have a performance problem with random pauses) or you have a sophisticated one (then you have a problem getting it working right). Only few garbage collectors exist that would be suitable - most Lisp implementations have good GC implementations, but still those are not tuned for real-time or near real-time use. Exceptions do exist. With C++ you can forget the GC, because there usually is none.
The other alternative to automatic memory management with a garbage collector is to use no GC and manage memory 'manually'. This is used by some (even commercial) Lisp applications that need to support some real-time response (for example process control expert systems).
The nearest thing that was developed in that area was the Crash Bandicoot (and also later games) game for the Playstation I (later games were for the Playstation II) from Naughty Dog. Since they have been bought by Sony, they switched to C++ for the Playstation III. Their development environment was written in Allegro Common Lisp and it included a compiler for a Scheme (a Lisp dialect) variant. On the development system the code gets compiled and then downloaded to the Playstation during development. They had their own 3d engine (very impressive, always got excellent reviews from game magazines), incremental level loading, complex behaviour control for lots of different actors, etc. So the Playstation was really executing the Scheme code, but memory management was not done via GC (afaik). They had to develop all the technology on their own - nobody was offering Lisp-based tools - but they could, because their were excellent software developers. Since then I haven't heard of a similar project. Note that this was not just Lisp for scripting - it was Lisp all the way down.
One the Scheme side there is also a new interesting implementation called Ypsilon Scheme. It is developed for a pinball game - this could be the base for other games, too.
On the Common Lisp side, there have been Lisp applications talking to flight simulators and controlling aspects of them. There are some game libraries that are based on SDL. There are interfaces to OpenGL. There is also something like the 'Open Agent Engine'. There are also some 3d graphics applications written in Common Lisp - even some complex ones. But in the area of flight simulation there is very little prior art.
On the topic of CLOS vs. Functional Programming. Probably one would use neither. If you need to squeeze all possible performance out of a system, then CLOS already has some overheads that one might want to avoid.

Take a look at Functional Reactive Programming. There are a number of frameworks for this in Haskell (don't know about other languages), most of which are based around arrows. The basic idea is to represent relationships between time-varying values and events. So for example you would write (in Haskell arrow notation using no particular library):
velocity <- {some expression of airspeed, heading, gravity etc.}
position <- integrate <- velocity
The second line declares the relationship between position and velocity. The <- arrow operators are syntactic sugar for a bunch of library calls that tie everything together.
Then later on you might say something like:
groundLevel <- getGroundLevel <- position
altitude <- getAltitude <- position
crashed <- liftA2 (<) altitude groundLevel
to declare that if your altitude is less than the ground level at your position then you have crashed. Just as with the other variables here, "crashed" is not just a single value, its a time-varying stream of values. That is why the "liftA2" function is used to "lift" the comparison operator from simple values to streams.
IO is not a problem in this paradigm. Inputs are time varying values such as joystick X and Y, while the image on the screen is simply another time varying value. At the very top level your entire simulator is an arrow from the inputs to the outputs. Then you call a "run" function that converts the arrow into an IO action that runs the game.
If you write this in Lisp you will probably find yourself creating a bunch of macros that basically re-invent arrows, so it might be worth just finding out about arrows to start with.

I don't know anything about flight sims, and you haven't listed anything in particular they consist of, so this is mostly a guess about writing a FS in Lisp.
Why not:
Lisp excels at exploratory programming. I think that since FSs been around so long, and there are free and open-source examples, that it would not benefit as much from this type of programming.
Flight sims are mostly (I'm guessing) written in static, natively compiled languages. If you're looking for pure runtime performance, in Lisp this tends to mean type declarations and other not-so-Lispy constructs. If you don't get the performance you want with naive approaches, your optimized-Lisp might end up looking a lot like C, and Lisp isn't as good at C at writing C.
A lot of a FS, I'm guessing, is interfacing to a graphics library like OpenGL, which is written in C. Depending on how your FFI / OpenGL bindings are, this might, again, make your code look like C-in-Lisp. You might not have the big win that Lisp does in, say, a web app (which consists of generating a tree structure of plain text, which Lisp is great at).
Why:
I took a glance at the FlightGear source code, and I see a lot of structural boilerplate -- even a straight port might end up being half the size.
They use strings for keys all over the place (C++ doesn't have symbols). They use XML for semi-human-readable config files (C++ doesn't have a runtime reader). Simply switching to native Lisp constructs here could be big win for minimal effort.
Nothing looks at all complex, even the "AI". It's simply a matter of keeping everything organized, and Lisp will be great at this because it'll be a lot shorter.
But the neat thing about Lisp is that it's multi-paradigm. You can use OO for organizing the "objects", and FP for computation within each object. I say just start writing and see where it takes you.

I would first think of the nature of the simulation.
Some simulations require interaction like a flight simulator. I don't think functional programming may be a good choice for an interactive (read: CPU intensive/response-critical) applicaiton. Of course, if you have access to 8 PS3's wired together with Linux, you'll not care too much about performance.
For simulations like evolutionary/genetic programming where you set it up and let 'er rip, a functioonal lauguage may help model the problem domain better than an OO language. Not that I'm an expert in functional programming but the ease of coding recursion and the idea of lazy evaluation common in functional languages seems to me a good fit for the 'let her rip' sort of sims.

I wouldn't say functional programming lends itself particularly well to flight simulation. In general, functional languages can be very useful for writing scientific simulations, though this is a slightly specialised case. Really, you'd probably be better off with a standard imperative (preferably OOP) language like C++/C#/Java, as they would tend to have the better physics libraries as well as graphics APIs, both of which you would need to use very heavily. Also, the OOP approach might make it easier to represent your environment. Another point to consider is that (as far as I know) the popular flight simulators on the market today are written pretty much entirely in C++.
Essentially, my philosophy is that if there's no particularly good reason that you should need to use functional paradigms, then don't use a functional language (though there's nothing to stop you using functional constructs in OOP/mixed languages). I suspect you're going to have a lot less painful of a development process using the well-tested APIs for C++ and languages more commonly associated with game development (which has many commonalities with flight sim). Now, if you want to add some complex AI to the simulator, Lisp might seem like a rather more obvious choice, though even then I wouldn't at all jump for it. And finally, if you're really keen on using a functional language, I would recommend you go with one of the more general purpose ones like Python or even F# (both mixed imperative-functional languages really), as opposed to Lisp, which could end up getting rather ugly for such a project.

There are a few problems with functional languages, and that is they don't mesh well with state, but they do go well with process. So in a way it could be said they are action oriented. This means you'll be wasting your time simulating a plane, what you want to do is simulate the actions of flying a plane. Once you grim that you can probably get it to work.
Now as side point, haskell wouldn't be good IMHO, because it's too abstract for a "game", this sort of app is all about Input/Output, but Haskell is about avoiding IO, so it'll become a monad nightmare, and you'll be working against the language. Lisp is a better choice, or Lua or Javascript, they are also functional, but not purely functional, so for your case try Lisp. Anyways in any of these languages your graphics will be C or C++.
A serious issue however is there is very little documentation, and less tutorials about Functional languages and "games", of course scientific simulations is academically documented but those papers are quite dense, if you succeed maybe you could write you experiences, for others as it's a rather empty field right now

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex