designing large projects in OCaml [closed] - functional-programming

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
What is a best practices to write large software projects in OCaml?
How do you structure your projects?
What features of OCaml should and should not be used to simplify code management? Exceptions? First-class modules? GADTs? Object types?
Build system? Testing framework? Library stack?
I found great recommendations for haskell and I think it would be good to have something similar for OCaml.

I am going to answer for a medium-sized project in the conditions that I am familiar with, that is between 100K and 1M lines of source code and up to 10 developers. This is what we are using now, for a project started two months ago in August 2013.
Build system and code organization:
one source-able shell script defines PATH and other variables for our project
one .ocamlinit file at the root of our project loads a bunch of libraries when starting a toplevel session
omake, which is fast (with -j option for parallel builds); but we avoid making crazy custom omake plugins
one root Makefile contains all the essential targets (setup, build, test, clean, and more)
one level of subdirectories, not two
most subdirectories build into an OCaml library
some subdirectories contain other things (setup, scripts, etc.)
OCAMLPATH contains the root of the project; each library subdirectory produces a META file, making all OCaml parts of the projects accessible from the toplevel using #require.
only one OCaml executable is built for the whole project (saves a lot of linking time; still not sure why)
libraries are installed via a setup script using opam
local opam packages are made for software that it not in the official opam repository
we use an opam switch which is an alias named after our project, avoiding conflicts with other projects on the same machine
Source-code editing:
emacs with opam packages ocp-indent and ocp-index
Source control and management:
we use git and github
all new code is peer-reviewed via github pull requests
tarballs for non-opam non-github libraries are stored in a separate git repository (that can be blown away if history gets too big)
bleeding-edge libraries existing on github are forked into our github account and installed via our own local opam package
Use of OCaml:
OCaml will not compensate for bad programming practices; teaching good taste is beyond the scope of this answer. http://ocaml.org/learn/tutorials/guidelines.html is a good starting point.
OCaml 4.01.0 makes it much easier than before to reuse record field labels and variant constructors (i.e. type t1 = {x:int} type t2 = {x:int;y:int} let t1_of_t2 ({x}:t2) : t1 = {x} now works)
we try to not use camlp4 syntax extensions in our own code
we do not use classes and objects unless mandated by some external library
in theory since OCaml 4.01.0 we should prefer classic variants over polymorphic variants
we use exceptions to indicate errors and let them go through happily until our main server loop catches them and interprets them as "internal error" (default), "bad request", or something else
exceptions such as Exit or Not_found can be used locally when it makes sense, but in module interfaces we prefer to use options.
Libraries, protocols, frameworks:
we use Batteries for all commodity functions that are missing from OCaml's standard library; for the rest we have a "util" library
we use Lwt for asynchronous programming, without the syntax extensions, and the bind operator (>>=) is the only operator that we use (if you have to know, we do reluctantly use camlp4 preprocessing for better exception tracking on bind points).
we use HTTP and JSON to communicate with 3rd-party software and we expect every modern service to provide such APIs
for serving HTTP, we run our own SCGI server (ocaml-scgi) behind nginx
as an HTTP client we use Cohttp
for JSON serialization we use atdgen
"Cloud" services:
we use quite a lot of them as they are usually cheap, easy to interact with, and solve scalability and maintenance problems for us.
Testing:
we have one make/omake target for fast tests and one for slow tests
fast tests are unit tests; each module may provide a "test" function; a test.ml file runs the list of tests
slow tests are those that involve running multiple services; these are crafted specifically for our project, but they cover as much as possible as a production service. Everything runs locally either on Linux or MacOS, except for cloud services for which we find ways to not interfere with production.
Setting this all up is quite a bit of work, especially for someone not familiar with OCaml. There is no framework taking care of all that yet, but at least you get the choice of the tools.

OASIS
To add to Pavel answer:
Disclaimer: I am the author of OASIS.
OASIS also has oasis2opam that can help to create OPAM package quickly and oasis2debian to create Debian packages. This is extremly useful if you want to create a 'release' target that automate most of the tasks to upload a package.
OASIS is also shipped with a script called oasis-dist.ml that creates automatically tarball for upload.
Look all this in https://github.com/ocaml.org.
Testing
I use OUnit to do all my tests. This is simple and pretty efficient if you are used to xUnit testing.
Source control/management
Disclaimer: I am the owner/maintainer of forge.ocamlcore.org (aka forge.o.o)
If you want to use git, I recommend to use github. This is really efficient for review.
If you use darcs or subversion, you can create an account on forge.o.o.
In both case having a public mailing list where you send all commit notification is a must have, so that everyone can see them and review them. You can use either Google groups or a mailing list on forge.o.o.
I recommend to have a nice web (github or forge.o.o) page with OCamldoc documentation build everytime you commit. If you have a huge code base this will help you to use the OCamldoc generated documentation right from the beginning (and fix it quickly).
I recommend to create tarballs when you reach a stable stage. Don't just rely on checking out the latest git/svn version. This tip has saved me hours of work in the past. As said by Martin, store all your tarballs in a central place (a git repository is a good idea for that).

This one probably doesn't answer your question completely, but here is my experience regarding build environment:
I really appreciate OASIS. It has a nice set of features, helping not only to build the project, but also to write documentation and support test environment.
Build system
OASIS generates setup.ml file from the specification (_oasis file), which works basically as a building script. It accepts -configure, -build, -test, -distclean flags. I quite used to them while working with different GNU and other projects that usually use Makefiles and I find it convenient that it is possible to use all of them automatically here.
Makefiles. Instead of generating setup.ml, it is also possible to generate Makefile with all options described above available.
Structure
Usually my project that is built by OASIS has at least three directories: src, _build, scripts and tests.
In the former directory all source files are stored in one directory: source (.ml) and interface (.mli) files are stored together. May be if the project is too large, it is worth introducing more subdirectories.
The _build directory is under the influence of OASIS build system. It stores both source and object files there and I like that build files are not interfered with source files, so I can easily delete it in case something goes wrong.
I store multiple shell scripts in the scripts directory. Some of them are for test execution and interface file generation.
All input and output files for tests I store in a separate directory.
Interfaces/Documentation
The use of interface files (.mli) has both advantages and drawbacks for me. It really helps to find type errors, but if you have them, you have to edit them as well when making changes or improvements in your code. Sometimes forgetting this causes nasty errors.
But the main reason why I like interface files is documentation. I use ocamldoc to generate (OASIS supports this feature with -doc flag) html pages with documentation automatically. In my opinion it is enough to write comments describing each function in the interface and not to insert comments in the middle of code. In OCaml functions are usually short and concise and if there is a necessity to insert extra comments there, may be it is better to split the function.
Also be aware of -i flag for ocamlc. The compiler can automatically generate interface file for a module.
Tests
I didn't find a reasonable solution for supporting tests (I would like to have some ocamltest application), that's why I am using my own scripts for executing and verifying use cases. Fortunately, OASIS supports executing custom commands when setup.ml is run with -test flag.
I don't use OASIS for a long time and if anyone knows any other cool features, I would like also to know about them.
Also, it you are not aware of OPAM, it is definitely worth looking at. Without it installing and managing new packages is a nightmare.

Related

Using MSBUILD like a classic MAKEfile -- how do I do this?

I'm frustrated by the lack of flexibility in the Visual Studio project/solution, but I realized that now that it uses MSBUILD it might be quite powerful but just doesn't expose that to the IDE. So I took a look at MSBUILD docs and don't know where to start! I wish there was a Nutshell book for that. Is there any good tutorial someone could point me to?
More specifically, here is the kinds of things I want to do:
Run a utility pre-processor to generate .CPP and .H files, which are then used by a regular C++ project. There are multiple inputs (to figure dependencies of; specifically should know if a normal .h file it uses has changed) and multiple outputs (at least one .cpp and one .h file) that are used as files in another project.
FWIW, the most complex case involves using Qt in a "normal" C++ project that can be built using VS Express 2010 or MSBUILD directly from a script on a server. Since that is a common library, there might be some guides or whatever to help? Note that a VS plug-in is not useful for the building stage, but could be used to initially generate project files that then rely only on MSBUILD and stuff included with the source code.
Would somebody please point me in the right direction?
--John
It gets worse from there, but that's my first goal.
I found the kind of information I was looking for in a book MSBuild Trickery: 99 Ways to Bend the Build Engine to Your Will by Brian Kretzler.
In the first 18 pages I found a few key pieces of information that, along with the on-line documentations I've already gone through, helps clear things up enough to try tackling my project. Details of interest include the processing order of how MSBuild reads and operates on the things in the file, quick points on when wildcard in items are expanded and how to handle generated files, and how to see what's happening in some practical cases or even step in the debugger.
FWIW, I managed to attack my problem without using the murky ".targets"/rules files that I have yet to understand, but only using better documented/exampled features (in particular, a Target that has wildcard items doesn't care that the file name extension is not in any ".target"; is simple enough to copy from example and allows the files to be seen in the IDE Project and added to the list using the IDE; again, the FileExtension there just works OK.)

Documentation for writing GNOME Shell extensions

I've been asked to customise the layout of the GNOME 3 desktop. Apparently the way to do that is by writing an "extension".
I've managed to do some of the things I wanted to do, but I feel utterly starved of information. I cannot find any useful documentation anywhere. I've wasted entire days of my life frantically googling every imaginable search term in a desperate attempt to find useful information.
The GNOME website has hundreds of extensions for download. These are not trivial 3-liners; they're sophisticated pieces of code. It defies belief that anybody could write these without documentation explaining how to do it.
Please, can somebody tell me where the actual documentation is? So far, the best I've managed to do is take apart existing extensions trying to track down the magic command that does the specific bit I'm interested in. (Not an easy task!)
Command names, object paths, example programs, anything would be helpful!
I have recently dug into it myself. The documentation is usually sparse or outdated. Here are some sources which helped me to get started (and through development):
Basic Stuff
Step-by-step tutorial (Gnome 3.4)
Unofficial documentation for the JavaScript bindings of many libraries
The sources of the gnome-shell's JavaScript bindings
Explanation of the St (Shell Toolkit) Ui-Toolkit components.
Some unofficial guidelines to get your extension on extensions.gnome.org
Since the documentation is nearly unavailable (or up to date), you'll need to do a lot of source-reading. I linked the gnome-shell sources above (the JavaScript part) which is a good start when diving into parts that are not covered by the In-official documentation (which is the most complete thing you'll find).
What's also particular helpful is checking extensions.gnome.org for extensions which do similar things to what you want to create, and look at their sources (most of them are open-source on GitHub or Bitbucket. You can also install them and find the sources under ~/.local/share/gnome-shell/extensions/).
When searching for something to use or more documentation on a particular function, you can also consult manuals for bindings in different languages (thought the parameters and return-values might not match).
Last but not least, here is some debugging advice:
LookingGlass is not particularly helpful. It only shows one line of an exception (the description) and only if they occur at startup time (when your extension is first started).
For full StackTraces and runtime-exceptions, consult the ~/.xsession-errors-file. It might be very long and bloated. I use this handy script to read it:
# Grabs the last session-errors from the current X11 session.
# This includes full Stack-Trace of gnome-shell-extension errors.
# See https://live.gnome.org/GnomeShell/Extensions/StepByStepTutorial#lookingGlass
tail -n100 ~/.cache/gdm/session.log | less
Note that since Gnome 3.6, if you are using gdm as display manager, the current session log is the file ~/.cache/gdm/session.log.
On some newer distros using systemd, you can get the error logs with:
journalctl -f /usr/bin/gnome-session
For debugging the prefs-part of your extension, you can launch the preferences by using the gnome-shell-extension-prefs-tool from a terminal, to see any exception-output on the console (you can also call the tool like gnome-shell-extension-prefs [uuid], to directly show your extensions preferences).
Since there is currently no real way of debugging with breakpoints (there is, but it's tricky), you can log on the console for quick checking, use the print()-function. You will see the output as mentioned above (either in the sessions-error file or on the terminal when starting gnome-shell-extension-prefs-tool).
Although it might be a little hard to get into it, the extension framework is quite powerful. Have fun!
I wrote a Blog-Post with somewhat greater detail, which can be found here: Making Gnome-Shell Extensions
An extensive list of references can be found on the Gnome Developer - API Reference page.
I used the following for my extension, but your use may vary:
GTK+ 3
GTK+ is the primary library used to construct user interfaces in GNOME applications. It provides user interface controls and signal callbacks to control user interfaces.
GDK 3
GDK is an intermediate layer which isolates GTK+ from the details of the windowing system.
Clutter
Clutter is a GObject based library for creating fast, visually rich, graphical user interfaces.
GObject Introspection
GObject Introspection is striving to provide a middleware layer between (GObject based) C libraries and language bindings.
Shell
Shell Reference Manual
St
St - Shell Toolkit - is the GNOME Shell's custom Clutter-based toolkit that defines useful actors. Some of these actors, such as StBoxLayout and StBin implement various layout options.
Icon Theme Specification
This freedesktop.org specification describes a common way to store icon themes.
NOTE: These last two are very helpful in finding visual element parameters!
PyGTK
PyGTK is GTK+ for Python. This reference contains a chapter for each Python PyGTK module (that corresponds to the underlying GTK+ library) containing the class descriptions.
PyGObject
PyGObject is a Python extension module that gives clean and consistent access to the entire GNOME software platform through the use of GObject Introspection. Specifically speaking, it is Python Bindings for GLib, GObject, GIO and GTK+.
This reference contains a chapter for each PyGObject module containing the class descriptions.
The documentation is on:
https://gjs.guide/extensions/
For the documentation of libraries:
https://gjs-docs.gnome.org/
More details on https://gjs.guide/extensions/overview/architecture.html
The other stuff you might want to check are
https://gitlab.gnome.org/GNOME/gnome-shell/blob/main/js/ui/popupMenu.js
https://gitlab.gnome.org/GNOME/gnome-shell/blob/main/js/ui/dialog.js
https://gitlab.gnome.org/GNOME/gnome-shell/blob/main/js/ui/modalDialog.js
https://gitlab.gnome.org/GNOME/gnome-shell/blob/main/js/ui/panelMenu.js
https://gitlab.gnome.org/GNOME/gnome-shell/tree/main/js
https://gitlab.gnome.org/GNOME/mutter
You can browse under js/ for more code to be reused.
You might also want to check https://gi.readthedocs.io/en/latest/index.html
Question:
I could not find anything under https://gjs-docs.gnome.org/ except some CSS and Javascript documentation ?!?!
Answer:
You have to first enable the docs to use them. Here, you will be mainly looking for:
clutter
meta
shell
st
Create a file like:
echo '{"docs":"clutter9~9_api/clutterx118~8_api/gobject20~2.66p/meta9~9_api/shell01~0.1_api/st10~1.0_api","hideIntro":"1"}' > devdocs.json
Import this file to https://gjs-docs.gnome.org/settings
Now you will be able to visit:
https://gjs-docs.gnome.org/shell01~0.1_api-global/
https://gjs-docs.gnome.org/shell01~0.1_api/
https://gjs-docs.gnome.org/meta9~9_api/
https://gjs-docs.gnome.org/st10~1.0_api/
https://gjs-docs.gnome.org/clutter9~9_api/
https://gjs-docs.gnome.org/clutter9~9_api-actor/
Warning: The version on the devdocs.json file is hardcoded. It will be outdated in no time, so you might want to check the version. The point is - you can not access docs until you enable them.
P.S. I know, this is a mess. This is how they did it.

What use does ./configure serve (other than checking dependencies)

Why does every source package that uses a makefile come with a ./configure script, what does it do? As far as I can tell, it actually generates the makefile?
Is there anything that can't be done in the makefile?
configure is usually a result of the 'autoconf' system. It can generate headers, makefiles, really anything. 'Usually,' since some are hand-crafted.
The reason for these things is to allow source code to be compiled in disparate environments. There are many variations on Unix / Linux, with slightly different system headers and libraries. configure scripts allow code to auto-adapt.
The configure step is a sort of meta build. It generates the makefile that will work on the specific hardware / distribution you're running. For instance it determines the name of the C or C++ compiler and embeds that in the makefile.
The configure step will also frequently take a set of parameters, the values of which may determine what libraries need to be linked against. For instance if you compile Apache HTTP with SSL enabled it needs to link against more shared libraries than if you don't. Since linking is handled by the makefile, you need an extra step to create a custom makefile (rather than requiring the make command to require dozens or hundreds of options.
Everything can be done from within the makefile but some build systems were implemented otherwise.
I don't personally use configure files for my projects but I admit I mostly export Erlang & Python based projects.
I don't think of the makefile as a script, I think of it as an input to the make utility.
The configure script (as its name suggests) configures the makefile, including as you say resolving dependencies.
If only from the idea of avoiding self-modifying code, the things in the configure script don't really belong in the makefile.
the point is that autoconf autohdr automake form an integrated system the makes cross platform building on unix relatively str8forward. THe docs are really bad and there are lots of horrible gotchas but on the other hand there are a lot of working samples
When I first came across this stuff I thought - "ha I can do that with a nice clean makefile" and proceeded to rework the source that way. Big mistake. Learn to write and edit configure.ac and makefile.am files, you will be happy in the end
To answer your question. Configure is good for
is function foo available on this platform and if so which include and library do I need
letting the builder choose if they want feature wizzbang included in a nice simple consistent way

What is your experience with non-recursive make? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
A few years ago, I read the Recursive Make Considered Harmful paper and implemented the idea in my own build process. Recently, I read another article with ideas about how to implement non-recursive make. So I have a few data points that non-recursive make works for at least a few projects.
But I'm curious about the experiences of others. Have you tried non-recursive make? Did it make things better or worse? Was it worth the time?
We use a non-recursive GNU Make system in the company I work for. It's based on Miller's paper and especially the "Implementing non-recursive make" link you gave. We've managed to refine Bergen's code into a system where there's no boiler plate at all in subdirectory makefiles. By and large, it works fine, and is much better than our previous system (a recursive thing done with GNU Automake).
We support all the "major" operating systems out there (commercially): AIX, HP-UX, Linux, OS X, Solaris, Windows, even the AS/400 mainframe. We compile the same code for all of these systems, with the platform dependent parts isolated into libraries.
There's more than two million lines of C code in our tree in about 2000 subdirectories and 20000 files. We seriously considered using SCons, but just couldn't make it work fast enough. On the slower systems, Python would use a couple of dozen seconds just parsing in the SCons files where GNU Make did the same thing in about one second. This was about three years ago, so things may have changed since then. Note that we usually keep the source code on an NFS/CIFS share and build the same code on multiple platforms. This means it's even slower for the build tool to scan the source tree for changes.
Our non-recursive GNU Make system is not without problems. Here are some of biggest hurdles you can expect to run into:
Making it portable, especially to Windows, is a lot of work.
While GNU Make is almost a usable functional programming language, it's not suitable for programming in the large. In particular, there are no namespaces, modules, or anything like that to help you isolate pieces from each other. This can cause problems, although not as much as you might think.
The major wins over our old recursive makefile system are:
It's fast. It takes about two seconds to check the entire tree (2k directories, 20k files) and either decide it's up to date or start compiling. The old recursive thing would take more than a minute to do nothing.
It handles dependencies correctly. Our old system relied on the order subdirectories were built etc. Just like you'd expect from reading Miller's paper, treating the whole tree as a single entity is really the right way to tackle this problem.
It's portable to all of our supported systems, after all the hard work we've poured into it. It's pretty cool.
The abstraction system allows us to write very concise makefiles. A typical subdirectory which defines just a library is just two lines. One line gives the name of the library and the other lists the libraries this one depends on.
Regarding the last item in the above list. We ended up implementing a sort of macro expansion facility within the build system. Subdirectory makefiles list programs, subdirectories, libraries, and other common things in variables like PROGRAMS, SUBDIRS, LIBS. Then each of these are expanded into "real" GNU Make rules. This allows us to avoid much of the namespace problems. For example, in our system it's fine to have multiple source files with the same name, no problem there.
In any case, this ended up being a lot of work. If you can get SCons or similar working for your code, I'd advice you look at that first.
After reading the RMCH paper, I set out with the goal of writing a proper non-recursive Makefile for a small project I was working on at the time. After I finished, I realized that it should be possible to create a generic Makefile "framework" which can be used to very simply and concisely tell make what final targets you would like to build, what kind of targets they are (e.g. libraries or executables) and what source files should be compiled to make them.
After a few iterations I eventually created just that: a single boilerplate Makefile of about 150 lines of GNU Make syntax that never needs any modification -- it just works for any kind of project I care to use it on, and is flexible enough to build multiple targets of varying types with enough granularity to specify exact compile flags for each source file (if I want) and precise linker flags for each executable. For each project, all I need to do is supply it with small, separate Makefiles that contain bits similar to this:
TARGET := foo
TGT_LDLIBS := -lbar
SOURCES := foo.c baz.cpp
SRC_CFLAGS := -std=c99
SRC_CXXFLAGS := -fstrict-aliasing
SRC_INCDIRS := inc /usr/local/include/bar
A project Makefile such as the above would do exactly what you'd expect: build an executable named "foo", compiling foo.c (with CFLAGS=-std=c99) and baz.cpp (with CXXFLAGS=-fstrict-aliasing) and adding "./inc" and "/usr/local/include/bar" to the #include search path, with final linking including the "libbar" library. It would also notice that there is a C++ source file and know to use the C++ linker instead of the C linker. The framework allows me to specify a lot more than what is shown here in this simple example.
The boilerplate Makefile does all the rule building and automatic dependency generation required to build the specified targets. All build-generated files are placed in a separate output directory hierarchy, so they're not intermingled with source files (and this is done without use of VPATH so there's no problem with having multiple source files that have the same name).
I've now (re)used this same Makefile on at least two dozen different projects that I've worked on. Some of the things I like best about this system (aside from how easy it is to create a proper Makefile for any new project) are:
It's fast. It can virtually instantly tell if anything is out-of-date.
100% reliable dependencies. There is zero chance that parallel builds will mysteriously break, and it always builds exactly the minimum required to bring everything back up-to-date.
I will never need to rewrite a complete makefile again :D
Finally I'd just mention that, with the problems inherent in recursive make, I don't think it would have been possible for me to pull this off. I'd probably have been doomed to rewriting flawed makefiles over and over again, trying in vain to create one that actually worked properly.
Let me stress one argument of Miller's paper: When you start to manually resolve dependency relationships between different modules and have a hard time to ensure the build order, you are effectively reimplementing the logic the build system was made to solve in the first place. Constructing reliable recursive make build systems is very hard. Real life projects have many interdependent parts whose build order is non-trivial to figure out and thus, this task should be left to the build system. However, it can only resolve that problem if it has global knowledge of the system.
Furthermore, recursive make build-systems are prone to fall apart when building concurrently on multiple processors/cores. While these build systems may seem to work reliably on a single processor, many missing dependencies go undetected until you start to build your project in parallel. I've worked with a recursive make build system which worked on up to four processors, but suddenly crashed on a machine with two quad-cores. Then I was facing another problem: These concurrency issues are next to impossible to debug and I ended up drawing a flow-chart of the whole system to figure out what went wrong.
To come back to your question, I find it hard to think of good reasons why one wants to use recursive make. The runtime performance of non-recursive GNU Make build systems is hard to beat and, quite the contrary, many recursive make systems have serious performance problems (weak parallel build support is again a part of the problem). There is a paper in which I evaluated a specific (recursive) Make build system and compared it to a SCons port. The performance results are not representative because the build system was very non-standard, but in this particular case the SCons port was actually faster.
Bottom line: If you really want to use Make to control your software builds, go for non-recursive Make because it makes your life far easier in the long run. Personally, I would rather use SCons for usability reasons (or Rake - basically any build system using a modern scripting language and which has implicit dependency support).
I made a half-hearted attempt at my previous job at making the build system (based on GNU make) completely non-recursive, but I ran into a number of problems:
The artifacts (i.e. libraries and executables built) had their sources spread out over a number of directories, relying on vpath to find them
Several source files with the same name existed in different directories
Several source files were shared between artifacts, often compiled with different compiler flags
Different artifacts often had different compiler flags, optimization settings, etc.
One feature of GNU make which simplifies non-recursive use is target-specific variable values:
foo: FOO=banana
bar: FOO=orange
This means that when building target "foo", $(FOO) will expand to "banana", but when building target "bar", $(FOO) will expand to "orange".
One limitation of this is that it is not possible to have target-specific VPATH definitions, i.e. there is no way to uniquely define VPATH individually for each target. This was necessary in our case in order to find the correct source files.
The main missing feature of GNU make needed in order to support non-recursiveness is that it lacks namespaces. Target-specific variables can in a limited manner be used to "simulate" namespaces, but what you really would need is to be able to include a Makefile in a sub-directory using a local scope.
EDIT: Another very useful (and often under-used) feature of GNU make in this context is the macro-expansion facilities (see the eval function, for example). This is very useful when you have several targets which have similar rules/goals, but differ in ways which cannot be expressed using regular GNU make syntax.
I agree with the statements in the refered article, but it took me a long time to find a good template which does all this and is still easy to use.
Currenty I'm working on a small research project, where I'm experimenting with continuous integration; automatically unit-test on pc, and then run a system test on a (embedded) target. This is non-trivial in make, and I've searched for a good solution. Finding that make is still a good choice for portable multiplatform builds I finally found a good starting point in http://code.google.com/p/nonrec-make
This was a true relief. Now my makefiles are
very simple to modify (even with limited make knowledge)
fast to compile
completely checking (.h) dependencies with no effort
I will certainly also use it for the next (big) project (assuming C/C++)
I have developed a non-recursive make system for a one medium sized C++ project, which is intended for use on unix-like systems (including macs). The code in this project is all in a directory tree rooted at a src/ directory. I wanted to write a non-recursive system in which it is possible to type "make all" from any subdirectory of the top level src/ directory in order to compile all of the source files in the directory tree rooted at the working directory, as in a recursive make system. Because my solution seems to be slightly different from others I have seen, I'd like to describe it here and see if I get any reactions.
The main elements of my solution were as follows:
1) Each directory in the src/ tree has a file named sources.mk. Each such file defines a makefile variable that lists all of the source files in the tree rooted at the directory. The name of this variable is of the form [directory]_SRCS, in which [directory] represents a canonicalized form of the path from the top level src/ directory to that directory, with backslashes replaced by underscores. For example, the file src/util/param/sources.mk defines a variable named util_param_SRCS that contains a list of all source files in src/util/param and its subdirectories, if any. Each sources.mk file also defines a variable named [directory]_OBJS that contains a list of the the corresponding object file *.o targets. In each directory that contains subdirectories, the sources.mk includes the sources.mk file from each of the subdirectories, and concatenates the [subdirectory]_SRCS variables to create its own [directory]_SRCS variable.
2) All paths are expressed in sources.mk files as absolute paths in which the src/ directory is represented by a variable $(SRC_DIR). For example, in the file src/util/param/sources.mk, the file src/util/param/Componenent.cpp would be listed as $(SRC_DIR)/util/param/Component.cpp. The value of $(SRC_DIR) is not set in any sources.mk file.
3) Each directory also contains a Makefile. Every Makefile includes a global configuration file that sets the value of the variable $(SRC_DIR) to the absolute path to the root src/ directory. I chose to use a symbolic form of absolute paths because this appeared to be the easiest way to create multiple makefiles in multiple directories that would interpret paths for dependencies and targets in the same way, while still allowing one to move the entire source tree if desired, by changing the value of $(SRC_DIR) in one file. This value is set automatically by a simple script that the user is instructed to run when the package is dowloaded or cloned from the git repository, or when the entire source tree is moved.
4) The makefile in each directory includes the sources.mk file for that directory. The "all" target for each such Makefile lists the [directory]_OBJS file for that directory as a dependency, thus requiring compilation of all of the source files in that directory and its subdirectories.
5) The rule for compiling *.cpp files create a dependency file for each source file, with a suffix *.d, as a side-effect of compilation, as described here: http://mad-scientist.net/make/autodep.html. I chose to use the gcc compiler for dependency generation, using the -M option. I use gcc for dependency generation even when using another compiler to compile the source files, because gcc is almost always available on unix-like systems, and because this helps standardize this part of the build system. A different compiler can be used to actually compile the source files.
6) The use of absolute paths for all files in the _OBJS and _SRCS variables required that I write a script to edit the dependency files generated by gcc, which creates files with relative paths. I wrote a python script for this purpose, but another person might have used sed. The paths for dependencies in the resulting dependency files are literal absolute paths. This is fine in this context because the dependency files (unlike the sources.mk files) are generated locally and rather than being distributed as part of the package.
7) The Makefile in each director includes the sources.mk file from the same directory, and contains a line "-include $([directory]_OBJS:.o=.d)" that attempts to include a dependency files for every source file in the directory and its subdirectories, as described in the URL given above.
The main difference between this an other schemes that I have seen that allow "make all" to be invoked from any directory is the use of absolute paths to allow the same paths to be interpreted consistently when Make is invoked from different directories. As long as these paths are expressed using a variable to represent the top level source directory, this does not prevent one from moving the source tree, and is simpler than some alternative methods of achieving the same goal.
Currently, my system for this project always does an "in-place" build: The object file produced by compiling each source file is placed in the same directory as the source file. It would be straightforward to enable out-of place builds by changing the script that edits the gcc dependency files so as to replace the absolute path to the src/ dirctory by a variable $(BUILD_DIR) that represents the build directory in the expression for the object file target in the rule for each object file.
Thus far, I've found this system easy to use and maintain. The required makefile fragments are short and comparatively easy for collaborators to understand.
The project for which I developed this system is written in completely self-contained ANSI C++ with no external dependencies. I think that this sort of homemade non-recursive makefile system is a reasonable option for self-contained, highly portable code. I would consider a more powerful build system such as CMake or gnu autotools, however, for any project that has nontrivial dependencies on external programs or libraries or on non-standard operating system features.
I know of at least one large scale project (ROOT), which advertises using [powerpoint link] the mechanism described in Recursive Make Considered Harmful. The framework exceeds a million lines of code and compiles quite smartly.
And, of course, all the largish projects I work with that do use recursive make are painfully slow to compile. ::sigh::
I written a not very good non-recursive make build system, and since then a very clean modular recursive make build system for a project called Pd-extended. Its basically kind of like a scripting language with a bunch of libraries included. Now I'm also working Android's non-recursive system, so that's the context of my thoughts on this topic.
I can't really say much about the differences in performance between the two, I haven't really paid attention since full builds are really only ever done on the build server. I am usually working either on the core language, or a particular library, so I am only interested in building that subset of the whole package. The recursive make technique has the huge advantage of making the build system be both standalone and integrated into a larger whole. This is important to us since we want to use one build system for all libraries, whether they are integrated in or written by an external author.
I'm now working on building custom version of Android internals, for example an version of Android's SQLite classes that are based on the SQLCipher encrypted sqlite. So I have to write non-recursive Android.mk files that are wrapping all sorts of weird build systems, like sqlite's. I can't figure out how to make the Android.mk execute an arbitrary script, while this would be easy in a traditional recursive make system, from my experience.

How to get doxygen to run faster?

Doxygen is a bit slow - it takes about a couple of minutes to process my whole project, so for small incremental changes this is longer than actually building the rest of my code. There are thousands of files without any documentation so I guess it is spending most of its time processing them. Is there any way to get it to skip files without any documentation?
What about getting it to only process changed files?
From Doxygen documentation:
How can I exclude all test directories
from my directory tree?
Simply put an exclude pattern like
this in the configuration file:
EXCLUDE_PATTERNS = /test/
So, you should be using patterns to exclude files. It's been a long time since I've used Doxygen, but i don't remember any option to process only changed files.
I found that turning off the option SEARCH_INCLUDES made a big difference. It was looking through the whole platform SDK and include paths for the compiler which were not documented anyway and would not appear in the generated documentation.
There is a DOT_NUM_THREADS options which may increase the performance on multicore machines. Unfortunately doxygen itself is just single threaded.
Another approach would be to organize your code into modules run for each module a separate doxygen instance and link the resulting tags together: http://www.doxygen.nl/manual/external.html
Doxygen is good at finding connections between files, either changed or not. But Doxygen does not remember informations about unchanged files, so it must process the whole codebase each time.
May be a solution would be to organize the project such that never changed files belong to one module which is excluded from Doxygen scope and whose documentation is already available. Then it would be possible to tell Doxygen to link newly built documentation to this existing module documentation.
Going further, it would also be possible to make Doxygen running module by module, processing only changed modules and a top level documentation which links to all module documentations.
I don't think having Doxygen run on a normal dev cycle is a good idea. Our Doxygen build runs as part of our Continuous Integration server's responsibilities.
That said, there are some benefits of running doxygen every build to catch missing docs.
So I would trim the doxygen config for dev builds removing diagrams, and even stop apple importing it into xcode.

Resources