Thompson's Trojan Compiler - unix

I'm trying to grasp a better understanding of Thompson's Trojan Compiler (discussed in his 1984 ACM Turing Award speech "Reflections On Trusting Trust"), and so far this is how I understand it:
"The original login program for Unix would accept whatever login and password the root instructed it to. It would only accept a certain password, known only by the man who wrote the system. This could let him log in to the system as root."
Is this the right concept? I'm not 100% sure if I understand the whole concept.
If someone could make it clearer, it would help.
(See also Bruce Schneier Countering "Trusting Trust")

The original login program accepts matching pairs of name and password from a file.
The modification is to add a super-powerful password, compiled into the login program, that allows root access. In order to ensure that this code isn't visible when reading the login program, there's a change to the compiler to recognize this section of the login program, i its original form and compile it into the super-powerful password binary. Then, in order to hide the existence of this code in the compiler, there needs to be another change to the compiler that recognizes the section of the compiler that the first change was added to and output the modified form.
Once the changed compiler code exists, you can compile the compiler and install it in the standard place, and then revert the source code for both the login program and the compiler to their unmodified form. The installed compiled compiler will then take the unchanged login program and output the insecure form. Similarly, the installed compiler will compile the unmodified compiler source code into the devious variant. Anyone inspecting the source code for either one will agree that there's nothing unusual in them.
Of course, it only works until the source code for either program evolves far enough that the modified compiler no longer recognizes it. Since the modified compiler's source code is no longer present, it can't be maintained, and (assuming that the compiler and login continue to evolve) it will eventually stop producing the insecure output.

I had never encountered the concept before, but this is pretty interesting - I found a neat write-up at http://scienceblogs.com/goodmath/2007/04/strange_loops_dennis_ritchie_a.php

Yes, it is the right concept. There's more to it; the modified compiler must also compile the unmodified compiler source to a similarly modified copy of itself. This includes trivial variations of that source, which basically means the modified compiler has to be able to solve e.g. the halting problem.

Related

How to do image-based development in Common Lisp?

I am new to Common Lisp. This is how I develop programs in other languages, and also how I now develop programs in Common Lisp:
Open a text editor (e.g. vim or emacs) to create/edit a text file.
Write source code into the text file. (If unsure about the behavior of a snippet of code, and an REPL is available, then evaluate the snippet in the REPL, verify that the snippet evaluates as expected, and then go back to writing more code.)
Save the text file.
Ask the compiler/interpreter to load and run the source code in the text file. (e.g. sbcl --script myprog.lisp)
Go to step 1 if needed.
This is the conventional write-compile-run development cycle for most programming languages. However, in the lisp world, I hear things like "interactive development" and "image-based development", and I feel that I am missing out on an important feature of Common Lisp. How do I do "image-based development" instead of "write-compile-run development"?
Can someone provide a step-by-step example of "image-based development" similar to how I described "write-compile-run development" above?
(Note: I am using SBCL)
In typical Common Lisp implementations the runtime, the compiler, parts of the development environment and the program you are developing reside in the same program and share the same object space. The compiler is always available while you develop the program and the program can be incrementally developed. The development tools have access to all objects and can inspect their state. One can also undefine/remove, replace, enhance functionality from the running program.
Thus:
don't restart the program you are developing. Stay connected and update it. Even days, weeks, or months - if possible.
write code in such a way that the program can be replicated and built from scratch if necessary. Build it from time to time and fix any build problems.
once you use our program and there is an error -> fix the error within the program, while being able to inspect the full error state
creating a running program is either loading all code into a plain Lisp all the time or saving an executable image with the loaded code/data
Fixes to program bugs can also shipped to the user as compiled Lisp files, which gets loaded into the delivered program and update the code then.
Let's say that you are using SBCL with Emacs and SLIME (e. g. through Portacle).
Open Emacs
Start SLIME (M-x slime) — this starts a “plain” Lisp process in the background and connects the editor functions provided by slime to it; then gives you a REPL that is also connected into this process (image)
Open a text file (e. g. foo.lisp)
Type some code
Press C-c C-k to compile the file and load it into the running Lisp process
Switch to the REPL, try it out
Switch to the Lisp file (step 4).
This is just very basic usage. Further things to do/learn
You can also compile and load just a single toplevel form (C-c C-c)
Learn about packages
Learn about systems (ASDF)
Learn how to use Quicklisp to get the libraries you want
Learn how to access inline documentation from the REPL
Note that you never need to unload your program, you just modify it, even when downloading and loading new libraries. This makes the feedback cycle instantaneous in most cases. You also never need to switch away from the IDE (Emacs).

Loading the dynamic linker

If the dynamic linker/loader is itself a shared object file, how is it properly loaded into a dynamically linked program's process image space if it's not already there? Is this some kind of catch 22 thing?
This answer provides some details (although there are technical mistakes in it).
Is this some kind of catch 22 thing?
Yes: ld.so is special -- it is a self-relocating binary.
It starts by carefully executing code that doesn't require any relocations. That code relocates ld.so itself. After this self-relocation / bootstrap process is done, ld.so continues just as a regular shared library.
Refer to
Oracle Solaris 11.1 Linkers and Libraries Guide
It's the best linkers reference that I have come across, concise and explains things well.
On page 89:
As part of the initialization and execution of a dynamic executable,
an interpreter is called to complete the binding of the application to
its dependencies. In the Oracle Solaris OS, this interpreter is
referred to as the runtime linker.
During the link-editing of a dynamic executable, a special .interp section, together with an
associated program header, are created. This section contains a path
name specifying the program's interpreter. The default name supplied
by the link-editor is the name of the runtime linker: /usr/lib/ld.so.1
for a 32–bit executable and /usr/lib/64/ld.so.1 for a 64–bit
executable.
Note – ld.so.1 is a special case of a shared object. Here,
a version number of 1 is used. However, later Oracle Solaris OS
releases might provide higher version numbers.
During the process of
executing a dynamic object, the kernel loads the file and reads the
program header information. See “Program Header” on page 371. From
this information, the kernel locates the name of the required
interpreter. The kernel loads, and transfers control to this
interpreter, passing sufficient information to enable the interpreter
to continue executing the application.

Automatically log changes to system files and allow revert

I'm trying to learn about the guts of Unix right now, mostly through experimentation. When I was first starting, I found myself looking through forum posts, copying and pasting bash code. When I broke something, I often had to do a fresh install because I couldn't remember what exactly I had changed where. Now, the simple solution is to record a log of all the system files I've changed and keep original copies of all the default files so I can revert if necessary. It would be great if there was a cl tool which did this for me automatically. It would be even greater if I could step back through changes. Basically, I'm looking to version control my entire OS.
Does anything like this exist? I would also accept alternative strategies for spelunking through Unix without causing permanent damage if you think I'm going about this wrong.
Using debian if it matters.

Couple of basic questions about Julia on Windows

I run Julia on Windows with the julia.bat file given in the zip archive. I have a couple of basic questions. This launches a DOS console.
When typing a plot() command Julia returns plot not defined. How to use the plot() function ? Is there a graphical interface available ?
When typing help I get:
What does it mean ?
There is also the launch-julia-webserver.bat file in the zip archive. When running this file two DOS windows open but nothing else happens. What can we do with this file and how ?
By the way I do not find any documentation answering such basic questions... of course if you know where to find such a documentation it would be an ideal answer.
To answer your immediate question, help is implemented as a function, and functions must be called with parentheses. Try help(), or to get help for a particular function in the standard library supply it as an argument; i.e., help(help).
When you enter a function name without the parentheses, the default is to print all of the implementations with their argument types.
The main Julia documentation is available online at http://docs.julialang.org/. We also have a mailing list at https://groups.google.com/forum/#!forum/julia-dev.
The webserver is pretty rough, especially on Windows. You should be able to open up http://localhost:2000/ with it running and access a web-based command environment. But you'll probably just want to stick to the normal command line.
Another contributor highlighted the response to help as a potential issue for new users and we've opened a bug on it at https://github.com/JuliaLang/julia/issues/1320. It's a new language and there's still plenty of rough edges, so thanks for helping us file those down!
To use launch-julia-webserver.bat, after you double-click it and the two DOS windows open, one of them should say "Connect to http://localhost:2000/ for the web REPL". If you open a web browser to http://localhost:2000/, you should be greeted with a welcome page that asks for your name and a session name.

Is it possible to have the entire contents of a class that tripped an error included in the stacktrace?

A lot of time can pass between the moment a stack trace is generated and the moment the stack trace is thoroughly investigated. During that time, a lot can happen to the file in question, sometimes obscuring the original error. The error might have been fixed in the meantime (overlapping bugs).
Is it possible to get Stacktraces that show the offending file at the time of the error?
Not elegantly, and you normally don't want the user browsing through code that's throwing unexpected exceptions anyway (open door to an attacker).
Usually, what happens in a dev shop is that the user reports an error, stack trace, and the build it occurred on. As a tester, you can grab that build from your archives (you ARE keeping an archive of all supported releases somewhere handy, RIGHT?), install, run, and try to reproduce the error, working with the user to provide additional info as necessary. I've seen very few bugs that couldn't be reproduced EVENTUALLY, even if it required running the program against a backup of the user's production database to do it.
As a developer, you can download that build's source code from your version control repository (you ARE using version control, RIGHT?), and examine the lines in the stack trace to try to discover the problem by inspection, and/or build and run it to reproduce the error. Then, you go back to the latest source version, build, and run the same steps (a UI automation system can help out here), and if you don't get the error, someone else already found and fixed it. If you still get the error, you also got an updated stack trace with lines that match the current build, allowing you to set your breakpoints and step through.
What KeithS said, plus there are ways to capture more helpful state information at the time of the Exception using the Exception.Data property. See http://blog.abodit.com/2010/03/using-exception-data-to-add-additional-information-to-an-exception/
While KeithS' answer as pretty much correct, it can be easier and more elegant than you think. If you can collect a dumpfile (instead of just a stack trace), you can use a Symbol Server and Source Server in combination with your debugger to automatically pull your correct-version code from source control.
For example: if you enable PDB output and source-server integration in MSBuild, and upload the resulting PDBs to a symbol server, Visual Studio can automatically load the correct source control from a TFS or SourceSafe repository based on the information in a minidump.

Resources